Highly Scalable Framework for Modeling Atmospheric Dynamics

Research Discription:

Due to the climate’s significant influence on human activities, and the huge losses caused by extreme weather events, climate change has been one of the most important research subjects among governments and research organizations. In terms of all the different parts in a climate system model, the global atmospheric model is one of the most challenging components. Developing highly scalable algorithms for global atmospheric modeling is becoming increasingly important as scientists inquire to understand behaviors of the global atmosphere at extreme scales. Nowadays, heterogeneous architecture based on both processors and accelerators is becoming an important solution for large-scale computing and has been widely applied in many key applications to provide significant computing power. Therefore, efficiently solving the atmospheric problems through HPC techniques has becoming an hot topic. The atmospheric modeling group in Tsinghua HPGC focuses mainly on the study of developing and accele

Research Contents:

Up to now, we have studied a set of the atmospheric applications based on different hardware architectures (i.e., CPU, GPU, MIC, FPGA), and achieved significant efficient improvement in both the performance and the power consumption.

Besides the applications from the atmospheric fields, we also focus on the acceleration of kernels derived from other key geo-science applications such as geo-exploration and city planning.


Figures above show our peta-scale simulation of the global shallow water equations using TH1A, one of the most powerful supercomputers in China. Through a hybrid CPU-GPU decomposition method, and a novel pipe-flow scheme (right figure), we manage to scale the complicated atmospheric equations on over 3750 TH1A nodes, and achieve a performance of 803 Tera-Flops in double-precision. The left figure above shows the simulating result on 2750 TH1A nodes.


Targeting the most recent Intel many integrated core (MIC) architecture, we have developed a highly-efficient methods to accelerate the atmospheric equations. Figures above demonstrate our recent work on Tianhe-2, the world’s No.1 supercomputer with a peak performance of 52 Peta-Flops. Our hybrid algorithms on Tianhe-2 manage to scale the performance of atmospheric equations on nearly 1.7 million cores, with an strong-scale efficiency of 77%.


Reconfigurable data flow engine (DFE) is a new HPC platform that has proved its promising potential in may key applications. In our recent study, we proposed highly-efficient and green DFE to solve the atmospheric equations. Figures above show the general architecture and the novel techniques to scale the performance on DFE. Compared with conventional HPC platforms such as the CPU, GPU, and MIC, DFE has showed a magnitude of better result in both the performance and the power efficiency.


Besides the atmospheric modeling, we also focus on other key algorithms derived from geo-science applications. Figure above shows different stencil kernels we are studying. Stencil evaluation could be considered as the most essential computational kernel for a variety of important geo-science applications such as the weather prediction, RTM, city planning, etc.

The other work is the acceleration of the CTM model. As one of the most advanced global 3-D chemical transport model (CTM) for atmospheric composition in the world, GEOS-Chem can simulate many mechanisms such as the transport, deposition and pollutant emissions. We proposed a scheme to parallel the chemistry process of the GEOS-Chem on the GPU platform, and obtained an improved performance.



Professor Guangwen Yang, Tsinghua University Professor Wayne Luk, Department of Computing, Imperial College London Dr. Haohuan Fu, Tsinghua University Dr. Wei Xue, Tsinghua University Dr. Chao Yang, Institute of Software, Chinese Academy of Sciences


Lin Gan, 4th year PhD. student, Tsinghua University Junfeng Liao, 3rd year PhD. student, Tsinghua University Mengyao Jin, 2nd year Master student, Tsinghua University Jingheng Xu, 1st year PhD. student, Tsinghua University


1. Lin Gan, Haohuan Fu, Wayne Luk, et. al. Accelerating Solvers for Global Atmospheric Equations Through Mixed-precision Data Flow Engine. The 23rd International Conference on Field Programmable Logic and Applications (FPL2013). 2-4 Sept, 2013.

2. Lin Gan, Haohuan Fu, Wayne Luk, etc. Global Atmosphericc Simulation on a Reconfigurable Platform. Poster of the 22nd IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2013), April 28-30, 2013.

3. Haohuan Fu, Lin Gan, Robert G. Clapp, et. al. Scaling the Reverse Time Migration Performance through Reconfigurable Data-Flow Engines, IEEE MICRO. pp. 30–40, 2013

4. Chao Yang, Wei Xue, Haohuan Fu, Lin Gan, etc. A Peta-scalable CPU-GPU Algorithm for Global Atmospheric Simulations. 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2013), Feb 23-27,2013.

5. You Yang, Haohuan Fu, Lin Gan etc. Accelerating the 3D Elastic Wave Forward Modeling on GPU and MIC. Workshop of the 27 th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2013). May 20-24, 2013.

6. Wei Xue, Chao Yang, Haohuan Fu, Xinliang Wang, Yangtong Xu, Lin Gan, etc. Enabling and scaling a global shallow-water atmospheric model on Tianhe-2. The 28the IEEE International Parallel and Distributed Processing Symposium (IPDPS 2014), May 19-23, 2014

7. Lin Gan, Haohuan Fu, Chao Yang, Wayne Luk, etc. A Highly-Efficient and Green Data Flow Engine for Solving Euler Atmospheric Equations. The 24th International Conference on Field Programmable Logic and Applications (FPL2014). 2-4 Sept, 2014

8. Lin Gan, Haohuan Fu, Wayne Luk, Chao Yang, Wei Xue, Xiaomeng Huang, Youhui Zhang and Guangwen Yang, Solving the Global Atmospheric Equations through Heterogeneous Reconfigurable Platforms, ACM Transactions on Reconfigurable Technology and Systems (TRETS) (to appear)

9. Lin Gan, Haohuan Fu, Wei Xue, Yangtong Xu, Chao Yang, Xinliang Wang, Zihong Lv, Yang You, etc. Scaling and Analyzing the Stencil Performance on Multi-Core and Many-Core Architectures. The 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS 2014) (to appear).

10. 徐敬蘅、甘霖、付昊桓 等,基于CPU-GPU异构平台的欧拉大气方程并行求解,全国高性能计算学术年会 (已录用)