Development of a Portable Lattice Boltzmann Code for Direct Numerical Simulations of Multiphase Flow in Porous Media and Microfluidic Devices
University of Illinois at Urbana-Champaign
Key words: Lattice Boltzmann method, Multiphase flow, Porous media, Microfluidics, Code portability, High performance computing
The lattice Boltzmann method (LBM) has become a widely-used method to study multiphase flow in porous media and microfluidic devices, due to its ability to model complex interfacial dynamics and flow with complex geometry. Furthermore, the LBM is relatively easy to implement and is suitable for modern parallel computing, which is a requirement for simulations of flow in three-dimensional porous media due to the complex pore space morphology and the need to use large domains. For the past decade, many CUDA LBM codes have been developed to utilize Nvidia GPU acceleration. In recent years, many-core processors, such as the Intel MIC processor, have also been widely adopted in high performance computing. Due to the rapid development of modern processors, low-level CUDA code may not be an optimal choice for maintaining a portable code. As modern compilers become more and more mature, portable code using hybrid MPI-OpenMP or MPI-OpenACC, which relies on the compiler for low level implementation, could be quite attractive as one may run the same code on different platforms from every day laptop to high-end supercomputer equipped with GPU or MIC.
In this talk, we will introduce our experiences on optimizing our multiphase MRT-LBM code and porting the code to MIC and GPU via hybrid MPI-OpenMP/OpenACC programing. We adopt the AA pattern streaming method (Bailey et al., 2009) to reduce memory access and memory consumption, and the structure of arrays (SoA) data format to maximize vectorization. The bounce-back boundary condition is implicitly carried out by the AA pattern streaming method, and branching is avoided. We will present code performance results obtained from CPU, MIC and GPU, including the latest generation processors such as the Intel Knights Landing (KNL) processor. Significant speedup is observed even on CPU platform after optimization. Code performance on a KNL node is about 4 times the performance of an optimized code on a traditional CPU node with two 14-cores high-end CPUs. We will also present multiphase flow simulation results of our code on a real Bentheimer sandstone and a heterogeneous micromodel.
This work was primarily supported as part of the Center for Geologic Storage of CO2, an Energy Frontier Research Center funded by the U.S. Department of Energy, Office of Science. The supercomputers used in this work including TACC Stampede, LSU SuperMIC and PSC Bridges provided by XSEDE, and Blue Waters via Illinois Blue Waters allocations.
Dr. Yu Chen is currently a postdoctoral research associate joined University of Illinois at Urbana-Champaign and the Center for Geologic Storage of CO2 on August, 2015, under the supervision of Prof. Albert Valocchi. His research focuses on direct numerical simulations of multiphase flow in porous media, the lattice Boltzmann method, high performance computing and microfluidics with applications on biomedical engineering. Dr. Chen completed his doctor and bachelor degrees both in Peking University in 2012 and 2006, and he had been a visiting student in University of Southern California from 2008 to 2009 and Los Alamos National Lab from 2009 to 2010. Before joining UIUC, Dr. Chen was a postdoctoral research associate at Tsinghua University.