Architecture of Moldex3D Parallel Computing for Solid
Moldex3D parallel computing adopts SPMD (single program multiple data) architecture with domain decomposition method and MPI (massage passing interface) via MPICH2. Domain decomposition is achieved by distributing the computing domains of a model across nodes. Each node is given the subset of the data on which to work. Another critical technology in parallel computing with domain decomposition is load balance. A good load-balance algorithm is necessary for minimizing the difference of loading between different CPU cores. Otherwise, one or more of CPU cores may idle during parallel computing. At the same time, necessary data exchange across the sub-domains interfaces are required to transmit results of one sub-domain to its surrounding sub-domains. Data communication will inevitably slow the parallel computing down. The larger the model, the communications with larger data volume between CPU cores is required. Hence the performance of parallel computing is directly related to the method of domain decomposition.
To maximize parallel computing performance, Moldex3D employs domain decomposition technology to properly partition the model into several small parts. These small parts are distributed to different CPU cores. Then each CPU core will do its own computing. The CPU cores also need to mutually exchange data during computing. The data exchange only occurs on the interface between contiguous sub domains through MPI (message passing interface) via Intel MPI. The results of sub domain are generated when the computing each CPU core finish. Finally, Moldex3D solver will collect the results of all sub domains into the full domain result.
Functions for Parallel Computing
For parallel computing, the functions are as follows.
Capability: Reduce the computing time and increase the capacity of massive computing. It can
•Reduce the computing time of Moldex3D analysis for solid model to help users to evaluate the design parameters for the revision or optimization of design more effectively.
•Increase the capacity of massive computing for huge mesh to help users to handle more complicated models.
The Mesh Partition Kernel
For parallel computing, the geometry model and its corresponding mesh has to be split into several divisions according to the number of CPU cores joining parallel computing. From the previous version (R9.1), the new mesh partition kernel, METIS, has been applied, which is developed by the Karypis Laboratory of University of Minnesota. This mesh partition kernel improved the efficiency of parallel computing by reducing the number of partition interfaces. Therefore, the data throughput between CPU cores can be reduced. This makes the parallel efficiency keep high even the number of sub-domains increases.
Domain decomposition for a model. Different color represents different sub domains.
Regarding the detailed information of METIS, please refer to http://glaros.dtc.umn.edu/gkhome/index.php.