Unstructured Mesh Technologies
Enabling the high-fidelity simulation of physical phenomena on complex geometries by developing and deploying unstructured mesh technologies for leadership-class computers.
Unstructured meshes can yield required levels of accuracy using many fewer degrees of freedom at the cost of more complex data structures and algorithms to achieve parallel scalability. In most cases application developers lack the time and expertise to develop the tools necessary to take full advantage of unstructured meshes for their applications. The goal of the FASTMath unstructured meshing team has been, and continues to be, developing interoperable tools that application developers can employ within their workflow at different levels of abstraction and granularity so they are able to take full advantage of unstructured mesh technologies. Specifically, FASTMath provides technologies for:
- Unstructured mesh management for parallel meshes being manipulated on the latest parallel computing systems.
- Advanced unstructured mesh PDE discretization methods that address a wide range of physics and include both standard order and high-order methods.
- Parallel anisotropic mesh adaptation methods that can account for highly anisotropic physics, and evolving solution features and evolving domain geometry.
- Load balancing and task placement tools to support a wide range of application needs, and scalable, multi-criteria, dynamic partitioning methods that reduce time spent in communication and computation.
- Support for the effective execution of uncertainty quantification using adaptive unstructured meshes.
- Support for in situ data analytics and visualization in workflows using unstructured meshes.
- Parallel unstructured meshes with full support of particle methods.
- Parallel solution field management tools including error estimation methods.
Unstructured mesh management: The Parallel Unstructured Mesh Infrastructure (PUMI) and Omega_h support parallel services on distributed unstructured meshes. PUMI supports a general range of element types, including boundary layer meshes and curved meshes needed for high-order methods. Omega_h is focused on efficient management of linear simplicial meshes using a broad range of accelerator hardware including GPUs.
Unstructured mesh PDE discretization technologies: The FASTMath unstructured mesh team supports three unstructured mesh finite element codes capable of addressing a wide range of application needs. These codes are being applied to DOE, DoD, and industrial applications. The codes are:
- Albany – An implicit, unstructured grid, finite element code for the solution and analysis of multiphysics problems.
- MFEM – A scalable C++ library for finite element discretizations of partial differential equations on high-order unstructured grids.
- PHASTA – Models compressible or incompressible, laminar or turbulent, steady or unsteady flows in 3D, using unstructured grids.
Parallel anisotropic mesh adaptation: Tools are being developed for parallel conforming and non-conforming mesh adaptation for curved domains that can evolve in time, including curved mesh entities for high-order methods. The procedures have adapted meshes of nearly 100 billion elements on ¾ million cores using 3 million processes. The generalized mesh adaptation tools have been integrated into a wide range of unstructured finite element analysis codes.
Load balancing and task placement: The Zoltan and Zoltan2 load balancing libraries have been and continue to be developed by FASTMath team members. Development efforts are focused on architecture-aware load balancing, multi-criteria fast dynamic load balancing, and MPI task placement to reduce application communication costs.
Uncertainty quantification using adaptive unstructured meshes: The importance of adaptive mesh control to the reduction of mesh discretization errors in forward analysis has been well established. Such error control is also a prerequisite for the application of UQ processes. As we move forward in SciDAC-4, coupled FASTMath unstructured mesh and UQ research will address adaptivity in the combined physical and stochastic spaces.
In situ data analytics and visualization: We will build on our preliminary in situ visualization and model reduction research to support the ability to control not only what is being viewed in situ, but to alter the simulation as it progresses by modifying the problem definition (e.g., boundary conditions, loads) and controlling the level of model fidelity being applied.
Parallel unstructured meshes for particle methods: To address the desire of multiple DOE applications to be able to execute particle motion coupled to unstructured mesh solves on increasingly larger meshes, we are developing distributed mesh particle methods.
Parallel field management and error estimation: The Attached Parallel Fields (APF) tool supports parallel field information and its manipulation as required by operations that range from results visualization, to error estimation, to uncertainty quantification. Parallel error estimation developments include residual-based and goal-oriented error estimators, and anisotropic error indicators.
Advances in Unstructured Mesh Adaptation. Recent development in the parallel mesh adaptation techniques have focused on the following areas: (i) Movement to fully array-based implementations to support effective execution on the up-coming parallel computing architectures. (ii) Development of a GPU-based version of mesh adaptation. (iii) Support of curved mesh adaptation for up-to-sixth order geometry. (iv) Support for combined mesh motion and mesh adaptation for evolving geometry problems. (v) Boundary layer mesh adaptation with explicit control of the “in surface” and normal direction mesh adaptation.
Dynamic partitioning strategies. We delivered partitioners that provide high scalability and quality for applications. Zoltan2’s Multijagged (MJ) geometric partitioner scales to O(100K) cores; it is the default partitioner in MueLu (multigrid) and Nalu (wind energy). MJ can also reduce application communication costs by mapping interdependent MPI tasks to “nearby” cores; on 16K cores, MJ task placement reduced communication time in ACME/HOMME (atmospheric modeling) by 31%. While traditional graph partitioners lack scalability and use much more memory, our PuLP/XtraPuLP (Partitioning Using Label Propagation) partitioners work on trillion-edge graphs on O(10K) cores. Related graph analysis tools in HPCGraph (e.g., BFS, connected components) are useful in many scientific algorithms. ParMA has been extended and used in PHASTA to improve partition quality and yield up to a 28% reduction linear algebra work performance on 512K cores. The diffusive partition improvement methods of ParMA have been extended to operate on a multi-graph-based representation of the applications work and dependency relations.
Unstructured Mesh and Uncertainty Quantification. We have demonstrated the application of anisotropic mesh/h-adaptivity in the physical domain and p-adaptivity (i.e., a spatially varying setting) for generalized polynomial chaos (gPC) basis in the stochastic domain to control discretization error in the joint physical and stochastic space, and increase the effectiveness and reliability of UQ processes. We have applied joint adaptivity on complex problems involving uncertain material parameters, source term and/or boundary conditions, and spanning advective–diffusive regimes.
In Situ Visualization. As part of a joint FASTMath and SDEV (now RAPIDS) activity, we develop in situ visualization capability coupled with the PHASTA CFD code with abstraction to enable general use within PDE solvers. In-memory adaptivity and dynamically reconfigurable visualization have been applied to multiple full-machine-scale LCF at ANL. Thre is ongoing work to extend this capability to SciDAC applications and to other LCF. Further effort is required to enhance capability to support both automated and user-guided adaptivity and problem redefinition.
Integration of unstructured meshing technology into simulation workflows. Strategies for the in-memory integration of mesh adaptation into existing unstructured mesh analysis codes were defined, and the API and data stream tools needed to support the implementation of those strategies developed. Using these technologies, in-memory integration of FASTMath parallel mesh adaptation tools have been carried out with the ACE3P (SLAC), Albany (Sandia), MFEM (LLNL), M3D-C1 (PPPL), XGC (PPPL) DOE codes, as well as the FUN3D (NASA), PHASTA (Colorado/RPI), and Proteus (DoD) codes.
High-Order Finite Element Methods: Recent developments in large-scale computational science make it increasingly clear that high-order discretizations have the potential to achieve optimal performance and deliver fast, efficient, and accurate simulations on exascale systems. A key to achieving this efficiency at high orders is the use of matrix-free algorithms that optimize the amount of computations performed per memory transfer. The MFEM project is developing new theory and advanced algorithms and implementations for these types of high-order discretizations that focus on generality, ease-of-use in applications, and high-performance on the full range of modern computing architectures. A main component of this work is performing research to enable matrix-free high-order support in all components of the traditional simulation pipeline, from meshing, to discretizations, adaptive mesh refinement, solvers, visualization and so forth. In the FASTMath institute, we are further developing these technologies for a wide range of applications, including electromagnetic simulations of interest to SciDAC application partners.