HPC Servers and LS-DYNA
As the computational demands of FEM simulation have grown over the
past several years, with models continuing to grow in complexity,
traditional solution methods have become inadequate. Applying
distributed computing techniques, LSTC had developed a version of
LS-DYNA that can run today’s large models in reasonable times on a wide
range of available hardware. In essence, the problem to be modeled is
split into pieces (domains), and each piece is simulated on a different
processor. Coordination between the simulations is of course required at
the domain boundaries. Contact is a particularly difficult problem,
requiring cooperation between all the processors as the domains
interact. The communication involved produces overhead, which increases
with the number of domains. Consequently, there is a limit to the speed
that can be achieved. For a given problem, the simulation time generally
goes down as the number of processors increases, up to a point. The
speedup will drop off and, if too many processors are used, the
simulation time will begin to increase. Here is one example of
simulation run time as a function of the number of processors. In this
example the problem is a 450,000 element crash model being run on a
cluster of PCs.
Currently, the largest application areas for the MPP version of LS-DYNA
are in automotive crash and metal forming. One of LSTC's customers has
been running production sheet metal stamping simulations using MPP-DYNA
for several years. Their problems routinely have 1 million elements, and
they achieve overnight turnaround times utilizing a 30 processor system.
Recent advancements in PC hardware have brought these machines to the
interest of large corporations as viable alternatives to vector
supercomputers. One auto manufacturer has a rack mounted PC cluster of
384 processors on which they perform production simulations utilizing
MPP-DYNA. They can run 24 simultaneous 16 processor problems, and have
overnight turnaround on typical crash models with over 600,000 elements.
Another carmaker recently benchmarked a PC cluster using a 700,000 node
ODB (offset deformable barrier impact) model. They found that running on
7 processors they could match the speed of their current model vector
supercomputer. On 48 processors it was 7 times faster, and still
scaling. This kind of speed is simply unattainable using traditional SMP
programming techniques as implemented in LS-DYNA.
Distributed implicit solution methods are developed at LSTC. Implicit
methods are better suited to certain kinds of problems than explicit
methods. By extending the implicit method to a distributed system, the
memory and computational power needed to solve large complex systems is
readily available. The delay in the implicit methodology in LS-DYNA is
primarily related to the effort required to add constitutive matrices to
each material model, the implicit extension of every constraint option
including contact, and the development of stiffness matrices for each
explicit element. This work is nearing completion so our efforts will
now center on the MPP implementation.
