NA-ASC-500-11 Issue 17
The Meisner Minute
Editorial by Bob Meisner
HQ. Those of you who follow Washington intrigue and read the bills offered by the House and Senate Authorizers and Appropriators may have already noted that Congress is supportive of the Energy Department’s drive to exascale. The bills will need to be reconciled before becoming law, but the language in each supports our joint efforts with the Advanced Scientific Computing Research (ASCR) Program and spurs us on to build more comprehensive plans.
The main thrust in this effort has focused on sending the recent Request for Information (RFI) to industry, soliciting information about the feasibility of delivering “platform and crosscutting co-design and critical research and development technologies targeted at deploying exascale computers by 2019-2020.” The RFI elicited 22 responses that are being used to inform our joint planning with ASCR. The E7 labs (Exascale 7: Argonne, Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, Pacific Northwest, and Sandia national laboratories) have been engaged in letting the RFI and organizing the substantial responses received from industry. As a result, we here in Washington, DC, are well on our way to building a joint plan.
During the last quarter the NNSA Administrator has assigned us a new mission of broader national security, which includes counter-terrorism, non-proliferation, forensic and emergency response. I know you have supported these areas out of hide for years. But, now that we have been tasked to support these mission areas, we will (over the coming years) have program resources to support them. Our challenges during the next year will be to meet the heightened expectations of these new ASC supported missions — heightened expectations that you have built through your pioneering operations and use of high-performance computing (HPC).
As we consider how to take on these new missions fully expecting to acquire and operate platforms, we will continue to use the ASC model of centralized capability and decentralized capacity computing, enabled by Tri-Lab Linux Capacity Cluster (TLCC) and TOSS. This approach began to be implemented this past year with delivery of small TLCC-1 compatible platforms to the labs. Over the next year you will see Red Storm being retired and replaced with TLCC-2 scalable units at the National Security Computing Center.
While exascale looms on the horizon, other missions will begin to benefit from your HPC advances. Our professional reputation, earned through supporting the US stockpile, will now serve an expanded national nuclear security community in need of extreme scale computing. Team ASC makes the difference. Thank you for your service.
Lightning Arrestor Connector Breakdown Model Development
Researchers at Sandia National Laboratories recently calculated the breakdown voltage across rutile cylinders and rutile particles (see images below).
Rutile, or titanium dioxide, is a critical material that is embedded into Lightning Arrestor Connectors to control their breakdown voltage. The models use breakdown paths imaged in experiments and yield agreement with experiments to within 20%. Lightning Arrestor Connectors are safety-critical components in almost every stockpile nuclear weapon and this accomplishment is a significant step toward simulating their performance.
ASC Code Chosen for SciDAC Project
The ASC parallel dislocation dynamics code ParaDiS was selected for a collaborative pilot project with the next Office of Science (SC) “Scientific Discovery through Advanced Computing” Program, known as SciDAC.
The ParaDiS code, which has demonstrated scalability on 132,000 processors of BlueGene/L, enables study of the fundamental mechanisms of plasticity at the dislocation level of microstructure, and is developed at Lawrence Livermore Laboratory (LLNL) as a key part of the multiscale ASC Physics and Engineering Models (PEM) effort in modeling strength. The success of this new collaboration will provide valuable experience and help to determine the trajectory of the collaboration between NNSA and SC.
This new collaboration may provide SciDAC resources for code refactoring to improve performance and prepare ParaDiS for future architectures. The effective incorporation of their tools into application codes is one of SciDAC’s goals.
ParaDiS is a massively parallel and specialized material physics code that incorporates dynamic load balancing. It is a workhorse code for the PEM sub-program at LLNL and has used significant amounts of the BlueGene/L machine for calculations underlying improved science-based strength models for a variety of materials. ParaDiS is freely distributable and is shared with the open science community.
CRASH: Predictive Science Using Strongly Radiative Shock Waves
The Center for Radiative Shock Hydrodynamics (CRASH), supported by the ASC Predictive Science Academic Alliances Program, seeks to advance predictive science by working with simulations and experimental data for a laser-driven shock tube with a strongly radiating shock. The experimental system of interest uses 3.8 kJ of laser energy to launch a shock wave in Xe gas at > 100 km/s. The CRASH modeling code is a solution-adaptive, radiation-hydrodynamics code that is typically run on the clusters at the NNSA laboratories. In support of upcoming experiments that will use an elliptical shock tube, CRASH recently completed the simulation whose output is shown in the figure.
Radiative shock in an elliptical tube at 13 ns. Blue shows beryllium, which was shocked and accelerated by the laser for 1 ns. Gold shows a gold washer. Red shows acrylic. Green shows polyimide, present initially as a tube with 25 µm thick walls and surrounded in the simulation by low-density polyimide gas. The black surface shows the shock front. The xenon gas is rendered as transparent. The simulation was performed on 1024 cores at Hera at LLNL, taking 3.5 days run time.
CRASH will generate a probabilistic prediction of the results of the experiments using statistical uncertainty quantification techniques. They will combine large numbers of simulation runs, using models of varying fidelity, with data from experiments that involve components of the entire system or circular tubes. In a recent demonstration of the method, CRASH has combined results of 1D and 2D simulations to calibrate uncertain physical parameters and successfully predicted the later shock location in systems with circular tubes.
The CRASH investigators, primarily from the University of Michigan and Texas A&M University, are actively involved with the NNSA laboratories. During the past five years, these labs have hired 14 of the Ph.D. graduates of the CRASH investigators.
ReALE Promises Dramatic Improvement in ASC Codes
Today, standard arbitrary-Lagrangian–Eulerian (ALE) methods are the core of the ASC FLAG code. The next generation of methods called ReALE allows for changing connectivity of the mesh dynamically during a calculation. The new method developed at Los Alamos National Laboratory (LANL) shows potential for dramatically increasing robustness and accuracy of ASC codes.
The ReALE method allows connectivity of the mesh to change in rezone phase, which leads to a general polygonal mesh and allows it to follow Lagrangian features of the mesh much better than for standard ALE methods. Standard ALE methods do not allow change in mesh connectivity. Researchers from LANL presented on ReALE at France’s government-funded technological research organization, CEA, and are collaborating with CEA.
An article by the collaborators is published in the Journal of Computational Physics. The article is number 10 in the SciVerse ScienceDirect Top 25 Hottest (most read) articles. Using a series of numerical examples, the article illustrates ReALE’s superiority over standard ALE methods without reconnection.
 R. Loubère, et al., “ReALE: A reconnection-based arbitrary-Lagrangian-Eulerian method,” J. Comp. Phys. 229 (12), 4724–4761 (2010).
High Performance and Portability to GPUs Demonstrated Simultaneously
Sandia researchers have demonstrated that key ASC computational tasks can be run portably and with high performance on a range of processors including graphics processing units (GPUs). GPUs have very attractive performance characteristics, but are notoriously difficult to program. Using the APIs in the Trilinos-Kokkos library, the team was able to demonstrate portability across NVidia GPUs, and Xeon and Opteron microprocessors while achieving high performance on each. The work involved two important Sandia ASC algorithms — Hexahedral Gradient and Modified Graham-Schmidt. This advance will help shield application developers from optimizing concerns for different architectures.
These results will appear in the IEEE 2011 Cluster Computing Workshop proceedings.
The Trilinos Project provides high performance libraries for hundreds of applications on all major computing platforms. Presently there are several distinct many core node architectures and even more parallel programming models. Mathematical libraries such as Trilinos must be able to support a variety of users and be compatible with the programming models commonly found in applications that call Trilinos functions. At the same time, simple parallel patterns such as parallel_for, which specifies that the iterations of a loop can be executed independently and in any order, and parallel_reduce, which is similar except that a collective operation is part of the loop body, are present in all parallel programming models and differ only the details of expressing the loop parallelism. Furthermore, parallel_for and parallel_reduce patterns are ubiquitous in scientific and engineering applications.
The Kokkos Node API supports Trilinos developers and users in the generic coding of parallel_for and parallel_reduce constructs such that the programmer can write a loop body one time in the Kokkos framework using standard C++ functor notation, and the Kokkos compile-time framework can compile the generic expression for any supported node type, including serial (still an important target), pthreads, Intel Threading Building Blocks (TBB), and CUDA, with OpenMP coming soon. Furthermore, the code that is generated via Kokkos does not compromise performance.
Debugging Millions of Processes and Winning an R&D 100 Award
A team of Lawrence Livermore National Laboratory (LLNL) computer scientists has won a prestigious R&D 100 Award from the trade journal R&D Magazine for developing a highly scalable debugging tool for identifying errors in computer codes running on supercomputers with 100,000 processor cores and above.
Their work, done in collaboration with researchers from the University of Wisconsin and the University of New Mexico, produced a technology known as the Stack Trace Analysis Tool, or STAT.
Today's largest supercomputers contain hundreds of thousands of processor cores and cost hundreds of millions of dollars. Single faults that disable a small part of a computer code can bring the entire program to a sudden halt, introducing major costs.
STAT is the first tool designed specifically to tackle the challenges of debugging at large scales with the goal of maintaining prompt response times. The tool works on the principle of detecting and grouping similar processes at suspicious points in a program's execution. This permits users to reduce the problem they are trying to debug to only a small number of processes by picking representatives from each group instead of debugging all processes at the same time.
STAT also includes a powerful graphical user interface that allows the user to identify where a bug exists in an application quickly. The interface can automatically perform several operations that analyze the state of the application and pinpoint potential locations of a bug.
For more information, see the Web.
New Discoveries Bring Scientists Closer to a Predictive Theory of Fission
Lawrence Livermore National Laboratory (LLNL) fission theorists have made an important step in quantifying a part of the fission process known as scission: the point at which one fissioning nucleus becomes two fission fragments. The theorists are now determining how the total energy released during fission is partitioned to individual fission fragments.
Coupled with high-performance computing, these calculations represent a key first step in understanding the properties of fission fragments and their impact on program metrics, which will ultimately lead to a predictive theory of fission.
A predictive and comprehensive theory of nuclear fission is critical to applications such as nuclear materials detection, nuclear energy, and stockpile stewardship, but has proven a daunting challenge since the discovery of fission in the 1930s. The recent LLNL work on the fundamental nature of scission uses a concept analogous to that of “Localized Molecular Orbitals” from molecular physics and quantum chemistry to solve the longstanding question of how to follow continuously the evolution of one quantum system (the fissioning nucleus) into two sub-systems (the fragments).
 Report from DOE/NNSA-sponsored workshop on “Scientific Grand Challenges for National Security: The Role of Computing at the Extreme Scale,” Washington D.C. (2009). http://science.energy.gov/~/media/ascr/pdf/program-documents/docs/Nnsa_g...
 *W. Younes and D. Gogny, accepted for publication in Physical Review Letters (2011).
Appro Selected to Develop New Capacity Computing Systems
The ASC Program has selected Silicon Valley-based supercomputing provider Appro to expand the weapons complex's supercomputing capacity and bolster computing for stockpile stewardship at NNSA's three national security laboratories.
The Tri-Lab Linux Capacity Cluster 2 (TLCC2) award is a multi-million and multi-year contract to provide multiple procurement options exceeding 3 petaFLOP/s in “capacity” computing. Under the terms of the contract, computing clusters built of scalable units (SUs) will be delivered to each of the laboratories between September 2011 and June 2012. Each SU represents 50 teraFLOP/s of peak computing power. The SUs are designed to be interconnected to create more powerful systems. SUs will be divided among the three labs, with each configuring the SUs into clusters according to mission needs. These computing clusters will provide needed computing capacity for NNSA's day-to-day work managing the nation's nuclear deterrent.
Starting in late September 2011, Lawrence Livermore National Laboratory (LLNL) is scheduled to receive the first of 18 SUs, which will be combined into a single classified cluster. TLCC2 was designed to allow LLNL users to quickly and effectively utilize the new systems. LLNL will bring in additional SUs to support the program's unclassified capacity needs, including ASC Alliance allocations.
TLCC2 is NNSA's second joint procurement of this type and will replace the clusters procured in 2007 that are nearing retirement. This tri-lab procurement model reduces costs through economies of scale based on standardized hardware and software environments at the three labs.
Collaborations Continue Between French and American Computing Sciences at Annual Workshop
The annual NNSA/ASC and CEA/DAM Computing Sciences Workshop was held in Sedona, Arizona, June 6-9, 2011. CEA/DAM is the military applications division of the French Atomic Energy and Alternative Energies Commission.
Hosted by Sandia National Laboratories and attended by staff from the three ASC laboratories, NNSA-HQ, and CEA/DAM, the 10th annual workshop included two additional days for in-depth technical exchanges on HPC operations and various research topics.
Ongoing collaborations in meshing and partitioning, visualization and data analysis, and I/O and parallel file systems were described as vibrant and healthy. Many current and future examples of cross-laboratory collaborations, co-organizing conferences, co-authoring papers, and sharing new capabilities development were cited as indicators.
Future collaborations discussed during the workshop included interest in:
ASC’s International Leadership on Display at ISC’11
ASC’s international leadership in scientific computing and technology research and development was on display at the 26th International Supercomputing Conference (ISC’11) in Hamburg, Germany, in June 2011. About 2,000 attendees and 140 exhibitors from more than 45 countries attended ISC'11.
Department of Energy systems continue to demonstrate leadership with four of the top ten computers on the list. Oak Ridge's Jaguar ranked No. 3; the ASC Program's Cielo, sited at Los Alamos, ranked No. 6; NERSC's Hopper ranked No. 8; and the ASC Program's Roadrunner, also sited at Los Alamos, ranked No. 10. The rankings, which are issued every six months, serve as a reminder of how fast computer power is advancing. For example, BlueGene/L was the top-ranked computer in November 2007 (only 3.5 years ago), and it is now in 14th place.
The Lawrence Livermore National Laboratory (LLNL) booth showcased examples of LLNL’s extraordinary high-performance computing (HPC) research and science through simulations, posters, articles, and publications. This is the third year LLNL has participated in a booth at the conference, and plans are already underway for next year.
Two ISC'11 announcements recognized LLNL employee contributions. The International Data Corporation (IDC) awarded Kambiz Salari the HPC Innovation Excellence Award for using modeling and simulation to find practical ways to reduce aerodynamic drag and improve the fuel efficiency of the tractor trailers ubiquitous on America's highways. The second award, No. 7 on the Graph 500, went to Roger Pearce, Maya Gokhale, and Nancy Amato for traversing massive graphs with NAND Flash.
Summer Workshop Prepares Tomorrow’s Computational Physicists
The first annual Computational Physics Student Summer Workshop coordinated by Dr. Scott Runnels was held at LANL June 13 – Aug. 8, 2011. Twenty-one graduate and undergraduate students from across the country were selected from an applicant pool of more than forty for admission into the summer workshop. The workshop was sponsored by the Computational Physics Division at LANL and is funded largely by LANL’s ASC Program. Computational physics is important to the country because it helps develop scientific models and solutions through computers and programming.
The students spent nine weeks of their summer vacation at LANL. Working from 8 a.m. to 5 p.m., Monday through Friday, they broke up their days from 10 a.m. to noon to attend lectures at the Los Alamos branch of the University of New Mexico on such topics as multimaterial mixing modeling, electromagnetic pulse simulation, or verification test of production modeling software. The rest of the students’ days were spent on research projects, often in teams. The work is intended to eventually lead to publication of their research in scholarly journals or conference papers. During weekend downtime, the students were treated to recreational opportunities in places like Albuquerque and Roswell, New Mexico, and Durango, Colorado. They visited museums and tourist sites, hiked, white-water rafted, and rode in a hot-air balloon.
LANL ASC Program Create Opportunities for Students
Los Alamos National Laboratory’s (LANL’s) ASC Program staff visited college and university career fairs in early 2011 to develop and enhance opportunities for collaboration and recruitment. On a trip to North Carolina Agricultural & Technical State University, a Historically Black College and University (HBCU), the ASC Program visitors identified a promising substantial technical collaboration.
The LANL ASC Program initiated a student pipeline with Florida A&M (FAMU), also an HBCU, and recipient of a Massie Chair grant. In Summer 2011, three students and one faculty member from FAMU came to Los Alamos. Dr. Andrew Jones, an FAMU Associate Professor of Mathematics, worked with and provided guidance to the FAMU students and gave technical presentations. Two of the students learned how to add a new capability to FLAG, a hydrodynamics code that is a product of the ASC Program. The third student participated in the Computational Physics Student Summer Workshop where she learned about diffusion and the finite difference method.
Other ASC-Program-sponsored students showed accomplishments in enterprise modeling for computing facilities as a result of their summer internships. Undergraduate students Selina Garcia, from New Mexico State University, and Douglas Keating, from University of Wisconsin, began a project to improve computer facilities management of the LANL computing center.
Completed Contract Negotiations for the Sequoia Supercomputer Pave Way for 2011 Delivery
Contract negotiations with IBM were completed last week for the new ASC supercomputer, Sequoia. Contract targets were changed to hard requirements and a delivery, integration, and acceptance schedule was finalized.
Racks begin arriving in December 2011, with deliveries continuing through April 2012. Integration will take place in phases. Acceptance of the first half of the system is planned for April 2012, and final acceptance of the 96-rack system is scheduled for September 2012.
Sequoia Supercomputer Earns Top Ranking on Green500
IBM's BlueGene/Q, which will be deployed for the ASC Program at Lawrence Livermore National Laboratory (LLNL) in 2012 as Sequoia, has earned the title of the world's most efficient supercomputer from the Green500. A prototype of the BlueGene/Q next-generation system was announced in June as No. 1 on the Green500 list.
Energy efficiency, including performance per watt for the most computationally demanding workloads, has long been a goal of increasingly powerful supercomputing systems. Energy-efficient super computers can allow users to realize critical cost savings by lowering power consumption, thus reducing expenses associated with cooling and scaling to larger systems while maintaining an acceptable power consumption bill performance, in addition to speed as measured in floating point operations per second (FLOPS). For more information, see the Green500 Web site.
BlueGene/Q is scheduled to be deployed in 2012 at two DOE national laboratories—Argonne National Laboratory and LLNL, both of which collaborated closely with IBM on the design of BlueGene, influencing many aspects of the system's software and hardware.
Designed to be a 20-petaFLOP/s system, Sequoia will be used by NNSA's Advanced Simulation and Computing (ASC) program to conduct stockpile stewardship research. Sequoia will be installed in the Terascale Simulation Facility starting in early 2012.
The complete IBM news release is available online.
ASC Salutes CSSE Program Manager David Daniel
Being in the fields of computer and computational science, LANL’s David Daniel is poised in the center of the next computing revolution. David is the program manager of the Computational Systems and Software Environment (CSSE) program element for LANL’s ASC Program. He is also the deputy group leader of the Applied Computer Science Group at LANL, a relatively new group with a staff of 26 people. The group is pairing computer and computational scientists with domain scientists, for example, physicists, to focus on easing the transition of applications onto next-generation architectures.
David has been a staff member in the Computer, Computational, and Statistical Science (CCS) Division since 2001, where he has worked on a number of projects including communication libraries (Open MPI) and the performance of scientific applications such as the Roadrunner Universe open-science project. He has advanced degrees in physics: a B.S. from Imperial College in London and a Ph.D. from University of Edinburgh. He first joined LANL in 1990 as a Director's Postdoctoral Fellow in the Theoretical Division focusing on simulations of lattice quantum chromodynamics (QCD). David left LANL and spent 8 years working in the high-performance and enterprise computing industry.
“David has deep and unique knowledge of a multitude of computational physics and computer science issues that are particularly relevant for the computing challenges that we face in the future,” says CCS Division Leader Stephen Lee, “which will need to be addressed through careful planning within CSSE and similar programs.”
The Applied Science Group at LANL aims to be the vanguard for scientific applications at extreme scale through co-design of algorithms, programming models, system software, and tools. The highly innovative computing platform Darwin is deployed to this group. They will use Darwin to focus on strategic goals of taming concurrency and power in forthcoming many-core and graphics-processor-based (GPU) systems. David and staff from the group are recognized leaders in computational co-design, and are key members of all three Exascale Co-Design Centers established this year to be a conduit for exascale computing. They are helping to plan the Department of Energy’s exascale strategy.
ASC Relevant Research
Sandia National Laboratories
Citations for Publications in 2011
Los Alamos National Laboratory
Citations for Publications in 2011
Printer-friendly version -- ASCeNews Quarterly Newsletter - September 2011