Testing the Edge of the Cloud
January 23, 2012 – Although federal researchers were generally upbeat about the agile capabilities of cloud computing, they found that shifting their niche data sets for scientific experiments to virtual environments might not hold the same savings seen in other industries.
The U.S. Department of Energy Office of Advanced Scientific Computing Research (ASCR) recently released its final report on the Magellan project, which investigated the performance, usability and cost roles cloud computing can play in midrange computing and data-intensive workloads. Over the last two years, the Magellan project particularly probed how the cloud could address the unique big data needs for research by the federal energy department’s Office of Science.
A cloud infrastructure testbed of more than 8,000 CPU cores and 1.4 petabytes of storage were deployed at Argonne Leadership Computing Facility and the National Energy Research Scientific Computing Center. With that infrastructure in place, researchers evaluated the performance of cloud models for IaaS and PaaS, virtual software packs, and MapReduce and Hadoop.
The big slam against a move to the cloud in the report came in the way of cost. The niche technical need of scientific research and data sets and existing customized tweaks to the DOE’s high-performance computing (HPC) systems would put the cost of a full dedication to the cloud at approximately four times more annually than the department’s present HPC budget. That was based on the pricing and similar cloud functionalities with Amazon’s on-demand service, though researchers went on to write that cost savings could be found with performance and customization adjustments in a mix of private cloud deployments and present HPC systems.
“Providing these capabilities would address many of the motivations that lead scientists
to consider cloud computing while still preserving the benefits of typical HPC systems which are already optimized for scientific applications,” the report concluded.
In a research survey, a majority of federal science department users reported favorable opinions of the cloud when it came to software control, information sharing and accessibility to resources. Access to additional resources from a cloud deployment was cited by 79 percent of users as a benefit of deployment, and 90 percent of respondents said that their scientific peers outside of the department were interested or engaged in using the cloud with research.
In its conclusions, Magellan researchers stated that while open source virtualized cloud stacks improved during the course of the project, there were gaps in development and customization. Like with the enterprise IT side of cloud use, administrative training and security measures would be needed for federal department use of the cloud. Also, researchers noted that MapReduce showed “promise” with high-throughput scientific applications, but, like with other cloud data workflow applications, it would require the adoption of more tools as well as layers of user configuration and training.
While testing the data limits of Magellan’s cloud, it also became a resource in a couple of scientific endeavors, such as quick analysis of suspected strains of a European E. coli in an outbreak and real-time data processing to analyze data from an ion collider.
The $32 million cloud testing project budget was funded under the American Recovery and Reinvestment Act. Expansion of cloud deployments has been a bedrock of top federal CIO Steven VanRoekel’s government IT planning and spending.
To access the 169-page Magellan report, click here.