Data Centric Systems: A new paradigm for computing

Editor’s note: This article is by Tilak Agerwala, Vice President of Data Centric Systems at IBM Research

The use of high performance computers (HPC) to model, simulate and analyze everything from astrophysics to zoology goes back to the dawn of the modern computer era. Since the 1950s, the models studied by HPC systems have increased both in scale and in detail with ever more sophisticated users calling for – and planning on – increased computational power. This increase is expressed in the form of Floating Point Operations per Second or FLOPS. Aggregate installed FLOPS, as measured by the Top500, have increased exponentially since tracking began in 1993, going from less than 60 gigaflops to nearly 300 petaflops, today. And the demand for increased FLOPS is not likely to abate for the foreseeable future.

However, we are now seeing a new trend emerge that will dramatically change how HPC system design moves forward – the emergence of data as the world’s newest, and arguably largest, natural resource.

Today’s businesses, academic institutions and government agencies continue to have a multitude of applications for HPC, from fraud detection in transaction-based systems like credit cards serving millions of users simultaneously, to computational simulations the human heart beating in real time cell by cell. But unlike 60 years ago, these models now support a trillion fold more data – leaping from kilobytes to petabytes, and growing. This onslaught of this structured and unstructured data on HPC requires a flexible computing architecture capable of addressing the growing needs for the workloads this data scale demands. In order to prepare for the future, IBM has adopted Data Centric Systems (DCS) design as our new paradigm for computing.

The rationale is simple. As the size of data grows, the cost of moving data around becomes prohibitive. We must employ distributed intelligence to bring computation to the data.

How Data Centric Systems is like a supply chain

To illustrate this, consider the evolution of today’s sophisticated supply chains. Today, detailed engineering, design and development work is taken on by highly capable members of the supply chain. Production complexity is distributed by processing components outside of the final manufacturing plant. These are finally brought together just in time for final assembly. This distributed application of specialized skills reduces waste in transport, and also results in a higher level of quality, efficiency, and innovation throughout the manufacturing work flow.

Like today’s modern supply chains, our next-generation systems will minimize data in motion. As data volumes grow exponentially the cost (power and time) of moving the data in and out of a central processor is becoming prohibitive. So why not move data less by running workloads where the data lives? Doing this will require “computing” at all levels of the system’s hierarchy by introducing active system elements such as network, memory and elastic storage. The architecture will also need to be composable, based on modular and upgradeable hardware design, and scalable from sub-rack to hundreds of racks. Finally, it must be on an open platform, like our OpenPOWER which allows deep innovation by partners at all levels of the ecosystem.

Data Centric Systems will be future “Systems of Insight”

In 2012, Project Cardioid, a partnership with Lawrence Livermore National Lab, used Sequoia, the largest Blue Gene/Q system we've ever delivered, with over 700,000 cores and sustained performance of 12 petaflops, to simulate an entire human heart at nearly the cellular level. This groundbreaking simulation modeled 370 million pieces of the human heart, 1,200 times faster than previous, less-accurate simulations. Going from months to near-real time simulation of thousands of beats is helping scientists understand disease, design and test drugs, as well as simulate other human organs.

BlueGene/Q Sequoia at the LLNL

Let’s fast-forward a few years from now to exascale computing and full DCS systems, and how they can be applied for everything from weather forecasting to finding oil and gas reserves. New sources are increasingly difficult to find and access as companies explore increasingly remote and challenging geologies, like the Arctic and the deep ocean. New sensor technologies provide a lens into these deeply buried hydrocarbon reserves, generating vast amounts of data.

While the data enables us to build much higher-resolution models, like with the heart simulation, the opportunity in this industry is in creating multiple simultaneous, and coupled simulations, to ultimately characterize the reservoir at the molecular level. With this level of insight oil and gas recoveries will be optimized over time. But getting to this level of fidelity will require 100x the compute power of today's fastest, largest systems. We can begin to tackle this type of problem with current systems but to get reach the goal will ultimately require Data Centric Systems.

True Data Centric Systems will be realized when we can consolidate modeling, simulation, analytics, Big Data, and machine learning and cognitive computing as “Systems of Insight.” They will provide these capabilities faster with more flexibility and lower energy costs, effectively ushering in a new category of computers.

For more details download my keynote presentation at the 2014 International Conference on Parallel Processing.

Labels: bluegene, Data Centric System, Exascale, HPC, openpower