Data Centric Systems: A new paradigm for computing

Editor’s note: This article is by Tilak Agerwala, Vice President of Data Centric Systems at IBM Research

The use of high performance computers (HPC) to model, simulate and analyze everything from astrophysics to zoology goes back to the dawn of the modern computer era. Since the 1950s, the models studied by HPC systems have increased both in scale and in detail with ever more sophisticated users calling for – and planning on – increased computational power. This increase is expressed in the form of Floating Point Operations per Second or FLOPS. Aggregate installed FLOPS, as measured by the Top500, have increased exponentially since tracking began in 1993, going from less than 60 gigaflops to nearly 300 petaflops, today. And the demand for increased FLOPS is not likely to abate for the foreseeable future.

However, we are now seeing a new trend emerge that will dramatically change how HPC system design moves forward the emergence of data as the world’s newest, and arguably largest, natural resource.

Today’s businesses, academic institutions and government agencies continue to have a multitude of applications for HPC, from fraud detection in transaction-based systems like credit cards serving millions of users simultaneously, to computational simulations the human heart beating in real time cell by cell. But unlike 60 years ago, these models now support a trillion fold more data leaping from kilobytes to petabytes, and growing. This onslaught of this structured and unstructured data on HPC requires a flexible computing architecture capable of addressing the growing needs for the workloads this data scale demands. In order to prepare for the future, IBM has adopted Data Centric Systems (DCS) design as our new paradigm for computing.

The rationale is simple. As the size of data grows, the cost of moving data around becomes prohibitive. We must employ distributed intelligence to bring computation to the data. 

How Data Centric Systems is like a supply chain

To illustrate this, consider the evolution of today’s sophisticated supply chains. Today, detailed engineering, design and development work is taken on by highly capable members of the supply chain. Production complexity is distributed by processing components outside of the final manufacturing plant. These are finally brought together just in time for final assembly. This distributed application of specialized skills reduces waste in transport, and also results in a higher level of quality, efficiency, and innovation throughout the manufacturing work flow. 

Like today’s modern supply chains, our next-generation systems will minimize data in motion. As data volumes grow exponentially the cost (power and time) of moving the data in and out of a central processor is becoming prohibitive. So why not move data less by running workloads where the data lives? Doing this will require “computing” at all levels of the system’s hierarchy by introducing active system elements such as network, memory and elastic storage. The architecture will also need to be composable, based on modular and upgradeable hardware design, and scalable from sub-rack to hundreds of racks.  Finally, it must be on an open platform, like our OpenPOWER which allows deep innovation by partners at all levels of the ecosystem.

Data Centric Systems will be future “Systems of Insight”

In 2012, Project Cardioid, a partnership with Lawrence Livermore National Lab, used Sequoia, the largest Blue Gene/Q system we've ever delivered, with over 700,000 cores and sustained performance of 12 petaflops, to simulate an entire human heart at nearly the cellular level. This groundbreaking simulation modeled 370 million pieces of the human heart, 1,200 times faster than previous, less-accurate simulations. Going from months to near-real time simulation of thousands of beats is helping scientists understand disease, design and test drugs, as well as simulate other human organs.

BlueGene/Q Sequoia at the LLNL
Let’s fast-forward a few years from now to exascale computing and full DCS systems, and how they can be applied for everything from weather forecasting to finding oil and gas reserves. New sources are increasingly difficult to find and access as companies explore increasingly remote and challenging geologies, like the Arctic and the deep ocean. New sensor technologies provide a lens into these deeply buried hydrocarbon reserves, generating vast amounts of data. 

While the data enables us to build much higher-resolution models, like with the heart simulation, the opportunity in this industry is in creating multiple simultaneous, and coupled simulations, to ultimately characterize the reservoir at the molecular level. With this level of insight oil and gas recoveries will be optimized over time. But getting to this level of fidelity will require 100x the compute power of today's fastest, largest systems. We can begin to tackle this type of problem with current systems but to get reach the goal will ultimately require Data Centric Systems.

True Data Centric Systems will be realized when we can consolidate modeling, simulation, analytics, Big Data, and machine learning and cognitive computing as “Systems of Insight.” They will provide these capabilities faster with more flexibility and lower energy costs, effectively ushering in a new category of computers.

For more details download my keynote presentation at the 2014 International Conference on Parallel Processing.

Winners of the First Heinrich Rohrer Medal

In order to celebrate the great achievements accomplished by the late Dr. Heinrich Rohrer, Nobel Laureate and IBM Fellow and to further promote progress in research and development in the field of nanoscience and nanotechnology, an international prize was named after him in 2013 by The Surface Science Society of Japan (SSSJ) in collaboration with IBM Research - Zurich, the Swiss Embassy in Japan and his wife Rose-Marie.

Dr. Rohrer was one of two inventors of the scanning tunneling microscope, which opened the world to nanotechnology.
Heinrich Rohrer with the STM in 1986.

The Heinrich Rohrer Medal is composed of the Grand Medal, which goes to researchers who have made distinguished achievements in the field of nanoscience and nanotechnology based on surface science and the Rising Medal, which goes to researchers 37 years old or younger who have made outstanding results and are expected to continue to play active roles in the field. The Medal includes cash prizes and will be awarded every three years, at The International Symposium on Surface Science (ISSS) organized by SSSJ next scheduled in November 2017.

The winner of the first Heinrich Rohrer Grand Medal is Roland Wiesendanger, Professor at the University of Hamburg, Germany for his pioneering and ground-breaking achievements on spin-resolved scanning tunneling microscopy and spectroscopy, bringing about very deep insights in spin-related properties of materials at atomic scale.

The winners of the first Heinrich Rohrer Rising Medal are Yoshiaki Sugimoto, an Associate Professor in Osaka University, Japan for his outstanding contributions to manipulation and chemical identification of individual atoms using atomic force microscopy and Jan Hugo Dil, SNSF Professor in Ecole Polytechnique Federale de Lausanne, Switzerland for his leading and creative roles in identifying novel spin structures using synchrotron radiation-based spin- and angle-resolved photoemission spectroscopy.

After receiving their medals last night in Japan they answered a few questions.

Question: As simple as possible, can you please explain your area of research? 

Roland Wiesendanger (RW): The understanding of magnetism at the ultimate, atomic length scale is one of the current frontiers in solid state physics. It is a key to future applications in spin electronics and the highest density of magnetic data storage. Spin-polarized scanning tunneling microscopy and spectroscopy are powerful tools to access magnetic phenomena on a scale all the way down to the very atoms. During the past two decades a large variety of surprising magnetic structures were discovered by spin-polarized STM. Competing magnetic interactions effective at the atomic length scale give rise to unexpected ordered structures of great complexity in monolayers of magnetic atoms as well as in artificially created nanostructures built up atom-by-atom. Even the flow of spin-polarized currents through individual molecules could be studied with sub-molecular spatial resolution by spin-polarized STM.

Yoshiaki Sugimoto (YS): Everything is made of atoms, and therefore various properties of
 materials can be attributed to constituent elements and their
 arrangement in nano-meter scale.
 So, the techniques to investigate various materials with atomic scale
 are are required.
 Furthermore, the ability to assemble nanostructures with unique and
 specific properties is a key technology for developing the next
 generation devices.
 For this to happen several major successes need to happen and they are anticipated through the bottom-up approach:
 an attempt to create such nano-devices from the atomic or molecular level 
instead of miniaturizing from the macroscopic world.
 In the bottom-up approach, the ultimate limit is to fabricate
 artificial nanostructures on surfaces by manipulating single 
atoms or molecules one by one.
 In our laboratory, we are developing such atom manipulation
 techniques as well as chemical identification and
local characterization techniques using scanning probe microscopy (SPM).

Jan Hugo Dil (JHD): My research aims at understanding the role that the spin of the electron plays in many novel materials. I especially focus on how this intrinsic magnetic moment of the electron is coupled to the path the electron takes in the solid; in technical terms spin-orbit coupling. This coupling can create spin-polarised electron currents at the surface or interface of a material, where the electrons that go to the right have one spin direction and those going to the left have the opposite spin direction. Furthermore, spin-orbit coupling can be used to change the spin of an electron that passes through the material in a controlled way and even to switch the magnetic state of a small magnet. Such effects and possibilities are very promising to use the spin of the electron instead of its charge to transport and control information, which is the aim of so-called spintronics. In our research we have been able to create a basic model of what determines the magnitude of these effects and we have found several materials with promising properties for spintronics.

Question: What does winning this award mean to you?

RW: As a young researcher working at the University of Basel in the 80s, I was greatly influenced by Heinrich Rohrer’s way of thinking. He always encouraged young researchers to find their own research topics guided by novelty and expected impact - “What would change if it could be done?” was one of the important questions raised him in order to force research activities into a meaningful direction. Right from the early days of the development of the spin-polarized STM technique I received strong encouragement by Prof. Rohrer which continued after I moved to the University of Hamburg.

YS: I am honored to receive the Heinrich Rohrer medal.
This award is important not only for me, but also the field of atomic force microscopy (AFM). 

The AFM field is still growing and I hope that this award further promotes the development and
 application of AFM techniques.

 Winning this award means that the hard work that my co-workers and I have invested in this research is recognized and it is a strong motivation to continue line of work. I am too young to have had the time to directly interact with Heinrich Rohrer, but I am sure I will be speaking for many others when stating that the work he has done at the IBM Research lab in Zurich has been inspiring and has placed this research center even more firmly on the scientific map. As a side note, when I moved to Zurich I lived in the small town of Kilchberg and I would often walk or run uphill past the IBM lab, which is located across the street from a farm. For me this lab still stands for a place where science can be conducted at very high levels away from the hustle of the standard academic environment.

Question: What's next for your research?

RW: After having developed spin-polarized STM and applied to many different material systems, including metallic thin films and nanostructures, semiconductors and oxides, as well as single atoms and molecules on surfaces, the next step will be the application of spin-polarized STM to novel material systems, including low-dimensional carbon materials, topological insulators, and unconventional superconductors. Spin-polarized STM promises to unriddle the role of the spin degree of freedom for the fascinating properties of these novel material systems down to the atomic scale.

YS: We are creating nanoclusters composed of several atoms based on
 chemical identification and atom manipulation techniques.
 We then investigate the properties of those clusters by AFM and STM.
 It is found that some of clusters composed of semiconductor atoms show
switching behaviors.
The switches work by carrier injection from the tip as well as
interaction force with the tip.
 In future, our research will lead to realization of atomic-level devices,
 whose operating principles are based on quantum mechanics. 
It is hoped that in the next decade,
by applying their atom manipulation techniques to
device fabrication, atomic-level quantum devices such
as solid state quantum computers will become a reality.

 We are also developing new systems to achieve higher spatial
resolution and higher functionality, such as friction measurement,
 spin resolution and time resolution.

 Besides the continuing search for materials with even better and more exotic spin properties, my plan is to apply everything we have learned from our research in the past years to another class of materials, such as the high temperature superconductors -- the other remarkable Nobel Prize discovery at IBM Research - Zurich.


Analytics helping cities and citizens smartly manage water metering

Real-time data on water usage benefits providers and end users

Editor’s note: This article is by Bradley Eck, IBM Research-Ireland’s Smarter Cities Technology Centre manager in Dublin.

A faucet that drips just once per second wastes 2,700 gallons of water, annually. My Water and Environment team is developing tools and methodologies, as part of an EU-funded project called iWIDGET, to manage urban water demand by reducing waste like leaky taps.

iWIDGET links 9 expert partners together to look at how analytics and smart meters can help cities and their citizens get real-time data on their water and related energy usage – with the aim of improving the management of urban water demand by reducing waste, and improving utilities’ understanding of end-user demand, and ultimately reducing customer water and energy costs.

Download Infographic
iWIDGET provides householders with easily accessible information that will allow them to make decisions on how to reduce water usage, and thus, their water bills. Plus, as utilities will have better visibility on their customers’ usage, they will be able to more-accurately forecast demand – and send their customers alerts if a leak is suspected.

Building iWidget

For our part, my team in Dublin focused on iWIDGET’s system architecture to improve a utility’s planning, operation, and management of real-time sensor data, and developed analytics around the high-resolution consumption data. Not an easy task, as this involves noisy raw data and dozens of third party systems and analytical tools. So, we built an API for the entire system that connects and interfaces with each partner’s components.

iWidget on the iPhone
Some of the components can be accessed via mobile applications. Meter Replacement, for example, looks at the history of readings and meter type to determine the best time to replace their meters. Another app focuses on pump scheduling, where real time data from smart meters feed in to a pumping plan that helps reduce energy costs.

iWIDGET system trials have started in Portugal, Greece and the UK, and will conclude in 2015. The envisaged outcome of the iWIDGET project is increased interoperability between water information systems at the European Union and national levels, and overall improved efficiency of water resource management. 

For more information on the project please visit the iWIDGET project website, or join our group on LinkedIn, or follow @iWIDGET_FP7 on Twitter.


IBM Scientists Recognize Day of Photonics

DAY OF PHOTONICS is an annual event organized in Europe that promotes “photonics” towards the general public. IBM scientists are researching photonics because it uses light instead of electrical signals to transfer information for future computing systems, thus allowing large volumes of Big Data to be moved fast between computer chips in servers, large cloud data-centers, and supercomputers.

In the videos below IBM scientists discuss how they are using photonics.

Join the conversation on Twitter #dayofphotonics

Personality and Visualization

Our team at IBM Research – Australia is currently looking for people who have tweeted at least 200 original public tweets to complete a 20 minute questionnaire about interpreting data from visualizations.

Research shows that people with high levels of Conscientiousness or low levels of Neuroticism (from the Big 5 OCEAN personality traits) tend to be faster learners than people with high levels of Neuroticism or low levels of Conscientiousness. Our goal is to determine whether there is a similar relationship between users’ personality types, and their ability to interpret information presented in visualization graphs. This will then allow systems such as Watson Analytics to automatically select the best visualizations to suit the user's personality type.

Please email our lead scientist Lida Ghahremanlou with your Twitter handle if you are interested in participating. There are only a fixed number of spots for the study, but if your handle is accepted, you will receive an email link to the questionnaire.

Screenshot of a Visualization Graph of the Research Questionnaire


The possibilities of Project Lucy

The TED Institute micro-documentary on Project SyNAPSE gave us a look at the future of cognitive computing, with glimpses at some possible practical applications. Those exciting possibilities become positively exhilarating in the Project Lucy micro-documentary, which gives us an idea of the potential for cognitive computing to transform a continent.

The project is a collaboration between IBM researchers in Africa and the company’s business and academic partners to apply IBM Watson to the continent’s biggest challenges. The goal is to use Watson to discover insights from big data and develop commercially viable solutions in the areas of energy, healthcare, water and sanitation, agriculture, human mobility and education. 

It is this last area that is the particular focus of IBM researcher Dr. Charity Wayua. In the film, the Kenya-based Wayua lays out the ambitions for cognitive computing to give teachers greater ability to deal with crowded classrooms. Armed with data-based insights, teachers can address needs and situations on a student-by-student basis.

While Africa’s challenges are daunting, it is far from the only place where classrooms are overcrowded, teachers are overstretched, and children are underserved. The potential impact on Africa’s education systems is awesome to contemplate, and it’s easy to see how the benefits could be replicated around the globe. Add in the other challenges Project Lucy is tackling and the potential for cognitive computing to improve the lives of millions becomes even greater. 

“For the African continent,” said Wayua, “I think this is going to be our 'big bet' on transformation.” If that big bet pays off, it won’t just transform Africa, it will transform the world.

Editor's note: This article is by Jonathan Batty, external relations leader for IBM's global labs.


Multilingual Watson

Learning to understand the human gift of language

DJ McCloskey, IBM Watson Group
Machines use programming languages to at least appear to understand our human languages. IBM Watson is one of the most sophisticated, helping everyone from healthcare providers to sous chefs by using several programming languages and algorithms to read and comprehend natural language. But the system could only answer questions posed in English – until now.

Natural Language Processing architect D.J. McCloskey leads a team “teaching” Watson the fundamental mechanisms to comprender español, entender português, 日本語を理解する(understand Japanese), and many other languages.

“Back in the late 1990s and early 2000s, the notion of a machine reading text was primarily defined by creating search indexes out of the words in text. We wanted to take it one step farther where ‘reading’ actually meant ‘understanding’ the text. So, we created LanguageWare in 2001, a technology that could automate fact extraction from the text,” D.J. said.

LanguageWare established a lightweight, optimized library of functions for processing natural language text, using a set of generalized structures and algorithms that captured the essence of language. Multilingual by design, this foundation gave LanguageWare a way to process text from any language so that a machine could understand the atomic sentence context, and build semantic understanding of that sentence in any language.

But D.J.’s team developed this sophisticated tooling with the mantra of “involve the humans” in mind. By letting humans teach the machines everything about language – from word morphology, to knowing the difference between “run” and “running” and “goose” and “geese,” and transcribing the knowledge of domain experts (learning from a subject’s human masters) – the system can then detect accurately worded facts in text, such as negative reactions to a drug, or an acquisition of one company by another. Today, Watson’s entire suite of cognitive capabilities uses and extends this tooling.

“And in Watson we have employed this capability to capture and apply precise knowledge from oncology experts, providing a way for human experts to teach the system at a deep level,” D.J. said.

Gluing it together with open architecture

These analytics and algorithms work together on top of Apache’s Unstructured Information Management Architecture, or UIMA (“you-ee-mah”). Its open architecture gave LanguageWare back in 2001, and Watson today, a way to combine their analytics with other complementary analytics to rapidly collaborate and prototype new ideas – a way to end up with a whole much greater than the sum of its parts, like the ideas from the Watson Mobile Challenge.

“I remember trying to convince people of the viability of machines understanding unstructured data, pre-Watson,” said D.J. “And then Watson (and UIMA) happened, and now people believe it can cure cancer, and make our tea!

“Amazingly enough, the power of this technology actually has potential to help do both – and more. Watson can’t cure cancer but we have real solutions where Watson Oncology Advisor helps consultant oncologists improve treatment of cancer patients. And a member of our team recently made Chef Watson’s Korean BBQ lemon cupcakes and they were awesome (with my tea)!”

Parsing languages (other than English)

Another ingredient in Watson’s NLP pantry is its parser. This set of code helps it analyze and understand the written English language down to the grammar and syntax level. For example, Watson’s parser lets the system know “who did what to whom,” as in “the boy kicked the ball.” So, a question about what was kicked will find “the ball” as the receiver of said action.

But not all sentences operate the same way or in the same order.

Say “Hola” to Watson, and find out more about its new capabilities, and its new home at Astor Place in New York City, here.
In English, the subjects, verbs, and objects follow a certain order: “John saw Mary.” John did the seeing, while Mary was seen in a subject-verb-object order. However, in Hindi it is “Jŏna mairī dēkhā,” or “John Mary saw,” so a subject-object-verb order. And in Ireland, where D.J. lives and works, verbs follow subjects, which follow objects for “Chonaic John Máire” which is “Saw John Mary.”

D.J.’s team chose Spanish first, a widely spoken representative of a romance language, as Watson’s next language to parse, but hopes to build a generic parser that, once plugged into UIMA, will allow Watson to understand any language.

“We are after the mechanics of language to get to a point where Watson works between languages in a pragmatic way, Watson going global!” D.J. said.

Now, with Watson’s capabilities on BlueMix available to developers all around the world, its ability to process local language just as well as English will be increasingly valuable. New mobile apps could exploit all of Watson’s natural language power on regionally relevant knowledge sources. Ultimately, Watson will be cross lingual, meaning questions in one language can find answers in another and be returned to the user, translated back into his or her native or preferred language – making the knowledge of to world available to all regardless of language.

More about IBM Watson