5.12.2014

Storlets: Turning Object Storage into a Smart Storage Platform


Michael Factor
Editor’s note: This blog posting was authored by Michael Factor, Distinguished Engineer and expert on Storage Systems at IBM Research - Haifa

Traditional file systems store our information in a tree structure. Although this works fine for small collections of data – like those on our local hard drive – they are not designed for the massive volumes of unstructured content most businesses are collecting, storing, and accessing on the cloud.

A new method of storing information is called object storage. This approach stores information as objects. Each object contains the data (the bits and bytes of our documents, movies, images, and so forth), together with metadata that holds user- and system-defined tags. These smart data objects include rich information – or metadata – that describes the content of the data, how the object is related to other objects, how the data should be handled, replicated, or backed up, and more.

“But what if we could turn a software-defined object store into a smart storage platform?”
Although object storage can store objects, manage them, protect them, and so on it doesn’t by itself dramatically increase the rate at which we can extract value from objects. But what if we could turn a software-defined object store into a smart storage platform?

Storlets bring the computation to the data

A new research prototype called "storlets" holds the promise of greatly increasing the value we get out of storage and the speed at which we can access what we need. A new software-defined mechanism, storlets allow object storage to move the computation to the data, instead of the system having to move the data to a server to carry out the computation. 

Imagine if every time you wanted to cook a meal, you had to bring all your ingredients to a central neighborhood depot where stoves, appliances, and cooking utensils were available to "process" the food. That’s similar to the current situation with data on the cloud. Storlets come to remedy this situation by moving the heavy lifting to where it’s needed similar to allowing you to cook everything in your kitchen, where all your raw materials are already located. 

The impact of storlets is substantial. Stored data can be processed locally, and no longer needs to be transferred over the network to a remote computer, processed, and then put back onto the storage server – all of which incurs both network transfer latencies and thus real dollar costs.

Our vision is to reduce costs, increase flexibility and improve security by turning the object store into a platform, and allowing the functionality of the object store to be extended using software. 

Getting more value from data – much faster

Aside from saving bandwidth by eliminating unnecessary data transfers, storlets are an ideal way to introduce new services. Storlets can analyze each object and extract its metadata, including size, subject, resolution, format, and more. Since storlets are dynamically loaded code, the function is only limited by the developer’s imagination.

For example, a media company could upload a movie to an object store and have it automatically generate a representative image. A physician doing rounds could have the object store send only the portion of a patient’s x-ray needed to her wireless device for immediate viewing. A lawyer could request that the object store provide a document from a previous court case in a specific format. Or pathologists could have images analyzed in the object store itself. All this sophisticated computation is moved into the storage infrastructure via the software of dynamically loaded storlets, making it faster, more flexible, and far less expensive.

IBM researchers in Haifa, Israel have been working on prototyping storlets for several years in the context of European Union Research projects such as ENSURE, VISION Cloud, and Forget IT. In short, we’re looking forward to seeing more value from data in faster and more flexible ways.

4 comments:

  1. Concept of storlets looks very interesting and promising. Good part is it seems to provide great customization (thereby flexibility in underlying storage hierarchy, unlike existing file systems) based on customer requirement.

    ReplyDelete
  2. Thank you for your comments. This flexibility is exactly the reason we did this work

    ReplyDelete
  3. What an interesting concept! The computations can be dinamically loaded into a storlet? Are the storlets self-sufficient meaning that their computations do not rely on other storlets? Is there any sort of communication protocol between storlets? How is data versioning dealt with? Where can we find more information on this project?

    ReplyDelete
  4. Yes, storlets can be dynamically loaded -- in fact they are just objects which are created via a normal PUT. We are working on ways of tying together multiple storlets via well-defined communication graphs. Versioning is somewhat orthogonal to storlets, and can be handled for example via Swift object versions. Also, it is important to remember, in an object store like Swift, one can't modify objects -- only replace. We are working to get some more information publicly available in an easily consumable fashion.

    ReplyDelete