What the CSI is Up to at SDC

What the Cloud Storage Initiative Is Doing At SDC

The SNIA Storage Developer Conference (SDC) is less than a week away. We’re looking forward to the conference and in particular want to make note of some exciting news and events that pertain to work the CSI is doing to promote standards that will increase the adoption, interoperability and portability of data stored in the cloud.

SDC Conference session: Introducing CDMI v1.1 – Tuesday, September 16th, 1:00 p.m. by David Silk. This session introduces the new CDMI 1.1 and provides an overview of capabilities the Technical Work Group have added to the standard, and what CDMI implementers need to know when moving from CDMI 1.0.2 to CDMI 1.1.

Cloud Interoperability Plugfest – Participants at the 12th Cloud Interoperability Plugfest will be testing the interoperability of their cloud storage interfaces based on CDMI. We always have a large showing of CDMI implementations at this event, but are also looking for implementations of Amazon S3, and OpenStack Swift, Cinder and Manila interfaces.

It’s not too late to register for this Plugfest. Find out how here.

SDC 2014 is going to be exciting and educational. It’s “one stop shopping” for IT professionals who focus on the tools, technologies and developments needed for understanding and implementing efficient data storage, management and security. The CSI hopes to see you there.

 

Getting Started with the CDMI Conformance Test Program

Together with our partner, TATA Consultancy Services, we recently had a great live Webcast to launch the Conformance Test Program (CTP) for the SNIA Cloud Data Management Interface (CDMI). CDMI is an ISO/IEC standard that offers end users simplicity and data storage interoperability across a wide range of cloud solutions. Interoperability and portability of data stored in the cloud has become a top IT priority. The CTP tests for conformance against the specification, and provides purchasers of certified cloud storage solutions the assurance that these solutions meet CDMI interoperability standards. Our Webcast is now available on demand. It details the benefits of the CDMI CTP program and explains how any cloud storage vendor or ISV can begin the CTP process. I encourage you to check it out to learn:

  • Key benefits of the CDMI standard for vendors and end users
  • Growing adoption of the CDMI standard
  • The suite of conformance tests required to achieve CDMI CTP certification
  • How to begin the CTP process

In addition to the Webcast replay, I encourage you to check out our CDMI CTP Frequently Asked Questions (FAQ). Getting started is easy. Just fill out the CTP form and you’ll be on your way.  

New Cloud Storage Meme – “Enterprise DropBox”

In a number of recent presentations on cloud storage recently, I have started by asking the audience “how many of you use DropBox?” I have seen rooms where more than half of the hands go up. Of course, the next question I ask is “does your corporate IT department know about this?” – sheepish grins abound.

DropBox has been responsible for for a significant fraction of the growth in the number of Amazon S3 objects – that’s where the files end up when you drop them into that icon on your laptop, smartphone or tablet. However, if that file is a corporate document, who is in charge of making sure the data and its storage meets corporate policies for protection, privacy, retention and security? Nobody.

Thus there is now growing interest in bringing that data back in-house and on premise for the enterprise so that business policies for the data can be enforced. This trending meme has been termed “Enterprise Dropbox”. The basic idea is to offer the equivalent service and set of applications to allow corporate IT users to store their corporate documents where the IT department can manage them.

Is this “Private Cloud”? Well, yes in that it uses capitalized corporate storage equipment. But it also sits “at the edge” of the corporate network so as to be accessible by employees wherever they happen to be. In reality, Enterprise DropBox needs to be part of an overall Bring Your Own Device (BYOD) strategy to enable frictionless innovation and collaboration for employees.

Who are likely to be the players in this space? Virtualization vendors such as Citrix (with its ShareFile acquisition) and VMware with its Project Octopus initiative look to be first movers in this space, along with start ups such as Oxygen Cloud. It’s interesting that major storage vendors have not picked up on this as yet.

Digging into how this works, you find that every vendor has a storage cloud with an HTTP based object storage interface that is then exposed to the internet with secure protocols. Each interface is just slightly different enough that there is no interoperability. In addition, each vendor develops, maintains and distributes it own set of client “apps” for operating systems, smartphones and tablets. A key feature is integration of the authentication and authorization with the corporate LDAP directory both for security and to reduce administrative overhead. Support for quotas and department charge back is essential.

Looking down the road, however, this proliferation of proprietary clients and interfaces is already causing headaches for the poor device user, who may have several of these apps on their devices (all maxed out to their “free” limit). The burden on vendors is the development cost of creating and maintaining all those applications on all those different devices and operating systems. We’ve seen this before, however, in the early days of the Windows ecosystem. You used to have to purchase a separate FTP client for early Windows installations. Want NFS? A separate client purchase and install. Of course, now all those standard protocol clients are built into operating systems everywhere. Nobody thinks twice about it.

The same thing will eventual work its way out in the smart device category as well. But not until a standard protocol emerges that all the applications can use (such as FTP or NFS in the Windows case). The SNIA’s Cloud Data Management Interface (CDMI) is poised to meet this need as it’s adoption continues to accelerate. CDMI offers a RESTful HTTP object storage data path that is highly secure and has the features that corporate IT departments need in order to protect and secure data while meeting business policies. It enables each smart device to have a single embedded client to multiple clouds – both public and private. No more proliferation of little icons all going to separate clouds.

What will drive this evolution? You – the corporate customer of these vendor offerings. You can ask the Enterprise DropBox vendors simply to “show me CDMI support in your roadmap”. Educate your employees about choosing smart devices that support the CDMI standard natively. Only then will the market forces compel the vendors to realize that there is no value in locking in their customers. Instead they can differentiate on the innovation and execution that separates them from their competitors. Adoption of a standard such as CDMI will actually accelerate the growth of the entire market as the existing friction between clouds gets ground down and smoothed out by virtue of this adoption.

Validating CDMI Features – Metadata Search

Here we go again with an announcement of a cloud offering that again validates an existing standardized feature of CDMI. The new Amazon CloudSearch offering lets you store structured metadata in the cloud and perform queries on the metadata. They missed an opportunity, however, to integrate this with their existing cloud object storage offering. After all, if you already have object storage, why not put the metadata with the data object instead of separating it out in a separate cloud?

CDMI lets you put the user metadata directly into the storage object, where it is protected, backed up, archived and retained along with the actual data. CDMI’s rich query functions are then able to find the storage object based on the values of the metadata without talking to a separate cloud offering with a new, proprietary API.

CDMI standardizes a Query Queue that allows the client to create a scope specification (equivalent to a WHERE clause) to find specific objects that match the criteria, and a results specification (equivalent to a SELECT clause) that determines the elements of the object that are returned for each match. Results are placed in a CDMI queue object and can be processed one at a time, or in bulk. This powerful feature allows any storage cloud that has a search feature to expose it in a standard manner for interoperability between clouds.

An example of the metadata associated with a query queue is as follows:

{
     "metadata" : {
          "cdmi_queue_type" : "cdmi_query_queue",
          "cdmi_scope_specification" : [
               {
                    "domainURI" : "== /cdmi_domains/MyDomain/",
                    "parentURI" : "starts /MyMusic",
                    "metadata" : {
                         "artist" : "*Bono*"
                    }
               }
          ],
          "cdmi_results_specification": {
               "objectID" : "",
               "metadata" : {
                    "title" : ""
               }
          }
     }
}

 

When results are stored in a query queue, each enqueued value consists of a JSON object of MIME-type “application/json”. This JSON object contains the specified values requested in the cdmi_results_specification of the query queue metadata.

An example of a query result JSON object is as follows:

{
     "objectID" : "00007E7F0010EB9092B29F6CD6AD6824",
     "metadata" : {
          "title" : "Vertigo"
     }
}

Thus if you are using your storage cloud for storing music files, for example, all of the metadata for each mp3 object can be stored right along with the object, and CDMI’s powerful query mechanisms can be used to find the files you are interested in without invoking a separate search cloud with disassociated metadata,

Validating CDMI features – Object Expiration

Validating yet another feature of the CDMI standard (see previous post for an earlier one), Amazon announced their Object Expiration feature for S3. While not a new concept for storage interfaces, it is the first cloud implementation of this capability that I know of. The idea is simply to have the server side of the cloud do object deletion on your behalf automatically, once the lifecycle of that data has completed.

As part of overall Data Lifecycle Management, object deletion is the most common terminal state for data. CDMI has standardized the interface for this capability in cloud storage with a comprehensive Retention and Hold Management feature (Chapter 17). The granularity of the standard CDMI feature is finer than that of the S3 feature in that it allows for retention and deletion on individual objects (although you could accomplish this in S3 with prefix = object name, it doesn’t scale using the header fields that Amazon uses). The S3 prefix mechanism can be used to scope the expiration policy down to individual “directories” (forward slash terminated parts of object names), and CDMI allows this also for the semantically equivalent CDMI sub-containers.

Complying with Regulations

Although the ability to delete objects when their lifecycle completes is useful, it is insufficient for complying with regulations such as Sarbanes-Oxley, or for eDiscovery needs during litigation. For most enterprises, they need to show that the data has not been modified during its lifecycle. In addition, if a subpoena is issued for the data – you DO NOT want the object deleted, even if it’s retention period has expired – this can cost you millions of dollars in a pending court case…

The CDMI standard anticipates that storage clouds will want to offer a more robust, full featured retention and hold management for corporate data, and that a standard means of achieving it will be needed. Take a quick look at Chapter 17 (it’s quite compact while being comprehensive) and investigate using the standard way to achieve this function. If you are a cloud vendor trying to emulate the S3 interface, good luck to you – Amazon will continue to expand the definition of what “S3” means (like adding this feature), forcing you to constantly modify your cloud’s storage interface to keep up (as well as requiring you to reverse engineer any bugs that exist).

Validating CDMI features – Server Side Encryption

One of the features of many storage systems and even disk drives is the ability to encrypt the data at rest. This protects against a specific threat – the disk drive going out the back door for replacement or repair. So it was only a matter of time before we would see this important feature start to be offered for Cloud Storage as well. Well, today Amazon announced their Server Side Encryption capability for their S3 cloud offering. This feature was anticipated by the CDMI standard interface when it was finalized as a standard back in April 2010.

Standard Server Side Encryption

So, how does CDMI standardize this feature? Well, as usual, it starts with finding out if the cloud actually supports the feature and what choices are available. In CDMI, this is done through the capabilities resource – a kind of catalog or discovery mechanism. By fetching the capabilities resource for objects, containers, domain or queues, you can tell whether server side encryption of data at rest if available from the cloud offering (yes this is granular for a reason). The actual capability name is: cdmi_encryption (see section 12.1.3). This indicates that the cloud can do encryption for the data at rest, but also indicates what algorithms are available to do this encryption. The algorithms are expressed in the form of: ALGORITHM_MODE_KEYLENGTH, where:

“ALGORITHM” is the encryption algorithm (e.g., “AES” or “3DES”).

“MODE” is the mode of operation (e.g.,”XTS”, “CBC”, or “CTR”).

“KEYLENGTH” is the key size (e.g.,”128″,”192″, “256”).

So the cloud can offer the user several different algorithms of different strengths and types, or if it only offers a single algorithm (such as the Amazon offering), the cloud storage client can at least understand what that algorithm is.

So how does the user tell the cloud that she wants her data encrypted? Amazon does this with a proprietary header of course, but CDMI does it with standard Data System Metadata that can be placed on any object, container of objects, queue or domain. This metadata is called cdmi_encryption (see section 16.4), and contains merely a string with a value chosen from the list of available algorithms in the corresponding capability. There is also a cdmi_encryption_provided metadata value to tell the client whether their data is being encrypted or not by the cloud.

Lastly, there is a system-wide capability called cdmi_security_encryption (section 12.1.1) that tells the user whether the cloud does server side encryption at all.

Server side encryption is an important capability for cloud storage offerings to provide, which is why CDMI standardized this in advance of having cloud offerings available. We expect more clouds to offer this in the future, and customers to soon realize that – without CDMI implementations, these offerings are locking them in and causing a high cost of exiting that vendor.

Plan to Attend Cloud Burst and SDC

Cloud Storage Developers will be Converging on Santa Clara in September for the Storage Developer Conference and the Cloud Burst Event

Cloud Burst Event

There are a multitude of events dedicated to cloud computing, but where can you go to find out specifically about cloud storage? The 2011 SNIA Cloud Burst Summit educates and offers insight into this fast-growing market segment. Come hear from industry luminaries, see live demonstrations, and talk to technology vendors about how to get started with cloud storage.

The audience for the SNIA Cloud Burst Summit is IT storage professionals and related colleagues who are looking to cloud storage as a solution for their IT environments. The day’s agenda will be packed with presentations from cloud industry luminaries, the latest cloud development panel discussions, a focus on cloud backup, and a cocktail networking opportunity in the evening.

Check out the Agenda and Register Today…

 

Storage Developer Conference

The SNIA Storage Developer Conference is the premier event for developers of cloud storage, filesystems and storage technologies. The year there is a full cloud track on the Agenda, as well as some great speakers. Some examples include:

Programming the Cloud

CDMI for Cloud IPC

David Slik
Technical Director,
Object Storage
NetApp

Open Source Droplet Library with CDMI Support

Giorgio Regni
CTO,
Scality

CDMI Federations, Year 2

David Slik
Technical Director,
Object Storage,
NetApp

CDMI Retention Improvements

Priya Nc
Principal Software Engineer,
EMC Data Storage Systems

CDMI Conformance and Performance Testing

David Slik
Technical Director,
Object Storage,
NetApp

Use of Storage Security in the Cloud

David Dodgson
Software Engineer,
Unisys

Authenticating Cloud Storage with Distributed Keys

Jason Resch
Senior Software Engineer,
Cleversafe

Resilience at Scale in the Distributed Storage Cloud

Alma Riska
Consultant Software Engineer,
EMC

Changing Requirements for Distributed File Systems in Cloud Storage

Wesley Leggette
Cleversafe, Inc

Best Practices in Designing Cloud Storage Based Archival Solution

Sreenidhi Iyangar
Senior Technical Lead,
EMC

Tape’s Role in the Cloud

Chris Marsh
Market Development Manager,
Spectra Logic

CSI Quarterly Update Q3 2011

A Message from
SNIA Links:

Follow SNIA:
Linkedin
Twitter
Facebook

SNIA Blogs:

Cloud Storage Initiative

Upcoming Activities

Get Involved Now!

A limited number of these activities are open to all, or Join SNIA and the CSI to participate in any of these activities

July Cloud Plugfest

The purpose of the Cloud Plugfest is for vendors to bring their implementations of CDMI and OCCI to test, identify, and fix bugs in a collaborative setting with the goal of providing a forum in which companies can develop interoperable products.

The Cloud Plugfest starts on Tuesday July 12 and runs thru Thursday July 14, 2011 at the SNIA Technology Center in Colorado Springs, CO.  The SNIA Cloud Storage Initiative (CSI) is underwriting the costs of the event, therefore there is no participation fee.

More Information

SNIA Cloud Burst Event

There are a multitude of events dedicated to cloud computing, but where can you go to find out specifically about cloud storage? The 2011 SNIA Cloud Burst Summit educates and offers insight into this fast–growing market segment. Come hear from industry luminaries, see live demonstrations, and talk to technology vendors about how to get started with cloud storage.

More information

Cloud Lab Plugfest at SDC

Plugfests have always been an important part of the Storage Developers Conference and this year will be the first Cloud Lab Plugfest event held over multiple days to test the interoperability of CDMI, OVF and OCCI implementations.

To get involved, please contact: arnold@snia.org

Cloud Pavilion at SNW

Every SNW, one of highlights is the Cloud Pavilion where attendees can see public and private cloud offerings and discuss solutions. Space is limited, so get involved early to ensure your spot.

To get involved, please contact: lisa.mercurio@snia.org

Get your hands on a Storage Cloud

Register-Banner2.jpg

Building your own standards-based private storage cloud.

Tuesday May24th, 1-5pm

Omni Interlocken Hotel,

Broomfield, CO

This year at Gluecon SNIA will be conducting a Hands on Lab workshop for Developers,

This session will take you deeper into cloud storage than you likely have ever been. First we will explore the standard cloud storage interface called CDMI (Cloud Data Management Interface), including some of the rationale and design tradeoffs in its creation.

Learn about how to use the RESTful interface to move data into and out of a storage cloud using a common interface. Learn how CDMI enables data portability between clouds. Dig deep into features such as Data System Metadata (how you order services from the cloud), cloud-side operations, queues, query and more.

Then stick around as we load an open source Java implementation of CDMI onto your laptop to create your own private cloud. Explore the workings of the JAX-RS standard used in this implementation and the storage code working behind the scenes. Advanced users can even implement their own cloud storage features and expose them through the standard interface.