SNIA CSI Welcomes Glyn Bowden

At our annual SNIA Members’ Symposium in San Jose, the Cloud Storage Initiative (CSI) elected our 2015 CSI board. I’d like to officially welcome our newest board member, Glyn Bowden from HP. HP now joins our growing list of member companies.

The CSI is committed to the adoption, growth and standardization of cloud storage and related cloud data services to promote interoperability and portability of data stored in the cloud. CSI leads as an industry-neutral authority on cloud storage environments and is committed to educating vendor and end user communities on cloud storage & industry standardization benefits.

It’s only the beginning of March and we’ve already hosted several educational Webcasts on topics ranging from OpenStack Cloud Storage and OpenStack Manila, to CDMI and the LTFS Bulk Transfer Standard. All CSI Webcasts are available on-demand. I encourage you to check then out.

LTFS Bulk Transfer Standard Q&A

Our recent live SNIA Cloud Webcast “LTFS Bulk Transfer Standard” is now available on-demand. Thanks to all the folks who attended the live event. We did not have time to address all of the questions, so here are answers to them. If you think of additional questions, please feel free to comment on this blog.

Q. The LTFS standard seems to support shared extents between files, and by extension, deduplicated files. Is this a correct assessment, and how does it play in the bulk transfer standard?

A. The LTFS Bulk Transfer Standard supports shared extents as supported by the LTFS standard, which can transparently reduce space used by having multiple references to common data stored on tape (deduplication). This typically happens below the bulk transfer layer, by the software used to read and write the LTFS volumes. At this point, few software packages support this feature due to the wear and latency consequences of read seeks resulting from using this feature.

Q. What is the state of the standard in its lifecycle? (e.g., working group draft, public review, published, etc.)

A. The LTFS standard has been around for some time; more information can be found here at http://www.snia.org/tech_activities/standards/curr_standards/ltfs. The LTFS Bulk Transfer Standard is here at http://www.snia.org/tech_activities/publicreview#ltfsbulk, and is in public review.

Q. The standard seems to be based on the idea of moving physical tapes to the cloud. Is there a definition of a virtual LTFS image that can be moved between systems over the network?

A. Not yet, but that is a great idea we’ll be taking forward in the next versions of the proposal.

Q. One of the barriers to greater use of LTFS in the Cloud is the relative lack of enterprise grade management software that ensures that the tape media is refreshed / upgraded as it ages, that its integrity is periodically checked, that reclamation and compaction is done. It needs open standards for support in standard volume management systems as well. Until these things are in place, LTFS will be interesting largely to specialized industries like film/entertainment, seismic, and bulk transfer & bulk storage — but not about the steady-state use of tape as a true additional layer of the cloud storage hierarchy. Tape with LTFS plus proper management could fill this role — but not until the full lifecycle tape management is available and integrated.

A. The management that is always required for a physical product with a well-defined and finite lifetime is not a unique requirement of LTFS. Tape has a long history of use as a backup and archive medium, and there are a number of tape management products that are commercially available from LTO tape suppliers and independent software companies, as well as open source products. A Google search for “tape management software” will provide you with a number of alternative solutions.

Q. Do you have a list or people that sell LTFS based solutions?

A. No we don’t, but it’s a very good idea, and we’ll investigate it further.

 

What’s New in the CDMI 1.1 Cloud Storage Standard

On December 2, 2014, the CSI is hosting a Developer Tutorial Webcast “Introducing CDMI 1.1” to dive into all the capabilities of CDMI 1.1.

Register now to join David Slik, Co-Chair, SNIA Cloud Storage Technical Work Group and me, Alex McDonald, as we’ll explore what’s in this major new release of the CDMI standard, with highlights on what you need to know when moving from CDMI 1.0.2 to CDMI 1.1.

The latest release – CDMI 1.1 –  includes:

  • Enabling support for other popular industry supported cloud storage protocols such as OpenStack Swift and Amazon S3
  • A variety of extensions, some part of the core specification and some stand-alone, that include a CIMI standard extension, support for immediate queries , an LTFS Export extension, an OVF extension, along with multi-part MIME and versioning extensions. A full list can be found here.
  • 100% backwards compatibility with ISO certified CDMI v. 1.0.2 to ensure continuity and backward compatibility with existing CDMI systems
  • And more

This event on December 2nd will be live, so please bring your specific questions. We’ll do our best to answer them on the spot. I hope you’ll join us!

 

Object Storage 201 Q&A

Now available on-demand, our recent live CSI Webcast, “Object Storage 201: Understanding Architectural Trade-Offs,” was a highly-rated event that almost 250 people have seen to date. We did not have time to address all of the questions, so here are answers to them. If you think of additional questions, please feel free to comment on this blog.

Q. In terms of load balancers, would you recommend a software approach using HAProxy on Linux or a hardware approach with proprietary appliances like F5 and NetScaler?

A. This really depends on your use case. If you need HA load balancers, or load balancers that can maintain sessions to particular nodes for performance, then you probably need commercial versions. If you just need a basic load balancer, using a software approach is good enough.

Q. With billions of objects what Erasure Codes are more applicable in the long term? Reed Solomon where code words are very small resulting in many billions of code words or Fountain type codes such as LDPC where one can utilize long code words to manage billions of objects more efficiently?

A. Tracking Erase Code fragments have a higher cost than replication but the tradeoff is higher HDD utilization. Using Rateless coding lowers this overhead because each Fragment has equal value. Reed Solomon requires knowledge of fragment placement for repair.

Q. What is the impact of having HDDs of varying capacity within the object store?  Does that affect hashing algorithms in any way?

A. The smallest logical storage unit is a Volume. Because Scale-Out does not stripe volumes there is no impact. Hashing, being used for location would not understand volume size, so a separate Database is used, on a volume basis, to track open space. Hashing algorithms can be modified to suit the underlying disk. The problem is not so much whether they can be designed a priority for the underlying system, but really the rigidity they introduce by tying placement very tightly with topology. That makes failure / exception handling hard.

Q. Do you think RAID6 is sufficient protection with these types of Object Storage Systems or do we need higher parity based Erasure codes?

A. RAID6 makes sense for a Direct Attached storage solution where all drives in the RAID Set can maintain sync. Unlike filesystems (with a few exceptions) Scale-Out Object Storage systems are “Storage as a workload” systems that already have protection as part of the system. So the question is what data protection method is used on solution x as apposed to solution y. You must also think about what you are trying to do.  Are you trying to protect against a single disk failure, or are you trying to protect against a node failure, or are you trying to protect against a site failure. Disk failures – RAID is great, but not if you’re trying to do node failure or site failure. Site failure is an EC sweet spot, but hard to solve from a deployment perspective.

Q. Is it possible to brief how this hash function decides the correct data placement order among the available storage nodes?

A. Take a look at the following links: “http://en.wikipedia.org/wiki/Consistent_hashing“; https://swiftstack.com/openstack-swift/architecture/

Q. What do you consider to be a typical ratio of controller to storage nodes? Is it better to separate the two, or does it make sense to consolidate where a node is both controller and storage?

A. The flexibility of Scale-Out Object Storage makes these two components independently scalable. The systems we test all have separate controllers and storage nodes so we can test this independence. This is also very dependent on the Object Store technology you use. We know of some object stores where there is a 1GB RAM / TB of data, while there are others that use 1/10 of that.  The compute is dependent on whether you are using erasure coding, and what codes. There is no one answer.

Q. Is the data stored in the Storage depository interchangeable with other vendor’s controller units? For instance, can we load LTO tapes from vendor A’s library to Vendor B’s library and have full access to data?

A. The data stored in these systems are part of the “Storage as a workload” principle. So system metadata used to track Objects stored as a function within the Controller. I would not expect any content stored to be interchangeable with another system architecture.

Q. Would you consider the Seagate Kinetic Open Storage Platform a radical architectural shift in how object storage can be done?  Kinetic basically eliminates the storage server, POSIX and RAID or all of the “busy work” that storage servers are involved in today.

A. Ethernet drives with key value interface provides a new approach to design object storage solution. It is yet to be seen how compelling they are for TCO and infrastructure availability.

Q. Will the inherent reduction in blast radius by the move towards Ethernet-interface HDDs be a major driver of the Ethernet HDD in object stores?

A. Yes. We define Blast Radius by a compute failure that impacts access to connected hard drives. As we lower the Number of Connected Hard Drives to compute the Blast Radius is reduced. For Ethernet drives, you may need redundant Ethernet switches to minimize the blast radius.  Blast radius can be also minimized with intelligent data placements with software as well.

Validating CDMI Features – Metadata Search

Here we go again with an announcement of a cloud offering that again validates an existing standardized feature of CDMI. The new Amazon CloudSearch offering lets you store structured metadata in the cloud and perform queries on the metadata. They missed an opportunity, however, to integrate this with their existing cloud object storage offering. After all, if you already have object storage, why not put the metadata with the data object instead of separating it out in a separate cloud?

CDMI lets you put the user metadata directly into the storage object, where it is protected, backed up, archived and retained along with the actual data. CDMI’s rich query functions are then able to find the storage object based on the values of the metadata without talking to a separate cloud offering with a new, proprietary API.

CDMI standardizes a Query Queue that allows the client to create a scope specification (equivalent to a WHERE clause) to find specific objects that match the criteria, and a results specification (equivalent to a SELECT clause) that determines the elements of the object that are returned for each match. Results are placed in a CDMI queue object and can be processed one at a time, or in bulk. This powerful feature allows any storage cloud that has a search feature to expose it in a standard manner for interoperability between clouds.

An example of the metadata associated with a query queue is as follows:

{
     "metadata" : {
          "cdmi_queue_type" : "cdmi_query_queue",
          "cdmi_scope_specification" : [
               {
                    "domainURI" : "== /cdmi_domains/MyDomain/",
                    "parentURI" : "starts /MyMusic",
                    "metadata" : {
                         "artist" : "*Bono*"
                    }
               }
          ],
          "cdmi_results_specification": {
               "objectID" : "",
               "metadata" : {
                    "title" : ""
               }
          }
     }
}

 

When results are stored in a query queue, each enqueued value consists of a JSON object of MIME-type “application/json”. This JSON object contains the specified values requested in the cdmi_results_specification of the query queue metadata.

An example of a query result JSON object is as follows:

{
     "objectID" : "00007E7F0010EB9092B29F6CD6AD6824",
     "metadata" : {
          "title" : "Vertigo"
     }
}

Thus if you are using your storage cloud for storing music files, for example, all of the metadata for each mp3 object can be stored right along with the object, and CDMI’s powerful query mechanisms can be used to find the files you are interested in without invoking a separate search cloud with disassociated metadata,

Plan to Attend Cloud Burst and SDC

Cloud Storage Developers will be Converging on Santa Clara in September for the Storage Developer Conference and the Cloud Burst Event

Cloud Burst Event

There are a multitude of events dedicated to cloud computing, but where can you go to find out specifically about cloud storage? The 2011 SNIA Cloud Burst Summit educates and offers insight into this fast-growing market segment. Come hear from industry luminaries, see live demonstrations, and talk to technology vendors about how to get started with cloud storage.

The audience for the SNIA Cloud Burst Summit is IT storage professionals and related colleagues who are looking to cloud storage as a solution for their IT environments. The day’s agenda will be packed with presentations from cloud industry luminaries, the latest cloud development panel discussions, a focus on cloud backup, and a cocktail networking opportunity in the evening.

Check out the Agenda and Register Today…

 

Storage Developer Conference

The SNIA Storage Developer Conference is the premier event for developers of cloud storage, filesystems and storage technologies. The year there is a full cloud track on the Agenda, as well as some great speakers. Some examples include:

Programming the Cloud

CDMI for Cloud IPC

David Slik
Technical Director,
Object Storage
NetApp

Open Source Droplet Library with CDMI Support

Giorgio Regni
CTO,
Scality

CDMI Federations, Year 2

David Slik
Technical Director,
Object Storage,
NetApp

CDMI Retention Improvements

Priya Nc
Principal Software Engineer,
EMC Data Storage Systems

CDMI Conformance and Performance Testing

David Slik
Technical Director,
Object Storage,
NetApp

Use of Storage Security in the Cloud

David Dodgson
Software Engineer,
Unisys

Authenticating Cloud Storage with Distributed Keys

Jason Resch
Senior Software Engineer,
Cleversafe

Resilience at Scale in the Distributed Storage Cloud

Alma Riska
Consultant Software Engineer,
EMC

Changing Requirements for Distributed File Systems in Cloud Storage

Wesley Leggette
Cleversafe, Inc

Best Practices in Designing Cloud Storage Based Archival Solution

Sreenidhi Iyangar
Senior Technical Lead,
EMC

Tape’s Role in the Cloud

Chris Marsh
Market Development Manager,
Spectra Logic

CSI Quarterly Update Q3 2011

A Message from
SNIA Links:

Follow SNIA:
Linkedin
Twitter
Facebook

SNIA Blogs:

Cloud Storage Initiative

Upcoming Activities

Get Involved Now!

A limited number of these activities are open to all, or Join SNIA and the CSI to participate in any of these activities

July Cloud Plugfest

The purpose of the Cloud Plugfest is for vendors to bring their implementations of CDMI and OCCI to test, identify, and fix bugs in a collaborative setting with the goal of providing a forum in which companies can develop interoperable products.

The Cloud Plugfest starts on Tuesday July 12 and runs thru Thursday July 14, 2011 at the SNIA Technology Center in Colorado Springs, CO.  The SNIA Cloud Storage Initiative (CSI) is underwriting the costs of the event, therefore there is no participation fee.

More Information

SNIA Cloud Burst Event

There are a multitude of events dedicated to cloud computing, but where can you go to find out specifically about cloud storage? The 2011 SNIA Cloud Burst Summit educates and offers insight into this fast–growing market segment. Come hear from industry luminaries, see live demonstrations, and talk to technology vendors about how to get started with cloud storage.

More information

Cloud Lab Plugfest at SDC

Plugfests have always been an important part of the Storage Developers Conference and this year will be the first Cloud Lab Plugfest event held over multiple days to test the interoperability of CDMI, OVF and OCCI implementations.

To get involved, please contact: arnold@snia.org

Cloud Pavilion at SNW

Every SNW, one of highlights is the Cloud Pavilion where attendees can see public and private cloud offerings and discuss solutions. Space is limited, so get involved early to ensure your spot.

To get involved, please contact: lisa.mercurio@snia.org

Get your hands on a Storage Cloud

Register-Banner2.jpg

Building your own standards-based private storage cloud.

Tuesday May24th, 1-5pm

Omni Interlocken Hotel,

Broomfield, CO

This year at Gluecon SNIA will be conducting a Hands on Lab workshop for Developers,

This session will take you deeper into cloud storage than you likely have ever been. First we will explore the standard cloud storage interface called CDMI (Cloud Data Management Interface), including some of the rationale and design tradeoffs in its creation.

Learn about how to use the RESTful interface to move data into and out of a storage cloud using a common interface. Learn how CDMI enables data portability between clouds. Dig deep into features such as Data System Metadata (how you order services from the cloud), cloud-side operations, queues, query and more.

Then stick around as we load an open source Java implementation of CDMI onto your laptop to create your own private cloud. Explore the workings of the JAX-RS standard used in this implementation and the storage code working behind the scenes. Advanced users can even implement their own cloud storage features and expose them through the standard interface.

CDMI breaks out at SNW Spring

CDMI announcements at SNW Spring

The SNIA co-sponsors the Storage Networking World (SNW) conference twice a year. At the Spring 2011 SNW show, the CDMI specification was updated to version 1.0.1h (online at http://cdmi.sniacloud.com) and the first commercial implementation of CDMI was announced.

The SNIA also put out a press release on the latest developments and progress that CDMI has made, including some new research results:

Cloud Storage Standard
Readies for Widespread Adoption

SNIA is establishing relationships with National and
International Standards Groups; Recent Market Research Reveals
CDMI will be Mainstream in RFPs

Santa Clara, Calif. (April 4th, 2011) — The Storage Networking Industry Association (SNIA) Cloud Storage Initiative (CSI), today announced that the Cloud Data Management Interface (CDMI), released as an official SNIA Architecture one year ago, continues to make significant steps toward broad acceptance.

“A critical part of delivering an industry wide standard is building a strong ecosystem of partners, alliances and supporting programs,” said David Slik, Co–Chair of SNIA Cloud Storage Technical Work Group. “As demonstrated by initiating relationships with nationally and internationally recognized standards bodies and our forthcoming CDMI Plugfest, we are making strong progress around delivering not only a strong standard, but a widely accepted and valued one.”

SNIA’s CDMI standard has been refined over the past year and is now being readied for further de jure standardization. The SNIA has joined the DAPS38 Technical Committee (which is responsible for Cloud Computing, among other technology standards) of INCITS – the primary U.S. focus of standardization in the field of Information and Communications Technologies (ICT). The SNIA has also requested a Category A Liaison relationship with the ISO/IEC JTC 1 SC38 subcommittee for Distributed Application Platforms and Services (DAPS).

CDMI has been citied in numerous cloud roadmaps and studies, including those from ITU–T (International Telecommunication Union), TeleManagement Forum, SIENA (the European Standards and Interoperability for eInfrastructure Implementation Initiative), and NIST (the U.S. National Institute of Standards and Technology). The maturing CDMI Reference Implementation has been through initial testing of the NIST SAJACC (Standards Acceleration to Jumpstart Adoption of Cloud Computing) use cases..

SNIA CSI 2011 sponsored activities include Plugfests , with the first taking place April 19–21, 2011 at the SNIA Technology Center in Colorado Springs, Colorado. The Cloud Plugfest allows vendors to bring their implementations of CDMI and the Open Grid Forum’s Open Cloud Computing Interface (OCCI) to test, identify, and fix bugs in a collaborative setting with the goal of providing a forum in which companies can develop interoperable products. For additional details on participating in the Cloud Plugfest, please visit www.snia.org/cloud/cloudplugfest/ .

SNIA CSI will repeat its “SNIA Cloud Burst Summit” in Santa Clara, California, on September 22, 2011 as an extended program with the SNIA Storage Developer Conference (SDC). In 2010, over 100 attendees participated in the Cloud Burst Summit, joining other cloud strategists and deployment professionals in this highly successful inaugural program that featured noted industry luminary Geoffrey Moore as the keynote speaker on the topic of clouds and IT transformation.

SNIA CSI has also partnered with Storage Strategies NOW to help bring to market research that will help inform the industry of the key insights around cloud storage. This information, which can be found in the IT Professionals Cloud Adoption Survey released today, will provide a valuable service to help users, vendors and the industry at–large track how adoption and use of cloud technologies should be considered. To learn more, visit www.ssg–now.com.

Deni Connor, principal analyst, Storage Strategies NOW added, “Our findings include that Email (66%) is the primary application for cloud storage, followed by backup (59%) and front office applications (45%). Additionally, 53% say that SNIA’s CDMI will be part of cloud storage RFPs/proposals; and 30% of respondents say SNIA’s CDMI is very important for public/hybrid cloud standard”.

Deni Connor, principal analyst, Storage Strategies NOW added, “Our findings include that Email (66%) is the primary application for cloud storage, followed by backup (59%) and front office applications (45%). Additionally, 53% say that SNIA’s CDMI will be part of cloud storage RFPs/proposals; and 30% of respondents say SNIA’s CDMI is very important for public/hybrid cloud standard”.

To learn more about SNIA and CSI stop by the SNIA CSI Cloud Pavilion on Tuesday and Wednesday during SNW Expo Hall hours.

About the SNIA Cloud Storage Initiative
The SNIA Cloud Storage Initiative (CSI) fosters the growth and success of the market for cloud storage for vendors, service providers, and users. Members of the CSI work together to advance the adoption of the SNIA Cloud Data Management Interface (CDMI) standard, educate the IT communities about cloud storage, perform market outreach that highlights the virtues of cloud storage, and collaborate with other industry associations on cloud storage technical work. CSI member companies represent a variety of segments in the IT industry and include Actifio, Asigra, Broadcom, CA Technologies, Cisco, Cleversafe, CoreVault, Desktone, EMC, Hitachi Data Systems, HP, IBM, Iron Mountain, LSI Corporation, Mezeo, NetApp, Novell, Oracle, Scality, Sepaton, SpectraLogic, StorSimple, SwiftTest, Terasky, Terremark, and Xiotech. For more information on SNIA’s Cloud Storage activities, visit snia.org/cloud and get involved in the conversation at twitter.com/SNIACloud or http://groups.google.com/group/snia-cloud.

About SNIA
The Storage Networking Industry Association (SNIA) is a not–for–profit global organization, made up of some 400 member companies spanning virtually the entire storage industry. SNIA’s mission is to lead the storage industry worldwide in developing and promoting standards, technologies, and educational services to empower organizations in the management of information. To this end, the SNIA is uniquely committed to delivering standards, education, and services that will propel open storage networking solutions into the broader market. For additional information, visit the SNIA web site at www.snia.org.