Open Source Software-Only Storage – Really.

Virtually any storage solution is more parts software than hardware. Having said this, users don’t care as much about the percentage of hardware vs. software. They want their consumption experience to be easy and fast to start up, with a pay-as-you-grow model and with the ability to scale without limits. So, it should not be a shock that real IT organizations are using software-only on standard servers to deliver storage to their customers. What’s more, this type of storage can be powered by open source.

At the upcoming SNIA Data Storage Innovation Conference, we are looking forward to discussing software-defined storage (SDS) from a user experience perspective with examples of OpenStack Swift providing an engine for building SDS clusters with any mixed combination of standard server and HDD hardware in a way that is simple enough for any enterprise to dynamically scale.

Swift is a highly available, distributed, scalable object store available as open source.  It is designed to handle non-relational (that is, not just simple row-column data) or unstructured data at large scale with high availability and durability.  For example, it can be used to store files, videos, documents, analytics results, Web content, drawings, voice recordings, images, maps, musical scores, pictures, or multimedia. Organizations can use Swift to store large amounts of data efficiently, safely, and cheaply. It scales horizontally without any single point of failure.  It offers a single multi-tenant storage system for all applications, the ability to use low-cost industry-standard servers and drives, and a rich ecosystem of tools and libraries.  It can serve the needs of any service provider or enterprise working in a cloud environment, regardless of whether the installation is using other OpenStack components.

I know what you are thinking, storage is too critical, so it will never work this way. But the same was said >25 years go when using RAID was seen as too risky given solutions would acknowledge writes while the data was in cache prior to being written to disk. The same was also said >15 years ago when VMware was seen as not robust enough to run any manner of demanding or critical application. Replicas and Erasure Codes are analogous to RAID 1 and RAID 5 respectively, and the uniquely as possible distribution of data behind a single namespace abstracts standard hardware like server virtualization.

Interested in hearing more? Come check out my DSI session, “Swift Use Cases with SwiftStack,” where we look forward to sharing how this new type of storage can work, and to suspend your disbelief that this storage can be enterprise-grade.

 

On-Demand Cloud Storage Webcasts Worth Watching

As the SNIA Cloud Storage Initiative (CSI) starts our 2016 with a new set of educational programs and webcasts on topics of interest to those developing, implementing & managing cloud storage, I thought it might be a good time to remind everyone of the vendor-neutral educational work the CSI has delivered in 2015.

I’m particularly proud of the work the CSI has done through BrightTalk (a web based content delivery platform) in producing live hour-long tutorials on a wide variety of subjects.

What you may not know is that these are also recorded, and you can play them back when it’s convenient to you. I know that we have a global audience, and that when we deliver the live version it may be in the middle of your busy working day – or even in the middle of the night.

As part of SNIA, the CSI supports the development of technical storage standards; and that means some of our audience are developers. For those of you that are interested in more technical presentations we had two developer focussed BrightTalks:

Hierarchical Erasure Coding: Making Erasure Coding Usable

This talk covered two different approaches to erasure coding – a flat erasure code across JBOD, and a hierarchical code with an inner code and an outer code; it compared the two approaches on different parameters that impact the IT business and provided guidance on evaluating object storage solutions.

Expert Panel: Cloud Storage Initiatives – An SDC Preview

At the 2015 Storage Developer Conference (SDC) we presented on a variety of topics:

  • Mobile and Secure – Cloud Encrypted Objects using CDMI
  • Object Drives: A new Architectural Partitioning
  • Unistore: A Unified Storage Architecture for Cloud Computing
  • Using CDMI to Manage Swift, S3, and Ceph Object Repositories

We discussed how encrypted objects can be stored, retrieved, and transferred between clouds, how Object Drives allow storage to scale up and down by single drive increments, end-user and vendor use cases of the Cloud Data Management Interface (CDMI), and we introduced Unistore – an innovative unified storage architecture that efficiently integrates heterogeneous HDD and SCM devices for Cloud storage systems.

(As an added bonus, all these SDC 2015 presentations and others can be found here http://www.snia.org/events/storage-developer/presentations15.)

OpenStack has had a big year, and the CSI contributed to the discussion with:

OpenStack File Services for High Performance Computing

We looked at how OpenStack can consume and control file services appropriate to High Performance Compute in a cloud and multi-tenanted environment and investigated two approaches to integration. One approach is to have OpenStack manage the storage infrastructure services using Cinder, Nova and Neutron to provide HPC Filesystem as a Service. We also reviewed a second option of using Manila file services for OpenStack to control the HPC File system deployment and manage the exports etc. We discussed the development of the Lustre Manila driver and its current progress.

Hybrid clouds were also in the news. We delivered two sessions, specifically targeted at end users looking to understand the technologies:

Hybrid Clouds: Bridging Private & Public Cloud Infrastructures

Every IT consumer is using cloud in one form or another, and just as storage buyers are reluctant to select single vendor for their on-premises IT, they will choose to work with multiple public cloud providers. But this desirable “many vendor” cloud strategy introduces new problems of compatibility and integration. To provide a seamless view of these discrete storage clouds, Software Defined Storage (SDS) can be used to build a bridge between them. This presentation explored how SDS, with its ability to deploy on different hardware and supporting rich automation capabilities, can extend its reach into cloud deployments to support a hybrid data fabric that spans on-premises and public clouds.

Hybrid Clouds Part 2: Case Study on Building the Bridge between Private & Public

There are significant differences in how cloud services are delivered to various categories of users. The integration of these services with traditional IT operations remains an important success factor but also a challenge for IT managers. The key to success is to build a bridge between private and public clouds. This Webcast expanded on the previous Hybrid Clouds: Bridging Private & Public Cloud Infrastructures webcast where we looked at the choices and strategies for picking a cloud provider for public and hybrid solutions.

Lastly, we looked at some of the issues surrounding data protection and data privacy (no, they’re not the same thing at all!).

Privacy v Data Protection: The Impact Int’l Data Protection Legislation on Cloud

Governments across the globe are proposing and enacting strong data privacy and data protection regulations by mandating frameworks that include noteworthy changes like defining a data breach to include data destruction, adding the right to be forgotten, mandating the practice of breach notifications, and many other new elements. The implications of this and other proposed legislation on how the cloud can be utilized for storing data are significant. This webcast covered:

  • EU “directives” vs. “regulation”
  • General data protection regulation summary
  • How personal data has been redefined
  • Substantial financial penalties for non-compliance
  • Impact on data protection in the cloud
  • How to prepare now for impending changes

Moving Data Protection to the Cloud: Trends, Challenges and Strategies

This was a panel discussion; we talked about various new ways to perform data protection using the Cloud and many advantages of using the Cloud this way.

You can access all the CSI BrightTalk Webcasts on demand at the SNIA Website. Many of you will also be happy to learn that PDFs of the Webcast slides are also available there.

We had a good 2015, and I’m looking forward to producing more great educational material during 2016. If you have a topic you’d like to see the CSI cover this year, please comment below in this blog. We value input from all.

Thanks for your support and hopefully we’ll see you some time this year at one of our BrightTalk webcasts.

Exploring the Software Defined Data Center – A SNIA Cloud Webcast

SNIA Cloud is pleased to announce our next live Webcast, “Exploring the Software Defined Data Center.” A Software Defined Data Center (SDDC) is a compute facility in which all elements of the infrastructure – networking, storage, CPU and security – are virtualized and removed from proprietary hardware stacks. Deployment, provisioning and configuration as well as the operation, monitoring and automation of the entire environment is abstracted from hardware and implemented in software. If you ever have a software that you haven’t used before and want to test it before applying it then consider using misra.

The results of this software-defined approach include maximizing agility and minimizing cost, benefits that appeal to IT organizations of all sizes. In fact, understanding SDDC concepts can help IT professionals in any organization better apply these software defined concepts to storage, networking, compute and other infrastructure decisions.

If you’re interested in Software Defined Data Centers and how such a thing might be implemented – and why this concept is important to IT professionals who aren’t involved with building data centers – then please join us on March 15th as Eric Slack, Sr. Analyst with Evaluator Group, will explain what “software defined” really means and why it’s important to all IT organizations. Eric will be joined by Alex McDonald, Chair for SNIA’s Cloud Storage Initiative who will talk about how these concepts apply to the modern data center.

Register now as we’ll explore:

  • How a SDDC leverages this concept to make the private cloud feasible
  • How we can apply SDDC concepts to an existing data center
  • How to develop your own software defined data center environment

As always, this Webcast will be live. Eric, Alex and I will be on hand to answer your questions. We hope you’ll join us on March 15th.

Data Protection in the Cloud FAQ

SNIA recently hosted a multi-vendor discussion on leveraging the cloud for data protection. If you missed the Webcast, “Moving Data Protection to the Cloud: Trends, Challenges and Strategies”, it’s now available on-demand. As promised during the live event, we’ve compiled answers to some of the most frequently asked questions on this timely topic. Answers from SNIA as well as our vendor panelists are included. If you have additional questions, please comment on this blog and we’ll get back to you as soon as possible

Q. What is the significance of NIST FIPS 140-2 Certification?

Acronis: FIPS 140-2 Certification is a requirement by certain entities to use cloud-based solutions. It is important to understand the customer you are going after and whether this will be a requirement. Many small businesses do not require FIPS but certain do.

Asigra: Organizations that are looking to move to a cloud-based data protection solution should strongly consider solutions that have been validated by the National Institute of Standards and Technology (NIST), an agency of the U.S. Department of Commerce, as this certification represents that the solution has been tested and maintains the most current security requirement for cryptographic modules, or encryption. It is important to validate that the data is encrypted at rest and in flight for security and compliance purposes. NIST issues numbered certificates to solution providers as the validation that their solution was tested and approved.

SolidFire: FIPS 140-2 has 4 levels of security, 1- 4 depending on what the application requires.  FIPS stands for Federal Information Processing Standard and is required by some non-military federal agencies for hardware/software to be allowed in their datacenter.  This standard describes the requirements for how sensitive but unclassified information is stored.  This standard is focused on how the cryptographic modules secure information for these systems.

Q. How do you ensure you have real time data protection as well as protection from human error?  If the data is replicated, but the state of the data is incorrect (corrupt / deleted)… then the DR plan has not succeeded.

SNIA: The best way to guard against human error or corruption is with regular point-in-time snapshots; some snapshots can be retained for a limited length of time while others are kept for as long as the data needs to be retained.  Snapshots can be done in the cloud as well as in local storage.

Acronis: Each business needs to think through their retention plan to mitigate such cases. For example, they would run 7 daily backups, 4 weekly backups, 12 monthly backups and one yearly backup. In addition it is good to have a system that allows one to test the backup with a simulated recovery to guarantee that data has not been corrupted.

Asigra: One way for organizations that are migrating to SaaS based applications like Google Apps, Microsoft Office 365 and Salesforce.com to protect their data created and stored in these applications is to consider a cloud-based data protection solution to back up the data from these applications to a third party cloud to meet the unique data protection requirements of your organization. You need to take the responsibility to protect your data born in the cloud much like you protect data created in traditional on premise applications and databases. The responsibility for data protection does not move to the SaaS application provider, it remains with you.

For example user error is one of the top ways that data is lost in the cloud. With Microsoft Office 365 by default, deleted emails and mailboxes are unrecoverable after 30 days; if you cancel your subscription, Microsoft deletes all your data after 90 days; and Microsoft’s maximum liability is $5000 US or what a customer paid during the last 12 months on subscription fees – assuming you can prove it was Microsoft’s fault. All the more reason you need to have a data protection strategy in place for data born in the cloud.

SolidFire: You need to have a technology that provides a real-time asynchronous replication technology achieving a low RPO that does not rely on snapshots.  Application consistent snapshots must be used concurrently with a real-time replication technology to achieve real time and point in time protection.  For the scenario of performing a successful failover, but then you have corrupted data.  With application consistent snapshots at the DR site you would be able to roll back instantly to a point in time when the data and app was in a known good state.

Q. What’s the easiest and most effective way for companies to take advantage of cloud data protection solutions today? Where should we start?

SNIA: The easiest way to ease into using cloud storage is to either (1) use the direct cloud interface of your backup software if it has one to set up an offsite backup, or (2) use a cloud storage gateway that allows public or private cloud storage to appear as another local NAS resource.

Acronis: The easiest way is to use a solution that supports both cloud and on premises data protection. Then they can start by backing up certain workloads to the cloud and adding more over time. Today, we see that many workloads are protected with both a cloud and on premise copy.

Asigra: Organizations should start with non-production, non-critical workloads to test the cloud-based data protection solution to ensure that it meets their needs before moving to critical workloads. Identifying and understanding their corporate requirements for a public, private and/or hybrid cloud architecture is important as well as identifying the workloads that will be moved to the cloud and the timing of this transition. Also, organizations may want to consult with a third party IT Solutions Provider who has the expertise and experience with cloud-based data protection solutions to explore how others are leveraging cloud-based solutions, as well as conduct a data classification exercise to understand which young data needs to be readily available versus older data that needs to be retained for longer periods of time for compliance purposes. It is important that organizations identify their required Recovery Time Objectives and Recovery Point Objectives when setting up their new solution to ensure that in the event of a disaster they are able to meet these requirements. Tip: Retain the services of a trusted IT Solution Provider and run a proof of concept or test drive the solution before moving to full production.

SolidFire: Find a simple and automated solution that fits into your budget.  Work with your local value added reseller of data protection services.  The best thing to do is NOT wait.  Even if it’s something like carbonite… it’s better than nothing.  Don’t get caught off guard.  No one plans for a disaster.

Q. Is it sensible to move to a pay-as-you-go service for data that may be retained for 7, 10, 30, or even 100 years?

SNIA: Long term retention does demand low cost storage of course, and although the major public cloud storage vendors offer low pay-as-you-go costs, those costs can add up to significant amounts over a long period of time, especially if there is any regular need to access the data.  An organization can keep control over the costs of long term storage by setting up an in-house object storage system (“private cloud”) using “white box” hardware and appropriate software such a what is offered by Cloudian, Scality, or Caringo.  Another way to control the costs of long-term storage is via the use of tape.  Note that any of these methods — public cloud, private cloud, or tape — require an IT organization, or their service provider to regularly monitor the state of the storage and periodically refresh it; there is always potential over time for hardware to fail, or for the storage media to deteriorate resulting in what is called bit rot.

Acronis: The cost of storage is dropping dramatically and will continue to do so. The best strategy is to go with a pay as you go model with the ability to adjust pricing (downward) at least once a year. Buying your own storage will lock you into pricing over too long of a period.

SolidFire: The risk of moving to a pay-as-you-go service for that long is that you lock your self in for as long as you need to keep the data.  Make sure that contractually you can migrate or move the data from them, even if it’s for a fee.  The sensible part is that you can contract that portion of your IT needs out and focus on your business and advancing it…. Not worrying about completing backups on your own.

Q. Is it possible to set up a backup so that one copy is with one cloud provider and another with a second cloud provider (replicated between them, not just doing the backup twice) in case one cloud provider goes out of business?

SNIA: Standards like the SNIA’s CDMI (Cloud Data Management Interface) make replication between different cloud vendors pretty straightforward, since CDMI provides a data and metadata neutral way of transferring data; and provides both standard and extensible metadata to control policy too.

Acronis: Yes, this is possible but this is not a good strategy to mitigate a provider going out of business. If that is a concern then pick a provider you trust and one where you control where the data is stored. Then you can easily switch provider if needed.

SolidFire: Yes setting up a DR site and a tertiary site is very doable.  Many data protection software companies available do this for you with integrations at the cloud providers.  When looking at data protection technology make sure their policy engine is capable of being aware of multiple targets and moving data seamlessly between them.  If you’re worried about cloud service providers going out of business make sure you bet on the big ones with proven success and revenue flow.

 

Q&A – The Impact of International Data Protection Laws on the Cloud

The impact of international data protection legislation on the cloud is complicated and constantly changing. In our recent SNIA Cloud Storage Webcast on this topic we did our best to cover some of the recent global data privacy and data protection regulations being enacted. If you missed the Webcast, I encourage you to watch it on-demand at your convenience. We answered questions during the live event, but as promised we’re providing more complete answers in this blog. If you have additional questions, please comment here and we’ll reply as soon as we can.

The law is complex, and neither SNIA, the authors nor the presenters of this presentation are lawyers. Nothing here or in the presentation should be construed as legal advice. For that you need the services of a qualified professional.

Q. What are your thoughts on Safe Harbour being considered invalid, and the potential for a Safe Harbour 2

A. Since 6 October 2015 when the European Court of Justice invalidated the European Commission’s Safe Harbour Decision, there’s been a lot written about Safe Harbour 2 in the press. But it was clear that a renegotiation was essential two years before that, when discussions for a replacement were started. Many think (and many hope!) that a new and valid agreement in terms of Europe’s Human Rights legislation will be settled between the US and Europe sometime in March 2016.

Q. Are EU Model Clauses still available to use instead of BCRs (Binding Corporate Rules)?

A. EU-US data transfers facilitated by the use of model clauses probably today fail to comply with EU law. But as there appears to be no substitute available, the advice appears to be – use them for now until the problem is fixed. Full guidance can be found on the EC website.

Q. What does imbalance mean relative to consent?

A. An example might help. You might be an employee and agree (the “consent”) to your data being used by your employer in ways that you might not have agreed to normally – perhaps because you feel you can’t refuse because you might lose your job or a promotion for example. That’s an imbalanced relationship, and the consent needs to be seen in that light, and the employer needs to demonstrate that there has been, and will be, no coercion to give consent.

Come See SNIA at the Software-Defined Infrastructure Summit

Demand for software-defined infrastructure (SDI) is on the rise, and with good reason. SDI helps data centers meet the challenges of cloud computing, big data/analytics, mobility and social media, in an agile and cost-effective way.  I’m pleased to announce that SNIA will be an active participant at next week’s Software-Defined Infrastructure Summit in Santa Clara, CA, December 1-3.

My colleagues and I at the SNIA Cloud Storage Initiative have organized a “Working with OpenStack” Seminar that kicks off the Summit on Tuesday, December 1.

I will keynote an OpenStack fireside chat along with Chris DePuy, VP, at Dell’Oro Group. We’ll be discussing the SNIA Cloud Data Management Interface (CDMI) and its interface with OpenStack, OpenStack implementations, how standards play, and the future of open source in the 21st century.

My keynote will be accompanied by additional SNIA talks in the Introduction to OpenStack session and the Application Management session:

  • Sam Fineberg, PhD, SNIA Cloud Storage Initiative member and Distinguished Technologist at Hewlett Packard Enterprise Storage, will provide an overview of the storage aspects of OpenStack including the core projects for block storage (Cinder) and object storage (Swift), and the new shared file service (Manila). He’ll cover some common configurations and use cases for these technologies, and discuss how they interact with the other parts of OpenStack.
  • Richelle Ahlvers, SNIA Open Source Task Force member and Principal Storage Management Architect at Avago Technologies, will discuss application integration in OpenStack and how SNIA-developed standards enable cross-vendor management interoperability and help open source projects interoperate with more industry solutions.

Tuesday’s Seminar day will include additional sessions from leaders in OpenStack, Ceph, and Software Defined Storage. SDI Summit days 2 and 3 will provide information on hardware, software, and data center technology and applications of software-defined infrastructure featuring keynotes from IBM, Intel, Red Hat, and VMware, all SNIA member companies.  It’s a must attend event.

SNIA will also be exhibiting at the Summit. Please stop by booth #408 to learn how SNIA standards are used in open source projects including cloud data management, non-volatile memory, self-contained information retention, and storage management. We will also have information on SNIA programs such as membership, certification, conformance testing, and conferences.

SNIA members and colleagues can use the code SPGP to receive a $100 discount on any level of SDI Summit registration. I hope to see you in Santa Clara!

Upcoming Webcast: The Impact of International Data Protection Legislation on the Cloud

Data Privacy vs. data protection has become a heated debate in businesses around the world as governments across the globe are proposing and enacting strong data privacy and data protection regulations. Join us on November 18th for our next Cloud Storage live Webcast “Data Privacy vs. Data Protection: The Impact of International Data Protection Legislation on the Cloud.

Mandating frameworks that include noteworthy changes like defining a data breach to include data destruction, adding the right to be forgotten, mandating the practice of breach notifications, and many other new elements are literally changing the rules when it comes to data protection. The implications of this, and other proposed legislation, on how the cloud can be utilized for storing data are significant. Join this live Webcast to hear:

  • “Directives” vs. “regulation”
  • General data protection regulation summary
  • How personal data has been redefined
  • Substantial financial penalties for non-compliance
  • Impact on data protection in the cloud
  • How to prepare now for impending changes

Our experts, Bob Plumridge, SNIA Europe Board Member; Eric Hibbard, Chair SNIA Security TWG, and I will all be available to answer your questions during the event. I encourage you to register today for this timely discussion. We hope to see you on November 18th!

Moving Data Protection to the Cloud: Key Considerations

Leveraging the cloud for data protection can be an advantageous and viable option for many organizations, but first you must understand the pros and cons of different approaches. Join us on Nov. 17th for our live Webcast, “Moving Data Protection to the Cloud: Trends, Challenges and Strategies” where we’ll discuss the experiences of others with advice on how to avoid the pitfalls, especially during the transition from strictly local resources to cloud resources. We’ve pulled together a unique panel of SNIA experts as well as perspectives from some leading vendor experts Acronis, Asigra and SolidFire who’ll discuss and debate:

  • Critical cloud data protection challenges
  • How to use the cloud for data protection
  • Pros and cons of various cloud data protection strategies
  • Experiences of others to avoid common pitfalls
  • Cloud standards in use – and why you need them

Register now for this live and interactive event. Our entire panel will be available to answer your questions. I hope you’ll join us!

 

OpenStack File Services Options

How can OpenStack consume and control file services appropriate to High Performance Compute (HPC) in a cloud and multi-tenanted environment? Find out on September 22nd when SNIA Cloud hosts a live Webcast and examines two approaches to integration.

One approach is to have OpenStack manage the storage infrastructure services using Cinder, Nova and Neutron to provide HPC Filesystem as a Service.

A second option is to use Manila file services for OpenStack to control the HPC File system deployment and manage the exports etc. This part also looks at the creation (in progress) of the Lustre Manila driver and its current progress.

I hope you’ll join Alex McDonald and me as we discuss the pros and cons of each approach. Register today and

Cloud Storage Development Challenges – An SDC Preview

This year’s Storage Developer Conference (SDC) is expected to draw over 400 storage developers and professionals. On August 4th, you can get a sneak preview of key cloud topics that will be covered at SDC in this live Webcast where David Slik and Mark Carlson Co-Chairs of the SNIA Cloud Technical Work Group, together with Yong Chen, Assistant Professor at Texas Tech University will discuss:

  • Mobile and Secure – Cloud Encrypted Objects using CDMI
  • Object Drives: A new Architectural Partitioning
  • Unistore: A Unified Storage Architecture for Cloud Computing
  • Using CDMI to Manage Swift, S3, and Ceph Object Repositories

You’ll learn how encrypted objects can be stored, retrieved, and transferred between clouds, how Object Drives allow storage to scale up and down by single drive increments, end-user and vendor use cases of the Cloud Data Management Interface (CDMI), and we’ll introduce Unistore – an innovative unified storage architecture that efficiently integrates heterogeneous HDD and SCM devices for Cloud storage systems.

I’ll be moderating the discussion among this expert panel. It should be an enlightening and lively hour. I hope you’ll register now to join us.