Q&A – OpenStack Mitaka and Data Protection

At our recent SNIA Webcast “Data Protection and OpenStack Mitaka,” Ben Swartzlander, Project Team Lead OpenStack Manila (NetApp), and Dr. Sam Fineberg, Distinguished Technologist (HPE), provided terrific insight into data protection capabilities surrounding OpenStack. If you missed the Webcast, I encourage you to watch it on-demand at your convenience. We did not have time to get to all of out attendees’ questions during the live event, so as promised, here are answers to the questions we received.

Q. Why are there NFS drivers for Cinder?

 A. It’s fairly common in the virtualization world to store virtual disks as files in filesystems. NFS is widely used to connect hypervisors to storage arrays for the purpose of storing virtual disks, which is Cinder’s main purpose.

 Q. What does “crash-consistent” mean?

 A. It means that data on disk is what would be there is the system “crashed” at that point in time. In other words, the data reflects the order of the writes, and if any writes are lost, they are the most recent writes. To avoid losing data with a crash consistent snapshot, one must force all recently written data and metadata to be flushed to disk prior to snapshotting, and prevent further changes during the snapshot operation.

Q. How do you recover from a Cinder replication failover?

 A. The system will continue to function after the failover, however, there is currently no mechanism to “fail-back” or “re-replicate” the volumes. This function is currently in development, and the OpenStack community will have a solution in a future release.

 Q. What is a Cinder volume type?

 A. Volume types are administrator-defined “menu choices” that users can select when creating new volumes. They contain hidden metadata, in the cinder.conf file, which Cinder uses to decide where to place them at creation time, and which drivers to use to configure them when created.

 Q. Can you replicate when multiple Cinder backends are in use?

 A. Yes

 Q. What makes a Cinder “backup” different from a Cinder “snapshot”?

 A. Snapshots are used for preserving the state of a volume from changes, allowing recovery from software or user errors, and also allowing a volume to remain stable long enough for it to be backed up. Snapshots are also very efficient to create, since many devices can create them without copying any data. However, snapshots are local to the primary data and typically have no additional protection from hardware failures. In other words, the snapshot is stored on the same storage devices and typically shares disk blocks with the original volume.

Backups are stored in a neutral format which can be restored anywhere and typically on separate (possibly remote) hardware, making them ideal for recovery from hardware failures.

 Q. Can you explain what “share types” are and how they work?

 A. They are Manila’s version of Cinder’s volume types. One key difference is that some of the metadata about them is not hidden and visible to end users. Certain APIs work with shares of types that have specific capabilities.

 Q. What’s the difference between Cinder’s multi-attached and Manila’s shared file system?

A. Multi-attached Cinder volumes require cluster-aware filesystems or similar technology to be used on top of them. Ordinary file systems cannot handle multi-attachment and will corrupt data quickly if attached more than one system. Therefore cinder’s multi-attach mechanism is only intended for fiesystems or database software that is specifically designed to use it.

Manilla’s shared filesystems use industry standard network protocols, like NFS and SMB, to provide filesystems to arbitrary numbers of clients where shared access is a fundamental part of the design.

 Q. Is it true that failover is automatic?

 A. No. Failover is not automatic, for Cinder or Manila

 Q. Follow-up on failure, my question was for the array-loss scenario described in the Block discussion. Once the admin decides the array has failed, does it need to perform failover on a “VM-by-VM basis’? How does the VM know to re-attach to another Fabric, etc.?

A. Failover is all at once, but VMs do need to be reattached one at a time.

 Q. What about Cinder? Is unified object storage on SHV server the future of storage?

 A. This is a matter of opinion. We can’t give an unbiased response.

 Q. What about a “global file share/file system view” of a lot of Manila “file shares” (i.e. a scalable global name space…)

 A. Shares have disjoint namespaces intentionally. This allows Manila to provide a simple interface which works with lots of implementations. A single large namespace could be more valuable but would preclude many implementations.

 

 

Data Protection and OpenStack Mitaka

Interested in data protection and storage-related features of OpenStack? Then please join us for a live SNIA Webcast “Data Protection and OpenStack Mitaka” on June 22nd. We’ve pulled together an expert team to discuss the data protection capabilities of the OpenStack Mitaka release, which includes multiple new resiliency features. Join Dr. Sam Fineberg, Distinguished Technologist (HPE), and Ben Swartzlander, Project Team Lead OpenStack Manila (NetApp), as they dive into:

  • Storage-related features of Mitaka
  • Data protection capabilities – Snapshots and Backup
  • Manila share replication
  • Live migration
  • Rolling upgrades
  • HA replication

Sam and Ben will be on-hand for a candid Q&A near the end of the Webcast, so please start thinking about your questions and register today. We hope to see you there!

This Webcast is co-sponsored by two groups within the Storage Networking Industry Association (SNIA): the Cloud Storage Initiative (CSI), and the Data Protection & Capacity Optimization Committee (DPCO).

 

Data Protection in the Cloud FAQ

SNIA recently hosted a multi-vendor discussion on leveraging the cloud for data protection. If you missed the Webcast, “Moving Data Protection to the Cloud: Trends, Challenges and Strategies”, it’s now available on-demand. As promised during the live event, we’ve compiled answers to some of the most frequently asked questions on this timely topic. Answers from SNIA as well as our vendor panelists are included. If you have additional questions, please comment on this blog and we’ll get back to you as soon as possible

Q. What is the significance of NIST FIPS 140-2 Certification?

Acronis: FIPS 140-2 Certification is a requirement by certain entities to use cloud-based solutions. It is important to understand the customer you are going after and whether this will be a requirement. Many small businesses do not require FIPS but certain do.

Asigra: Organizations that are looking to move to a cloud-based data protection solution should strongly consider solutions that have been validated by the National Institute of Standards and Technology (NIST), an agency of the U.S. Department of Commerce, as this certification represents that the solution has been tested and maintains the most current security requirement for cryptographic modules, or encryption. It is important to validate that the data is encrypted at rest and in flight for security and compliance purposes. NIST issues numbered certificates to solution providers as the validation that their solution was tested and approved.

SolidFire: FIPS 140-2 has 4 levels of security, 1- 4 depending on what the application requires.  FIPS stands for Federal Information Processing Standard and is required by some non-military federal agencies for hardware/software to be allowed in their datacenter.  This standard describes the requirements for how sensitive but unclassified information is stored.  This standard is focused on how the cryptographic modules secure information for these systems.

Q. How do you ensure you have real time data protection as well as protection from human error?  If the data is replicated, but the state of the data is incorrect (corrupt / deleted)… then the DR plan has not succeeded.

SNIA: The best way to guard against human error or corruption is with regular point-in-time snapshots; some snapshots can be retained for a limited length of time while others are kept for as long as the data needs to be retained.  Snapshots can be done in the cloud as well as in local storage.

Acronis: Each business needs to think through their retention plan to mitigate such cases. For example, they would run 7 daily backups, 4 weekly backups, 12 monthly backups and one yearly backup. In addition it is good to have a system that allows one to test the backup with a simulated recovery to guarantee that data has not been corrupted.

Asigra: One way for organizations that are migrating to SaaS based applications like Google Apps, Microsoft Office 365 and Salesforce.com to protect their data created and stored in these applications is to consider a cloud-based data protection solution to back up the data from these applications to a third party cloud to meet the unique data protection requirements of your organization. You need to take the responsibility to protect your data born in the cloud much like you protect data created in traditional on premise applications and databases. The responsibility for data protection does not move to the SaaS application provider, it remains with you.

For example user error is one of the top ways that data is lost in the cloud. With Microsoft Office 365 by default, deleted emails and mailboxes are unrecoverable after 30 days; if you cancel your subscription, Microsoft deletes all your data after 90 days; and Microsoft’s maximum liability is $5000 US or what a customer paid during the last 12 months on subscription fees – assuming you can prove it was Microsoft’s fault. All the more reason you need to have a data protection strategy in place for data born in the cloud.

SolidFire: You need to have a technology that provides a real-time asynchronous replication technology achieving a low RPO that does not rely on snapshots.  Application consistent snapshots must be used concurrently with a real-time replication technology to achieve real time and point in time protection.  For the scenario of performing a successful failover, but then you have corrupted data.  With application consistent snapshots at the DR site you would be able to roll back instantly to a point in time when the data and app was in a known good state.

Q. What’s the easiest and most effective way for companies to take advantage of cloud data protection solutions today? Where should we start?

SNIA: The easiest way to ease into using cloud storage is to either (1) use the direct cloud interface of your backup software if it has one to set up an offsite backup, or (2) use a cloud storage gateway that allows public or private cloud storage to appear as another local NAS resource.

Acronis: The easiest way is to use a solution that supports both cloud and on premises data protection. Then they can start by backing up certain workloads to the cloud and adding more over time. Today, we see that many workloads are protected with both a cloud and on premise copy.

Asigra: Organizations should start with non-production, non-critical workloads to test the cloud-based data protection solution to ensure that it meets their needs before moving to critical workloads. Identifying and understanding their corporate requirements for a public, private and/or hybrid cloud architecture is important as well as identifying the workloads that will be moved to the cloud and the timing of this transition. Also, organizations may want to consult with a third party IT Solutions Provider who has the expertise and experience with cloud-based data protection solutions to explore how others are leveraging cloud-based solutions, as well as conduct a data classification exercise to understand which young data needs to be readily available versus older data that needs to be retained for longer periods of time for compliance purposes. It is important that organizations identify their required Recovery Time Objectives and Recovery Point Objectives when setting up their new solution to ensure that in the event of a disaster they are able to meet these requirements. Tip: Retain the services of a trusted IT Solution Provider and run a proof of concept or test drive the solution before moving to full production.

SolidFire: Find a simple and automated solution that fits into your budget.  Work with your local value added reseller of data protection services.  The best thing to do is NOT wait.  Even if it’s something like carbonite… it’s better than nothing.  Don’t get caught off guard.  No one plans for a disaster.

Q. Is it sensible to move to a pay-as-you-go service for data that may be retained for 7, 10, 30, or even 100 years?

SNIA: Long term retention does demand low cost storage of course, and although the major public cloud storage vendors offer low pay-as-you-go costs, those costs can add up to significant amounts over a long period of time, especially if there is any regular need to access the data.  An organization can keep control over the costs of long term storage by setting up an in-house object storage system (“private cloud”) using “white box” hardware and appropriate software such a what is offered by Cloudian, Scality, or Caringo.  Another way to control the costs of long-term storage is via the use of tape.  Note that any of these methods — public cloud, private cloud, or tape — require an IT organization, or their service provider to regularly monitor the state of the storage and periodically refresh it; there is always potential over time for hardware to fail, or for the storage media to deteriorate resulting in what is called bit rot.

Acronis: The cost of storage is dropping dramatically and will continue to do so. The best strategy is to go with a pay as you go model with the ability to adjust pricing (downward) at least once a year. Buying your own storage will lock you into pricing over too long of a period.

SolidFire: The risk of moving to a pay-as-you-go service for that long is that you lock your self in for as long as you need to keep the data.  Make sure that contractually you can migrate or move the data from them, even if it’s for a fee.  The sensible part is that you can contract that portion of your IT needs out and focus on your business and advancing it…. Not worrying about completing backups on your own.

Q. Is it possible to set up a backup so that one copy is with one cloud provider and another with a second cloud provider (replicated between them, not just doing the backup twice) in case one cloud provider goes out of business?

SNIA: Standards like the SNIA’s CDMI (Cloud Data Management Interface) make replication between different cloud vendors pretty straightforward, since CDMI provides a data and metadata neutral way of transferring data; and provides both standard and extensible metadata to control policy too.

Acronis: Yes, this is possible but this is not a good strategy to mitigate a provider going out of business. If that is a concern then pick a provider you trust and one where you control where the data is stored. Then you can easily switch provider if needed.

SolidFire: Yes setting up a DR site and a tertiary site is very doable.  Many data protection software companies available do this for you with integrations at the cloud providers.  When looking at data protection technology make sure their policy engine is capable of being aware of multiple targets and moving data seamlessly between them.  If you’re worried about cloud service providers going out of business make sure you bet on the big ones with proven success and revenue flow.