Watson: From Jeopardy! to Digital Support Assistant

When IBM Watson premiered on “Jeopardy!” viewers were mesmerized by Watson’s ability to answer the quiz show’s questions and most times, beat the human contestants! Fast-forward to today and the real-world applications extend well beyond playing trivia games. Watson is being deployed in a variety of medical and business scenarios.

In fact, NetApp is now using Watson as part of Elio, a virtual support assistant that responds to queries in natural language. Elio is built using Watson’s cognitive computing capabilities which enable Elio to analyze unstructured data, by using natural language processing to understand grammar and context, interpret complex questions, and evaluate all possible meanings to determine what is being asked. Elio then reasons and identifies the best answers to questions with help from experts who monitor the quality of answers and continue to train Elio on more subjects. It’s a fascinating application of artificial intelligence (AI) that we will discuss in detail at our SNIA Cloud Storage webcast on February 22, 2018, “Customer Support through Natural Language Processing and Machine Learning.”

Elio and Watson represent an innovative and novel use of large quantities of unstructured data to help solve problems, on average, four times faster than traditional methods. Join us at this webcast, where those on the front lines of this innovative application will discuss:

  • The challenges of utilizing large quantities of valuable yet unstructured data
  • How Watson and Elio continuously learn as more data arrives, and navigates an ever growing volume of technical information
  • How Watson understands customer language and provides understandable responses

Learn how these new and exciting technologies are changing the way we look at and interact with large volumes of traditionally hard-to-analyze data. Register now! We look forward to seeing you on the Feb. 22nd.

 

 

Evaluator Group to Share Hybrid Cloud Research

In a recent survey of enterprise hybrid cloud users, the Evaluator Group saw that nearly 60% of respondents indicated that lack of interoperability is a significant technology issue that they must overcome in order to move forward. In fact, lack of interoperability was the number one issue, surpassing public cloud security and network security as significant inhibitors.

The SNIA Cloud Storage Initiative (CSI) is pleased to have John Webster, Senior Partner at Evaluator Group, who will join us on December 12th for a live webcast to dive into the findings of their research. In this webcast, Multi-Cloud Storage: Addressing the Need for Portability and Interoperability, my SNIA Cloud colleague, Mark Carlson, and John will discuss enterprise hybrid cloud objectives and barriers to adoption. John and Mark will focus on cloud interoperability within the storage domain and the CSI’s work that promotes interoperability and portability of data stored in the cloud. Read More

Expert Answers to Cloud Object Storage and Gateways Questions

In our most recent SNIA Cloud webcast, “Cloud Object Storage and the Use of Gateways,” we discussed market trends toward the adoption of object storage and the use of gateways to execute on a cloud strategy.  If you missed the live event, it’s now available on-demand together with the webcast slides. There were many good questions at the live event and our expert, Dan Albright, has graciously answered them in this blog.

Q. Can object storage be accessed by tools for use with big data?

A. Yes. Technically, access to big data is in real-time with HDFS connectors like S3, but it is  conditional on latency and if it is based on local hard drives, it should not be used as the primary storage as it would run very slowly. The guidance is to use hard drive based object storage either as an online archive or a backup target for HDFS.

Q. Will current block storage or NAS be replaced with cloud object storage + gateway?

A. Yes and no.  It’s dependent on the use case. For ILM (Information Lifecycle Management) uses, only the aged and infrequently accessed data is moved to the gateway+cloud object storage, to take advantage of a lower cost tier of storage, while the more recent and active data remains on the primary block or file storage.  For file sync and share, the small office/remote office data is moved off of the local NAS and consolidated/centralized and managed on the gateway file system. In practice, these methods will vary based on the enterprise’s requirements.

Q. Can we use cloud object storage for IoT storage that may require high IOPS?

A. High IOPS workloads are best supported by local SSD based Object, Block or NAS storage.  remote or hard drive based Object storage is better deployed with low IOPS workloads.

Q. What about software defined storage?

A. Cloud object storage may be implemented as SDS (Software Defined Storage) but may also be implemented by dedicated appliances. Most cloud Object storage services are SDS based.

Q. Can you please define NAS?

A. The SNIA Dictionary defines Network Attached Storage (NAS) as:

1. [Storage System] A term used to refer to storage devices that connect to a network and provide file access services to computer systems. These devices generally consist of an engine that implements the file services, and one or more devices, on which data is stored.

2. [Network] A class of systems that provide file services to host computers using file access protocols such as NFS or CIFS.

Q. What are the challenges with NAS gateways into object storage? Aren’t there latency issues that NAS requires that aren’t available in a typical Object store solution?

A. The key factor to consider is workload.  If the workload of applications accessing data residing on NAS experiences high frequency of reads and writes then that data is not a good candidate for remote or hard drive based object storage. However, it is commonly known that up to 80% of data residing on NAS is infrequently accessed.  It is this data that is best suited for migration to remote object storage.

Thanks for all the great questions. Please check out our library of SNIA Cloud webcasts to learn more. And follow us on Twitter @SNIACloud for announcements of future webcasts.

 

How Gateways Benefit Cloud Object Storage

The use of cloud object storage is ramping up sharply especially in the public cloud, where its simplicity can significantly reduce capital budgets and operating expenses. And while it makes good economic sense, enterprises are challenged with legacy applications that do not support standard protocols to move data to and from the cloud.

That’s why the SNIA Cloud Storage Initiative is hosting a live webcast on September 26th, “Cloud Object Storage and the Use of Gateways.”

Object storage is a secure, simple, scalable, and cost-effective means of managing the explosive growth of unstructured data enterprises generate every day. Enterprises have developed data strategies specific to the public cloud; improved data protection, long term archive, application development, DevOps, big data analytics and cognitive artificial intelligence to name a few.

However, these same organizations have legacy applications and infrastructure that are not object storage friendly, but use file protocols like NFS and SMB. Gateways enable SMB and NFS data transfers to be converted to Amazon’s S3 protocol while optimizing data with deduplication, providing QoS (quality of service), and efficiencies on the data path to the cloud.

This webcast will highlight the market trends toward the adoption of object storage and the use of gateways to execute a cloud strategy, the benefits of object storage when gateways are deployed, and the use cases that are best suited to leverage this solution.

You will learn:

  • The benefits of object storage when gateways are deployed
  • Primary use cases for using object storage and gateways in private, public or hybrid cloud
  • How gateways can help achieve the goals of your cloud strategy without
    retooling your on-premise infrastructure and applications

We plan to share some pearls of wisdom on the challenges organizations are facing with object storage in the cloud from a vendor-neutral, SNIA perspective. If you need a firm background on cloud object storage before September 26th, I encourage you to watch the SNIA Cloud on-demand webcast, “Cloud Object Storage 101.” It will provide you with a foundation to get even more out of this upcoming webcast.

I hope you will join us on September 26th. Register now to save your spot.

IP-Based Object Drives Q&A

At our recent SNIA Cloud Storage webcast “IP-Based Object Drives Now Have a Management Standard,” our panel of experts discussed how the SNIA release of the IP-Based Drive Management Standard eases the management of these drives. If you missed the webcast, you can watch it on-demand.

A lot of interesting questions came up during the live event. As promised, here are answers to them:

Q. Am I correct in thinking that each IP based drive will have a unique IP address?

A. Each Ethernet interface on the drive will have its own unique IP Address. Object Drives may be deployed in private address spaces (such as in a fully configured rack). In such configurations, two Object Drives might have the same IP address, but would be on completely separate networks.

Q. Assuming vendors will be using RedFish, will the API calls be made through existing middleware or directly to the BMCs (baseboard management controllers, specialized service processors that monitors the physical state of a computer) on the platforms?

A. Redfish can be supported by host based middleware, the enclosure’s BMC, or may be supported directly from the drive.

Q. Would a drive with native iSCSI protocol and an Ethernet interface be considered an “IP Drive”?  

A. Yes. This is why we use the generic IP Drive term as it allows for multiple protocols to be supported.

Q. What are the data protection schemes supported in the existing products in this space?

A. Examples of data protection typically used with IP drives include erasure encoding and traditional RAID.

Q. Is this approach similar to the WD Ethernet Drive?

A. The WD Ethernet Drive is an IP based drive.

Q. Do you expect to see interposers with higher Ethernet bandwidth that could be used with SSD vs. HDDs?

A. Yes, there are multiple examples starting to appear in the market of interposers for SSDs.

Q. Is this regular Ethernet or NVMe over Fabrics?

A. Regular Ethernet. This does not require Converged Ethernet, nor anything layered on that. NVMe over Fabrics could utilize IP based Drive Management in the future.

 

 

 

 

 

 

 

 

 

Security and Privacy in the Cloud

When it comes to the cloud, security is always a topic for discussion. Standards organizations like SNIA are in the vanguard of describing cloud concepts and usage, and (as you might expect) are leading on how and where security fits in this new world of dispersed and publicly stored and managed data. On July 20th, the SNIA Cloud Storage Initiative is hosting a live webcast “The State of Cloud Security.” In this webcast, I will be joined by SNIA experts Eric Hibbard and Mark Carlson who will take us through a discussion of existing cloud and emerging technologies, such as the Internet of Things (IoT), Analytics & Big Data, and more, and explain how we’re describing and solving the significant security concerns these technologies are creating. They will discuss emerging ISO/IEC standards, SLA frameworks and security and privacy certifications. This webcast will be of interest to managers and acquirers of cloud storage (whether internal or external), and developers of private and public cloud solutions who want to know more about security and privacy in the cloud.

Topics covered will include:

  • Summary of the standards developing organization (SDO) activities:
    • Work on cloud concepts, Cloud Data Management Interface (CDMI), an SLA framework, and cloud security and privacy
  • Securing the Cloud Supply Chain:
    • Outsourcing and cloud security, Cloud Certifications (FedRAMP, CSA STAR)
  • Emerging & Related Technologies:
    • Virtualization/Containers, Federation, Big Data/Analytics in the Cloud, IoT and the Cloud

Register today. We hope to see you on July 20th where Eric, Mark and I will be ready to answer your cloud security questions.

IP-Based Object Drives Now Have a Management Standard

The growing popularity of object-based storage has resulted in the development of Ethernet-connected storage devices, also referred to as IP-Based Drives, that support object interfaces, and in some cases the ability to run applications on the drives themselves. These scale-out storage nodes consist of relatively inexpensive drive-sized enclosures with IP network connectivity, CPU, memory and storage.

While inexpensive to deploy, these solutions require more management than a traditional drive. In order to simplify management of these drives, SNIA has developed and approved the release of the IP-Based Drive Management Specification. On April 20th, the SNIA Cloud Storage Initiative is hosting a live webcast, “IP-Based Object Drives Now Have a Management Standard.” It will be a unique opportunity to learn about this specification from the authors who wrote it. In this webcast, we’ll discuss:

  • Major components of the IP-Based Drive Management Standard
  • How the standard leverages the DMTF Redfish management standard to manage IP-Based Drives
  • The standard management interface for drives that are part of JBOD (Just A Bunch Of Disks) or JBOF (Just A Bunch Of Flash) enclosures

This standard allows drive management to scale to data centers and beyond, enabling high degrees of automation and software only management of data centers. Reserve your spot today to learn more and ask questions to the folks behind the spec. I hope to see you on April 20th.

 

 

Containers, Docker and Storage – An Expert Q&A

Containers continue to be a hot topic today as is evidenced by the more than 2,000 people who have already viewed our SNIA Cloud webcasts, “Intro to Containers, Container Storage and Docker“ and “Containers: Best Practices and Data Management Services.” In this blog, our experts, Keith Hudgins of Docker and Andrew Sullivan of NetApp, address questions from our most recent live event.

Q. What is the major challenge for storage in containerized environment?

A. Containers move fast. Users can spin up and spin down containers extremely quickly. The biggest challenge in production-bound container environments is simply keeping up with the movement of data.

Docker Engine does not delete base container images when the container is shut down. Likewise, Registry assumes you’ve got unlimited storage on hand. For containers that push frequent revisions (as would be the case in a continuous delivery environment), that leads to a lot of orphaned container images that can fill up all available storage if left unchecked.

There are some community-led scripts that will help to keep things in control. That’s the beauty of community-led technology.

Q. What about the speed of retrieving the data from storage?

A. That’s where being a solid storage architect comes in. Every storage system has different strengths and weaknesses, so it’s important to engineer your solution to fit your performance goals. Docker containers are running on the main kernel of the host system. IO is not constrained by abstraction, as in the case of virtual machines. Rather, it is constrained more by density – hundreds of containers on a host can push massive IOPS, so you want your pipes fat and data sources close to the host systems.

Q. Can you expand on moving Docker Volumes from On-Premise bare metal to Cloud Service Providers? Data Migration? Encryption? 

A. None of these capabilities are built-in to Docker Engine. We rely on external storage systems to provide those features. Private-to-cloud replication is primarily a feature of software-based companies, like Portworx, Blockbridge, or Hedvig. Encryption and migration are both common features across other companies as well. Flocker from ClusterHQ is a service broker system that provides many bolt-on features for storage systems they support. You can also use community-supplied services like Ceph to get you there.

Q. Are you familiar with “Flocker” that apparently is able to copy persistent data to another container? Can share your thoughts?

A. Yes. ClusterHQ (makers of Flocker) provide an API broker that sits between storage engines and Docker (and other dynamic infrastructure providers, like OpenStack), and they also provide some bolt-on features like replication and encryption.

Q. Is there any sort of feature in the volume plugins that allows a persistent volume to re-connect to a container if the container is moved across multiple hosts?

A. There’s no feature in plugins to cover that specifically. The plugin API is very simple. In practice, what you would do is write your plugin to expose volumes to Docker Engine on every host that it’s possible to mount that volume. In your container specification, whether it’s a Compose file, DAB file, or what have you, specify the name of your volume. Wherever that unique name is encountered, it will be mounted and attached to the container when it’s re-launched.

If you have more questions on containers, Docker and storage, check out our first Q&A blog: Containers: No Shortage of Interest or Questions.

I also encourage you to join our Containers opt-in email list. It will be a good way to keep up with all the SNIA Cloud is doing on this important technology.

Containers: No Shortage of Interest or Questions

Based on record-breaking registration and attendance at our recent SNIA Cloud webcast, Intro to Containers, Container Storage and Docker, It’s clear that containers is a hot topic that folks want to learn more about – especially from a vendor-neutral authority like SNIA. If you missed the live event, it’s now available on-demand together with the webcast slides.

We were bombarded with questions at the live webcast and we ran out of time before we could answer them all, so as promised, here are answers from our expert presenters, Chad Thibodeau and Keith Hudgins. Oh, and please don’t forget to register for part two of this webcast, Containers: Best Practices and Data Management Services, on December 7, 2016.

Q: Would you please highlight key challenges a company may face to move from a hypervisor to container environment?

CT: The main challenge that gets raised in moving from virtual machines to containers is around security as when deployed on bare-metal, all of the containers share the core operating system. However, there are arguments that containers can still be effectively isolated.

KH: Primarily paring down your applications to their minimum running requirements. This can be quite difficult with long-entrenched legacy applications!

Q: With a VM you allocate a finite amount of vCPU and RAM, with a high degree of confidence that those resources will be available to whatever workload is running in the VM. Is that also true of containers – does (or can) the workload get a guaranteed allocation of CPU and memory resources?

CT: Keith, I’ll let you address this one from the application microservice; from the storage side, an SLA or Quality-of-Service can be defined for a container volume if the storage provider offers this capability.

KH: By default, you don’t allocate CPU or ram availability. Most containers are small enough that it’s not a consideration. However, if you need to specify priority, we have a method to do that. Please review the docs here.

Q: Where are microservices most useful? Are there certain environments where they are more likely to be deployed & which verticals or type of solutions/apps will see more benefit?

CT: Microservices can apply to applications within most verticals; for financials it was mentioned that Goldman Sachs is planning on containerizing 90% of their existing applications to web-service such as Netflix. Some of the determining factors are whether the application(s) would benefit from what container technology provides such as rapid deployment, lightweight, portability, and the ability to scale beyond typical monolithic applications.

KH: Microservices are most useful with network-facing applications that don’t require heavy transactional control. Note that it *is* possible to build transactional microservices, but the best practices on that route hasn’t been optimized yet.

Q: What OS version / Hypervisor, support containerization, are working towards cutting the “noisy neighbor” issue?

CT: Containers are supported by both MS Windows and Linux operating systems. The specific version of Linux OS will be more dependent upon the level of capabilities included (Keith, more your area) and MS Windows Server 2016 is the first release of Windows with container (Docker) support.

KH: Docker supports running containers under Windows and Linux kernels. We don’t care whether it’s on metal or virtualized. It’s possible to set affinity groups in a production Docker installation to help manage noisy neighbor issues, but note that fundamentally Docker is NOT a multi-tenant system.

Q: What is “stateful database”? How does it differ from regular databases?

CT: Most databases are stateful such as Oracle, MySQL, Cassandra, MongoDB or Redis. The confusion may be around the Gartner quote which stated “Stateful Database Applications” in which they simply meant that databases are examples of stateful applications.

KH: Any database is by definition stateful. A “stateless” container is one that is running a process that doesn’t store persistent data to disk. This could be a caching system, web application server, load balancer, queue runner, or pretty much any component that doesn’t need to store data. Everything else is “stateful” and needs some way to shove that data into a reliable datastore.

Q: What factors should be considered when choosing between containers and virtual servers for a given project/use case?

CT: The driving factors for container deployments are: portability, minimal footprint (low overhead since no hypervisor or guest OS), rapid provisioning and de-commissioning, scalability and largely open-source based. If any (or all) of these are deemed valuable to you, then you should consider container deployment technology.

KH: That’s a very broad question! It helps to understand that a container is simply a wrapper around one process that is running on a container host. So it’d be one database service, or one web app server, for example. If you can break up your application into a bunch of these single processes and chain those processes together via networking (DB serves data through the network layer to your cache, which supplies the web app, which is behind the load balancer, etc) then it’s a great candidate for containerization.

Q: Sounds like a lightweight hypervisor?

CT: Containers have been compared to virtual machines as a “lightweight VM”; however, there are distinctions mostly around the fact that the hardware resources are not virtualized for containers, but rather the application is abstracted.

KH: That’s not a bad way to start thinking about it. However, you don’t have a second kernel underneath the hypervisor, so there’s no hardware abstraction. Also, in general you don’t want to run a full OS stack per container, just what you need for the application. That way your containers are lean and efficient.

Q: So is there a practical limit to the number of users you need to have for an app in order for containers/microservices to be preferable vs. traditional apps?

CT: Not necessarily. It is more about what you are trying to achieve with the application and the requirements you have around things like: platform agnostic, portability, ease-of-deployment, scalability, etc. But I wouldn’t put a hard number on when containers make more sense over virtual machines or bare-metal deployments for that matter.

KH: Nope! Microservices is far more about making it easier to build and maintain your applications than it is about scaling. Like anything, you can over-abstract your application design and go extra silly with it, but it’s fundamentally about a better way of managing your applications’ lifecycle than it is about how many users you can push through the pipe.

Q: So is graph and memory the same thing?

CT: Keith, I’ll let you address this one.

KH: Nope. Graph refers to our copy-on-write storage for images at runtime. Our docs can explain it way better than I can in a Q&A session. Look here for more info.

Q: Similar to the Docker Container Networking, are there any specific efforts going on around Docker Storage? For example, are you (or will you be) building any products to support features that you mentioned (such as ‘Storage vMotion’ like capabilities)?

CT: Keith, I’ll let you address this one. However, there are initiatives and activities on the storage side around providing vMotion like capabilities for the data and application state.

KH: It’s always a possibility. There’s nothing I can say right now, but stay tuned.

Q: Let me shift gear, here, where does containerization work with NFV, and how should one correlate to the ask of Telco provider(s)?

CT: Keith, I’ll let you address this one–should be right up your alley.

KH: While this webinar is fundamentally about storage technologies, Docker does have a very broad ecosystem of network partners. NFV is a very broad topic and can’t easily be covered in one quick bite, but there are definitely efforts using Docker as both an enablement technology for NFV, as well as integrating Docker’s built-in networking capabilities in an NFV scope for application delivery.

Q: It would be helpful to circle back at the end and summarize what is Open Source and what is a commercial product, I’m trying to grasp what you miss out on by staying just Open Source. I know that excludes the Universal Control Plane but I don’t yet see what UCP delivers, what its USP is.

CT: Keith, I’ll let you address this one–should be right up your alley.

KH: UCP is the only unique commercial component of Docker. It combines a web-based GUI with role-based access control (RBAC) to make it easier to control security and access to Docker components in a production environment. We do maintain a separate codebase for our commercially supported Engine and Registry, but that’s mainly done to maintain a more stable release, with critical patches backported from the upstream open source projects. Fundamentally, CS Engine and DTR are the same product as their open source upstreams, only on a slower, more stable release cycle. Click here for an overview, and links to some more detailed information on what’s involved in our commercial products:

Q: Is there demand for concurrent access, across container hosts, to persistent data? If so, what are those use case scenarios?

CT: Yes, actually if you think about a micro-service architecture, you will most likely have many containers accessing a common set of container data volumes simultaneously. This is exactly the reason for persistent storage–if the containers running the application services get migrated or moved to other physical nodes, they need to maintain access to their respective container data volumes in many cases.

KH: What a great question! That demand is small, but there. In most cases, persistent data is maintained concurrently through clustering processes (database replication, object storage, etc) but there are some edge cases for large file processing (rendering, big data needs) where there are some asks for that capability.

Q: is it possible to run windows applications on Linux container and vice versa?

CT: To my knowledge, you should be able to run Linux applications within the recently announced Windows Server 2016 supported containers (see article here), but you can’t run Windows applications within Linux containers.

KH: No. A container is essentially a process running in a named concurrency group under the kernel. Therefore, you need a Linux kernel to run Linux processes, and likewise for Windows. It will be possible to run Windows and Linux containers under the same management umbrella very soon. We’re waiting on some network features to roll out in Windows Server 2016 SP1 for that capability.

Q: is it a good idea to run legacy apps in a container? Exactly what is the relation between microservices and containers? Container-like technology used to be popular in various UNIX OSes. What is different now? is the best choice a microservice in a container & spin multiple instantiations fast?

CT: Keith, I’ll let you address this one–should be right up your alley.

KH: Container technology is still popular in several UNIX OSes. Under the hood, a Docker Linux container isn’t very different from a Solaris Zone. The difference is primarily the lifecycle tools to build and maintain your containers from both the developer and operations sides. The newer generation of container runtimes is simply much easier to use than older methods. From a Docker perspective: Docker Hub, the ease of use of the ‘Docker’ CLI command tools, and clustering capabilities in Engine are the main differences. As always, design your architecture to fit your team, user, and application needs. However, if you do want to use a microservices approach, maintaining each part of your application stack as a suite of microservices does make running them widely parallel a strong approach.

Q: Is a micro service self-contained with respect to data requirements. Can a service that depends on an external datasource be a micro service?

CT: A micro service is by definition self-contained; however, it also would typically connect to one or more container data volumes. Regarding accessing external data sources, not exactly sure what is meant here, but the micro services can be running on one physical server with the container data volumes being created and managed on a separate DAS/SAN/NAS storage platform.

KH: Yes, absolutely. An API broker for an external, legacy datastore is a good example of a microservice.

Q: How are container images qualified so that they can be trusted for automated pulls?

CT: Keith, I’ll let you address this one–should be right up your alley. I believe that there is NOT a vetting or certification process done by Docker when posting to either the public Hub or to a trusted registry. This would be the responsibility of the container image developer.

KH: In a few ways. First, containers are rarely built from scratch. They are normally built from base images released by trusted providers like Microsoft, Ubuntu, Red Hat, etc. First you should prove trust in that base image through similar methods as you would a VM image. In a Docker Datacenter install, a user with Admin rights can then bring those base images into Docker Trusted Registry (DTR) and then also do a review of internal images built on top of that base before blessing them to go into production. There are also 3rd party security scanning technologies you can use, should that be a concern.

Q: For stateless applications, can Docker help apply updates to the application without taking a downtime? For example, a container is running version n of an application and version n+1 needs to be deployed without causing a downtime to users, could one spin a new container with version n+1 of the application and deploy it?

CT: If the application is truly stateless, then it shouldn’t matter if they are torn down and restarted on another physical server/node to allow the application of a new patch or OS update on the original node. However, this would need to be correctly architected.

KH: Yes. Using Docker Engine in Swarm mode, we provide a command ‘Docker service update’ to do exactly that. Check the docs.

Q: Flocker vs. Convoy vs. others – could you talk about these interfaces and their adoption?

CT: ClusterHQ (Flocker) has developed a generic storage volume plugin that they then provided back to the Docker community to incorporate the Docker engine. I’m not very familiar with Convoy, but it appears to be a Rancher-developed storage plugin that they have made available as open source, but it is not part of the Docker release.

KH: Flocker and Convoy are brokerage-type volume drivers that have the capability to connect with several storage backends. Each has its own API to talk to and manage volumes under the hood. It’s also possible to integrate directly with Docker’s volume API. If you’re mainly interested in integrating purely with Docker Engine, a direct Docker Volume API plugin is the best approach. However, both Flocker and Convoy provide some ease-of-use features and capabilities that might make it attractive to go their routes. Volume API docs are here.

Q: Does the link in communication between different containers that run microservices incur the very load we are trying to escape from monolithic approach?

CT: Keith, I’ll let you address this one–should be right up your alley.

KH: That’s a very philosophical question! I’d argue that using modern API methods like REST over HTTP is so lightweight that the distributed approach makes more sense.

Q: Docker Swarm?

CT: Keith, I’ll let you address this one.

KH: Swarm is our clustering technology to chain together several Docker Engine hosts into one big cluster. Prior to Engine 1.12, it was a standalone product. After 1.12, we added SwarmKit into Engine to make building and maintaining swarms much easier. For more info, check out old Swarm docs and new Swarm docs.

Q: Do the applications need to be re-written/revised to take advantage of Container approach?

CT: Most legacy or monolithic applications will need to be refactored to best take advantage of a micro-service architecture.

KH: Typically, yes. Web applications are already built in a distributed way, so they’re the easiest to convert.

Q: Do microservices implement Unikernels?

CT: Keith, I’ll let you also address this. My take: Containers run microservices and leverage the entire OS (Linux kernel and all of its libraries, drivers, etc.). A unikernel is a very small and minimalistic kernel that doesn’t contain the additional bloat of the full kernel and therefore, is considered to run that much faster and leaner. Docker acquired a unikernels company and will most likely provide support for running microservices with unikernels and how they may provide a container like wrapper.

KH: Not directly. Unikernels are a new method to run arbitrary runtimes under a single kernel. Docker is currently doing some early work with microkernel technology to improve containers, but it’s not rolled into core Engine yet.

Q: Can a Docker container run on bare metal instead of a host OS directly. If yes, what benefits does this approach provide?

CT: A Docker container requires a host OS to run; however, when we refer to “bare-metal” we are referring to a “non-virtualized server”. The key benefit this provides is that you don’t waste efficiencies by eliminating the hypervisor and guest OS and it is also much more manageable as with the hypervisor and guest OS scenario, you have to manage and maintain all of the VMs that may have different guest OS’s and versions.

KH: No. A Docker container needs Docker Engine to run, so you’ll need to run Engine under a supported OS on the metal. Running Engine on a physical server means your containers will get full “on the iron” IO, since there’s no hypervisor abstraction layer between your container and the hardware it’s running on.

Q: Are packaged software companies like Oracle moving to containerization?

CT: Oracle is developing product offerings that are containerized applications. They would be best able to address your question.

KH: Here is Oracle’s GitHub repository of their official Docker containers and here is their official images in Docker Hub.

See? I told you there was no shortage of questions! If you still have one, please comment in this blog below and we’ll get back to you as soon as we can. Follow us on Twitter @SNIACloud to stay up-to-date on what SNIA Cloud is doing with containers. And don’t forget to register for part two of this webcast, Containers: Best Practices and Data Services, on December 7th. We hope to see you there!

 

 

Cloud Object Storage – You’ve Got Questions, We’ve Got Answers

The SNIA Cloud Storage Initiative hosted a live Webcast “Cloud Object Storage 101.” Like any “101” type course, there were a lot of good questions. Here they all are – with our answers. If you have additional questions, please let us know by commenting on this blog.

Q. How do you envision the new role of tape (LTO) in this unstructured data growth?

A. Exactly the same way that tape has always played a part; it’s the storage medium that requires no power to store cold data and is cheap per bit. Although it has a limited shelf life, and although we believe that flash will eventually replace it, it still has a secure & growing foreseeable future.

Q. What are your thoughts on whether object storage can exist outside the bounds of supporting file systems? Block devices directly storing objects using the key as reference and removing the intervening file system? A hierarchy of objects instead of files?

A. All of these things. Objects can be objects identified by an ID in a flat non-hierarchical structure; or we can impose a hierarchy by key- to objectID translation; or indeed, an object may contain complete file systems or be treated like a block device. There are really no restrictions on how we can build meta data that describes all these things over the bytes of storage that makes up an object.

Q. Can you run write insensitive low latency apps on object storage, ex: virtual machines?

A. Yes. Object storage can be made up of the same stuff as other high performance storage systems; for instance, flash connect via high bandwidth and low latency networks. Or they could even be object stores built over PCIe and NVDIMM.

Q. Is erasure coding (EC) expensive in terms of networking and resources utilization (especially in case of rebuild)?

A. No, that’s one of the advantages of EC. Rebuilds take place by reading data from many disks and writing it to many disks; in traditional RAID rebuilds, the focus is normally on the one disk that’s being rebuilt.

Q. Is there any overhead for small files or object use cases? Do you have a recommended size?

A. Each system will have its own advantages and disadvantages for objects of specific sizes. In general, object stores are designed to store billions of objects, so the number of objects is usually not an issue.

Q. Can you comment on Internet bandwidth limitations on geographically dispersed erasure coded data?

A. Smart caching can make a big difference, but at the end of the day, a geographically EC dispersed object store won’t be faster than a local store. You can’t beat the speed of light.

Q. The suppliers all claim easy exit strategies from their systems. If we were to use one of the on-premise solutions such as ECS or Cleversafe, and then down the road decide to move off-premise, is the migration/egress typically as easy as claimed?

A. In general, any proprietary interface might lock you in. The SNIA’s CDMI is vendor neutral, and supported by a number of vendors. Amazon’s S3 is a popular and common interface. Ultimately, vendors want your data on their systems – and that means making it easy to get the data from a competing vendor’s system; lock-in is not what vendors want. Talk to your vendor and ask for other users’ experiences to get confirmation of their claims.

Q. Based on factual information, where are you seeing the most common use cases for Object Storage?

A. There are many, and each vendor of cloud storage has particular markets. Backup is a common case, as are systems in the healthcare space that treat data such as scans and X-rays as objects.

Q. NAS filers only scale up not out. They are hard to manage at scale. Why use them anymore?

A. There are many NAS systems that scale out as well as up. NFSv4 support high degrees of scale out and there are file systems like Gluster that provide very large-scale solutions indeed, into the multi-petabyte range.

Q. Are there any specific uses cases to avoid when considering object storage?

A. Yes. Many legacy applications will not generate any savings or gains if moved to object storage.

Q. Would you agree with industry statements that 80% of all data written today will NEVER be accessed again; and that we just don’t know WHICH 20% will be read again?

A. Yes to the first part, and no to the second. Knowing which 80% is cold is the trick. The industry is developing smart ways of analyzing data to help with the issue of ensuring cached data is hot data, and that cold data is placed correctly first time around.

Q. Is there also the possibility to bring “compliance” in the object storage? (thinking about banking, medical and other sensible data that needs to be tracked, retention, etc…)

A. Yes. Many object storage vendors provide software to do this.