Containers continue to be a hot topic today as is evidenced by the more than 2,000 people who have already viewed our SNIA Cloud webcasts, “Intro to Containers, Container Storage and Docker“ and “Containers: Best Practices and Data Management Services.” In this blog, our experts, Keith Hudgins of Docker and Andrew Sullivan of NetApp, address questions from our most recent live event.
Q. What is the major challenge for storage in containerized environment?
A. Containers move fast. Users can spin up and spin down containers extremely quickly. The biggest challenge in production-bound container environments is simply keeping up with the movement of data.
Docker Engine does not delete base container images when the container is shut down. Likewise, Registry assumes you’ve got unlimited storage on hand. For containers that push frequent revisions (as would be the case in a continuous delivery environment), that leads to a lot of orphaned container images that can fill up all available storage if left unchecked.
There are some community-led scripts that will help to keep things in control. That’s the beauty of community-led technology.
Q. What about the speed of retrieving the data from storage?
A. That’s where being a solid storage architect comes in. Every storage system has different strengths and weaknesses, so it’s important to engineer your solution to fit your performance goals. Docker containers are running on the main kernel of the host system. IO is not constrained by abstraction, as in the case of virtual machines. Rather, it is constrained more by density – hundreds of containers on a host can push massive IOPS, so you want your pipes fat and data sources close to the host systems.
Q. Can you expand on moving Docker Volumes from On-Premise bare metal to Cloud Service Providers? Data Migration? Encryption?
A. None of these capabilities are built-in to Docker Engine. We rely on external storage systems to provide those features. Private-to-cloud replication is primarily a feature of software-based companies, like Portworx, Blockbridge, or Hedvig. Encryption and migration are both common features across other companies as well. Flocker from ClusterHQ is a service broker system that provides many bolt-on features for storage systems they support. You can also use community-supplied services like Ceph to get you there.
Q. Are you familiar with “Flocker” that apparently is able to copy persistent data to another container? Can share your thoughts?
A. Yes. ClusterHQ (makers of Flocker) provide an API broker that sits between storage engines and Docker (and other dynamic infrastructure providers, like OpenStack), and they also provide some bolt-on features like replication and encryption.
Q. Is there any sort of feature in the volume plugins that allows a persistent volume to re-connect to a container if the container is moved across multiple hosts?
A. There’s no feature in plugins to cover that specifically. The plugin API is very simple. In practice, what you would do is write your plugin to expose volumes to Docker Engine on every host that it’s possible to mount that volume. In your container specification, whether it’s a Compose file, DAB file, or what have you, specify the name of your volume. Wherever that unique name is encountered, it will be mounted and attached to the container when it’s re-launched.
If you have more questions on containers, Docker and storage, check out our first Q&A blog: Containers: No Shortage of Interest or Questions.
I also encourage you to join our Containers opt-in email list. It will be a good way to keep up with all the SNIA Cloud is doing on this important technology.