Difference between Hypervisor Virtualization and Container Virtualization
Recently a new technology(well not a new technology actually, but a new way of implementation) got a lot of traction in the open source community and many major players in the industry adopted it as part of their upcoming releases. Its none other than Docker. The traction gained by this open source product was so high, that it became the darling of packaging applications inside a small container, and has become the hottest trend in application development, deployment and testing in months of its initial release.
Docker solves one of the main problem that system administrators and developers faced for years. Its this “It was working on dev and qa. But why the hell is it not working on production environment”. Well the problem most of the times can be a version mismatch of some library or few packages not being installed etc etc. This is where docker steps in, and solves this problem for ever, by making an image of an entire application, with all its dependencies and ship it to your required target environment / server. So in short, if the app worked in your local system, it should work anywhere in the world(because you are shipping the entire thing).
Well you might be thinking even hypervisor based virtualization can solve this problem of “was working in dev and qa but not in production”, by taking an image of an entire virtual host and launching a new virtual instance from it(the thing we do generally in aws, or openstack). Agreed, that can be done. But a container is so light weight that you do not have to go through the hassle of setting up a new host altogether just for your app. In fact it just takes few seconds to pull a container image from a registry and start it. Well that’s the main advantage of using docker container virtualization. We will not be discussing a lot about Docker here, simply because it needs special attention and requires a series of posts to cover it.
We will be discussing the differences between a hypervisor based virtualization and a container based virtualization in this post.
Well the general term virtualization can be defined as follows…
Its nothing but a method or technique used to run an operating system on top of another operating system. So the hardware resources are fully utilized and will be shared by each of the operating system running on top of the base operating system .
The basic idea behind a hypervisor based virtualization is to emulate the underlying physical hardware and create virtual hardware(with your desired resources like processor and memory). And on top of these newly created virtual hardware an operating system is installed. So this type of virtualization is basically operating system agnostic. In other words, you can have a hypervisor running on a windows system create a virtual hardware and can have Linux installed on that virtual hardware, and vice versa.
So the main basic thing to understand about hypervisor based virtualization is that, everything is done based on a hardware level. Which means if the base operating system (the operating system on the physical server, which has hypervisor running), has to modify anything in the guest operating system(which is running on the virtual hardware created by the hypervisor), it can only modify the hardware resources, and nothing else.
A Hypervisor is also called as a virtual machine Monitor(VMM), This is because the hypervisor sits in between the guest operating system and the real physical hardware. Hypervisor controls the resource allocation to the guest operating system running on top of the physical hardware.
Now the main added advantage of virtualization is full utilization of hardware resources(which is costly). Now imagine a situation, where you have a physical server with 10G RAM, 8 core processor, and a 1G NIC card. And you are using that particular server for your organizations internal website and a ftp server for your employees, to share files. Now the server might be idle for most of the times, because there will be no heavy usage of that internal website and FTP server. The entire hardware resources dedicated to it, remains idle most of the times, and is a waste of computer resources.
Using virtualization, you can easily make multiple virtual machines inside, and allocate each of them only the required amount of hardware resources(because your website and FTP might not require more than 1G memory), the remaining virtual machines can be used for other purposes.
Bare metal virtualization has only one major difference compared to Hosted virtualization. The difference is that the Hypervisor sits directly on top of the hardware. The hardware device drivers are part of the hypervisor, and there will be only one memory and CPU manager(that is part of the hypervisor which sitting directly on top of the hardware. This is the reason its called as bare metal virtualization)
What is Container Virtualization?
While discussing about hosted and bare metal virtualization, one common thing that we found was that both of them are based on a hardware level(basically they are virtualizing hardware resources). But container virtualization is done at the operating system level, rather than the hardware level. The main thing that needs to be understood about container virtualization is...
Each container(well call it guest operating system) shares the same kernel of the base system.
Now you can guess the main advantage of container based virtualization over hosted and bare metal, which is quite obvious. ie. As each containers are sitting on top of the same kernel, and sharing most of the base operating system, containers are much smaller and light weight compared to a virtualized guest operating system. As they are light weight an operating system can have many containers running on top of it, compared to the limited number of guest operating system that you can run.
- Isolation of application environments,
- Resource isolation,
- All of these without impacting performance,
- Sharing a common thing between virtualized hosts, or containers must be easy.
As the container is sharing the kernel with the base system, you can see the processes that are running inside the container from the base system. However when you are inside the container, you will only be able to see its own processes.
This isolation of each containers is provided by a major Linux kernel feature called as cgroups and namespace.
In other words, you can have the same PID, for example 100 inside your container 1, container 2 and container N. This is because each of the container resides in its own PID namespace.
The initial state of a Linux system is "One single namespace for everything(Networks, PID's, Devices etc.)", for containers what we are doing is to slice it to different namespace for each containers to run inside isolated
- PID (Isolated process id's)
- Net (Network interfaces)
- mnt (Mounting file system)
- Hostname separation
- Users.
If you are a system administrator, then am sure you might have already heard of a term called as "chroot". Its a method to make a root files system where your process will run isolated from others. In simple terms, its done so that the process will be unable to access any other data on the system in any way(so an added level of security on a file system level). Am sure you might have done chrooting with Apache, or any web servers for that matter for security reasons(which limits access to the system. Because the root of the process becomes your given directory, hence no access to the real root directory). Container based virtualization takes this idea and expands it to each and every resources that a process/application requires(mentioned above), this is done to such an extent that you can have multiple operating system running with different applications on top, all inside the same kernel.
Now what more do you need to have an isolated environment with different namespace for different applications? The above list will cover almost all application environment requirements.
Now there must be something that will take care of resource allocations to these containers. This very thing is done by Cgroups in Linux Kernel. There is an excellent Red Hat Doc explaining Cgroups.
The original idea of cgroups was originated from Google Engineers, who had to limit resource utilization(CPU, memory etc) for different process groups..
Related: Understanding Processes in Linux
By default on a Linux system, all processes are children of the INIT process. Which means all processes are part of a single tree structure. Now in Cgroup method, different process groups, can exist on a single system. So instead of a single process tree of default linux method, cgroup method can have different trees of process structure(with different parents, and childs will inherit stuff from their parents), all isolated from each other. Now you might have got an idea of how cgroups and namespace are leveraged by container based virtualization.
Related: Linux Booting process Explained
Examples of Container based virtualization includes LXC, OpenVz, Parallels Virtuozzo, Solaris Containers, Docker(Uses LXC), HP-UX containers etc.
Well you might be thinking why all this hype to docker if container based virtualization was there from a long time now. This is because of the excellent API it brings along with it, to manage containers.
- The main advantage of using docker for container based virtualization is because of its ability to easily build, and ship containers. Yeah ship your applications to anybody or a remote server with ease. The entire container is shipped(so it gives you 100 percent guarantee that whatever worked on your local environment will work on your target environment.)
- You get a GIT like version controlling feature for your containers. Yeah that's correct, make changes to your applications and commit it and upload it to your repository or ship the latest version.
- An excellent community shared repository of containers, where you can get ready made containers with your required open source application on top of it. So its simple as installing a package using apt-get or yum. So its easy and faster to pull containers from registries(like yum repositories, for packages) and get running in minutes.
- Reuse: Any existing container built by others can be used to modify and make your own version of the container. Has concepts like base image, on which you can have your own stuff/configuration for your custom application.
Comments
need clarification
Are sun virtual box and oracle VM virtual box examples of hypervisors?
very Informative
Thanks for shedding light on something which has lot of ambiguity..
Excellent Info
This tutorial gave me a good understanding on the basics of virtualization and containers. Thanks!
REALLY NICE article - clear,
REALLY NICE article - clear, logical and easy to understand; with related buzzwords explained.THANKS for all the effort!!!
very simple and straightforwrard article
After reading this article, I know the differences now between Virtualization and Containers. Thanks for sharing this knowledge. Would expect more on examples also :)
That well explained the
That well explained the distinction between virtual machine and virtual container. Thanks
Best and simplest virtualization explanation I have ever read
Thank you very much for such great article. This is what I want to look. Very simple and well explained, I really love your awesome work. This save me a lot of time when I searched for figuring out what are differences between docker and vagrant (and other visualization engines). That is what I'm here.
So thank you again, keep doing your great works.
Thank you so much!
Thank you so much!
Thanks a lot
I was searching for Virtualization image to build the presentation. And I stumbled upon your blog. Very informative. Documented the differences clearly.
Thank you so much.
container
very helpful
Thanks for the clear explanation
I really liked the way you have explained the concept it's simple and clear.
Thanks a lot for this , anyone can understand whats virtualization and container means .:)
Very nicely explaned.
Very nicely explaned.
I always refer slashroot whenever I think I need to build the concept first.
Add new comment