My name is Philipp C. Heckel and I write about nerdy things.
This site moved here from blog.philippheckel.com/blog.heckel.xyz!

Hybrid Clouds: A Comparison of Cloud Toolkits


Distributed Systems, Virtualization

Hybrid Clouds: A Comparison of Cloud Toolkits



Contents


Download as PDF: This article is a slightly shortened version of my seminar paper. Feel free to download the original PDF version, or the presentation slides.


3. Cloud Toolkits

After discussing the opportunities and obstacles of cloud computing, this section introduces existing toolkits for deploying private and hybrid clouds, i.e. software to manage a virtual private infrastructure. The first part gives a brief overview of commercial and non-commercial toolkits, and outlines their differences as well as similarities. The second part of the chapter analyzes the technical requirements for deploying a hybrid cloud, and then presents the two open source toolkits OpenNebula and Eucalyptus (was at eucalyptus.com, site now defunct, July 2019) in greater detail.

3.1. Market Overview

Many companies use the hype around cloud computing to advertise their products as cloud-enabling software. And in fact, the vague definition of the term allows a broad interpretation of what this comprises, and makes it rather difficult to determine relevant products. Table 1 and the following paragraphs briefly introduce the most important virtual infrastructure management tools, i.e. cloud toolkits. The scope of this analysis will only include IaaS solutions that allow deploying private and/or hybrid clouds. It does not attempt to be complete, but simply represents a snapshot of well-known toolkits.

VMware vSphere RHEV XenServer Hyper-V Eucalyptus Nimbus Open Nebula oVirt
Hypervisor VMware KVM Xen, Hyper-V Hyper-V, Xen Xen, KVM, VMware Xen Xen, KVM, VMware KVM
VLAN Yes Yes Yes Yes Yes Yes Yes Yes
Scheduling Yes Yes N/A N/A Limited External External No
Live Migr. Yes Yes Yes Yes Yes Yes Yes Yes
High Avail. Yes Yes Yes Yes No No No No
Hybrid Cloud No No No No Partially Partially Yes No
Admin GUI Yes Yes Yes Yes No No No Yes
Req. Intel VT / AMD-V No Yes only for Windows guests No if KVM hypervisor is used only for Windows guests if KVM hypervisor is used if KVM hypervisor is used
Guest OSs W/L/So/N W/R W/R/C/S/D W/S/R Depends Depends Depends Depends
License Propr. Propr. Propr. Propr. BSD Apache 2 Apache 2 GPLv2
Annual Cost up to $4400 per CPU up to $750 per socket Free / up to $5500 per host up to $3300 per CPU Free /
N/A
Free Free Free

Table 1: Comparison of Cloud Toolkits. In case there is a free and premium version, terms in italic font indicate features that are only available in the premium version. This table is mainly based on Sotomajor, 2009, Blanco, 2009 (was at: http://www.opennebula.org/_media/constantino_vazquez_-_opennebula_-_executing_sge_clusters_on_top_of_hybrid_clouds_using_opennebula.ppt, site now defunct, July 2019), VMware Costs, RHEV Pricing (was at: http://www.redhat.com/f/pdf/rhev/DOC113R8-Pricing-and-Licensing-for-RHEV-for-Servers.pdf, site now defunct, July 2019), Citrix Requirements, RHEV Datasheet, and VMware Paravirtualization (was at: http://www.vmware.com/files/pdf/VMware_paravirtualization.pdf, site now defunct, July 2019).

VMware, the biggest player in the virtualization market, offers several cloud-enabling pieces of software: its flagship vSphere, formerly known as VMware Infrastructure, is a full data center virtualization solution. It is based on the ESX hypervisor that manages a single host. By combining many ESX hosts and connecting them to with an internal network, the virtualized servers form a private cloud. Even though VMware advertises vSphere as a hybrid cloud solution, its hybrid abilities are very limited: vSphere is able to scale out only if the public cloud provider also uses vSphere, i.e. using Amazon’s EC2 for cloudbursting is not supported (cmp. McLaughlin, 2009 and Blanco, 2009 (was at: http://www.opennebula.org/_media/constantino_vazquez_-_opennebula_-_executing_sge_clusters_on_top_of_hybrid_clouds_using_opennebula.ppt, site now defunct, July 2019). Because of this major deficit, vSphere rather classifies as pure data center virtualization software rather than a hybrid cloud toolkit. However, in terms of its features, vSphere outperforms its open competitors by orders of magnitude. In particular, VMware offers enterprise features such as high availability, fault-tolerance, or distributed resource scheduling.

With its data center virtualization solution Red Hat Enterprise Virtualization (RHEV), Red Hat also focuses on enterprise customers. The RHEV hypervisor is based on the Kernel-Based Virtual Machine (KVM) and is built from a subset of Red Hat Enterprise Linux (RHEL). RHEV’s features include live migration, virtual networking, and image management. Similar to VMware vSphere, it additionally provides a sophisticated Administration User Interface and enhanced enterprise features.

As relatively new player in the cloud market, Microsoft entered the competition with its virtualization solution Hyper-V. Hyper-V provides similar functionalities as its competitors, but is not yet as advanced in terms of fault-tolerance and other virtualization-specific features. Instead it rather focuses on traditional server features and goes deeper with alert management or monitoring. In the new R2 revision, it added live migration and a cluster-aware file system. Hyper-V is available as standalone hypervisor, or included in the Windows Server 2008 R2 operating system.

Another big player in the virtualization sector is Citrix Systems: with its XenServer, it competes with vSphere and RHEV in the enterprise segment and therefore mainly focuses on private cloud provisioning for data centers. It is based on the Xen hypervisor and allows the deployment of a virtual data center. Like its competitors, it also features live migration (XenMotion), and a simple GUI-based multi-server management. Since February 2009, XenServer is available for free, including most of its enterprise features. The commercial extensions Citrix Essentials allow using Microsoft’s hypervisor Hyper-V, and enable additional features such as high availability and e-mail alerting.

Compared to the commercial products, the open source cloud toolkits Eucalyptus (was at eucalyptus.com, site now defunct, July 2019), Nimbus, OpenNebula and oVirt have a very limited feature set. None of them provides high availability or fault-tolerance, and only oVirt includes a usable graphical user interface. The projects are in a very early development state and have not reached production stability yet. Most projects lack of a complete up-to-date documentation and have a complicated installation process. Deploying a private or hybrid cloud using these toolkits encompasses a lot of Linux scripting and requires significant knowledge about other software and technologies (e.g. about external schedulers, lease managers, or virtual file formats). In fact some experts believe that the open source solutions are “light years away” from their commercial competitors (Magnus, 2010).

The most prominent example of the open source toolkits is Eucalyptus (was at eucalyptus.com, site now defunct, July 2019), a project initiated by the UC Berkeley in 2007. Eucalyptus is an acronym for Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems and is now run by Eucalyptus Systems, an open source company founded to promote and further develop the software. Eucalyptus features an open software package as well as a commercial product. While the open product supports the Xen and KVM hypervisors, Eucalyptus EE additionally works with VMware’s ESX/ESXi, i.e with vSphere. Eucalyptus implements several Amazon AWS interfaces (including EC2, S3 and EBS) and thereby allows the creation of a EC2-compatible private cloud. Like all presented solutions, it also features live migration, VLAN support, as well as image management.

Similar to Eucalyptus, the Nimbus project also focuses on providing a private IaaS cloud with AWS-compatible interfaces. It was founded by researchers at the University of Chicago in 2008, and mainly concentrates on scientific computing. The initial implementation was based on the Xen hypervisor, but newer versions also support KVM. It is very adjustable in terms of third party software and can be configured to use external schedulers such as the Sun Grid Engine (SGE) (was at: gridengine.sunsource.net, site now defunct, July 2019). In future releases, the developers plan to include an EC2 back-end so that EC2 instances can be integrated transparently.

Opposed to Eucalyptus and Nimbus, OpenNebula explicitly advertises itself as a hybrid cloud toolkit. The project was started in a cooperation of the Universities of Chicago and Madrid in 2008, and focuses on integrating different clouds in a single system. It uses a plugin-based technology to support a variety of different hypervisors and public cloud providers. It currently works with Xen, KVM or VMware and can integrate Amazon EC2 as well as ElasticHosts as cloud provider. Compared to the other projects, Open\-Nebula corresponds most to the hybrid cloud definition.

oVirt is part of Red Hat’s Emerging Technology project and was first released in 2008. The project describes itself as virtualization management framework for realizing a virtual data center. It is based on the libvirt library and can therefore potentially support a variety of hypervisors. However, the current stable version of the software only works with KVM. Compared to the other open source projects, the oVirt’s strongest part is its Web-based administration interface: while OpenNebula, Eucalyptus and Nimbus only provide command line interfaces and APIs, oVirt allows managing multiple hosts and resource pools with a simple Web GUI.

Currently the commercial cloud software providers such as VMware, Red Hat and Citrix have a more advanced feature set and provide fully-integrated enterprise solutions. With enhanced GUI tools and seamless integration of multiple servers, they all allow a relatively easy management of virtualized data centers. In contrast, the open source solutions are far behind in terms of completeness or stability. However, while they do not feature a complete enterprise-ready software package, they are more advanced when it comes to hybrid cloud computing: Eucalyptus, OpenNebula and Nimbus, for instance, allow the creation of a hybrid cloud at least to some extent. While Eucalyptus and Nimbus reach this goal by simply imitating the EC2 interface, OpenNebula integrates remote resources transparently. Nimbus and Eucalyptus can hence be controlled with the same tools as Amazon’s Elastic Computing Cloud.

3.2. Technical Requirements and Restrictions

The currently available cloud software allows the creation of a very flexible virtualized computing infrastructure and can bring great benefits in a modern IT environment (cmp. section 2.1). However, as indicated in section 2.2.4, clouds have much room for improvement in terms of interoperability. Most cloud software is designed to work in a well-defined infrastructure and only if all requirements are met, the cloud works as expected. Especially the commercial cloud software (such as VMware vSphere or RHEV) strictly defines the supported hardware and software, but also open source solutions make high demands on the systems.

3.2.1. Hardware Requirements

The hardware requirements of cloud solutions are very different in most cases, and depend not only on the used hypervisor, but also on the cloud management software. In order to guarantee service levels to their software packages, VMware, Citrix and Red Hat define very large hardware compatibility lists of supported processors, storage and I/O systems (cmp. Citrix HCL, VMware HCL, Red Hat HCL). For companies with incompatible hardware, switching to a virtualized infrastructure can hence become very expensive because new hardware might be necessary. The non-commercial projects do not explicitly define supported hardware, but instead simply specify minimal hardware requirements such as memory or CPU speed.

These requirements often include CPU virtualization technologies such as Intel VT/VT-x or AMD-V. Depending on the hypervisor and the type of virtualization, the host system’s processors must provide these virtualization extensions to function. While KVM only supports hardware-assisted virtualization, i.e. with CPU virtualization functionalities, ESX- and Xen-based cloud toolkits also allow full virtualization and paravirtualization (cmp. Citrix Requirements and VMware Paravirtualization (was at: http://www.vmware.com/files/pdf/VMware_paravirtualization.pdf, site now defunct, July 2019). In case of Xen, however, paravirtualization is only possible for Linux guests. For proprietary operating systems, only hardware-assisted virtualization is possible. Table 1 indicates exact compatibilities.

Besides CPU compatibility, especially the commercial cloud solutions rely on a very specific hardware configuration and topology. Some products require certain network layouts, or other components to be present. Citrix XenServer, for instance, requires VM images to reside on a SAN storage “to use advanced platform features such as resource pools, shared storage, live migration and high availability” (Getting Started with XenServer). VMware’s vSphere only supports live migration for a limited set of processors: administrators have to “make sure that the source and destination hosts have compatible processors” (VMware Compatibility Guide).

While the big vendors have very high requirements, they can at least guarantee that the system works as expected if the listed hardware is used. For the open source projects, however, compatibility between hosts requires a trial-and-error approach.

3.2.2. Operating Systems and Software Restrictions

In addition to the numerous hardware requirements, most cloud toolkits restrict the number of usable operating systems and other software significantly. While the open source solutions do not specifically list the officially supported operating systems, the vendors of commercial virtualization software only certify a very small number of guest OSs.

Especially Microsoft and Red Hat mainly focus on supporting their own operating systems and hence only provide a very limited choice: Hyper-V supports several Windows versions, including Windows Server 2000 — 2008 R2 and Windows XP — 7 (excluding the Home editions) Hyper-V Guest OSs, site was at http://www.microsoft.com/windowsserver2008/en/us/hyperv-supported-guest-os.aspx, now defunct, July 2019). However, in terms of other OSs such as Solaris or Linux, it only supports two enterprise Linux distributions for a single virtual CPU (Red Hat Enterprise Linux 5 and SUSE Enterprise Linux 10–11). Red Hat’s RHEV officially supports even fewer operating systems: besides its own RHEL, it is only compatible with certified versions of Windows XP, 2003 and 2008. The other commercial solutions have similar compatibility lists: and even though vSphere and XenServer officially support a broader range of OSs, they limit the choices significantly.

Compared to the commercial vendors, the open source solutions seem to have a much larger OS support. KVM-based cloud toolkits, for instance, have been reported to work with over 100 guest operating systems of seven OS families, including Windows, Debian/Ubuntu, Red Hat/Fedora, BSD, and Solaris. However, the two paradigms pursue completely different goals: while the commercial software aims towards stability and productive operations, its open competitors are rather focused on supporting a large number of operating systems.

In addition to the limited amount of certified guests, all of the available VI toolkits enforce the use of specific APIs and command-line tools to manage the cloud. These tools are mostly product-bound and require a certain amount of expertise. Since they cannot be used to control other clouds, switching to a different software can become very expensive.

Similar to the required hardware, the OS and software restrictions of current cloud toolkits are very noticeable and have to be considered before deploying a private or hybrid cloud. When businesses have to choose a cloud solution, they not only have to consider obvious issues like availability, security and cost, but also face many problems when it comes to compatibility of hardware and software: current cloud toolkits are still far away from being natively supported by any machine. Instead, they are only usable in certain configurations and topologies, using certified hardware and a limited set of software.

3.3. OpenNebula and Eucalyptus

As two of the most promising open source solutions for deploying private and/or hybrid clouds, OpenNebula and Eucalyptus have reached great publicity in the last year. Even though they both enable the conversion of a regular data center in a virtualized infrastructure, they follow completely different approaches.

3.3.1. Eucalyptus

Eucalyptus (was at eucalyptus.com, site now defunct, July 2019) is an open source software framework that implements an IaaS environment. It can deploy private or public clouds, and “gives users the ability to run and control entire virtual machine instances deployed across a variety of physical resources” (Nurmi et al., 2009 [was at: http://www.cca08.org/papers/Paper32-Daniel-Nurmi.pdf, site now defunct, July 2019]). Since October 2009, Eucalyptus is part of the Linux distribution Ubuntu Server, rebranded as Ubuntu Enterprise Cloud.

The Eucalyptus API is compatible to Amazon EC2 and hence makes it possible to control both Amazon and Eucalyptus instances with the same tools. Its main objectives are to provide a platform for testing applications before they are moved to Amazon’s infrastructure, as well as to manage and control large collections of distributed resources.

In general, Eucalyptus (and the included storage daemon Walrus) emulate EC2 and S3 by providing the same SOAP and Query interfaces as Amazon, and by acting similar to the real Amazon cloud. However, even though the external APIs are mostly identical, the interior of the Eucalyptus cloud is rather different: Amazon’s EC2 is based on a modified version of the Xen hypervisor and uses its own image format (Amazon Machine Image, AMI). In contrast, Eucalyptus can run on Xen, KVM or VMware ESX and also ships with a different VM format (Eucalyptus Machine Image, EMI). With the exception of the APIs, both systems are hence completely incompatible.

Eucalyptus does not advertise itself as hybrid toolkit, but rather as private cloud software. And in fact it strongly depends on the definition of hybrid cloud computing whether or not it qualifies as such: Eucalyptus does not integrate remote resources transparently in the private infrastructure, and it provides no tools to extend the local capacity via external cloud providers. Instead it simply emulates the EC2 infrastructure, and can thereby serve as foundation for hybrid solutions. It is not designed to be a hybrid cloud software, but rather “to bridge between public and private clouds to enable hybrid cloud infrastructures” (Dooley, 2010).

3.3.2. OpenNebula

Opposed to Eucalyptus, and currently the only software on the market that describes itself as hybrid cloud toolkit is OpenNebula. And in fact, compared to the other solutions it corresponds most to the definition of a hybrid cloud. While other toolkits either introduce their own proprietary infrastructure (vSphere, RHEV, XenServer, Hyper-V) or emulate others (Eucalyptus, Nimbus), OpenNebula transparently integrates external resources in the cloud.

It has a flexible component-based structure and includes several predefined drivers for information management (monitoring), image and storage management (e.g. via NFS, LVM, or SSH), as well as to support several hypervisors (KVM, Xen, VMware). In addition to the numerous private cloud drivers, OpenNebula also integrates drivers for Amazon EC2 and ElasticHosts, and can be easily extended to support other cloud providers. With its support for EC2, OpenNebula can be configured to use Amazon’s infrastructure for cloudbursting, or use other EC2 compatible systems (such as Eucalyptus or Nimbus).

An OpenNebula cloud typically consists of a front-end node for administration purposes (i.e. for managing hosts and images), as well as of several cluster nodes to execute the VM images. The hosts are controlled by the front-end either via command-line tools, or via well-defined programming interfaces.

Even though the current development status of OpenNebula is far away from a production-ready product, it “differentiates itself from the other platforms in the sense that it has been designed to federate existing technologies” (Cerbelaud et al., 2009). By leveraging the advantages of other virtualization software, it combines them to a strong hybrid cloud tool with high potential.

>> Next chapter: Conclusion

Pages:<123 4>

Leave a comment

I'd very much like to hear what you think of this post. Feel free to leave a comment. I usually respond within a day or two, sometimes even faster. I will not share or publish your e-mail address anywhere.