The Business Value of Automation

Many enterprises have now reached a stage where managing the scale of their IT infrastructures is becoming a challenge second only to increasing the speed of provisioning. A broad spectrum of technologies, solutions, processes, and skillsets is available to help manage a large scale environment. Among management technologies, automation is one of the most powerful.

Automation is not merely a technology choice. First and foremost, it’s a business choice.
Without automation, supporting the growth of your business can be increasingly complex, to the point of becoming impossible beyond a certain scale. If it’s true that software is eating the world, as the well-known venture capitalist Marc Andreessen said in 2011, and if it’s true that every company is becoming a technology company, as the Chief of Research at Gartner said in 2013, then automation becomes a must-have tool in the hands of the business, not just of the IT organization.

Here are four key reasons why automation is critical to managing a large-scale IT environment:

Less waste: automation optimizes IT operations

To grow your business, you can either offer more services or focus on expanding the capabilities of an existing one, providing multiple service levels or various degrees of customization.
Most likely, an enterprise organization makes both choices, multiple times throughout its lifespan. Accordingly, the IT environment evolves over time, starting from a fairly simple environment with limited capacity and ending as a large-scale jungle of multiple languages, platforms and architectures that must be supported for many years.

To describe the evolution of the IT environment I will use a maturity model that we introduced in the blog post “How to manage the cloud journey?”, shaped after two dimensions: scale and complexity. According to our model, if your business is successful, the IT infrastructure that sustains your organization should grow both in scale, hosting increasingly more workloads, and in complexity, hosting an increasingly diversified set of workloads.

To support such evolution, you have two options: hiring at the same pace your infrastructure grows, or empowering your IT organization with a new set of tools that can scale their operational capabilities. Doing nothing isn’t really a viable option because you can’t expect to manage the growing scale and complexity with a mostly flat number of people.

To give you more context, I will use the research on the TCO for a private cloud based on OpenStack that we published last year. Some of the data we used in this research comes from the Server Support Staffing Ratios report published by Computer Economics, Inc. The study found that a large organization*, with an average level of automation, supports 46 operating system instances (mix of physical and virtual) per system administrator while the same large organization, with high levels of automation, supports 101 instances per admin.

In our research, we assumed the number of workloads doubling each and every year. Hypothetically, in a similar situation, if you decide not to invest on automation, you should be doubling your operations staff as well.

Matching a similar pace from a hiring perspective can be extremely hard, if not impossible, due to a number of factors: limited OPEX budget, slow hiring process, scarcity of skilled resources on the market, and more.

In the second scenario, by empowering your organization with new tools, assuming the right tools are identified, you can enable IT operations to get more of the existing job done in the same amount of working hours. The new tools must be designed to execute the tasks in a more efficient way, through increased ease of use, higher flexibility to adapt to the use cases, faster computation, or a mix of all these things. For example, automation can help a team to deploy applications more often, and deploy them faster; fix issues at a broader scale, and fix them faster.

Highways are a good analogy to explain the concept. When the population in a geographical area grows, the government is forced to develop the infrastructure to support the additional cars. In some places, these highways must be equipped with toll gateways to regulate the access, and toll gateways must be operated by humans, each performing thousands of repetitive operations per day. In turn, the newly built highways attract even more citizens in the region, and more cars on the road. The government can deal with the spike in traffic either by adding more gateways and hiring new people to manage them, or it can make the existing highways more efficient by implementing automated barriers and automated access systems like E-ZPass.

Electronic tolls enable more cars to access the highway in the same amount of time that human operators take: by eliminating the brief stop at the toll and the interaction between the driver and the human operator or the cash machine, automation allows each car to move through the toll at a higher speed, in less time.

In the same way, IT automation can help your Operations team to manage more workloads over the same working hours, reducing the need to hire more staff to support the growth of the infrastructure.

Less complexity: automation orchestrates sophisticated services

Beyond a certain point, success often leads to a diversification in the market demand, as your popularity attracts a broader audience. In an enterprise environment, this dynamic implies that some of the Lines of Business (LoBs) that you are serving may eventually start requesting services that are outside the planned offering or far more complex than what originally anticipated.

For example, after a cloud computing environment designed to offer a few highly standardized services becomes highly successful, the organization may experience a raising number of requests to serve complex custom services. Each of those custom services includes many application tiers that must be coordinated in terms of provisioning, configuration, sequential system updates and patching, migration (if necessary), and retirement. The risk is that this newly introduced complexity, paired with the scale you reached, lessens productivity if not properly managed.

Let’s use a different analogy to explain how automation can simplify by orchestrating complex systems: the automatic transmission in a car. In modern cars, the transmission is managed by a computer which operates the gearbox and the clutch, coordinating them with engine, brakes, wheels and many other components. Automating the gear shifting task reduces the amount of work required to drive because it simplifies the entire process, for example by removing the need to monitor the tachometer.

The simplification introduced by the automated transmission is particularly useful when a certain aspect of driving must be repeated over and over. For example, thanks to automation, the continual operation of accelerating, stopping and re-accelerating when stuck in traffic during rush hour doesn’t require the driver to shift gears endless times.

Like the automatic transmission in your car, an automation tool is designed to deal with a large number of moving parts at the same time, taking care of repetitive tasks and keeping all the pieces together while delivering predictable results.

Moreover, to comply with regulations and reduce emissions, car manufacturers have introduced more efficient gearboxes made of eight or even 10 gears. Manually shifting eight gears not only would be complex and distracting, but it would also almost certainly result in highly inefficient driving.

In the same way, automation tools can simplify your experience of deploying and maintaining applications of increasing complexity, from the multi-tier service composition to the configuration of ancillary components such as networking and firewalls.

Less mistakes: automation reduces human errors

Even the most talented member of your IT operations team is human, and humans are prone to mistakes. The larger and more complex an environment, the higher the chances for mistakes.
For example, large-scale environments easily force the IT organization to deal with time pressure and stress. The psychological stress comes from the realization that a task, even a simple one, can’t be accomplished across all the managed machines in the allocated amount of time with a low probability of errors.

It must be also considered that growing complexity leads to more articulated operations that need to be performed. Highly complex tasks require constant focus and precision – skills that not all team members may possess.

In the previous section, I mentioned how automation can more easily coordinate the operations performed on a series of application tiers that compose a sophisticated business service. The challenge is not just about dealing with many moving parts (the “what”), but also dealing with the highly complex configurations that apply to each tier and define the relationship between the tiers (the “how”).

Back to our car analogy. The autopilot feature recently introduced by Tesla is a good example of how automation can assist humans and help them to avoid mistakes. Driving is certainly a complex task but yet manageable by humans. Automation here, while not yet perfect, can be two times more reliable than human drivers.

Less uncertainty: automation prepares you for the future

So far I talked about the value of automation in facing today’s challenges, but automation can do more than that. Automation can also better equip organizations to face the uncertainty of the future.

As an abstraction layer interconnecting many elements on the enterprise IT and operating at scale with minimal effort, automation can be seen as an extensible platform that can evolve and adapt to market changes.

Automation as a platform builds upon the foundational elements you already have in your computing environment, simplifying the evolution of existing services and the creation of completely new ones.

For example, automation can simplify the deployment of existing applications across new public and private cloud infrastructures that don’t exist today.

In another example, automation can make it easier to combine new IT components, like a new Identity and Access Management (IAM) service, with existing ones, to help create new offerings at a fraction of the time otherwise required to re-engineer the whole stack from scratch.

Let’s take the automatic gearbox analogy to another level. Think about an highway with a long queue of cars progressing at a very low, tedious speed. To reduce the annoyance of driving in queues, several manufacturers introduced a technology called Adaptive Cruise Control (ACC). ACC uses information from sensors like radars and cameras to instruct the car’s control systems to automatically follow the vehicle in front, adjusting speed and stopping when necessary. The development of such technology, exactly as for the autopilot, has been possible thanks to the automation of several car components, pioneered by automatic transmission. In this example, automatic transmission is a key building block for car innovation. As car manufacturers introduce more and more new features, the automatic transmission acts both as enabler and actuator of many new capabilities.

In summary, automation is not just a great tool to deal with today’s market demands, but it can also be a fundamental building block to help sustain the growth and evolution of your business tomorrow. However, as I said at the beginning of this post, automation is just one of the many technological, operational, and cultural elements that you may need to introduce in your organization as part of a digital transformation journey. Automation alone is not enough.

Massimo Ferrari
Management Strategy Director

*With an IT operational budgets of $20 million or greater.

US Coast Guard Academy

How to Manage the Cloud Journey?

By now, it’s hopefully clear that Red Hat is very serious about Management, with a continual commitment and a constant look at the big picture.


Over the years, we expressed our commitment to become a key IT management player in a number of ways:

This week we further express our commitment in additional ways:

We have grown and evolved our portfolio faithful to enable and support our customers in their march towards a Frictionless IT, shaping our offering after its core principles like ease of use. Three examples:

  • Our newest offering, Insights, features a Software-as-a-Service delivery model to reduce at its minimum the cost of entry.
  • Ansible, already considered one of the easiest product to use among IT automation and configuration management tools in the market, has grown in popularity as an easy way to manage containers.
  • CloudForms ships as single virtual appliance where some competitors still want you to setup and configure 6-12 systems to deploy their cloud management platform.

But what is the big picture? Why these specific products? What is guiding our decisions in terms of management portfolio growth?


Part of the answer to the above questions is in the cloud maturity model below, shaped after the dimensions of scale and complexity.
Cloud Journey
If your cloud project is successful, it will likely grow in scale as more and more Lines of Business trust your IT organization to host their applications. Not only guided by common sense, we saw the relationship between success and scale through first hand experience by working with many customers worldwide, and with further validation resulting from a highly detailed TCO analysis that we published recently.

Popularity has a side effect. As more LoBs approach the private or hybrid cloud you built, the business demand will also likely start to diversify and your IT organization may be requested to host not just a great variety of greenfield applications, but also brownfield ones that were not designed to run on IaaS and PaaS clouds. We saw this over and over in conversations with clients, and we heard it even more on stage of industry events like the OpenStack Summit from now very experienced early adopters.

This diversification increases the complexity of the cloud environment, both in terms of complexity of applications to deploy and manage, and in terms of integration with IT systems outside the cloud environment.

So the question is: “How do you manage that growing complexity as you evolve in your cloud journey?”
Cloud Journey with Products
You begin the cloud journey by asking yourself, “Can we move faster?” The first step to answering that question consists of deploying a cloud engine. Your decision to adopt a IaaS or PaaS cloud engine depends on many factors, including cultural fit, readiness to standardize the computing stack at a certain level, preference to work with virtual machines or containers, and much more.

Your cloud engine of choice comes with its own set of management tools, which are perfectly fine to address the needs of an IT organization up to a certain level of complexity. After that level, which varies from organization to organization, your business likely will require more sophisticated support, just like a landlord that expands his/her real estate business, embracing services like Airbnb and beyond.

As you grow in scale, the first management solutions that you may want to consider are the ones that can preserve the health of your growing IT environment, enabling you to run at scale. For this stage of maturity, Red Hat offers Insights and Satellite. Insights can proactively identify configuration issues and security vulnerabilities before they become critical, generating an appropriate remediation plan*. Satellite, conversely, can deploy trusted software content and security patches at scale, enabling IT Ops to fix whatever issue has been identified by Insights or by the IT organization manually.

As the complexity increases along the way, and your cloud environment is requested to serve and host increasingly diverse applications, you may want to consider an IT automation solution to help maximize the efficiency of your cloud. For this stage of maturity, Red Hat offers Ansible. Ansible is capable of automating the provisioning and configuration of the components of a multi-tier application, including the underlying resources that serve the application, like networking.

As both scale and complexity reach their peak, at the most advanced stage of maturity in your cloud journey, you may want to consider a cloud management platform to govern the private cloud environment side by side with public clouds and pre-existing server virtualization environments in a coherent way. For this stage of maturity, Red Hat offers CloudForms. CloudForms provides a single pane of glass to control, automate and keep compliant a truly hybrid IT environment composed of VMware, Microsoft, Amazon, Google, and Red Hat technologies**.

In other words, we are investing in the Red Hat Management portfolio to support our customers at every stage of the maturity model described here, both when they follow the adoption path described so far and when they have more ad-hoc business needs to address***.

There’s much more that we can do, and that we’ll do, to empower IT organizations in their digital transformation. So, as usual, stay tuned for more.

Alessandro Perilli
GM, Management Strategy

* Red Hat Insights can do this with both IaaS and PaaS clouds. In fact, the new version of the platform that we just announced introduces support for containers, OpenStack-based private clouds, and KVM-based server virtualization environments.

** At the same time this post goes live, we announce the official support for Google Cloud Platform, alongside a remarkable number of other new capabilities and improvements.

*** Do we expect all customers to follow the exact adoption path described in this blog post? No. In fact, multiple customers start adopting some of our management solutions much earlier in their cloud journey. Which is why we leverage CloudForms in both our IaaS and PaaS cloud engines, or why we launched the Ansible Container project, as a way to support IT organizations that want to work with containers from the earliest maturity stages.

Open Source for Business People

Thanks to the effort of companies like Red Hat, Google, Netflix and many others, it’s safe to say that open source is no longer a mystery in today’s IT organizations. However, many struggle to understand the nuances that make a huge difference between vendors commercially supporting the same open source technologies.

Should the general public have any interest in understanding those nuances? A few years ago the answer would have been “no.” However, today, understanding those nuances is critical to select the right business partner when an IT organization wants to adopt open source.
As more vendors start offering commercial support for various projects, from Linux to OpenStack to Kubernetes, the need to understand the real difference between vendor A and vendor B becomes critical for CIOs and IT Directors.

In Red Hat, we have a TL;DR answer to the question “What makes you different from vendor XYZ?”. Our short answer is we have more experience supporting open source projects, and we participate and nurture the open source communities in a way most other industry players simply don’t.
This is a true statement, but what does it actually mean? How does that translate into a competitive advantage that a CIO can appreciate when selecting the best business partner to support her/his agenda? Today, I’ll try to provide the long version of that answer, with some simplifications, in a way that is hopefully easy to understand for business-oriented people.

To narrate this story, let’s take as example a fictitious open source project that we’ll call “Project-O” and divide it into three chapters:

Chapter 1: Innovation brings instability

At any given time during the lifecycle of Project-O, any individual in the world, can contribute a piece of code to:

  • introduce, complete or fix a feature (innovate)
  • improve performance (optimize)
  • increase reliability (stabilize)

To serve the business, we need to innovate and optimize. To protect the business, we need to stabilize. The continuous tension between these two needs compels hundreds or thousands of code contributions to Project-O at any given time. The bigger the project, and the larger the community supporting it, the more code is submitted at any given time.

Let’s use an analogy: if Project-O is an existing house, each code contribution is a renovation proposal. Imagine having hundreds or thousands of renovation proposals per day.

Just like renovation proposals, new code, especially the one that introduces new features, can be written in a very conservative way or in a very disruptive way:

  • It’s conservative code when adoption doesn’t break other parts of Project-O. In other words, the individual who wrote the code has been mindful of the “backwards compatibility.”
    In our analogy, it’s when a renovation proposal doesn’t force the house owner to demolish existing walls or do some other major intervention to accommodate the proposed changes. Imagine painting a guest room.
  • It’s disruptive code when adoption breaks other parts of Project-O and requires some major reworking.
    In our analogy, it’s when a renovation proposal requires the house owner to make drastic changes to the plumbing system in the only bathroom. It can be done, of course, but it implies temporary instability and disruption inside the house.

Obviously, the more conservative the code, the fewer chances there are to innovate. And vice versa.

When an individual wants to improve Project-O, he or she has to submit the proposed code to a group of individuals, called “maintainers”, that govern the project and have the mandate to review the quality and impact of the code before accepting it.

A maintainer has the right to reject the code for various reasons (we’ll explain this in full in Chapter 3), and needs to make a fundamentally binary choice: requesting strong backwards compatibility or allowing disruptive code.

In our analogy, the maintainer is the house owner that has to carefully evaluate the pros and cons for each renovation proposal before approving or rejecting it.

If the house owner wants an amazing new wing of the house, he has to be ready to tear down walls, rework the plumbing system, and deal with a fair amount of redesign. In similar fashion, the maintainer that wants to innovate and quickly evolve Project-O has to allow more disruptive code and deal with the implications of that disruption.

To address the business demand, especially in a highly competitive market like the one we have today, the maintainer has no choice but to allow disruptive code wherever possible*.
How a vendor deals with that disruption makes the whole difference, and can truly define its competitive advantage. This is where things get nuanced and interesting.

Chapter 2: Instability is exponentially difficult to manage in large projects

As we said, the larger the community behind an open source project, the larger the number of code contributions submitted at any given time. In other words, the amount of things you can renovate in a standard apartment is infinitely smaller than the number of things that you can renovate in a castle.

Let’s say that Project-O is a fairly complicated open source project, equivalent to an hotel in our analogy. For the maintainer of Project-O, the challenge is to consider and approve enough code contributions to keep the project innovative, but not too many to be overwhelmed by the amount of things to fix at the same time. Imagine renovating the rooms in one wing, rather than all of them at the same time.

When very many functionalities of Project-O break simultaneously due to too many code contributions, the difficulty of fixing them all together in a reasonable amount of time grows exponentially. The problem is that the market cannot wait forever for Project-O to become stable again to be used. The innovation provided by the newly contributed code must be delivered within a reasonable amount of time to be used in a competitive way. Usually, large enterprises struggle to adopt a new version of Project-O if a stable release is provided faster than every 6 months. However, the same large enterprises won’t wait years before receiving a new stable release of Project-O.

Again, it would be like the hotel owner in our analogy would approve 10,000 renovation proposals all executed at the same time, each one breaking existing parts of the hotel. Imagine upgrading the electrical, plumbing, heating, and remodeling the restaurant all at the same time. Fixing the resulting disruption would be so incredibly difficult to render the hotel completely unusable for an excessive amount of time.

According to what we said so far, the maintainer sets goals and deadlines to stop accepting code contributions. After the deadline is met, no more code contributions are applied and the community works to stabilize the new version of Project-O enough to be usable.

However, “usable” doesn’t necessarily mean “tested” and “certified as reliable”. It’s the same difference that goes between “I tried to run the code a dozen times and everytime it worked” and “I tried to run the code thousands of times, under most disparate conditions, and I know that it will always work in the conditions I tested”. This is where competing vendors can make a business out of an open source technology that is fundamentally free to access and use for the entire world.

So, at a certain point, the maintainer freezes code contributions for Project-O. Subsequently, competing vendors look at all submitted code contributions and decide how much of it should be commercially supported** after their own extensive QA testing.
Because of this, the open source version of Project-O, called “upstream”, is not necessarily identical to the commercially supported version of Project-O provided by vendor A, which in turn is not necessarily identical to the version of Project-O provided by vendor B. There are small and big differences in between these three versions as they represent three discrete states of the same open source project.

Vendor A and vendor B need to make a decision on how much of Project-O they want to commercially support, trying to balance the need for innovation (addressed by newly disruptive code being accepted by the maintainer) and the exponential complexity of fixing the amount of instability caused by that innovation.

Chapter 3: How vendors manage instability is their competitive advantage

At this point, you may think that the differentiation between vendor A and vendor B is in how savvy or smart they are in “making the cut,” in how many new code contributions to Project-O they decide to support at any given time. In reality, that is only partially relevant. What really differentiates the two vendors is how they deal with the instability caused by the newly contributed code.
To manage this instability each vendor can leverage up to three resources:

  • Deep knowledge
  • Special tooling
  • Strong credibility

Deep knowledge
When much of the newly contributed code is disruptive in nature, many things can break at the same time within Project-O. Sometimes the new code breaks dependencies in a domino effect that is very complicated to fully understand. Fixing all broken dependencies quickly and effectively requires a broad and deep knowledge of all aspects of Project-O. Like the hotel owner who intimately knows the property inside and out through many years of renovations, and has a very clear idea of all the areas, obvious and non-obvious, that the changes in a renovation plan imply.

This is why vendors involved in the open source world make a big deal of statistics like number of contributions to any given project, like the ones captured by Stackalytics. Knowing how much and how broadly a vendor is contributing to an open source project may seem a superficial and sometimes misleading metric, but it’s meant to measure how deep is the knowledge of that vendor. The deeper the knowledge, the more skilled the vendor is in managing the instability created by disruptive code.

Special tooling
No matter how deep the knowledge available, at the end of the day a vendor is an organization made of people, and people can make mistakes. Human error is unavoidable. Hence, to mitigate the risk of human error, some vendors develop internal special tooling that assists humans in understanding the impact of the instability created by newly contributed code, and operate the necessary changes across the board to make Project-O as stable as possible, as quickly as possible.

Without a deep knowledge about Project-O, it can be impossible to develop and maintain any special tooling. So, human capital is the biggest asset a vendor involved in open source has.

Strong credibility
Through deep knowledge and/or specialized tooling, a vendor can identify and fix the broken dependencies in open source code faster than its competitors, but there’s a last challenge: submit the patches to the maintainers and be sure that each part of Project-O is fixed for the newly contributed code to work in time for it’s release. If vendors get fixes back “upstream,” they don’t have to maintain those fixes alone. But, for the fixes to be accepted, vendors have to prove their code helps Project-O, not just themselves.

Back to our analogy: the hotel owner accepted a certain number of renovation proposals to build a new wing, and compiled them in a renovation plan. The plan is ambitious and the contractors executing it will break the current plumbing system in the process. Nonetheless, the plan must be completed within 3 weeks or the hotel will not remain competitive enough to justify the renovation plan itself. The contractor that is building the new wing breaks the plumbing system, as expected, and must ask for modifications from the contractor that owns that system.The owner of the plumbing system is willing to help, of course, but to comply he has to review the new wing project and the proposal changes to the plumbing system, and, if he agrees with them, order new pipes. The whole process would normally take 5 weeks, enough to compromise the whole renovation plan.

The only way to save the day is if the contractor who is building the new wing has a strong credibility in plumbing. And that credibility is so strong that the requested modifications to the plumbing system are accepted without questioning, and the pipes are ordered with an express delivery. In other words, the owner of the plumbing system trusts the wing builder so much that a further review is not necessary.

Such credibility is not granted lightly in the open source world. Few individuals are granted that sort of trust, and that sort of trust is earned over years of continuous contribution of new code and demonstration of deep knowledge.

Thanks to these amazing open source contributors that decide to join a vendor, that vendor is more or less able to fix broken dependencies in a timely way. In fact, differently from what could be assumed, highly trusted open source contributors are not easily hired and retained through standard HR practices. They independently decide to join and stay with a vendor primarily because they believe in the mission of that vendor, in how that vendor conducts business.

So, in summary, the difference between two vendors operating in the open source world boils down to how capable they are in managing the instability caused by innovation. That differentiation is very subtle and hard to appreciate for anybody until it’s time to face the instability.

Alessandro Perilli
GM, Management Strategy

* The deeper you go into the computing stack, all the way down to the kernel of the operating system, the smaller the amount of disruptive code is allowed to compromise the reliability of mission critical systems, and their capability to integrate with a well established ecosystem of ISVs and IHVs. That’s why it’s much harder to innovate at the lowest level of the stack. 

** Commercially supporting open source software means that the vendor performs the QA testing to verify code stability, provides technical support in case something doesn’t work, issues updates and patches for security and functionality improvements and certifies integration with third-party software and hardware components.

Elephant In The Room: What’s The TCO For An OpenStack Cloud?

A few months ago, for our own internal use, we started a project to calculate what it costs to run an OpenStack-based private cloud. More specifically, the total cost of ownership (TCO) over the years of its useful life. We found the exercise to be complex and time consuming, as we had to gather all of the inputs, decide on assumptions, vet the model and inputs, etc. So, in addition to results, we’re offering up a few lessons we learned along the way, and hopefully can save you a scar or three when you want to create your own TCO model.

Ultimately, we wanted answers to three layers of cost:

  1. What is the most cost effective method for acquiring and running OpenStack?
  2. How does OpenStack compare financially to non-OpenStack alternatives?
  3. How should we prioritize technical improvements to provide financial improvements?

Following an exhausting survey of cloud TCO research, none of the cost models we could get our hands on were complete enough for our needs: some did not break out costs by year, some did not include all of the relevant costs, and none addressed potential economies of scale. We needed a realistic, objective, and holistic view – not hand-picked marketing results, and found a few suggestions that helped us get there – whatever the technology.

Since we could not find anything both comprehensive and transparent, we created one, and used the opportunity to go a few steps further by adding additional dimensions: full accounting impact across cash flow, income statement, and balance sheet. The additional complexity made it harder to understand and consume the model. Further, we needed the model to not only spit out projections, but be a reliable way to compare options and support decision making throughout the life of a cloud as options and assumptions change. So, we decided to create a tool rather than just a total cost of ownership (TCO), for easy comparisons, and conversations with financial teams and lines of business.

To help us view the data objectively, we relied as much as possible on industry data. Making assumptions was inevitable, not all of the required data is available, but we made as few as possible and verified the model and results with a number of reputable and trusted organizations and individuals in both finance and IT.

What is the most cost effective method for acquiring and running OpenStack?

If you’re considering or even running OpenStack already, we imagine you’re asking yourself a few questions, “I have a smart team, why can’t we just support the upstream code ourselves?”. As Red Hat is commercially supported open source software, we can talk all day about the value of supported open source software, including the direct impact on OpenStack, but we also want to address the direct costs, the line items in your budget. To get to these costs and answer our questions, we shaped the model to analyze two different acquisition and operation methods for OpenStack:

  • Self-supported upstream OpenStack
  • Commercially supported OpenStack


As the model shows, the self-supported upstream use of OpenStack, with the least expensive software acquisition cost, ends up the most expensive, which may seem counter-intuitive. Why? Because of the cost of people and operations.

All of the costs of a dedicated team* running the cloud: the salaries, hiring, training, loaded costs, benefits, raises, etc., regardless of the underlying technology, are a large chunk of the total costs. With a commercially supported OpenStack distribution, you only need to support the operations of your cloud, rather than the software engineers, QA team, etc., for supporting your cloud and the code too. We expect that you need to hire fewer people as your cloud grows, and the savings would exceed the incremental cost of the software subscription. Your alternative, is this:


Taking our analysis a step further, we also explored the financial impact of increasing the level of automation in an OpenStack cloud with a Cloud Management Platform (CMP). Why? Because most companies’ experience shows** that managing complex systems usually doesn’t go according to plan. However, if automation is appropriately implemented, it can lower the TCO of any complex system.

CMP is a term coined by Gartner to describe a class of software encompassing many of the overlaid operations we think of in a mature cloud: self-service, service catalogs, chargeback, automation, orchestration, etc. In some respects, a CMP is an augment to any cloud infrastructure engine, like OpenStack, necessary to provide enterprise-level capabilities.

Our model shows coupling a CMP with OpenStack for automation can be significantly less expensive than either using and supporting upstream code, or using a commercial distribution. Why? As with the commercial distribution, our model shows that you would need to hire fewer people as your cloud grows, and the savings can potentially dwarf the incremental software subscription cost. The combined costs are drawn from Red Hat Cloud Infrastructure, which includes the Red Hat CloudForms CMP and Red Hat Enterprise Linux OpenStack Platform.


One of the sets of industry data we used, to help create an unbiased model, came from an organization named Computer Economics, Inc. They study IT staffing ratios, and all kinds of similar things. They found that the average organization, with the average amount of automation, supports 53 operating system instances (mix of physical and virtual) per system administrator. They also found, that the average organization, with a high level of automation supports 100 instances per admin.

So, in our scenario, with the cloud expected to double in size next (and every) year, you have a few options. You can double your cloud staff (good luck with that), double the load on your administrators (and watch them leave for new jobs), or invest in IT automation.

The aforementioned study shows that high levels of automation can nearly double the number of OS instances supported. While automation can reduce the cost curve for hiring, and make your cloud admins’ lives easier, we’re in a financial discussion. Automation only makes financial sense if it lowers the cost per VM. Which is exactly what we found:


In order to compare the costs and advantages of automation more closely, we looked inward (it was an internal study after all). We compared with the completely loaded costs (hardware, software, and people) for one VM of our commercial distribution of OpenStack, Red Hat Enterprise Linux OpenStack Platform (RHELOSP), with those of our Red Hat Cloud Infrastructure, which includes both RHELOSP and our CMP, Red Hat CloudForms.

Looking at the waterfall chart above, we start with the fully loaded costs of one VM provided by RHELOSP of $5,340 per VM, and want to compare the similarly loaded costs for RHCI. The RHCI software costs an additional $53 per VM under these density assumptions, which increases the costs to $5,393. Next, we factor in the $1,229 savings through automation from hiring fewer people as your cloud grows, we see a loaded cost of $4,164 per VM for RHCI. Under our model, using a CMP with OpenStack resulted in savings of over $1,200 per VM.

Moving from just an average level of automation to a high level of automation, our model showed a significant improvement in costs as you grow, that the extra cost of automation can be dwarfed by the potential savings. High automation is only moving from the median to the 75th percentile, so our model shows that there’s a lot of headroom for improvement above and beyond even what we show.

At $1,200+ savings per-VM per-year, automation has the potential to quickly add up to millions in savings once you’ve reached even moderate scale.

That’s the kind of benefit is one of the many reasons why Red Hat recently acquired Ansible. And given that Ansible is easy to use, use of Ansible tools can not only improves the TCO through automation, but can also help customers achieve those savings faster.

How do OpenStack and non-OpenStack compare financially?

As we said, we wanted to model to be useful also to compare different market alternatives, but in order for the comparison to be useful, we needed the comparison to be apples-to-apples. Competitive private-cloud technology available on the market at the time of our research provided much more than just the cloud infrastructure engine, so we decided to compare OpenStack plus the CMP against commercial bundles made of an hypervisor plus CMP, which is what Red Hat customers and prospects ask us to do most of the time.

In the model, we conservatively assume that the level of automation is exactly the same. If you have data you are willing to share which supports or refutes this, please let us know.

As we expected, the model showed us that an OpenStack-based private cloud, even augmented by a CMP, costs less than a non-OpenStack-based counterpart. The model shows $500 savings per VM increasing to $700 over time and over a larger number of VMs and more as the maturity of the cloud grows over time.


However, the question is: is the $500-700+ in savings per-VM worth the risk of bringing in a new technology? To find the financial answer, we had to consider how these savings add up.


As the chart shows, by the time you have even a moderate sized cloud, OpenStack with a CMP total annual cost savings can exceed two million dollars. We are aware that it’s common business practice to apply discount to retail prices, but to keep the comparison as objective as possible, we referred to list price disclosed by every vendor we evaluated in our research. Because our competitors were not real keen on sharing their discount rates, the only objective comparison we can make are these list prices. We estimate that there is a small portion of this savings that comes through increased VM density (which we’ll talk about later), but the majority is in software costs.

With this in mind, if you take a look at these numbers, and think about the software discounts you’ve negotiated with your vendors, you’ll have a reasonable idea of what this would look like for you. And as a reminder, these are just for the exponential growth model starting from a small base. We’ll wager there are any number of you reading this who have already well exceeded these quantities and are accumulating savings even faster than we show here.

We also recommend looking at the total costs over the life of a project. In fact, when we look at the accumulated savings over the life of your private cloud, we notice something rather striking.


Our model showed that it really doesn’t matter what your discount level is, if you plan on any production scale OpenStack with a CMP can potentially save you millions of dollars over the life of your private cloud.

How should we prioritize technical improvements to provide financial improvements?

In order to move from one-time decisions to deliberate on-going improvements, you need the “why” of the model as well as the outputs. By the time we finished building and vetting our TCO model, we made a number of interesting, and sometimes surprising, discoveries:

Cost per VM is the most important financial metric

For most of this post, we’ve been focusing on cost per VM. Despite the necessity in budgeting, total costs are simply not instructive. Here’s an example of the total annual costs over six years, for one of the many private cloud scenarios we considered:


A typical approach in TCO calculations is looking at the annual costs, but this metric alone isn’t particularly helpful in the analysis of a private cloud, with or without OpenStack. In private clouds, we can’t get away from the fact that we are providing a service, and what our Lines of Business or customers consume is a unit, like a VM or container. Hence, we believe that it’s much more significant to look at the annual per-VM cost.


In the same scenario we showed with the rapidly increasing total costs, the VM cost has dropped by more than half, from the first year to the third. That dramatic improvement is impossible to see in the total costs curve. Without accounting for the VM costs, you’d miss that the total costs are increasing because of more usage, but you’re getting more for your dollar every year. Increasing growth while increasing cost efficacy is a good problem to have.

In other words, we recommend using VM Cost as your main metric because it shows how good you are at reducing the cost of what you provide. Total Cost does not distinguish between cost improvement and usage growth.

The hardware impact on total spend is marginal

We’ve woven in analysis of two of the three main cost components related to acquiring and running OpenStack, and financially comparing OpenStack and non-OpenStack alternatives. Our model shows that the selection of private cloud software choices has the potential to save you millions of dollars. The investment in automation similarly shows the potential to save additional millions of dollars. Either or both of these can save an organization a lot of money, despite the additional expenses. But, so far, we’ve only hinted at hardware costs.

Some of our readers may be surprised at the results: hardware is a large and easily identifiable cost, so if you can cut the amount of hardware, in theory you can save a lot of money. Our model suggests that it’s not really the case.


We asked the model how costs change across a large range of VM densities: 10, 15, 20, and 30 VMs per server, with no other changes. The numbers show very little difference in costs even across this large range of densities.

If we start with an average density of say 15 VMs per server and (unrealistically) double it to 30, we see a savings of around $350 per VM. Not a trivial amount, and one that adds up quickly at scale, but these amounts are before the costs of any software and the effort to make this monumental jump in efficiency.

If we make a more realistic (but still really big) stretch to a ⅓ increase in density from 15 VMs per server up to 20 VMs per server, the models indicates a $175 in savings per-VM before the cost of software and effort. This is tiny compared to the $1,200 or more savings per-VM through automation in the same scenarios.

Never neglect your hardware costs, but don’t start there for cost improvements, it’s unlikely to provide the biggest bang for your buck.

Lowering VM costs will increase usage and total costs

Our model shows that the more you lower the VM costs for the same service, the more you will increase your total costs. There’s a direct causal effect: the less expensive this service is, the more people want to use it.

Here’s a different example from our industry, to further prove our point. 1943 saw the beginning of construction of the ENIAC, the first electronic general-purpose computer, which cost about $500,000. In 2015 dollars, that’s well over $6,000,000. Today, servers cost less than 1/100th of that, and we buy 1,000,000’s of them every year. We now spend much, much more on IT than the first IT organizations did supporting those early giant beasts and, yet, our unit costs are significantly lower.

Based on this awareness, we looked at the market numbers for consumption of servers and VMs from IDC, and ran some calculations: for every 1% you reduce your VM cost, you should expect to see a 1.2% increase in total cost, due to a 2.24% increase in consumption. Which seems counterintuitive, but the increase in total costs is due to your success. You’ve reduced the costs to your customers, so they’re buying more. Once again, your reduction in VM cost is directly increasing the demand for the services of your cloud.

IT, and in particular IT components like servers and VMs have “elastic demand curves,” broadly meaning that reducing prices leads to greater utilization and greater total cost. If increased efficiency causing higher total costs comes as a surprise to you, you’re not the only one.

Track all of your costs to prioritize efforts

Tracking the costs of as many components as possible enables you to prioritize improvements over time even as your cloud matures, your staff gets better and better at running it, and even as demands change from your customers. In order to build a tool around our TCO model, we had to decide on what costs we want to track, and model together. Our model accounts for all hardware, software, and personnel required to operate a private cloud. Each and every one of them are a potential lever in affecting how your costs change over time.


The levers built into the model include: VM density affecting hardware spend, IT automation for personnel costs, and software choices for software costs. Between the three of these, the model addresses all of the major costs of acquiring and operating a private cloud, with the exception of data center facilities. With the low impact on costs of hardware and changes to density, we assumed that datacenter facility costs will largely be the same across technologies and were not a focus of this model. However, should you have great data center cost information you’d like to contribute, please let us know, as we strive to increase the completeness and accuracy of our model.

The model suggests IT automation should be the first item on your todo list.

Considering the timeframe increases model accuracy

Even though building a cloud can be quick, getting the most from its operation is a journey: staff will learn along the way, corporate functions will have to adjust, and business demands for new technologies and faster IT response will only increase.

Per-VM costs are inseparable from timing. You’re buying hardware, hiring people, buying software, suffering talent loss, refreshing hardware, and buying still more to support growth. All of these costs can, and usually do, hit your budget differently every year. If you’re buying software licenses, you have a large upfront cost and maintenance. If your staff gets promoted, gets raises, and sometimes takes new jobs, these will affect salaries, hiring, and training costs. Some you can plan for, some you can’t.

Put in another way, if next year, you provide the exact same quality of service, to the exact same customers, in the exact same quantity, with the exact same technology, there’s still a very real chance your costs will not be the same as they are this year.

We’re showing costs and cost changes over six years, but we modelled out to ten to find out when the costs start flattening out.

If you want your TCO model to be a tool for ongoing decision making, you need to not only look at costs, but how costs change over time.

The cloud growth curve doesn’t affect the TCO

One of the nice things about creating a flexible model is it allows you to try all sorts of hypotheses and inputs. While absolute costs depend on the success and speed of your private cloud adoption, one of our surprising discoveries is that relative costs are not dependant upon your adoption curve. None of the advice the model provides is affected by the growth curve.

This means IT organizations can get started even when unsure of how quickly your private cloud is going to take off. This also makes the particular growth model we discussed here a lot less important. Our examples have VM count doubling every year, which is the most common customer story you hear during IT conference keynotes. But, the advice is equally applicable no matter what your particular growth model is.

Having technical conversations with Lines of Business (LOBs) are frustrating for both sides: they often can’t provide you sufficient information you need in order to provide a thoughtful architecture and plan. Because of any number of reasons, you can’t provide accurate costs and changes to costs over time. With a good TCO model, these conversations can get unbelievably easier for both sides of the table: you can model different scenarios and provide ranges of pricing, and help your LOBs work through priorities. Invest the required time in an accurate TCO model, and you’ll not only make these conversations even easier, but you’ll have the tools in place to add financial input into your designs even as the services you provide change over time.

If you’re interested in expanding on what we’ve built, please let us know.

Erich Morisse
Management Strategy Director

Massimo Ferrari
Management Strategy Director

* If you think that you can run a cloud by leveraging existing IT Ops, think again. Research published by Gartner shows that not creating a dedicated team is one of the primary reasons for the failure of cloud projects: Climbing the Cloud Orchestration Curve 

** Velocity 2012: Richard Cook, “How Complex Systems Fail”
The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win
Complex Adaptive Systems: 1 Complexity Theory
What You Should Know About Megaprojects and Why: An Overview

Why Red Hat is extending management support to Microsoft private and public clouds

Today, Red Hat announced an unprecedented partnership with Microsoft, focused on mutually supporting a number of technologies and platforms. Among others, we announced the upcoming support of Microsoft Azure and System Center in our cloud management platform CloudForms.

Over the past 18 months, I’ve seen Microsoft’s public cloud evolve and mature, and the market interest grow, to the point that supporting Azure side by side with Amazon Web Services has been the number one request from many of our enterprise customers planning to, or currently building a hybrid cloud. These customers are demanding a single pane of management glass to consistently orchestrate the lifecycle of their applications across the two leading cloud platforms.

We are working hard to satisfy that demand, not only because it’s the best for our customers, but also because it aligns with one of the core principles shaping Red Hat Management portfolio: multi-vendor support.

CloudForms can already orchestrate and govern a broad range of server virtualization and private IaaS cloud engines from multiple vendors, not just Red Hat. This agreement with Microsoft extends our range to Microsoft cloud offerings. As we said for the Ansible acquisition, our enterprise customers have complex heterogeneous IT environments and don’t want IT organizations to create redundant management silos, or embrace single vendor stacks if it’s not the best for their business.

Red Hat aims to be our customer’s most trusted business partner in their journey to cloud computing. The first step has been recognizing that there’s no single cloud technology, product, platform, or vendor that can solve all problems in the most efficient way. Now, the second step is to enable our customers to consume the cloud technologies that fit their business goals and corporate culture in the most frictionless way.

This is what you should expect from us as result of today’s announcement: an easy and streamlined approach to manage workloads deployed across Microsoft Azure and Amazon Web Services, Microsoft System Center Virtual Machine Manager and VMware vSphere, and of course Red Hat Enterprise Virtualization and Red Hat Enterprise Linux OpenStack Platform.

This seamless management will extend to all CloudForms capabilities, including self-service provisioning and lifecycle management, policy-based orchestration, show back and chargeback, configuration management and drift analysis, capacity planning and reporting.

There are many more steps we plan to take to fully enable an enterprise hybrid cloud. Expect big things from Red Hat.

Alessandro Perilli
GM, Management Strategy

When And Why OpenStack Needs A Cloud Management Platform

At Red Hat we are seeing more and more organizations choosing OpenStack for the next step in their cloud journey. Very often, this transformation journey is marked by three main evolutive stages:

  1. Build a server virtualization environment for scale-up workloads
  2. Extend the server virtualization environment with an Infrastructure-as-a-Service (IaaS) cloud for scale-out workloads
  3. Unify and enforce enterprise-grade governance for both server virtualization and IaaS cloud environments

Different companies stop at different stages of this maturity model, depending on the business needs and the maturity of their IT organization. As the environments in stage 1 and stage 2 grow in size and complexity, companies can reach an operational scale that requires more sophisticated management tools than the ones provided out of the box by server virtualization and IaaS cloud engines.

A Cloud Management Platform (CMP) offers an additional layer to govern a complex server virtualization environment or IaaS cloud as needed by a large-scale end user organization.

In fact, despite OpenStack being a powerful and flexible IaaS cloud engine, doesn’t offer a wide range of management capabilities that some organizations may be looking for, such as:

  • Capacity & Performance Management
  • Configuration & Change Management
  • Chargeback
  • Orchestration

OpenStack does a great job in providing the instrumentation for the aforementioned capabilities – think the metering APIs that OpenStack Telemetry (Ceilometer) offers or the orchestration templates that you can define with OpenStack Orchestration (Heat) – but the management tools that it provides on top of that instrumentation don’t meet the needs of every organization.

To better understand why a CMP is so important at a certain operational scale, let’s use an  analogy: professional property renting.

When you think about the management tools that IT organizations use at each stage of our maturity model, think of:

The Virtual Infrastructure Manager for the amateur landlord

As we said before, at this stage an organization has in place a server virtualization environment and its management console like, for example, Red Hat Enterprise Virtualization Management. The organization is an amateur landlord.

Let’s say that you own one or more apartments that you want to rent. All of them are ideally located in the same city but different in size, finishes, prestige of the location, etc. You want to rent them as long as you can, carefully selecting the best possible occupant for each. You want to keep things simple: long term, fixed price contracts, personally track every change in each apartment and, if something bad happens, you personally work with the occupant to determine responsibility and find a solution.

Your apartments are unique, lovely, hand cared for, just like VMs in a server virtualization environment.

However, you don’t get the most from your properties because this simple, not-automated, way to do business is slow rather than agile, reactive rather than proactive, and with an unbalanced level of attention dedicated to each asset. For example, if one of your tenants starts acting unpredictably and against the law, evicting him can become a nightmare, distracting you from managing all other apartments. In another example, if a growth opportunity knocks at the door, you need time to carefully plan a new property acquisition, select tenants, etc., and this will likely make you lose the opportunity window.

This way of doing business is perfectly fine and sufficient as long as your ambitions as landlord (or your scalability needs as IT organization) remain contained. If your ambition/needs grow, maybe due to a highly competitive market, you need better tools to manage your property portfolio (or your application portfolio) in a more efficient and operationally scalable way.

The IaaS Cloud Manager for the Airbnb-enabled landlord

At this stage an organization has in place an Infrastructure as a Service (IaaS) engine like, for example, Red Hat Enterprise Linux OpenStack Platform. The organization is an Airbnb-enabled landlord.

If the number of apartments you want/need to manage grows, maybe due to early success and increasing market demand, you feel the need for a tool like Airbnb. Airbnb maximizes your capability to address the market demand and minimize the friction in the renting process in many ways. It offers a wonderfully designed website that lists your properties on a map, showing photos of the rooms and furniture, giving guidance about the services around the apartments, and providing a complete booking service that your potential tenants can use in a self-service way.

Airbnb enables you to easily manage different contract options (monthly, weekly, daily), rent a single room or the entire apartment, open and close the calendar for availability instantaneously and, more important, gives you the flexibility to change your mind whenever you want (and offers up to $1M host protection insurance). Airbnb exposes a rating for each property, encouraging landlords to offer a consistent experience for every apartment. Services like Airbnb can help the real estate market grow by increasing competition, pushing landlords to invest more in their properties as revenues come in quicker and in a more frictionless way.

In the same way, OpenStack offers to your lines of business a self-service portal that they can leverage to self-provision what they need, gives you the flexibility to build instance flavours offering different lease times, amount of resources, pre-baked images and grants you the flexibility to introduce or retire those flavours as needed. The usage model encourages users to standardize the OS/Middleware offering, consequently increasing the predictability and efficiency in terms of maintenance, hardware resources, purchasing, etc.

Landlords embrace tools like Airbnb to manage their properties because they want to be agile and catch new business opportunities. To do so, they accept to cut their emotional bond with each individual apartment. IT departments are driven by similar logic, and accept to move from pet-VM to cattle-instances.

The CMP for the professional property manager

At this stage an organization may have deployed a Cloud Management Platform (CMP) like, for example, Red Hat CloudForms, to govern both the server virtualization environment and IaaS cloud. The organization is a professional property manager.

Let’s say that the agility offered by a tool such as Airbnb makes you feel confident to serve hundreds or even thousands of tenants and manage many more properties. This last step in your career as a landlord introduces a completely new set of needs and the complexity is so high that you cannot do everything by yourself. At this point, a tool like Airbnb can’t fulfill all your needs because it’s not designed to serve landlords at scale:

  • managing bookings, cancellations and changes at scale can’t be made with a spreadsheet, you need a professional booking system. You need some level of automation to manage your capacity and at the same time supervise the performance of each property.
  • for each tenant you need to inventory the stay, consumptions, reimbursement, etc., and offer transparent billing. This requires a professional chargeback process.
  • for every booking of every property you need to arrange cleaning, supplies, accesses, etc. When the numbers start rising this can become a massive effort, impossible to be manually fulfilled. You need to orchestrate all the external services connected to your estates: professional cleaning service for both the property and bed linens, for example; suppliers of things like soap, toilet paper, coffee etc.; someone who distribute the keys; and so on.
  • every time a tenant leaves you have to check everything is OK. You need to plan minor and major maintenance activities, changes and improvements for every single property, and even the opportunity to buy new ones!

Operational Burden

Exactly like in our analogy, a CMP introduces a set of critical management capabilities to enhance and augment what OpenStack can do out of the box. Additionally, and critically, a CMP can unify the self-service provisioning experiences across both the server virtualization environment and the IaaS cloud that it manages side by side.

Cloud Management Platform
Following these principles, a CMP like Red Hat CloudForms has capacity planning capabilities that enable IT organizations to know which OpenStack availability zone has enough resources to deploy new instances. For example capacity planning can tell you that a single instance of a web server with 2 vCPUs and 2GB of memory can be safely deployed on zone A but if you plan to scale it out at certain point in time zone B is a better choice provide the amount of additional resources needed.

It provides Performance Analysis capabilities to monitor and forecast the utilization of instances, hosts and providers. For example they can track the average load of the physical hosts over time suggesting the moment to add more hardware to support the increasing demand of resources.

In combination with Ansible (which Red Hat recently acquired), CloudForms offers automation capabilities allowing administrators to create orchestration and configuration workflows for the deployment, setup, and retirement of a certain instances. For example, the deployment of a web server hosting a public website will require your firewall to open a number of ports, and your router to setup a NAT on a public IP to grant access to the Internet audience.

Moreover, CloudForm’s change management and policy enforcement capabilities will keep in compliance the entire environment, tracking modifications and enforcing specific configurations or patch installation on instances and hosts. For example if one of the tenants configures an instance, in its domain, opening a potential security breach CloudForms will automatically restore a safe state.

Last but not least, CloudForm’s chargeback capabilities allow IT organizations to charge OpenStack instances allocation and usage based on a number of different criteria. For example you can account the utilization of a specific instance by minutes, hours, days or a fixed price depending on the kind of workload is going to support.

So, in summary: some organizations may find the management engines coming out of the box with traditional server virtualization or Infrastructure as a Service engines a perfect fit for their business needs. However, for those organizations planning to build a large-scale enterprise-grade private or hybrid cloud, a CMP offers a governance layer that allows them to reach an operational scale that would be impossible to manage otherwise.

Massimo Ferrari
Management Strategy Director

Why did Red Hat acquire Ansible?

Today, we announced a definitive agreement to acquire Ansible, a popular IT automation tool launched in early 2013. Like in any acquisition, customers and partners will likely have a number of questions, so let me get straight to the point and cover the top three questions I anticipate:

Why an IT automation tool?

Automation helps IT organizations addressing the increasing demand for speed and simplicity coming from the lines of business (LOB) across a wide range of key initiatives, including:

  • Support for cloud-native applications through the deployment of Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) clouds
    Next-generation applications require next-generation computing environments, like scale-out IaaS and PaaS clouds. The deployment of these cloud environments (e.g. OpenStack) can be challenging due to their inherent complexity and the relative maturity of the underlying technology.

    IT automation tools can help to dramatically speed up cloud deployments while drastically reducing human errors associated with manual intervention.

  • Agile application development through the DevOps practice
    Next-generation applications are developed after new methodologies, like DevOps, and new patterns, like the microservices architecture. Supporting the continuous delivery predicated by the DevOps methodology requires a toolchain that empowers developers to release early and often. In turn, the application update frequency depends on how fast, simple and efficient the DevOps tools are in the toolchain.

    IT automation tools are a critical addition to any DevOps toolchain, as they can operate a large amount of changes to complex application architectures, and to a large number of application instances, in very short amount of time.

  • Service orchestration through IT process automation
    The ultimate ambition of IT organizations worldwide is to offer their LOB a fully automated provisioning of entire application stacks, through virtual machines (VMs) or containers, or “service orchestration”. It’s an ambition as old as the private cloud, and yet, the industry struggles to make it a reality. The problem is that orchestration and automation are two incredibly challenging processes, because of the myriad of moving parts to coordinate, and the lack of standardized interfaces to programmatically coordinate them.

    Red Hat CloudForms, our cloud management platform, is top in class at orchestrating the whole lifecycle of an enterprise application (from provisioning to retirement), according to configuration and compliance policies. However, a great orchestration engine still depends on last mile automation to compose each tier of the application. The more flexible and powerful the IT automation engine is, the more complex applications that can be provisioned.

Our customers already use Red Hat solutions in conjunction with various IT automation tools. With this acquisition, we want to offer that type of integration through the world-class Red Hat support and certification that makes open source consumable for the enterprise (exactly the same way we do for OpenStack and every other product in our portfolio).

Why Ansible?

We see in Ansible a perfect alignment with the core principles that shape Red Hat’s management, both at the product level and at the portfolio level.

At the product level, Ansible matches Red Hat’s desire to deliver a frictionless design and a modular architecture through open development:

  • Ansible is simple to use.
    A quick Google search will reveal an overwhelmingly consistent sentiment about Ansible’s low learning curve and its simpler manageability. As we work to deliver the Frictionless IT that our customers need to address the demand of current and future generations, this focus on “simple” is critical.

    How simple? Let me give you two examples.
    First: Ansible’s “playbooks” are written in humanly-readable YAML code, which make easier to both write and maintain the automation workflows.
    Second: Ansible is agentless, using standard SSH connectivity to execute automation workflows, making it much easier to blend into an existing enterprise IT environment and its intricate operational framework.

  • Ansible is modular.
    At the time of writing, Ansible ships 400+ modules, which can be invoked at will to extend the product’s capabilities beyond its core feature-set and intent. This is a critical capability that we want to offer in all Red Hat management products to support our customers as their needs evolve in terms of the maturity, complexity and scale of their IT.

    How modular? Let me give you one example.
    Ansible’s modular capabilities span from managing storage images in OpenStack Image Service (Glance) to managing Linux containers, to collecting data from a F5 Big-IP application delivery controller.

  • Ansible is a very popular open source project.
    Ansible is an incredibly popular open source project and the community members contribute to both the core technology and the modules that come with the core. We believe that supporting and nurturing great open source communities is the only way to guarantee a continuous stream of innovation, and it’s what makes Red Hat so special.

    How popular? Let me share some telling examples.
    First, Ansible has almost 13,000 stars and almost 4,000 forks on GitHub
    Second, according to RedMonk, the number of mentions of Ansible in the Hacker News community is skyrocketing.

At the portfolio level, Ansible matches Red Hat’s desire to support a multi-tier architecture, provide multi-layer consistency, and deliver multi-vendor support:

  • Ansible supports multi-tier deployments.
    Ansible is designed to support the deployment and configuration of a multi-tier application, through VMs and containers. This means that organizations can automatically provision different components of the same application on the tier that is most efficient to run them: scale-up workloads on bare metal and server virtualization engines, scale-out workloads on IaaS cloud engines and PaaS cloud engines. We do not believe in “one size fits all” approaches and we are committed to supporting the broadest range of infrastructure and platform engines possible.

    How far does Ansible’s multi-tier support go? Here’s an example.
    Ansible can manage VMs and guest OSes in a VMware vSphere server virtualization environment, deploy and manage instances in an OpenStack IaaS cloud, and deploy applications inside an OpenShift PaaS cloud, all at the same time.

  • Ansible brings consistency at multiple layers of the architecture.
    Ansible can be used to programmatically manipulate every layer of a computing architecture, from the infrastructure to the application, and for every use case, from orchestration to deployment to configuration. As I said at the beginning of this post, Red Hat is committed to enabling the provisioning of entire application stacks in the easiest possible way, and management consistency is a great way to keep things easy.

    How far does Ansible’s multi-layer support go? Here’s an example.
    Ansible can automate everything including the configuration of network, storage, compute (e.g. OpenStack instances), OS, middleware (e.g. Red Hat JBoss Middleware) and finally, application layers.

  • Ansible supports heterogeneous IT environments.
    Ansible can automate the configuration of a broad range of technologies from many vendors, not just Red Hat. Our enterprise customers have complex heterogeneous IT environments and the last thing we want is for customers to create redundant management silos, or embrace single vendor stacks if it’s not the best for their business.

    How far does Ansible’s multi-vendor support go? I have two final examples for you.
    First: Ansible supports both Linux and Windows environments, performing equally well configuring an Apache2 web server or a web application pool on Microsoft IIS.
    Second: through its modules, Ansible empowers IT organizations to manage a wide range of ISV and IHV technologies, from F5 Big-IP and Citrix NetScaler network controllers to Amazon Web Services and Google clouds.

How does Ansible fit Red Hat’s management strategy?

If you read this far, you already have a pretty good idea of how Ansible will augment and complement Red Hat’s current management portfolio:

  • Red Hat CloudForms will continue to offer overall orchestration and policy enforcement across all architectural tiers we support, within the corporate boundaries and on public clouds.
  • Ansible will automate the provisioning and configuration of infrastructure resources and applications within each architectural tier, as requested through the CloudForms self-service provisioning portal. This will include deploying Red Hat Satellite agents on bare metal machines when the use case requires it.
  • Red Hat Satellite will continue to enable the provisioning and configuration of Red Hat systems (and security patches and software updates) within each architectural tier, as defined by the Ansible automation workflows.

Red Hat Open Management Platform

Red Hat customers will be able to adopt any of the three as standalone products, but we’ll work hard to tighten the integration between the three to enable them to work great together.

We are very excited to have the Ansible team joining the Red Hat family and we can’t wait to put the product in the hands of our customers.

Alessandro Perilli
GM, Management Strategy

Towards a Frictionless IT (whether you like it or not)

With the term Frictionless IT, Red Hat means an enterprise IT that just works, reshaped after the experience offered by modern consumer-grade public cloud services, which business users are growing to expect.

What does Frictionless IT have to do with Red Hat and the IT organisations that we serve? Simple: if we don’t start moving towards Frictionless IT, we all risk irrelevance.

Current generations of IT professionals are experiencing a growing disconnect between Enterprise IT and Personal IT.

  • Enterprise IT remains reliable, but in most cases slow to procure, complex to use, and overall frustrating. Think about your expense report system.
  • Personal IT is evolving into a set of instantaneously available, incredibly easy to understand and blazing fast at executing the tasks that they are supposed to execute. Think about Gmail, Dropbox, Evernote, IFTTT, and the plethora of other public cloud services that we all interact with on daily basis through our phones, tablets, and laptops.

The first problem with this split brain between Personal and Enterprise IT is that our brain is exactly the same, inside and outside the office. Any interaction with this emerging Personal IT raises the bar on how the IT experience should be. The more we use Gmail, Dropbox, Evernote and IFTTT in our personal life, the more our expectations grow for a similar experience at work. We wonder more and more, “if my Personal IT is such a breeze to use, why does my Enterprise IT have to be miserable?”

The second problem is that current generations can endure frustrating Enterprise IT only because that’s all that they have experienced for decades. New generations will not be so forgiving. The kids in college today, and those who just started their first job in a new, exciting startup, are growing used to only one kind of IT experience: the frictionless one.

At some point in the near future, these kids will land more reliable and less stressful jobs in large enterprises. It will not be just one or two individuals with a different set of expectations joining a typical bank or insurance company. It will be a whole generation that permeates every department of an end user organisation, from marketing to engineering, with a completely different set of demands and expectations. The overwhelming majority of IT organisations, and the traditional solution providers that support them, are completely unprepared to meet that demand.

At Red Hat, we recognise this challenge. In it we see an opportunity to simplify enterprise software in many dimensions, from the user interface to the underlying architecture, through not only the technology, but also aspects like documentation, licensing and much more.

We believe that at least three ingredients are necessary to meet the demand for frictionless IT:

Ease of use

A key enabler for a Frictionless IT is a smooth user experience (UX). The user experience is defined by the quality of an interaction between the human and the system, and it takes place when you deploy, integrate, customize and use enterprise systems. Intelligent installers and self-contained binaries, simplified back-end architectures, supported out-of-the-box plug-ins, modular front-ends, consistent UIs and even coherent documentation all contribute to improve the quality of the UX. However, very few organisations in the world look at these aspects from a holistic standpoint and take a user-centric approach. For example, the user interface (UI), in both commercial-off-the-shelf and custom-made applications, is one of the most overlooked aspects of enterprise software.

If you think that investing in state-of-the-art UI is unnecessary, or not worth the effort, think again. The primary reason why some public cloud offerings become overnight successes at a planetary scale is their intuitive UI. In our Personal IT we are already getting used to intuitiveness, and the demand for it is supported by the broad market offering. We have already reached the point that when an app on our smartphones is too complex to use in the first few minutes, we simply delete it and download an alternative. There’s no second chance for the app that is not frictionless.

Now let’s go back to the upcoming generation of technology consumers. Even among the most technical of them, some may have never built a computer by screwing a motherboard to the case (like many of us did, including me), used a command prompt or plugged in a network cable. Those users will expect that installing software will be as frictionless as deploying a virtual appliance, plugging a cable will be as frictionless as drawing a line on a service catalog UI and so on.

If the IT organisations of tomorrow don’t deliver that kind of ease of use, future generations of business users will simply circumvent them, more than today, relying on external cloud service providers. And to meet the expectations of future generations, the UX in enterprise software has to dramatically improve.

Red Hat understands the challenge, and we are working hard to influence the open source projects that we support in the short and long term. For example, our commercial cloud management platform, CloudForms, comes as a single virtual appliance; this is in contrast to other cloud management platforms that may require 6 to 9 different tools (and not all of them available as virtual appliances). We consider this a prime example of the effort we put in engineering more frictionless enterprise solutions.


A second key enabler for a Frictionless IT is speed. If the interface is pretty but you still need to take 20 steps (or 20 weeks) to get the job done, it’s not frictionless. We are already know that speed deeply influences the UX, to the point of impacting search engine rankings, thanks to the enormous research conducted around aspects like loading time in web development. And yet, it took a lot for the industry to realize that the same human brain which doesn’t tolerate a very slow page load very likely won’t tolerate a very slow enterprise IT experience.

Speed has become an increasingly important factor in the last five years, to the point that the industry constantly mentions agility as the most desired attribute for business and development models. Of course agility is not just speed, but speed is a very big part of it. Which is one of the many reasons why, for example, we are seeing a shift of interest from virtual machines (VMs) to application containers.

Operating system and application virtualization are as old as (and in some cases, older than) hardware virtualization. More than ten years ago, the emerging virtualization industry was rich with technology startups focused on all three approaches. As we know, eventually the mainstream audience preferred VMs over what we used to call operating system partitions and application layers, but today we are experiencing a second coming of the latter technologies because customers’ business needs are changing and evolving, as they always do.

Ten years ago, IT organizations’ primary challenge was modernizing the data center while maximizing the ROI on existing hardware equipment, and hardware virtualization brilliantly helped to accomplish the goal. Today, IT organizations’ primary challenge is addressing the business demand as fast as possible, because there’s now a competitor that never existed before: the public cloud provider. Application containers can be deployed in seconds rather than the minutes needed for VMs, significantly shrinking the reaction time for a variety of scenarios, including scaling out a web application to address an unexpected traffic peak and avoiding a fatally slow loading time.

Red Hat understands the challenge. This is why, for example, we invested so heavily in application containers, introducing enterprise support for the Docker format across a growing number of our products (Red Hat Enterprise Linux 7 first, then OpenShift 3, and soon CloudForms 4).

Application containers are just one example (and to be fair, they have more virtues than just speed of deployment); we constantly look at solutions that can dramatically increase operational speed.


A third enabler for Frictionless IT is seamless integration between enterprise products and the ancillary services necessary to make it work or unlock their full potential. No successful software or hardware comes without a certain degree of integration with the existing enterprise IT environment, but the extent of that integration makes or breaks the UX, in turn impacting on users’ productivity.

Integration can happen at the back-end level and at the front-end level. The latter is rarely considered, so I’ll focus on that in this post. To clarify the deeply underestimated importance of front-end integration, I always use the analogy of the smart calendar.

In many cases, in preparation for a business meeting we always check a couple of apps on our smartphones: the calendar app, to know when, where, and how we need to meet; and the map app, to know how to get there. In a perfect world, especially if the business meeting is a delicate negotiation with parties you’re meeting for the first time, we might want to check at least another couple of apps: LinkedIn, to learn more about the people that we are going to meet; and Twitter, to learn more about what those people have to say about topics that may be relevant to the negotiation. Out of the four, it is the last two apps that could provide the intelligence necessary to successfully close the negotiation. But because the information is spread across so many different apps, which dramatically increases the friction, we limit ourself to checking the first two, the indispensable ones. Crucially, because of the friction, we don’t check the information that could be most valuable for the meeting, which deeply impacts our effectiveness.

Thankfully, there’s now a better way. A wave of so called smart calendar apps are emerging (and rapidly being acquired), with their biggest value being the ability to blend the front ends of the aforementioned four apps into a single, consistent UI that dramatically reduces friction. If you have ever tried smart calendars like Tempo or Sunrise, you have an idea.

Enterprise IT has to follow the same path: improve integration to minimize the friction (which in this case can appear as a steep learning curve) and maximise the productivity of the enterprise audience.

Red Hat understands the challenge, and we are working hard to influence the open source projects we support in the short and long term. For example, we are working within the ManageIQ community, the upstream project behind CloudForms, to develop a coherent UI allowing our customers to manage side by side virtual machines and containers in a consistent fashion:

CloudForms managing Kubernetes


CloudForms managing Containers


Ease of use, speed, and integration are key ingredients to dramatically improve the enterprise software (and hardware) UX. But what’s the difference from the past, you might ask. User experience has been considered as a key differentiator since the late 60s by companies like IBM. And there are plenty of ROI calculators showing that UX has a quantifiable impact on business. The difference is that now enterprise users have choice, and enterprise IT organizations have competitors. And the choice is incredibly broad and incredibly accessible. If IT organisations fail to deliver Frictionless IT, lines of business (LoB) will simply go elsewhere and get the job done with the tool that is most convenient (simplicity, not cost) out of the many available.

A LoB doesn’t care about security, compliance and integration issues, nor do they trouble themselves with the politics driving the IT organization choices towards a specific solution versus another. A LoB only wants to get the job done within the deadline. And if the corporate policies get in the way, they will be often circumvented. In turn, if the corporate policies get circumvented and the tools that empower a LoB are provided by external cloud service providers, in the long term the role of the IT organisation will become less relevant. To stay relevant in the eyes of upcoming generations, both vendors and their clients must recognise the ongoing transformation, anticipate the upcoming demand, and adapt.

It’s great to see how some vendors are starting to realise the need for Frictionless IT. For example, during the last week’s Red Hat Summit 2015, our long term partner SAP demonstrated a growing awareness about the need for simplicity.

On our side, we are working to deliver the most frictionless products that the open source communities, supported by Red Hat’s expertise and vision, can offer. We have a long way to go, but we are confident that this is the right path to walk. Stay tuned for more on this front.


Alessandro Perilli
GM, Management Strategy

OpenStack: Scary Enterprise Support

At Red Hat, we’re seeing a surge of confidence from large organizations and more and more OpenStack adoption (with deployments in thousands and tens of thousands of sockets) in industries like financial services, insurance, healthcare, and retail.

This increased confidence can be tied to the increasing maturity of the OpenStack code (at least for the core services), and an emerging set of features that are critical for enterprises, like identity federation in Keystone, and Red Hat’s strong, enterprise-grade support. The latter is incredibly important and has made the difference in many deals.

Enterprise-grade support for any open source project, and especially for one as complex as OpenStack, can be articulated through many dimensions. However, they are almost never part of the conversation until too late. When you evaluate enterprise-grade support from any OpenStack provider, assess at least these six key dimensions:

Expertise In the underlying operating system

OpenStack depends on the underlying Linux operating system to function. As OpenStack needs a number of facilities and libraries (e.g., cryptographic modules) provided by the operating system (OS), there is no way to decouple the two.

To provide enterprise-grade support, any vendor offering a commercial edition of OpenStack must package it with an OS that is proven from a reliability and security standpoint, as well as deeply understood, to fix issues in case things go wrong (and things will go wrong at some point. Savvy CIOs know this to be a fundamental truth).

As I talk to customers around the globe, a common theme is emerging: when evaluating OpenStack, enterprises prefer to rely on tested, supported, and certified Linux distributions rather than unknown OSes.

When your OpenStack vendor:

  • is using a Linux distribution that has been in the market for a very short period (i.e., one year)
  • has no history of contribution to the Linux distribution of choice (check the stats provided by the awesome Bitergia research firm)
  • doesn’t even mention its Linux distribution of choice in its marketing materials

…that spells scary enterprise support.

Case in point: at the 2015 OpenStack Summit in Vancouver, sharing their experience in operating their OpenStack cloud, Time Warner Cable stated “Kernel panics happen, kernel panics happen, kernel panics happen”, and then asked “Do you have a kernel vendor on your vendor list?”

In more than one deal, Red Hat has been called to replace an existing OpenStack vendor due to our deep experience and expertise in both the underlying OS and OpenStack itself. In most cases, we have been asked to first support the existing implementation, and then help the client migrate to our own OpenStack distribution.

Security response

Like every other piece of software, OpenStack is prone to security vulnerabilities. The problem is that, like any other cloud engine, OpenStack is a mission critical piece of software, on which many lines of business depend when their apps run in the cloud.

To provide enterprise-grade support, any vendor offering a commercial edition of OpenStack must be capable of addressing security issues as they arise as fast as possible and in the most possible competent way.

When your OpenStack vendor:

  • doesn’t have a security response team
  • has a security response team that consists of a single person per continent
  • cannot back port and port a security fix to older and newer versions of OpenStack before it’s fixed in the trunk code

…that spells scary enterprise support.

More often than you could imagine, Red Hat has been selected in deals also because of the vast skills of our global Security Response Team and their track record of fixing 97% of security issues within 24 hours. We are incredibly proud of them.

Certification and compliance

To trust any solution on the market to run mission critical systems, enterprises need certification and compliance in a wide number of areas: software and hardware integration, security, government regulation. Without them, large organizations can’t have the peace of mind from calling a single, well-defined actor to help solve issues, should they arise. Who do you call if you experience storage corruption in your Windows virtual machine hosted by a KVM hypervisor running on an IBM blade system, connected to an EMC storage system through an Emulex HBA in a non-certified OpenStack cloud?

In specific environments, peace of mind is not even the biggest issue; organizations simply cannot operate without regulatory compliance.

When your OpenStack vendor:

  • only supports a handful of ISVs or IHVs
  • only supports its own hardware
  • has no security or government certifications

…that spells scary enterprise support.

Red Hat is often preferred over other OpenStack providers because we truly support a multi-vendor cloud stack, at the hardware and software level. With more than 270 certified OpenStack partners, Red Hat boasts the industry’s largest certified ecosystem in support of commercial OpenStack deployments. This helps provide customers with freedom of choice and peace of mind that is necessary to build and operate an actual hybrid cloud. Do you know, for example, that Red Hat has 25 Microsoft SVVP certifications to support various Windows operating systems?

Vertical consulting

Like any software solution, enterprises need to adapt OpenStack to their ever evolving business needs, and integrate it with the remarkably heterogeneous IT systems. While the incredible industry support for OpenStack enables support for a wide range of environments and use cases, OpenStack cannot satisfy every need for every company in every vertical out of the box.

To provide enterprise-grade support, any vendor offering a commercial edition of OpenStack must support its deployment, integration and customization with a global consulting arm that is vertically skilled on the product and its complexities.

When your OpenStack vendor:

  • has no consulting division
  • has a consulting division that consists of four engineers across five continents
  • has a generic consulting division with no vertical expertise and dedicated practice on OpenStack

…that spells scary enterprise support.

Red Hat realized enterprise customers want a business partner to support them well beyond the initial deployment of OpenStack. This awareness led to key strategic investments, like the acquisition of eNovance, bringing more 100 OpenStack engineers to the Red Hat Consulting organization; the establishment of a Cloud Innovation Practice, to help transfer the new skill sets required to govern a cloud environment without taking its ownership in a typical managed services fashion; and the creation of a comprehensive, hands-on OpenStack training.

Code Indemnification

Like any other open source project, OpenStack contributions come from a vibrant and highly skilled community of individuals and vendors. Despite the deep expertise, contributors are humans and can unintentionally violate intellectual property rights in their open source code.

To provide enterprise-grade support, any vendor offering a commercial edition of OpenStack must protect its customers from legal repercussion in case of intellectual property right infringements.

When your OpenStack vendor:

  • doesn’t mention or recognize code indemnification as an important legal protection
  • has no experience in the legal implications with open source licensing
  • is not formally committed to code indemnification

…that spells scary enterprise support.

Red Hat is proud to be one of the few open source vendors to offer code indemnification as an additional support mechanism for our enterprise customers. Our commitment doesn’t just imply taking care of the legal implications, but it extends to quickly providing a technical replacement for the disputed code.

Extended cloud management

OpenStack is a powerful and flexible Infrastructure-as-a-Service (IaaS) engine, but it’s not enough to build an enterprise-grade cloud. Large organizations need governance capabilities, such as policy enforcement, capacity management, and configuration management, that are not provided out of the box by OpenStack, and that are not limited to OpenStack management. Depending on cost, performance, reliability, security, compliance, and many other constraints, lines of business must be able to host their applications across a wide range of private and public cloud engines. Managing all of those clouds in a consistent fashion, orchestrating the lifecycle of the workloads as coherently as possible, is a massive challenge that requires a powerful single pane of management glass.

To provide enterprise-grade support, any vendor offering a commercial edition of OpenStack must be capable of offering a robust and powerful governance layer that integrates with, and augments OpenStack basic operational management capabilities. That way, when the enterprise is ready to grow, the vendor can fully support its evolution.

When your OpenStack vendor:

  • suggests that OpenStack is all you need to build a mature enterprise cloud
  • has no cloud management platform that tightly integrates with its OpenStack distribution
  • has a cloud management platform that cannot support side by side server virtualization, IaaS, and Platform-as-a-Service (PaaS) across both private and public environments

…that spells scary enterprise support.

We consistently hear the same question over and over from large enterprises we talk to that are interested in OpenStack: how can you help us build our enterprise cloud on top of OpenStack?

That is where managing OpenStack through our cloud management platform CloudForms makes a huge difference. Customers may not want to buy their full cloud stack from us, and we are committed to support multi-vendor approaches, but many also want to have the assurance that if they want to have a full Red Hat cloud, we have it. We do.

Any OpenStack provider claiming to offer enterprise-grade support, must excel in every of those aforementioned dimensions, not just one of them.


Alessandro Perilli
GM, Management Strategy