Do you really know what cloud is and how to use it? Many in business, and most in the technology industry in particular, would say “yes,” with confidence. The word is a preeminent one and from all the conferences, books and online references out there, descriptions are readily available. The problem is we need first to understand what it truly is, its capabilities and how can we work with it. There are abstraction, communication, governance, security and size criteria related to the cloud. This article discusses these criteria and how to work with the cloud. The article is based on two simple premises: 1) cloud is more than a thing and 2) software infrastructure architects are the ones best able to handle it.
We can start by looking for a definition of cloud. Wikipedia attaches a last name to it to differentiate it from other clouds: cloud computing. It is basically “the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet)”[1]. That definition is somehow accurate, but we will make it a bit more abstract by saying that cloud is a pool of resources that can be accessed and used remotely.
Those resources are not restricted to hardware and software since they can be services or data/information (requiring software and hardware, of course). The slight difference is the actual abstraction of the resources that live, or reside, in the cloud. That is, the cloud is the place “out there” that is a pool of resources we can access.
There are many ways those cloud resources can be accessed, but the near de-facto standard is using the service metaphor. The cloud is then that pool of resources that offers a set of services (business functionality) through a special communication channel that you can consume over the network. This abstraction brought a plethora of resources delivering services, like Infrastructure as a Service (IaaS)[2], Platform as a Service(PaaS)[3] and even Data as a Service (DaaS)[4].
These resources are the same normal “on premises” systems use. The difference is that they are in the cloud, in the pool, accessed using the service metaphor. This comes with some important restrictions on what the system can and cannot do and also impacts how the system is designed per-se. Let’s review those considerations.
Abstraction criteria
Resources in the cloud may not be what they seem. Using virtualization and other techniques, the cloud can offer different types of resources that may not be the real thing. At the infrastructure level, the cloud can offer computers with some memory and processor power but not necessarily a physical machine that exists. The abstraction of the resources is a very important piece of the cloud offering that has many benefits and some restrictions.
The abstraction will give the software development and IT teams the all-important “ease of use.” The configuration and deployment of a machine, for instance, is as easy as adding it to an account and then burning a pre-configured system onto it. Creating a new database or installing a new service is as easy as a couple of clicks. That is because all the heavy lifting is done at cloud implementation, not on the actual system. You only consume the service of adding a new database. How that is done is hidden.
What you lose then is a little bit of control. To offer that ease of use, simplicity should be implemented, and that means reducing the configuration granularity. There is no room for fine-tuning or directly handling configuration parameters in order to tweak the solution to your needs. Doing so may prevent the use of cloud resources for some applications that have special requirements from the standard, or most common, configurations.
Communication criteria
One important characteristic of accessing resources in the cloud is the need for communication technology. Communication means information (control, status and business) flowing from the local system to the cloud and vice-versa. That comes with an interesting set of considerations to take into account.
External API. To access the resources you must follow the API specification the cloud provider gives you. That API is out of your control and may provide fantastic capabilities or limit your own functionality. That API should be evaluated to determine if it fits your needs.
Speed. All communication affects the speed of processing. In the cloud context that is clear when you include security (cost of encrypting), data validation, network latency, etc. Requirements that are sensible to turnaround times should be validated before using cloud with them.
Data/control jeopardy. As mentioned before, you can lose control. A third-party cloud provider is the one managing your resources and data. Any problem or exception will be handled by the cloud provider.
The online state. Of course, communication means you need to be connected, online. This online state, when the cloud is an external one, requires continuous Internet connection. A line failure will leave the system inoperable, a risk that should be evaluated and mitigated.
Governance criteria
Governance may need to be expanded. The policies and norms may change due to the newly acquired capabilities in infrastructure and software when working with the cloud. Let’s review.
Dynamic provisioning. The cloud offers the dynamic creation of resources capability. That allows for an on-demand provisioning and on-the-fly creation of resources based on actual needs. That should be automatic. Of course, this requires data gathering for decision making but also minimizes chaotic changes in infrastructure and service size. Special care should be taken as adding more resources may impact sensible components. (Imagine an overload of database queries or saturation of the network.)
Failover/recovery. More resources mean more possibilities of component failure. On the other hand, problems with bloated or abandoned resources can be solved with the careful creation of more resources to replace the ones that are broken. Policies regulating this are needed.
Operations team impacted. Handling a static set of servers is hard work. Handling large sets of dynamic servers that are created and removed dynamically may be cumbersome. Operations may need to switch gears toward resource manipulation and use of templates.
Cloud-aware development. Development teams should be ready for dynamic provisioning. That is, the software needs to be aware of the cloud and, having access to the cloud API, it can auto-provision the needed resources. It should also provide services for sanity checks, monitoring and external control so that a central management component can detect problems and isolate them. That may be easy with a couple of resources, but a dynamic mass will require additional tooling.
Security criteria
In security there are several concerns and constraints. The two major concerns are as follows.
Data CIA. One of the main concerns in security is ensuring the confidentiality, integrity and availability (CIA) of data. The Data CIA is a set of properties that should be enforced depending on the data and the process that data requires. Those properties are also affected by “where” the data is located. The data in public cloud heavily relies on the cloud implementation. Data may be seen by the cloud provider; problems with the provider may alter the data or, even worse, leave the data unreachable. The data is somewhere else and it is not totally under your control. So, additional enforcement should be in place, like encryption in the origin (so data is encrypted on premises), data validation and live data backups (as in business continuity plans, see next point).
Massive failure. Yes, it has happened. The cloud provider may have a serious blackout and all of your services in the cloud would be unreachable. Depending on how critical those services are, you need to take this concern into account in your business-continuity plan.
There are of course, some legal constraints you also need to account for, for instance, data that should not leave the country or cannot be “exported” to some locations.
Size criteria
Yes, the cloud offers immediate scaling of your infrastructure or services. Suddenly, you can scale from one to thousands of servers. Controlling a huge set of resources is a complex task. Here’s what you need to consider.
Creation and control. How, when, and how much to create should be driven by policies, regular monitoring and careful analysis of the impact when adding resources. The other part of the job is how to control that mass of resources. From grid strategies to viral expansion and control of resources (where each resource may additionally create and control a small set of resources), each one of them will require a control component. A centralized control is not a good idea as it may not be able to control that mass without help.
Sanity of resource. Self-awareness is a must. You need a self-sanity control enabling system components to notify you promptly and in an illuminating way when something is wrong.
Multiple vendors integration. Of course, there are cases where multiple vendors are needed to fulfill the system requirements. Managing large sets of resources that can be located in different clouds requires a little bit of abstraction. Careful planning and integration with the business-continuity plan should be performed.
Summary
Cloud is a great new way of providing resources, using the service metaphor, offering interesting capabilities for growing and dynamic systems. But, at the same time, it poses some constraints and related concerns that software infrastructure architects should understand and solve when creating the software and infrastructure architectures.
References
[1] Wikipedia, “Wikipedia-Cloud Computing”
[2] Wikipedia, “Wikipedia IaaS”
[3] Wikipedia, “Wikipedia PaaS”
[4] Wikipedia
William Martinez, a software architect and R&D manager at the nearshore software engineering services company, Avantica Technologies, works with venture-backed startups, the Software 500 and industry in development, testing, high-end productivity evaluations, SOA technologies and mobile platforms. He is a REST evangelist and is currently writing a book on the subject from the architecture perspective. He also leads the IASA Costa Rica Chapter (International Association of Software Architects) and is a professor at the University of Costa Rica. He has been a technical reviewer of popular books including Fast SOA and Rest in Practice, and co-wrote the Planning and Evaluation of IT Projects book for UNED.