A single infrastructure can no longer serve as a standard for all applications, data formats and environments; different use cases require different applications. Today’s IT managers must comprehend the requirements for all these applications and the corresponding data formats and deliver a data center that comprehends the fluid nature of IT infrastructure. The combination of massive increases in data volumes, major innovations in open source communities with distributed applications such as Hadoop, Cassandra, Elasticsearch, Spark, etc., and a large array of file and object storage formats on top of a plethora of traditional databases creates an increasingly amorphous data center perimeter that spans public, private and hybrid clouds. This complexity makes it extremely difficult to pave deterministic long-term IT infrastructure decisions, potentially bewildering even top IT managers.
In this kind of uncertainty, it is better to project that future IT infrastructures will combine private and public cloud environments with a high level of mobility. Different trends make it extremely difficult to expect common standards for infrastructure across all the public clouds and on-premises environments.
If the dependency on the cloud infrastructure is maintained at the basic services level, then customers can avoid cloud lock-in completely. This also helps migration across clouds and on-premises environments. IT infrastructures’ aforementioned fluidity makes it highly important for the customers of cloud infrastructure to keep their dependency at a basic services level. To achieve complete functionality, and at the same time limit dependence on the cloud infrastructure to a basic services level and achieve seamless deployment, three key features are needed:
- Creation of elastic resources with assured quality of service
- Ability to view, control, secure and modify data in all formats
- Anticipation of volatility in infrastructures and environments
Elasticity
Emerging container technologies, such as Docker and LXC, and host-side caching using SSDs and RAM for reads and writes helps in creating highly elastic resources with assured quality of service. By enabling storage quality of service, it is possible to decouple the application performance from storage performance. This enables applications to perform with the same kind of performance in all cloud environments.
Know your data
Data is the lifeblood of every organization. Having complete control over the data is extremely important for any organization. The emerging IT infrastructure forces customers to store data in different formats and in different environments. Hence, it is important to have a global namespace of all the data and have complete control to view, modify, secure and control access.
Prepare for volatility
Emerging hardware technologies such as NVMe, and huge non-volatile memory on the host, are creating high levels of volatility in hardware infrastructure decisions. In general the host-side hardware computing power, the non-volatile memory and the increase in IO bandwidth are forcing infrastructure designers to re-architect the way compute and storage are configured. This is also blurring the traditional demarcation boundaries of primary storage and the secondary storage. Adding to the volatility of hardware technologies, emerging cloud technologies are forcing CIOs to assume high-level infrastructure volatility.
In effect, we need an infrastructure software that abstracts applications and data from underlying hardware, thereby addressing the problem of deployment in different infrastructures and dynamic IT operations.
At a broad level, we can classify every private and public cloud environment into two basic classes:
- Basic cloud services, that is virtual machines for compute with the option of configuring the size of local memory, number of CPU cores, local SSDs or local HDD storage or bare metal servers. Alongside this is shared highly available storage — object storage, HDFS file storage, NFS-based file storage, block storage or a combination of all these types of storage.
- Premium cloud services.
As an example, we can include Amazon EC2 instances and AWS S3 storage as basic cloud services. Similarly, such basic services are easily available from other cloud providers, and from VMware or KVM in the case of on-premises environments.
Premium cloud services are highly proprietary for the cloud infrastructure providers. These include container orchestration, database as a service, Hadoop as a service, a specialized way for application deployments, special networking configurations, etc.
In general, global cloud providers often provide private application programming interfaces to enable companies to use their value-added services. With these APIs and the proprietary applications, the cloud vendor actually limits your ability to move to another cloud provider since the other cloud provider might not use the same services or API. On a positive note, however, these premium cloud services enable the customers to get started in the cloud infrastructure quickly.
Krishna Yeddanapudi is the founder and CTO of Robin Systems, founded in 2013. Robin Systems is leading the revolution in reinventing IT infrastructure for an increasingly data-centric world. Robin is pioneering the creation of industry’s first Data-Centric Compute and Software Containerization software to help enterprises accelerate, consolidate and simplify their modern data applications. Robin’s software dramatically improves the performance for distributed applications such as Hadoop, NoSQL and Elasticsearch, and eliminates data duplication by enabling data sharing across applications and clusters, providing a substantial boost in agility to deploy new applications. In his 20-plus-years career, Yeddanapudi has gained deep insight into many technologies.