“We don’t look at the cloud storage as this huge dumb Internet disk at the end of a WAN” – Ian Howells, CMO StorSimple
StorSimple is a Silicon Valley based start-up focused on application-optimized cloud storage for Microsoft Server applications. StorSimple’s investors include Ignition Partners, Index Ventures, Mayfield Fund and Redpoint Ventures. Following a $13 million Series B funding recently, Ian Howells joined the company as their Chief Marketing. Howells is an industry veteran with 25 years of experience at leading companies such as Ingres, Documentum, SeeBeyond and Alfresco. He is responsible for the StorSimple marketing strategy and operational activities globally.
StorSimple’s approach appears to be a hybrid cloud storage strategy that promises to bring the benefits of the cloud to on-premises applications without forcing the migration of applications into the cloud or switch users to new unfamiliar applications. I spoke with Ian Howells about the drivers of the cloud storage market and in particular about how hybrid cloud storage plays a role in the evolution of the cloud.
What are the adoption drivers for the cloud storage market?
Ian Howells: Cloud is an industry disruption. The notion of elasticity is a key differentiating characteristic of the cloud and applies to both compute and storage. Elasticity is very important when your business and IT needs are unpredictable and you want to expand and contract quickly in response to those changes. Elastic compute is more appropriate for customer-facing Web and social applications with unpredictable demand patterns. In the enterprise context, storage is generally more unpredictable than compute (e.g., you don’t double your employees in a year).
Enterprise storage needs are growing at 50-60 percent a year. A recent IBM study reported that the demand for storage has gone up from 150 exabytes (million Petabytes) in 2005 to 1200 exabytes this year. Customers need to better and more cost effective way to manage that explosion in data.
What are the some of the problems customers face with managing their current storage requirements and how does cloud storage solve those problems?
Ian Howells: One of the main reasons why storage requirements are growing at an uncontrollable rate is the explosion of redundant content. Companies typically create large number of copies of content that differ slightly from each other during the typical document creation and review process.
When you treat all content equally we refer to this as “content communism.” When it comes to the cloud, “content communism” doesn’t work. The key is to understand that most people care more about version 20 than the previous 19 versions. You have got to think like Google. Google doesn’t treat every page you search for equally. They use the notion of page rank to give the most appropriate page to you first automatically with their algorithms. When it comes to cloud storage, you need to treat content in a similar way. Content “communism” does not work anymore.
In cloud storage, we view storage as a set of blocks and use the concept of a block rank similar to page rank used by Google. We store the highest ranking blocks in fast Solid State Drives (SSDs), then we tier out lower ranking blocks to Serial Attached SCSI (SAS) devices, and then finally the lowest ranking blocks out to the cloud. A tiered architecture is different than caching. (See this StorSimple blog post.) Essentially, the block rank drives where the data gets stored. A vast majority of your data is transparently and automatically trickled into your cloud storage using this ranking method. For most common applications including email and collaboration, you don’t access your email archive, files on shared file drives, or libraries of virtual images frequently. Therefore, the least frequently accessed data will all eventually get stored in the cloud, while the “working set” (the most frequently used data) is made available in the fastest tier.
Furthermore, we don’t need to store entire previous versions of a particular document in the cloud, but only differences between them. Using de-duplication technology, we never store the same block twice, which dramatically reduces the amount of storage needed. Some of our customers were able to reduce their storage requirements by a factor of 10.
We don’t look at the cloud as this huge dumb Internet disk at the end of a WAN. We look at in terms of a tiered architecture with the cloud being the last tier.
What’s your target market and what problems do you specifically solve for the customers in that market?
Ian Howells: Let’s Look at the Content Management Space. Collaborative Content Management is a $9 billion market and about 50 percent of that Content Management. Companies are buying CMS systems very differently today from what they did 10 years ago. They used to spend millions of dollars standardizing on one system for the entire enterprise. They don’t do that anymore. It’s more a project-by-project basis acquisition and Sharepoint, Documentum, or Filenet are on the short list for that buy decision.
Generically, we call our target market “application optimized storage” and we believe the big market opportunity in that space is SharePoint (SP). SP manages its content using a DB. In 2007, they recommended a maximum DB size of 100 GB and in 2010, the maximum is 200 GB—that’s pretty small. What we are able to do using our appliance is make SP a BIG SP and allow it to manage multiple Terabytes easily among others.
If you transform a SharePoint system from a 200 GB to a multiple Terabyte system, it opens up a massive opportunity for SharePoint to win against its competitors.
In addition, we took SharePoint’s backup time from 68 minutes to 27 seconds, and recovery from 68 minutes to 38 seconds, thus solving major performance issues customers are having today.
We are a clear leader in the Microsoft ecosystem and that means in SharePoint, Exchange, shared file drives, and shared virtual libraries. We are the only cloud storage vendor certified for Windows.
How should customers think about their cloud storage strategy today?
Ian Howells: The cloud is a very new and dynamically evolving space. Customers have many options to evaluate. There’s SaaS, private clouds, public clouds, and hybrid clouds. Companies need to ask three simple questions for their cloud strategy:
- Are we new or established?
- Are our applications isolated, integrated, or customized?
- Do our applications fit into the “working set” model?
For example, if you are a new company and are making an email decision, it’s easy to make the case to go the SaaS route. Email is pretty isolated, doesn’t need much customization and there’s a natural working set (today’s email vs. the archive).
On the other hand, if you are a 100-year old insurance company using SharePoint that is heavily customized and deeply integrated with your other applications like CRM and ERP, it’s quite difficult to take an application and move it to the cloud and integrate easily with your private application. That is much more a hybrid cloud use case.
I think you really need to have huge economies of scale to make private cloud financially meaningful. It also takes a lot longer to roll out a private cloud strategy than a hybrid or a public cloud strategy. Most CXOs are looking for a cloud strategy that can show results in the next 18 months. With the hybrid cloud, they can show a quick short-term win, even if a private cloud is their long-term strategy. They can’t just bet on a long-term strategy alone.
How does your solution work with public clouds?
Ian Howells: We provide an appliance which we install on-premise in less than 10 minutes. The customer can choose a public cloud like Windows Azure, Amazon, Iron Mountain, AT&T or on-premise solution with EMC Atmos etc. The tiers are automatically and transparently managed by the appliance. We have got block rank algorithm as I mentioned earlier. Typically, 75 percent of your data will be in the cloud because you don’t access it that often. Based on the access/use patterns of the application, we optimize the location of the storage. For example, in the SharePoint case, application optimization automatically places the DB in SSD, the log file in SAS, and the least accessed content in the cloud while keeping the working set in SSD or SAS. It’s really critical to understand the application to maximize performance.
When you have data spread over multiple tiers like that, how do you ensure coherent backup and recovery operations?
Ian Howells: With the tiered approach, we find that 75 percent of the data is already in the cloud and, in a sense, already backed up. We just have to backup the working set of 25 percent, which is really small. Furthermore, we dedup and get size reductions of a factor of 10, which means we are only backing up 2.5 percent. Deduping data into the cloud gives us a huge benefit: when you have to recover, you don’t have to get 10 Terabytes over a WAN connection. All you need is the working set. The working set trickles back into the appliance and you are good to go. This way we can recover very quickly and we get faster over time as the working set gets built again. Your backup and recovery are an order of magnitude quicker.
We have a concept called Cloud Clone that takes the whole application (the DB, the log files, the BLOB everything) and backs it up in an isolated way in the cloud. Once the initial clone is backed up, the next backup is tiny because it applies only to changes in the working set, so you can do a clone daily and have off-site snapshots for your data-volume. You have the ability to recover individual files or entire volumes because you can mount clones as data volumes for restore. Now you have the ability to test Disaster Recovery as many times as you like because it is all integrated and easy to use.
We need to only recreate a new working set instead of the entire data in the cloud so we recover very quickly.
In the case of a disaster recovery scenario, if your entire data center is blown up and you switch to a different data center, you load your configuration files into the appliance and you are up and running immediately with your content in the cloud. Your application is completely isolated in the cloud. In effect, we have created a redundant data center for you in the cloud. Many medium sized companies cannot afford to have a back up data center.
Economics, security, and performance are big topics in the cloud, your thoughts?
Ian Howells: In the case of the cloud, you pay only for what you need and you pay today’s prices. And if you need more storage in the future, you pay at the future prices and if they are cheaper, you pay lower prices. Because storage prices are constantly decreasing, this will work out to be a cheaper than paying for all the storage for 5 years up front.
And because we dedup and send only a 10th of the data into the cloud, so instead of paying 15 cents/Gigabyte, you are effectively paying only 1.5 cents/Gigabyte
As for security, I think it is more of an education. We have more secure systems in the cloud today than traditional data centers. We use military grade encryption and the customer retains the encryption keys and because we dedup, there is no chance of anyone able to make sense of the data even if they break into the system.
People often switch-off application-level encryption because of performance issues. Our appliance is hardware-based encryption and works automatically (you take humans out the picture where all the security problems happen) across Exchange, SharePoint, etc and therefore has virtually no impact on performance.
We thank Ian Howells, Chief Marketing Officer, StorSimple for talking with us and sharing valuable insights on the evolution of cloud storage.
Kamesh Pemmaraju heads cloud research at Sand Hill Group and he helps companies — enterprises and technology vendors — accelerate their transition to the cloud. Follow him on twitter @kpemmaraju.