Editor’s note: Denodo Express is the industry’s first no-cost data virtualization tool. It allows data management professionals to start proving the value of data virtualization by generating new data insights in hours instead of weeks, thus providing business with faster access to information that’s consistent, reliable, and trusted across the enterprise. The solution connects to disparateon-premises, cloud, Big Data, structured and unstructured sources to create normalized views integrated and cleansed in real time or in-memory for on-demand consumption.
I talked with Suresh Chandrasekaran, senior vice president North America of Denodo Technologies about data virtualization challenges and trends and how Denodo Express addresses users’ needs.
Is data virtualization primarily a solution for business intelligence and reporting tools?
Suresh Chandrasekaran: Traditionally data virtualization use cases focused on analytical uses, integrating data for BI and reporting tools. But recently we have seen more and more interest in operational use cases, such as a “single view of the customer” in call centers or providing real-time dashboards to operational managers on such aspects as the status of equipment on a production line.
Actually, while many people have heard about data virtualization, a good majority are still unsure what it is. Adding to this confusion is the fact that the term is getting overloaded as more people jump on to the bandwagon and twist the term for their purposes when describing what a data virtualization platform can do.
What are the top three drivers?
Suresh Chandrasekaran: A significant driver is the need to integrate data from Hadoop or NoSQL databases with other data in the enterprise, thereby avoiding the “data science” silo.
Second is data warehouse “offloading,” which moves older, less frequently accessed data out of the data warehouse and into Hadoop to leverage less-expensive storage. Companies then use data virtualization for queries across the two data stores, hiding the partitioned nature of the data from end users.
Another driver is the growing adoption of the Internet of Things (IoT). Organizations realize that they might be collecting massive volumes of data from clickstreams, sensors, mobile devices, etc. and storing it in systems like Hadoop. Data virtualization helps them make use of the data by either accessing it and/or integrating it with other data.
So it’s primarily a data integration solution. What are some of the integration challenges it solves?
Suresh Chandrasekaran: Developers struggle with data consistency and availability. Business and data analysts often spend hours hunting for data, and dealing with data inconsistencies. They also need to push data to mobile devices that are not suited to traditional SQL-centered data integration technologies.
Enterprise architects face lack of a common and consistent data model across the organization and across all of the different data sources. As a result, there is no single version of truth as customer data can reside in many systems in the enterprise.
Data virtualization enables enterprise architects to build a common data model across all of their data assets regardless of format, location, technology, protocols, etc. They can do this without forcing everyone to conform to a single enterprise-wide data model. The data virtualization layer “translates” between data source models.
Business-side executives struggle with getting a full picture of their business, customer, or product from the data. They also look for ways to take advantage of new technologies such as Big Data and determine if they are getting left behind by their competitors. Specifically, the CMO questions how they can get better customer information, as the information they get from different reports is often inconsistent.
They want to be sure they can get an accurate picture of what is happening across the business to avoid misspending or alienating a customer, and they can do that with data virtualization. It allows users to access all the data that they need and not just the data that is easily accessible and to have a single view of truth. They have a more complete picture of the business from any level of the organization, which can have a material impact on business performance.
It seems like data governance is one of the main challenges.
Suresh Chandrasekaran: Yes, it is. Enterprise architects must determine how to guarantee veracity of reported data when it can come from any one of multiple sources. Even if they know where it came from, how do they know what happened to it between the source and the final output or report? They also have to protect the data and determine who is accessing what data, and how and when they are accessing it and determine the impacts.
Data virtualization simplifies data governance by using a layer as access to the data. It provides data lineage, so they have a clear view of where the data came from and how it was changed, integrated, transformed, cleansed, etc. between the source and the user. The data virtualization layer also provides a security control point to control who accesses what data, and it logs who can access what data and when.
How does data virtualization help with the Internet of Things?
Suresh Chandrasekaran: Not just data virtualization but specifically Denodo Express removes the frustration developers often face, having to wait months to prepare new data sources for ETL into the data warehouse. This is a real differentiator for businesses that have IT bottlenecks and project backlog but want new technologies such as Big Data and Internet of Things (IoT) to be usable now. It streams all data without having to reengineer what is already complete, so it makes it easier to adopt new technologies and integrate them into corporate data processes without having to wait on IT.
Free use is cool, of course, but what other aspects of Denodo Express are driving customers to adopt it?
Suresh Chandrasekaran: It’s delivered to users and development teams in the format that they prefer, whether it’s SQL for BI and reporting tools, Web Services for Web and mobile applications, Web Parts for SharePoint integration, etc. It features a drag-and-drop UI, thus eliminating the need for coding while configuring data integration.
Since it’s free for life and there’s no risk, Denodo Express is especially valuable for users who want to try data virtualization but may not have the budget or the authority to get started. They can get started virtually in minutes and begin integrating data assets from disparate heterogeneous sources, regardless of their location. Users also get online community-based support and tons of online tutorials and how-to videos.
Aside from the cost, what selection criteria are most important when selecting a data virtualization product?
Suresh Chandrasekaran: The criteria vary between the needs of developers or enterprise architects, as follows:
- Does the product support a wide range of data sources including semi- and unstructured data sources? Is it easy to connect to these data sources?
- Does it support a broad range of data delivery options, e.g., virtual, cached or physical delivery via JDBC, ODBC, ADO.NET, SOAP Web Services, RESTful Web Services (output as XML, JSON, HTML, RSS), Portlets and Data Widgets (Microsoft Web Parts for SharePoint, Java JSR-168 and JSR-286 Portlets), JMS message queues?
- Is it extensible for new data sources and technologies? In essence is it future proof?
- Does the product support rapid and agile development or would you need to resort to coding to do anything of medium complexity?
- Is it integrated to source control systems?
- Does it have tools to help with testing and debugging? How easy is it to promote projects from development to test to production environments? Can this be automated easily?
- Does the product allow you to easily create a common data model across all of your data sources?
- Can you transform and cleanse the data on the fly?
- Does it provide a full range of data access patterns across all enterprise information (query, reference, browse, events/alerts, search)?
- Does it scale?
- Can it combine real-time query optimization techniques (rule-based and cost-based), intelligent caching of intermediate result sets and the pre-processing and staging of data to off load enterprise systems (replication) to provide a high throughput architecture? Does it combine federation, caching and replication in a single product?
- Is it secure? Can it automatically detect data source changes, impact analysis and change propagation in an integrated fashion? And does it include built-in monitoring and auditing tools with standard integration with major monitoring tools in the market?
Data virtualization products such as Denodo Express are designed for data architects that are tired of being a prisoner to archaic data integration methods or are just generally frustrated at not being able to leverage the true value of their data with another option. Selecting a product with the above criteria will enable companies to realize value from their use of data virtualization.
Suresh Chandrasekaran is senior vice president at Denodo. Throughout his career in product management and marketing roles at Vitria, Alta Vista, Compaq and as a management consultant at Booz Allen, Suresh has helped businesses leverage information through the medium of technology to deliver a competitive advantage. He speaks frequently at conferences on technology and its business impact, drawing from 15+ years of experience in leading integration middleware and Web companies. Contact him at firstname.lastname@example.org.
Kathleen Goolsby is managing editor of SandHill.com.