With 26+ patents in parallel data management and optimization, TigerGraph’s founder and CEO, Dr. Yu Xu, has extraordinary expertise in big data and database systems.
Having worked on Twitter’s data infrastructure for massive data analytics and led Teradata’s big data initiatives as a Hadoop architect, not only does Yu have an impressive resume, but his ability to explain detailed concepts in a simplified way made for easy conversation.
M.R. Rangaswami: Graph databases are gaining momentum as more organizations adopt the technology to achieve deeper business insights. What exactly is a graph database?
Yu Xu: The world is more hyper-connected than ever before, and the ability to tap into the power of rich, growing networks – whether that be financial transactions, social media networks, recommendation engines, or global supply chains – will make or break the bottom-line of an organization. Given the importance of connections in the modern business environment, it’s critical for database technology to keep up.
Legacy databases (known as relational or RDBMS) were built for well-mapped, stable and predictable processes like finance and accounting. These databases use rigid rows, columns and tables that don’t require frequent modifications, but are costly and time-consuming when adjustments need to be made.
The graph database model is built to store and retrieve connections from the ground up. It’s more flexible, scalable and agile than RDBMS, and is the optimal data model for applications that harness artificial intelligence and machine learning.
A graph database stores two kinds of data: entities (vertices) and the relationships between them (edges). This network of interconnected vertices and edges is called a graph. Graph database software stores all the records of these interconnected vertices, attributes, and edges so they can be harnessed by various software applications. AI and ML applications thrive on connected data, and that’s exactly what graph technology delivers.
M.R.: What’s the difference between native and non-native graph databases?
Yu: As graph technology grows in popularity, more database vendors offer “graph” capabilities alongside their existing data models. The trouble with these graph add-on offerings is that they’re not optimized to store and query the connections between data entities. If an application frequently needs to store and query data relationships, it needs a native graph database.
The key difference between native and non-native graph technology is what it’s created for. A native graph database uses something called index-free adjacency to physically point between connected vertices to ensure connected data queries are highly performant. Essentially, if a database model is specifically engineered to store and query connected data, it’s a native graph database. If the database was first engineered for a different data model and added “graph” capabilities later, then it’s a non-native graph database. Non-native graph data storage is often slower because all of the relationships in the graph have to be translated into a different data model for every graph query.
M.R: What are some ways that businesses are leveraging graph databases?
Yu: The use cases for graph technology are vast, diverse, and growing. If an application frequently queries and harnesses the relationships between users, products, locations, or any other entities, it will benefit from a native graph database. The same is true if a use case leverages network effects or requires multiple-hop queries across data.
Some of the most popular use cases for graph include fraud detection, recommendation engines, supply chain management, cybersecurity, anti-money laundering, and customer 360, just to name a few. If your enterprise relies on graph analytics or graph data science, then it needs a native graph database to ensure real-time performance for mission-critical applications.
M.R. Rangaswami is the Co-Founder of Sandhill.com