Graph DB

“Relational databases are not the right tool for every job,” says Neo4j. “Tabular data with a consistent structure and fixed schema is a perfect fit for a relational database (RDBMS). But if your application demands flexibility or highly connected data, then it’s time to look for an alternative” like the graph database.

The differences between a relational and graph database

In many cases, that alternative will be a graph database, which concerns itself equally with the data points themselves, and the connections between them. When using a traditional RDBMS, those connections are discovered – and defined – by the code that extracts and makes use of the tabular data, which increases both application complexity and the time required to run a query. Not so with a graph database, in which the relationships are already recorded, and available for use right away.

Read about other advanced tools recommended by Merit’s data engineering experts for powering and optimising your BI Stack.

Why graph databases?

Graph databases are more logical than the RDBMS since, as humans, we already have a tendency to ‘picture’ ideas in a similar manner. We are all familiar with the family tree or a business organisation chart, which use lines to define the connections (called ‘edges’ in a graph database) between the people (data points, or ‘nodes’ in graph speak). Without the lines, the chart wouldn’t be either as rich or as informative, since they’re an integral part of the data being presented – just like the edges in a graph database.

Graph databases also contain ‘properties’ about each node, which could similarly be represented in the family tree or organisational chart as additional information, like dates of birth, job titles, or arrival at a company.

What are graph databases good for?

When using a graph database rather than an RDBMS to record relationships between different objects, developers are saved the additional step of reformatting the data to fit within a rigid structure of columns and rows. Equally, when they later extract the data, there’s no need to reverse engineer the process – or the risk that either operation might introduce errors or rob the data of vital elements.

The evolution of graph databases

Graph databases are a rapidly developing technology, and new uses are being found all the time. This has led to the emergence of several competing technologies. Although there is, as yet, no single dominant structure, languages like Property Graph Query Language (PGQL) are emerging, giving applications a degree of portability.

Types of graph databases

There are also several different types of graph database to choose from, although two of the most common options are RDF (resource description framework) graphs and property graphs.

Resource description framework (RDF) graphs

RDF graphs give data points greater context by supplementing them with extensive metadata, making them ideal for open source or highly structured information. These are widely used by governments and research organisations to present publicly accessible data for incorporation by third parties.

Property graphs

Property graphs, meanwhile, are ideal for use in operations that focus on analytics, as they not only supplement the data points with descriptors but also describe the relationships between each one. This gives each ‘edge’ within the database a context of its own that can be used in queries.

Use cases for graph databases

Unsurprisingly, graph databases are of particular interest to social networks as they allow administrators (via machine learning and artificial intelligence) to better understand the connections between members. This helps them to surface relevant content which, in turn, increases member loyalty and time spent on site.

Graph databases use in fraud detection

However, they can be used in any instance where analysis of each ‘edge’ will deliver insights. For example, graph databases are particularly useful in fraud detection or money laundering where criminals move money around to obscure its origin. While a traditional database might be able to plot the location of a financial instrument at a specific point, a graph database can map its movements, and compare them to other assets following similar tracks.

Graph databases used for tracking financial risk

“Since the relationships in [a] graph database are treated with as much value as the database records themselves, the engine that navigates the connections between nodes can do so efficiently, enabling millions of connections per second,” explains Jackie Vendetti at Tibco. “Graph database enables quick extraction of new insight from large and complex databases to help uncover unknown interactions and relationships. This means that with a graph database, banks can process data and compute risks quicker than today’s current relational databases so they can spot opportunities and threats before the competition.”

Graph databases used for Covid contact tracing

Similarly, when coronavirus started to spread in Hainan province, Chinese researchers used graph databases for contact tracing, allowing them to map potential patients and locations that could present a high likelihood of further infection.

Do graph databases signal the end of relational databases?

So, if graph databases are so much faster and more flexible, does this spell the end of the traditional relational database? Far from it. Graph databases are ideal for specific use cases involving artificial intelligence and machine learning, and for dealing with large amounts of data – but traditional databases still have a future in more conventional use cases.

SQL, and its associated tools, processes, applications and stored data, have a heritage stretching back almost 50 years. They are well understood, and ideal for storing much of the data we collect on a daily basis. This is particularly true of data that is highly structured, can be easily tabulated, and analysed at the point of use.

Graph databases certainly have a bright future, and their use will continue to grow, but the traditional RDBMS shows no sign of becoming irrelevant any time soon.

  • 01 /

    A Hybrid Solution for Automotive Data Processing at Scale

    Automotive products needed millions of price points and specification details to be tracked for a large range of vehicles.

  • 02 /

    Automotive Data Aggregation Using Cutting Edge Tech Tools

    An award-winning automotive client whose product allows the valuation of vehicles anywhere in the world and tracks millions of price points and specification details across a large range of vehicles.