What is a graph database?

Contents

A graph database is a special type of database that stores data in the form of nodes and edges. This approach enables efficient modelling and querying of complex relationships. Graph databases are therefore particularly suitable for applications that map highly interconnected information.

What is a graph database made of and what is it used for?

A graph database is, as the name suggests, based on graphs. These graphs clearly display complex interconnected information and their relationships with one another and store them as a large, connected dataset.

The graphs consist of nodes, which are uniquely designated and identifiable data entities or objects, and edges, which represent the relationships between these objects. Visually, the two components are represented as points and lines. Edges each have a start and end point, while each node always has a certain number of relationships—whether incoming, outgoing, or undirected—with other nodes.

Graph databases are used, for example, to analyse user relationships in social networks or the purchasing behavior of customers in online shops. By storing relationships, it is possible to provide product and friendship recommendations and build individual person and product networks.

Note

Relational databases store data in tables and use SQL for queries. In contrast, graph databases belong to the NoSQL family and offer a more flexible structure for efficiently handling complex relationships between data.

Examples of graph databases

There are different concepts that describe how such graph DBMS are structured. The best-known are the labeled property graph and the resource description framework (RDF).

Labeled property graph

In the labeled property graph (LPG), each node and edge of the graph is assigned specific properties, known as properties, and labels. These store specific information about the entities or relationships. Labels serve for categorisation so that, for example, a node can be marked as a ‘person’ or ‘company’, while properties can contain additional attributes such as names, ages, or geographic coordinates.

This structure enables very flexible and powerful data querying because relationships and properties are stored directly in the database and can be retrieved through simple queries. LPGs are particularly well-suited for modelling complex networks in which entities and their connections are described in different contexts.

Resource description framework

In the resource description framework (RDF), information is organised in triples consisting of subject, predicate, and object, providing a simple structure for representing relationships between entities. Each triple represents a statement where the subject designates the resource, the predicate describes the property or relationship, and the object represents the value or another resource.

With RDF, data can be linked in a standardised way, allowing it to be combined and retrieved across different systems. This flexibility makes RDF particularly useful for applications that depend on connecting data from various sources, such as knowledge graphs.

How do queries work in a graph database?

When working with a graph-based database, various query methods are used. This is primarily because there is no unified query language. Unlike traditional models, graph databases also rely on special algorithms to fulfill their primary task: simplifying and accelerating complex data queries.

The most important algorithms include depth-first search and breadth-first search: depth-first search explores the next deeper node, while breadth-first search moves level by level. These algorithms make it possible to find patterns (called graph patterns) as well as direct and indirect neighboring nodes. Other algorithms allow calculation of the shortest path between two nodes and identification of cliques (subsets of nodes) and hotspots (highly connected data). One strength of graph databases is that relationships are stored directly in the database, so they do not need to be computed at query time. This results in high performance even for complex queries.

Advantages and disadvantages of graph databases

The strength of a database can primarily be measured by four factors: integrity, performance, efficiency, and scalability. Graph databases aim to make data queries faster and easier—that is essentially their main purpose. Where relational databases reach their performance limits, the graph-based database model is particularly agile because data complexity and size do not negatively impact the query process.

In addition, the graph database model allows real-world scenarios to be stored in a natural way. The structure is similar to human thinking, which makes the connections easy to understand. However, graph databases are not all-encompassing solutions. They reach their limits in terms of scalability, for example, because they are primarily designed for single-server architecture, which poses a mathematical challenge for scaling. There is also no single standardised query language.

The advantages and disadvantages of graph databases at a glance:

Advantages	Disadvantages
✓ query speed depends only on the number of specific relationships, not on the amount of data	✗ poor scalability because of single-server architecture
✓ results delivered in real time
✓ clear and intuitive representation of relationships
✓ flexible and agile structures

Graph databases should not be considered an absolute or better replacement for traditional databases. Relational structures remain useful standard models that guarantee high integrity and stability of data and allow flexible scalability. As is often the case, the key is the intended use!

Graph database comparisons

There are various graph database examples suited for different use cases. Below are four popular models:

Neo4j: Neo4j is the most popular graph DBMS, designed as an open-source model.
Amazon Neptune: This graph database is available through the Amazon Web Services public cloud and was released in 2018 as a high-performance database.
SAP Hana Graph: With SAP Hana, the developer SAP created a platform built on a relational database management system and enhanced it with the integrated graph-based model SAP Hana Graph.
OrientDB: This database combines document-oriented and graph-based database approaches and is considered one of the fastest currently available models.

A direct comparison shows that these databases offer various features that can be helpful depending on the specific use case:

	Neo4j	Amazon Neptune	SAP HANA Graph	OrientDB
type	native	managed (cloud)	graph extension	multi-model
query languages	Cypher	SPARQL, Gremlin, OpenCypher	SQL-based	SQL-like, Gremlin
data model(s)	property graph	property graph, RDF	relational, graph model	graph, documents
typical use cases	social networks, fraud detection, recommendation services, network management	knowledge graphs, identity and access management, cloud-native apps	business analytics, IoT, financial analysis, SAP applications	content management, complex data relationships, distributed systems

Reviewer

Julia Hertler
With over 18 years of experience in content marketing, Julia Hertler has deep expertise in digital communications. For the past 10 years, she has specialised in the areas of domains and hosting at the IONOS Digital Guide, making complex technical topics easy to understand.

10 Years Digital Guide: A Success Story

Stay on top of AI!

CMS without databases: the simple solution for small web projects

Do you know of any content management systems without databases? These simple CMS systems are usually free, can be downloaded online, and can often be extended and adjusted to the user’s needs. Despite these advantages, are simple CMS platforms really a serious alternative to the…

Database

dizainShutterstock

How to backup databases

Backing up your data is a popular option for securing your database. In order to create backup copies, you need additional hardware and to install a suitable backup structure. How do you secure your own network and web server against attacks and proceed to protect your databases?

MySQL
PHP
Database

artidashutterstock

What is data reduction?

With the mass of data constantly increasing, ever more efficient storage techniques are required to reduce storage requirements with as few losses as possible. For this purpose, there are two compression techniques that are currently in use. While data compression identifies and…

Encyclopedia

NicoElNinoShutterstock

Google Knowledge Graph: What Is It and What Does It Do?

Google’s Knowledge Graph summarises relevant information and links for a search request in a clear and visually appealing manner. This extra information on search results is provided in a separate widget. In this guide, we explain how the Knowledge Graph works and where the…