A database collects data and links it to a logical unit. The in­di­vidu­al data units are provided with meta-de­scrip­tions and in­form­a­tion necessary for their pro­cessing. Databases are extremely useful for managing databases and fa­cil­it­at­ing the retrieval of specific in­form­a­tion. In addition, rights can be defined in many databases that determine which persons or programs are au­thor­ised to access which data. It is also a matter of present­ing the contents in a clear, demand-oriented manner.

Database systems differ con­cep­tu­ally from each other and so have in­di­vidu­al strengths and weak­nesses. However, all of them are sub­divided into the database and the database man­age­ment system. The “database” refers to the complete set of data to be ordered (also referred to as a “database”). The database man­age­ment system is re­spons­ible for ad­min­is­tra­tion and de­term­ines structure, order, access rights, de­pend­en­cies, etc. It often uses a specially defined database language and a suitable database model that defines the database system’s ar­chi­tec­ture. 

Many of these systems can only be read by specific, precisely defined database ap­plic­a­tions. Nowadays, there is often confusion when a database program is just referred to as a “database” without other spe­cific­a­tions. The term is also often used when referring to simple col­lec­tions of files. Tech­nic­ally, however, a folder on a computer that contains many files, for example, is not yet a database.

Defin­i­tion: Databases

Databases are logically struc­tured systems for elec­tron­ic data man­age­ment that use database man­age­ment systems to control af­fil­i­ations, access rights, and store in­form­a­tion about the database they contain. Most databases can only be opened, edited, and read using special database ap­plic­a­tions.

Why do we need databases?

To make elec­tron­ic data pro­cessing struc­tur­ally efficient, the concept of the elec­tron­ic database as a separate software layer between the operating system and the ap­plic­a­tion program was developed in the 1960s. It was the result of practical ex­per­i­ence: working manually with in­di­vidu­al files, su­per­vising and granting access rights simply proved to be too difficult so elec­tron­ic data pro­cessing was developed to make the task easier. The idea for the elec­tron­ic database system was one of the most important in­nov­a­tions in the de­vel­op­ment of the computer.

First, network and hier­arch­ic­al database models were developed. However, these soon proved to be too simple and tech­nic­ally limited. IBM achieved a major break­through in the 1970s with the de­vel­op­ment of the much more powerful re­la­tion­al database model, which quickly spread into working life. The most suc­cess­ful products of this time were the database language SQL from Oracle and the successor products from IBM, SQL/DS and DB2.

Until the 2000s, well-known man­u­fac­tur­ers dominated the market for database software until several open source projects brought in a breath of fresh air. The most popular freely ac­cess­ible systems include MySQL and Post­gr­eSQL. The trend towards NoSQL systems, which began in 2001, continued the tradition of re­la­tion­al database systems started by man­u­fac­tur­ers.

Today, it is im­possible to imagine many areas of ap­plic­a­tion without database systems. All business software is based on powerful databases that provide system ad­min­is­trat­ors with extensive options and tools. In addition, data security has become an in­creas­ingly important topic in database systems. Finally, passwords, personal in­form­a­tion and even elec­tron­ic cur­ren­cies are stored and encrypted in elec­tron­ic databases.

The modern financial system, for example, can be imagined as a network of databases. Most sums of money exist as elec­tron­ic in­form­a­tion units – pro­tect­ing this in­form­a­tion with the help of secure databases is an essential task for financial in­sti­tu­tions. Not least because of this, elec­tron­ic databases are extremely important for modern civil­isa­tion.

Functions and re­quire­ments of a database man­age­ment system (DBMS)

A widely used term to describe functions and re­quire­ments for trans­ac­tions in a database man­age­ment system is ACID, an acronym for atomicity, con­sist­ency, isolation, and dur­ab­il­ity. The partial terms of ACID cover the most important re­quire­ments for a DBMS:

  • Atomicity or seclusion is the “all or nothing property” of DBMS where only valid queries are made in the correct order and the entire trans­ac­tion is executed correctly
  • Con­sist­ency requires that suc­cess­ful trans­ac­tions leave a stable database, which requires constant reviews of all trans­ac­tions
  • Isolation is the re­quire­ment that trans­ac­tions do not “stand in each other’s way,” which is usually happens with locking functions
  • Dur­ab­il­ity means that all data is stored per­man­ently in the DBMS, even after a suc­cess­ful trans­ac­tion has been completed. This also applies to system errors or DBMS failures. Trans­ac­tion logs, for example, which log all processes in the DBMS, are essential for dur­ab­il­ity

The following is a further sub­di­vi­sion of the functions and re­quire­ments that a database man­age­ment system might have, beyond the ACID model.

Function/reĀ­quireĀ­ments ExĀ­planĀ­aĀ­tion
Saving data Databases store elecĀ­tronĀ­ic texts, documents, passwords, and other inĀ­formĀ­aĀ­tion that can be accessed through queries.
Data revision Most databases allow you to edit stored inĀ­formĀ­aĀ­tion directly, depending on access rights.
Deleting data Data records contained in databases can be deleted comĀ­pletely. In some cases deleted data can be recovered, in others the inĀ­formĀ­aĀ­tion is lost forever.
Metadata manĀ­ageĀ­ment InĀ­formĀ­aĀ­tion is usually stored in databases with metadata or metatags. These create order within the database and make a search function possible, for example. Access rights are also often regulated through metadata. Data manĀ­ageĀ­ment follows four funĀ­daĀ­mentĀ­al opĀ­erĀ­aĀ­tions: create, read/retrieve, update, and delete. This concept is known as the CRUD principleĀ and is the basis for data manĀ­ageĀ­ment.
Data integrity Databases need to be secure to prevent unĀ­auĀ­thorĀ­ised access to stored data. In addition to a powerful enĀ­crypĀ­tion process, careful manĀ­ageĀ­ment (parĀ­ticĀ­uĀ­larly by the main adĀ­minĀ­isĀ­tratĀ­or) is essential for data security. Data security usually means taking technical preĀ­cauĀ­tions to prevent maĀ­nipĀ­uĀ­laĀ­tion or loss of data. It is a core concept of privacy.
Data integrity Data integrity means that data within a database follows certain rules to ensure the data’s accuracy and define the business logic of the database. This is the only way to ensure that the database functions conĀ­sistĀ­ently as a whole. There are four of these rules in reĀ­laĀ­tionĀ­al database models: area integrity, entity integrity, refĀ­erĀ­enĀ­tial integrity, and logical conĀ­sistĀ­ency.
Multi-user mode Database apĀ­plicĀ­aĀ­tions allow access to the database from different devices. In multi-user opĀ­erĀ­aĀ­tions, the disĀ­triĀ­buĀ­tion of rights and data security is funĀ­daĀ­mentĀ­al. Another challenge for multi-user databases is how to keep data conĀ­sistĀ­ent with simĀ­ulĀ­tanĀ­eous read and write access for multiple users without affecting perĀ­formĀ­ance too much.
Query opĀ­timĀ­isaĀ­tion On the technical side, a database must be able to process each query as efĀ­fiĀ­ciently as possible to guarantee good perĀ­formĀ­ance. If a database goes ā€œtoo many waysā€ with a data query, the overall database perĀ­formĀ­ance suffers.
Trigger und storage ProĀ­cedĀ­ures These proĀ­cedĀ­ures are mini-apĀ­plicĀ­aĀ­tions stored within a database manĀ­ageĀ­ment system that are called upon (ā€œtriggeredā€) to change certain actions. Among other things, this improves data integrity. In reĀ­laĀ­tionĀ­al databases, database triggers and stored proĀ­cedĀ­ures are typical processes – the latter can also conĀ­tribĀ­ute to system security if users are only allowed to perform actions with preĀ­fabĀ­ricĀ­ated proĀ­cedĀ­ures.
System transĀ­parĀ­ency System transĀ­parĀ­ency is parĀ­ticĀ­uĀ­larly relevant for disĀ­tribĀ­uted systems: By depriving the user of data disĀ­triĀ­buĀ­tion and imĀ­pleĀ­mentĀ­aĀ­tion, the use of the disĀ­tribĀ­uted database resembles that of a centĀ­ralĀ­ised database. Different levels of system transĀ­parĀ­ency reveal or obscure the backĀ­ground processes. However, the main function is to simplify use as much as possible.
Note

If you run your own database, a com­pre­hens­ive data backup method is extremely important!

What kind of databases are there?

You can dif­fer­en­ti­ate between commonly used database models by other criteria than just technical de­vel­op­ment of elec­tron­ic data trans­mis­sion. The main focus was on ef­fi­ciency and user-friend­li­ness, but also on the arms race of well-known man­u­fac­tur­ers.

Hier­arch­ic­al database model

The oldest database model is the hier­arch­ic­al one. It has since been replaced by the re­la­tion­al database and other models. However, the hier­arch­ic­al model has recently been used more and more fre­quently: XML uses the simple system for storage, for example. Here and there, insurance companies and banks still use hier­arch­ic­al databases, es­pe­cially for older database ap­plic­a­tions. The best known hier­arch­ic­al database system is IMS/DB from IBM.

Hier­arch­ic­al databases have very clear de­pend­en­cies. This means that each data record has exactly one pre­de­cessor (parent-child re­la­tion­ships, PCR) except the root of the database. This leads to the tree structure shown above. Although each “child” can have only one “parent,” each “parent” can have any number of “children.” Because of the strict hier­arch­ic­al order, layers that are not directly adjacent cannot interact with each other. Also, a con­nec­tion between two different trees cannot be easily es­tab­lished. Hier­arch­ic­al database struc­tures are therefore extremely in­flex­ible, albeit very clear.

Datasets that have “children” are called “records.” Records without “children” are called “sheets” because they usually contain the documents in a hier­arch­ic­al database. Records are usually used to organise the sheets. Each query in a hier­arch­ic­al database accesses a sheet that is accessed from the root through the records.

Network type database model

The network database model was designed ap­prox­im­ately at the same time as the re­la­tion­al database model, even though it out­per­formed the com­pet­i­tion in the long term. Unlike the hier­arch­ic­al model, records have no strict parent-child re­la­tion­ships. Each data record can have several pre­de­cessors, which results in a net-like structure. Similarly, there is no unique access path to a data record.

The data set in the middle of the graph can the­or­et­ic­ally be reached by five other sets. At the same time, access to the middle data set allows access to five ad­di­tion­al data sets. De­pend­en­cies can also be defined in the network database model: The topmost record has no direct link to the one on the far right, so it has to go through the middle one (which can then allow or deny access). To do this, the user can directly access the data record in the upper left-hand corner. Data sets can be added and removed fluidly in the network model without sig­ni­fic­antly in­ter­fer­ing with the overall structure.

Today, the network-like database model is used mainly on main­frames. In other areas, either the hier­arch­ic­al model (es­pe­cially for IBM customers) continues to be trusted, unless the trans­ition to a more flexible and easy-to-use re­la­tion­al model has been made. Well-known network database models are the UDS from Siemens and the DMS from Sperry Univac. Over the years, both companies have also developed in­ter­est­ing hybrids between network model and re­la­tion­al model without achieving a real break­through. However, the results of this test can still be traced in Siemens SQL today. A modern further de­vel­op­ment of the network model is the graph database, whose structure is re­min­is­cent of a network.

Re­la­tion­al database model

The most popular database model today is the re­la­tion­al one, even if it is con­sid­er­ing imperfect by many. The as­so­ci­ated re­la­tion­al database man­age­ment system is better known by the acronym RDBMS, and SQL is usually used as the database language. The table-based re­la­tion­al database model is based on the core concept of “relation,” a term that is firmly defined in math­em­at­ics. The relations are for­mu­lated using re­la­tion­al algebra, which can be used to extract specific in­form­a­tion from these relations. This principle is the basis of the database language SQL.

The re­la­tion­al database model works with in­di­vidu­al tables that define the loc­al­isa­tion and links between in­form­a­tion. This in­form­a­tion forms a data set (in the diagram of a line or a “tuple”). In­di­vidu­al in­form­a­tion is collected as at­trib­utes (in graph A1 to An) in the columns. The total relation (“relation” is often used syn­onym­ously with the term “table”) is derived from related at­trib­utes. The primary key, which is usually defined as the first attribute (A1) and must never change, is ele­ment­ary for the unique iden­ti­fic­a­tion of a data record. In other words, this so-called primary key (also “ID”) defines the exact position of the following data set with all at­trib­utes.

Note

Read our article in on the re­la­tion­al database to find out why it has become the es­tab­lished standard, the details of how it works, and what kind of criticism it faces.

Object-oriented database model

Object databases were only conceived at the end of the 1980s and still find few ap­plic­a­tion areas today. The object-oriented databases, some of which are available as open source, are most fre­quently used on Java and .NET platforms. The best-known object database is db40, which scores, above all with its small memory size. Object databases usually work with the query language OQL, which is very similar to SQL.

In the object-oriented database model, data and its functions and methods are stored in an object. Objects are usually term objects with as­so­ci­ated at­trib­utes that describe the object in more detail. Access to these objects is defined in the object database man­age­ment system using the “methods” that are stored together with the data in the object.

Objects can be complex and can consist of any number of data types. In addition, objects are unique within the database system and are iden­ti­fied with the unique iden­ti­fic­a­tion number (object ID, OID). As you can see in the diagram, in­di­vidu­al objects are grouped into object classes, resulting in a class hierarchy. Although there seems to be a sim­il­ar­ity to the hier­arch­ic­al database model, the object-oriented approach is decisive here and there are no fixed parent-child re­la­tion­ships. Nev­er­the­less, the method of access can be specified by the object class.

Object databases offer ad­vant­ages for complex problems with cor­res­pond­ing object depths. The object database works largely in­de­pend­ently without much in­ter­ven­tion in nor­m­al­isa­tion and ID ref­er­en­cing, and sub­sequently allows the re­l­at­ively simple and smooth feeding of new, complex objects. However, simple queries are much faster in a re­la­tion­al database system, for example. Since object-oriented database systems are not very popular, this leads to in­suf­fi­cient com­pat­ib­il­ity with many common database ap­plic­a­tions.

Document-oriented database model

In this model, documents form the basic unit for storing data. They structure data and should not be confused with documents like those used in text editing programs. The data is stored in key/value pairs and consists of a “key” and a “value.” Since the structure and number of these pairs are not fixed, in­di­vidu­al documents within a document-oriented database can look very different. Each document is a self-contained unit. Relations between documents are not easily to make, but are also not necessary in this model.

Note

In recent years, document-oriented databases have ex­per­i­enced a real boom thanks to the NoSQL’s success, es­pe­cially because of its good scalab­il­ity. An example of this kind of database system is MongoDB.

In the re­la­tion­al model (shown in the graph with the tables), different relations are linked together to read a common data set. In the document model, a single document is suf­fi­cient to store all in­form­a­tion. The schema is freely se­lect­able: the document-oriented database model is con­cep­tu­ally schema-free, as long as the database language used remains the same.

An ele­ment­ary idea for document databases is that related data is always stored together in one place (in the document). While re­la­tion­al databases usually display and output related in­form­a­tion by linking several tables, the specific query of a document in the document-based model is suf­fi­cient. This reduces the number of op­er­a­tions required in the database.

Document-based database systems are par­tic­u­larly in­ter­est­ing for web ap­plic­a­tions, because complete HTML forms can be fed in with them. Es­pe­cially during the course of de­vel­op­ing web 2.0, document databases became more and more popular. However, there are con­sid­er­able dif­fer­ences between the different document-oriented database systems, from syntax to internal structure. So, not every document database is suitable for every area of ap­plic­a­tion. Precisely because of these different it­er­a­tions there are some well-known document-oriented database systems out there today: Lotus Notes, Amazon SimpleDB, Mongo DB, CouchDB, ThruDB, Orient DB, and many more.

Overview: database models

Database model DeĀ­velĀ­opĀ­ment AdĀ­vantĀ­ages DisĀ­adĀ­vantĀ­ages Fields of apĀ­plicĀ­aĀ­tion Known repĀ­resĀ­entĀ­atĀ­ives
HierĀ­archĀ­icĀ­al 1960s Extremely fast read access, clear structure, techĀ­nicĀ­ally simple Rigid tree structure that does not allow links between branches Banks, insurance companies, operating systems IMS/DB
Networked Start of the 1970s Multiple search paths to the data set, no strict hierarchy Poor overview with larger databases Mainframe UDS (Siemens), DMS (Sperry Univac)
ReĀ­laĀ­tionĀ­al 1970 Simple, flexible creation and editing, easily exĀ­pandĀ­able, fast comĀ­misĀ­sionĀ­ing, lively and comĀ­petĀ­itĀ­ive UnĀ­manĀ­ageĀ­able with large amounts of data, poor segĀ­mentĀ­aĀ­tion, arĀ­tiĀ­fiĀ­cial key atĀ­tribĀ­utes, external proĀ­gramĀ­ming interface, object propĀ­erĀ­ties and object behavior difficult to map ConĀ­trolling, acĀ­countĀ­ing, merĀ­chandĀ­ise manĀ­ageĀ­ment systems, content manĀ­ageĀ­ment systems, and much more MySQL, PostĀ­grĀ­eSQL, Oracle, SQLite, DB2, Ingres, MariaDB, Microsoft Access
Object-oriented End of the 1980s Best support of object-oriented proĀ­gramĀ­ming languages, storage of mulĀ­tiĀ­meĀ­dia content InĀ­creasĀ­ingly poorer perĀ­formĀ­ance with large data volumes, few comĀ­patĀ­ible inĀ­terĀ­faces Inventory (museums, retail) db4o
Document-oriented 1980s Central storage of related data in inĀ­diĀ­viduĀ­al documents, free structure, mulĀ­tiĀ­meĀ­dia oriĀ­entĀ­aĀ­tion ReĀ­lĀ­atĀ­ively high orĀ­ganĀ­isaĀ­tionĀ­al effort, often proĀ­gramĀ­ming skills are required Web apĀ­plicĀ­aĀ­tions, internet search engines, text databases Lotus Notes, Amazon SimpleDB, MongoDB, CouchDB, Riak, ThruDB, OrientDB
Go to Main Menu