Sensors that collect sci­entif­ic or in­dus­tri­al meas­ure­ment data generate large volumes of data in a short period time. This data has to be processed together with a timestamp of the meas­ure­ment. Special databases are required for this time series data. This article focuses on InfluxDB, a database man­age­ment system (DBMS) specially designed for this task.

What is InfluxDB?

InfluxDB is a database man­age­ment system developed by In­fluxData, Inc. InfluxDB is open-source and can be used free of charge. The InfluxDB En­ter­prise version offers main­ten­ance agree­ments and special access controls for business customers, and is installed on a server within a corporate network. In addition, the new InfluxDB 2.0 version runs as a cus­tom­is­able cloud service with a web-based user interface for data ingestion and visu­al­isa­tion.

The InfluxDB database man­age­ment system is written in Google’s pro­gram­ming language Go, also known as Golang. The first version of InfluxDB used InfluxQL, a query language developed by In­fluxData, for external database queries.

InfluxDB 2.0 is written in a new language called Flux, which In­fluxData publishes on GitHub as an open-source project. The project is updated on GitHub by de­velopers working with time series data. Flux is a stan­dalone language for time series databases (TSDB). It can be used with InfluxDB version 1.7 and higher, either in­de­pend­ently or with third-party databases.

Flux is optimised for ETL processes (extract, transform, load) in databases and is not com­pat­ible with the InfluxQL query language pre­vi­ously used. However, In­fluxData is planning a migration path for existing customers to translate InfluxQL code into Flux.

Flux syntax is based on the popular language JavaS­cript. It is easy to learn and can be expanded. A key feature of Flux is that it can integrate different data sources using third-party APIs, for example. As a result, Flux is com­pat­ible with analytics tools like Jupyter. The Apache Arrow data in­ter­change interface permits com­mu­nic­a­tion with other systems and in­teg­ra­tion in big data en­vir­on­ments.

When is InfluxDB used?

InfluxDB is ideal for time-series databases (TSDB), which store time series. These databases are used, among other things, to store and analyse sensor data or protocols with timestamps over a certain period of time. For example, Internet of Things devices or sci­entif­ic measuring in­stru­ments deliver millions of incoming data sets in a constant stream of data.

This data must be quickly processed once it reaches the database. For this reason, InfluxDB includes a built-in time service that uses the Network Time Protocol (NTP) to ensure that time is syn­chron­ised between all systems.

With InfluxDB, a database can be very compact and must contain only two or three columns. In this example, the data source, the actual value and the cor­res­pond­ing time stamp are stored in the database.

Sensor Value Time
Sensor 1 140.50 04/23/2020 @ 10:00
Sensor 2 110.02 04/23/2020 @ 10:00
Sensor 1 142.32 04/23/2020 @ 10:05 AM
Sensor 2 110.50 04/23/2020 @ 10:05 AM

InfluxDB dif­fer­en­ti­ates between tag and field columns. Where a tag is simply metadata that is included in the index, fields contain values that can be analysed. In our example, the first column is a tag and the second one is a field. This dif­fer­en­ti­ation makes it easier to manage the database and analyse meas­ure­ment data.

What are the ad­vant­ages of InfluxDB?

Compared to ordinary re­la­tion­al databases, TSDBs like InfluxDB offer clear speed ad­vant­ages when it comes to storing and pro­cessing time-stamped meas­ure­ment data. A tra­di­tion­al DBMS slows down when or­gan­ising complex indexes, which are not used at all in this area of ap­plic­a­tion. InfluxDB can maintain high write speeds over a long period of time because it uses a very simple index.

Unlike version 1.x, the new InfluxDB Cloud 2.0 from In­fluxData is a cloud-based solution that can run on Amazon Web Services (AWS), the Google Cloud Platform (GCP) or Microsoft Azure. With server­less computing, you don’t need your own server in­fra­struc­ture. In the cloud version, you no longer have to reserve in­di­vidu­al servers. Instead, the system auto­mat­ic­ally adjusts to the load, which is important for in­dus­tri­al IoT ap­plic­a­tions and machine learning, where the volume of data can change in­stant­an­eously.

Whereas the first version required the TICK stack (Telegraf, InfluxDB, Chro­no­graf and Kapacitor), InfluxDB 2.0 already has everything you need. Both the local and cloud versions contain the entire database man­age­ment system in a single program file currently available for 64-bit Linux, Linux for ARM pro­cessors, macOS, and as a docker container. Telegraf etc. can still be used to collect data for InfluxDB 2.0.

First steps in InfluxDB

InfluxDB offers free access to InfluxDB Cloud 2.0 for anyone getting started with the solution. This plan allows you to try out the database and the entire hosted, multi-user data platform for time series data. InfluxDB Cloud 2.0 also contains modules for col­lect­ing, eval­u­at­ing and visu­al­ising stored data.

The free version offers limited data rates for reads and writes, up to 10,000 data sets, and a maximum storage period of 30 days. These limits are usually suf­fi­cient for hobby projects, in which case the free version would suffice. A free plan can later be upgraded to a paid, usage-based plan without losing data.

To get started, create a free user account on the InfluxDB Cloud 2.0 signup page. Then click the veri­fic­a­tion link in the email.

After verifying your user account, log in and select your cloud provider. In Europe, InfluxDB Cloud 2.0 is currently available only via Amazon Web Services (AWS). However, this is not an issue if you’re using the free version. If you’re already using Amazon Web Services or Google Cloud Platform (GCP), you can subscribe to the InfluxDB cloud products through the mar­ket­places of these cloud providers.

Once you’ve logged in, InfluxDB displays your personal dashboard, where your data is collected and visu­al­ised. Data can be collected via Telegraf plug-ins, the InfluxDB v2 API, the Influx command line interface (CLI) or directly via the InfluxDB user interface. Client libraries for various popular pro­gram­ming languages are also available.

You can create Telegraf con­fig­ur­a­tions in­ter­act­ively or copy existing con­fig­ur­a­tions to send data to the InfluxDB Cloud 2.0 instance. Once you’ve con­figured InfluxDB cloud to collect data, you can create personal dash­boards to query and display the data.

In the InfluxDB data explorer, you can explore and visualise the collected data. You can adjust time intervals and ranges for re­fresh­ing the dashboard’s data according to the needs of your project. The InfluxDB user interface provides a variety of at­tract­ive visu­al­isa­tion options. The web in­ter­faces allows you to move seam­lessly between the Flux Builder and manual editing of database queries.

On the “Usage” page, you can view your current database usage at any time to determine whether a paid plan might be worth­while.

The most important new features of InfluxDB Cloud 2.0 at a glance

Free plan (with limits): No down­load­ing, no in­stall­a­tion and no in-house server in­fra­struc­ture required; fastest in­tro­duc­tion to InfluxDB 2.0 tech­no­logy; the free plan is designed for getting started with InfluxDB and for small hobby projects.

Flux support: Flux is a stan­dalone scripting and query language for time series databases that increases pro­ductiv­ity by allowing easy reuse of code. Flux was developed and optimised for working with data in InfluxDB 2.0, but it can also be used with other data sources.

Unified API: The unified InfluxDB v2 API offers access to all InfluxDB com­pon­ents, such as data ingestion, query, storage and visu­al­isa­tion. This enables seamless movement between the installed open source version and the InfluxDB Cloud 2.0 version.

Visu­al­isa­tion and dash­boards: Based on the in­nov­at­ive Chro­no­graf project from the first version of InfluxDB, the new user interface offers sig­ni­fic­antly faster results when visu­al­ising and querying data in real-time.

Usage based pricing plans: Usage-based billing offers more flex­ib­il­ity than a self-hosted database system and ensures that you only pay for what you actually use.

Go to Main Menu