Time series databases are a viable alternative to files for storing engineering data. InfluxDB and Timescale are the two most well known alternatives currently available. In this article we dive into their differences on Data model, Query language, Performance, and Stability. Our findings below are based on 40+ conversations we had over the past 3 years, with users of both databases.
Data model
InfluxDB has a rather strict data model. Data is stored in buckets > measurements > fields. Tags can be added to data points to have further classification. The advantage is that this structure is somewhat self-explanatory. But still, users are confused on when to make a new measurement, or when to use tags. The set of possible tags tends to explode, as users constantly add new tags - instead of using a predefined set.
Timescale has a more free model, just like the underlying Postgres database. This can make a data schema more difficult to understand - every relational structure is possible. On the upside, relational data and time series data can be freely combined. For example, a ship operating company we talked to, needs to compute the total weight of the ship:
SELECT (cargo_weight + vessel_weight) AS total_weight FROM cargo_log JOIN vessel_props ON cargo_log.vessel_id = vessel_props.id
This kind of calculation is only possible because of the relational nature of Timescale.
Winner: Timescale
Performance
This is a tough nut to crack, because performance really depends on your use case. For our application at Marple, Timescale querying was about 2x faster. But we have talked to numerous other users who had the opposite experience.
Be wary of benchmarks like this, because they focus on one specific test that is hard to generalise. If performance is important, you should set up a test case yourself, that is representative for your application.
Winner: Undecided
Data importing
Telegraf is a tool built by Influxdata that can be used to import various data types into InfluxDB, as well as into Timescale. It supports a few data formats commonly used in engineering, such as: CSV, AVRO and JSON. If you are using something else, like MAT, TDMS, MDF or HDF5, you’re out of luck.
For both databases, you can always write a custom script that reads data from a file and pushes it piecewise to the database. If you have large files, however, this will be slow. For timescale, you’re better off converting the data to a CSV format and leveraging the COPY command.
You can also consider to use Node-RED, a tool that can wire data flows together. It has plugins for writing both to InfluxDB and Postgres (and therefore Timescale).
Winner: Undecided
Hardware integration
InfluxDB has been picked up quite well by hardware manufacturers. Some suppliers provide integration to push data from sensors to InfluxDB, for example:
- CSS Electronics: for logging CAN data
- Dewesoft: loggers for general purpose test & measurements
For Timescale, we have no knowledge of such existing partnerships.
Winner: InfluxDB
Stability
InfluxDB Cloud has apparently shut down some customer databases without notifying them first. As far as we know, this is the only incident. With the introduction of InfluxDB 3.0, the database went back to using SQL as its main query language. It has previously changed to the Flux query language from v1.x to v2.0. This has caused frustrations among some users, having to adapt each time.
For Timescale, we have heard about no such issues, to date (Aug 2023).
Winner: Timescale
Tooling
InfluxDB comes with a visualisation frontend out of the box. This makes it low-barrier to start exploring the data. The interface is quite limited, however, so engineers often add Grafana (for monitoring) or Marple (for in depth investigation) to their workflow.
Timescale can be used with many existing tooling for Postgres (e.g. DBeaver, pgAdmin) but such tools are not UX friendly for time series data.
Winner: InfluxDB
Other alternatives
Two notable time series databases have not been discussed in this comparison, and are worth taking a look at
- QuestDB (2014) if performance is really important
- TDEngine (2019) if your use case is in IoT
Don’t forget that files might still be the best solution, even CSV files.
Conclusion
On the six topics that we examined, the result is:
InfluxDB 2 - Timescale 2 - Undecided 2
I didn’t think it would end up as a tie, but that’s an honest conclusion.
InfluxDB has the upper hand for Hardware integration and Tooling. If these are important to you, go for Influx. Timescale wins on Data model and Stability.
In any case, don’t forget to benchmark performance for your specific use case, as explained above.
If you have a question, or want to talk in depth about what database would fit your use case, don’t hesitate to get in touch!
Written by Nero Vanbiervliet