Experience with OpenTSDB and Elasticsearch as a Time-Series DB

Experience with OpenTSDB and Elasticsearch as a Time-Series DB

This post is regarding my experience of past year working with analytics platform in a networking company.

When I started looking into a problem regarding statistics and analytics infrastructure I wanted to give myself a clean slate. To begin with I defined my problem at hand as - Finding a good TSDB. Why TSDB? For the company's product use-cases I learnt that almost always required statistical metrics were time-series data - sequence of numerical points in successive order.

I started working on defining “good” by understanding the problem in depth and looking at the different technologies out there. Just to note, being in networking companies for 9 years of my professional career this was a brand new challenge that I was really looking forward to it. The field was vast and the options were plenty. At the time of taking up the problem we already had OpenTSDB as our chosen DB and we were clearly having issues with it. I wanted to keep things simple and came up with following top 5 attributes for my “good” TSDB.

  1. Easy to deploy and upgrade: Many of the time, TSDB could act as an add-on technology. Meeting or exceeding the standard of core technology’s deployment and upgrade is MUST. Having an ability to deploy with minimal dependencies and with simple APIs is something I found critical. Most of the technologies out there qualified with this attribute. Upgrade is something that becomes after-thought in many softwares and more often than not it does not work as seamlessly as described. I highly recon trying out version upgrade and a schema upgrade even before qualifying TSDB.
  2. Scale, Performance and Memory/Storage requirements: First thing I did was to get a ballpark scale numbers that we were predicting for next 2 to 3 years. I wanted to focus mostly on how to achieve that and measure performance on those numbers. Many technologies that I read about and did small prototyping with had a very strong benchmarking numbers; including elasticsearch. But some of them were coming with a higher storage and memory requirements that we could not afford while providing supplementary analytics system.
  3. Continuous support model: There are two ways to adopt these technologies. One is to build an in-house team who specializes in the technology to the point that they can contribute back to the project. Two is to chose one with the great community and company around it which can provide with continuous support. In our case, I thought OEMing technology made more sense over all and it happens to be cost effective in long run in this ever changing landscape.
  4. Rich API sets: Kudos to all the technology provider in the space of analytics for creating an environment of making a Rich API sets part of initial product offering. Almost all the technologies that I read and tried qualified to have it. It made barrier to entry to get the hands-on very easy and it was very essential in choosing of the technology.
  5. Advanced Search/Query capabilities: Classic search and queries of TSDBs are covered by almost every option out there. Very few focuses on capability beyond that. Ability to group top x, to perform prefix searches and to do cross indices queries were of an importance for our use-case.

Following were some interesting findings when I took a second look into OTSDB.

  • Limited Tags support (Maximum 8) - This was a huge restriction as some of the new requirements meant we wanted to store more tags per data. This would not play well with 1st attribute of good TSDB.
  • Does not work well if for same time-series different set of tags are given - Which meant that if I were to make a change to my schema, I had to perform a migration step in upgrades. Not a good fit per 1st attribute again.
  • At the time, OTSDB were using Hadoop/Hbase as a backend data store. Vendors providing official support had a very steep base requirement for memory and storage. It was coming out to be minimum 8 nodes setup with standard memory of 8/16 GB on average. That means attribute 2 of good TSDB for us were not getting satisfied.
  • One can find and buy support for data-store backend (hadoop/cassandra etc). But there is no official support for TSDB layer. Not a good fit per 3rd attribute for my good TSDB.
  • In OTSDB grouping, down-sampling, interpolation, aggregation and rate calculation after fetching data from hadoop always adds up to total query-time. As the data grew, the query-time grew exponentially as well. With these findings and having a very rudimentary query options it was failing attribute 5 of my "good" TSDB.

Some additional findings were also disrupting the use of OTSDB. They were as below:

  • When you want just one value (i.e, sum of a particular metric during a time period), it is not possible without downsampling.
  • Downsampling is not always accurate. Even with 2.1 release (at that time latest), going through the changes.
  • Serialization of large result sets can take quite a long time, and aggregating a large result set isn't free either.
  • From FAQ of OTSDB: “ Can I store sub-second precision timestamps in OpenTSDB? As of version 2.0 you can store data with millisecond timestamps. However we recommend you avoid storing a data point at each millisecond as this will slow down queries dramatically. See Dates and Times for details.”

At the time, after looking into influxDB and elasticsearch, ES qualified with all 5 attributes present.


We ran internal benchmarking tests on the proprietary data sets (1 mil and 10 million records) and found that ES query speed was on an average 10 times faster than that of OTSDB. In terms of per document/data write, OTSDB definitely gave faster performance. But ES gave an ability to batch writes the document and fact that you can store objects such as dictionaries in ES meant we could reduce the no of docs to write and thus the overall writing time. Also found http://cds.cern.ch/record/2011172/files/LHCb-TALK-2015-060.pdf as a supporting benchmarking document that made the ES's case stronger.

I started defining schema and this elastic blogpost helped a lot. I made sure that mappings and aliases are used properly. We defined daily indices per application; We used application level aliases to enable search across multiple days. This gave shards for a given index two factor modularization: 1. Time and 2. Application. Also, it helps in purging where typically you want to purge historical data by going back in time.

After tweaking parameters during benchmark testing the final elasticsearch.yml looked like:

network.host: <hostname> , _local_
cluster.name: <clustername>
node.name: <nodename>
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: [ <host1>, <host2>, <host3>]
discovery.zen.minimum_master_nodes: 2
http.cors.enabled: true
http.cors.allow-origin: "*"
index.number_of_shards: 3
index.number_of_replicas: 1
index.translog.durability: async
index.translog.flush_threshold_size: 1g
index.refresh_interval: 30s
bootstrap.mlockall: true
threadpool.bulk.type: fixed
threadpool.bulk.queue_size:  5000

We decided to start supporting 3 node (all master all data) cluster. Some of the settings that we ended up playing with and changing it based on our benchmarking exercise:

Number of shards and replica settings:

All you need is one shard per node so we defined 3. All you need is a replica each so we defined 1. These settings typically should take care of a split brain scenario and avoid extra computation with more shards and replicas being in place.

Translog settings:

Transaction log or translog settings helps control fsync and commit of the data to the file system. By increasing the default threshold size from 512 Mb to 1g we got better performance where our bulk writes were indexing more than 512Mb worth of data every 5 seconds.

Refresh interval settings:

By default elasticsearch keeps the refresh interval to 1sec. This gives the near "real time" peek into the data. But it has a computation and i/o cost associated with it as the in-memory buffers are flushed and written to the disk. Increasing it to 30sec meant that threads could spend more time doing indexing and sufficed the use bulk writes that we wanted to achieve. This meant that our reads were 30sec behind but it was fine for our application.

Writing:

Java transport client is the best ES client to used in terms of feature richness. We wanted to write in bulk for performance gain and while at it we discovered a problem in elasticsearch's transport bulk request. What we found that if for any reason bulk size hits the threshold (setting: maxbulkactions) prior to its write interval (setting: flushinterval) the flush/write thread will end up actually performing index synchronously. This scenario back-pressured writing threads who were using shared queue to pick data from to be written in elasticsearch. We ended up it by making sure that the flush interval always gets executed and bulk write is done periodically only (every 3 sec) in our case.

Following this implementation we achieved what we wanted to. A decent write numbers and a very impressive read numbers from Elasticsearch.

Balaji Iyer

Software Program Management & Operations Leader

7y

Thanks for your insights Ronak Shah, great post on the journey. Looking forward to a post on your usage of Kibana and Watcher 😁

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics