Saturday, March 2, 2019

(#31) Persistence is Futile!


We are the IoT Sensors. Lower your shields and surrender your ships. Persistence is futile.


At least that’s what I think about trying to store IoT data in a database. Or, it’s what I thought!


I believed that the data volume and velocity of a large IoT ecosystem would swamp any data store. Persistence is Futile! Moreso if the target was a relational database.


I have other beefs about relational databases, mostly along their semblance to a Land Line Telephone in the 21st century.

But particularly in an IoT world, a world awash in sensor data.

Because a universal attribute of sensor data is time.

  • “What is the temperature now?”
  • “What was it yesterday?”
  • “What will it be this afternoon?”
While any database can store and manipulate time data, I’ll argue that few treat time as a first-class citizen.


And then I met InfluxDB.

InfluxDB is a product in the emerging field of Time Series Database (TSBD) technology.
InfluxDB is a data store where time is treated as a foundational element. Time is more than just a datatype. A Time Series Database (TSDB) “…is a software system that is optimized for handling time series data, arrays of numbers indexed by time (a datetime or a datetime range)” (TSDB Wikipedia)
Asking time-based questions, getting time-based answers from InfluxDB is easy. It’s what a TSDB does.

Let’s get to the meat and cover how I’ve put an InfluxDB into production in two scenarios. First, I have to call out the ease of installation. I am stunned, amazed and grateful (no Oxford comma) at the installation process!  

On my Ubuntu 18.04 LTS based system, the installation was trivial. No, that’s an overstatement. It was easier than trivial. It was ‘easier than falling off a log.’ A half dozen commands, ending with:
$ sudo apt-get install influxdb
$ sudo systemctl unmask influxdb.service
$ sudo systemctl start influxdb.service
And InfluxDB is up and running.
Waiting to ingest some data.
Boom!


InfluxDB is one of four components bundled in what the vendor calls their TICK stack. The other components I installed (with equal ease) were Telegraf and Chronograf. I skipped the installation of Kapacitor.
  • InfluxDB is the persistence layer for the timeseries data.
  • Telegraf is their tool for putting data into InfluxDB.
  • Chronograf is their visualization tool.

Let's start with Chronograf since that’s Eye Candy. 

Two commands bring Chronograf online:
$ wget https://dl.influxdata.com/chronograf/releases/chronograf.deb
$ sudo dpkg -i chrongraf.deb

Then hop over to http://localhost:8888  (replace localhost with the server address where Chronograf is running.)

Solar Array Dashboard

Here’s my first example of a Chronograf Dashboard. This is a screenshot of data coming from my Solar Charging System setup.

These are real-time charts (i.e. the charts change as the data changes) that are showing a handful of key data items important to a small solar charging system.

I’m far from being an expert Chronograf user, I’m far from being an intermediate user. I’ve spent about 90 minutes with the tool over the last several months.
This dashboard took me 15 minutes to create!


Let’s move down to the next layer in the TICK – ingestion of data. Telegraf is their tool that eases the chore of sending IoT data to InfluxDB.
Installation of Telegraf was, again, stunningly simple:
$ sudo apt-get install telegraf
$ sudo systemctl start telegraf
$ telegraf config > telegraf.conf


The last command creates a default configuration file with, what I think are all ingestion methods disabled. All I had to do was bring that file into an editor and change a few lines:
[[inputs.mqtt_consumer]]
   servers = ["tcp://192.168.2.1:1883"]
   topics = [
      "LS1024B/1/DATA",
      "sensors/#",
   ]
   data_format = "json"


My whole home IoT ecosystem is already based on JSON and MQTT. All of my devices in the house send their data, formatted as JSON payloads to an MQTT broker.
Thus, for me, four steps. Nothing more.
My Solar Panel System devices are flowing their data into InfluxDB!
Total elapsed time?
Maybe 30 minutes.


Crank it up a notch



Now, let’s switch to a slightly more complicated InfluxDB setup that I put into place at work.
For context, at work, we have a few packaging machines that we’ve tapped into for their event stream. These packaging machines spew events, as they package packages. The machine events are formatted as JSON messages and routed to an MQTT broker.
I’ve written other applications in Python and Java to subscribe to the events and “do things” (make business related ‘conclusions’). “Java Application 1” uses Esper, a Complex Event Processing engine to create business-relevant conclusions based on patterns in the event stream.
Skipping the gory details, “Java Application 1” looks for certain patterns that indicate something important has occurred. When a pattern is recognized, the Esper based Java program publishes a new event to the MQTT broker.

The points are:
  • Everything is an event
  • All events are represented as JSON messages and
  • All are flowing to an MQTT broker
“Java Application 2” subscribes to these events are puts them into InfluxDB using the Java client. Why not just use Telegraf again? I wanted hands-on experience with the Java client so I could compare the two.
Recap:
  • The packaging machines publish low-level, JSON formatted, events to MQTT
  • Java App 1 uses Esper to detect patterns in those events and publish new events, also in JSON to MQTT
  • Java App 2 subscribes to all of these events and uses the InfluxDB Java client to store the data


Eye candy for openers:
Packaging Machines Status Dashboard

Again, this is a real-time dashboard that instantly conveys the status and business value of the machines. Throughput, machine issues, and concerns that need to be attended to.
Admittedly I’ve not spent a lot of time with other visualization tools (Tableau, PowerBI, etc…) the ease of creating the charts was surprising. I spent about 40 minutes to create that dashboard.
In subsequent posts I’ll go into more detail, show code, show the JSON and configurations.

There's more to InfluxDB than just eye candy!

No comments :

Post a Comment