Pete Shima
Pete Shima
Site Author
Dec 28, 2017 10 min read

Monitoring your home network with InfluxDB on Raspberry Pi with Docker

thumbnail for this post

Backstory

Over a year ago I was having all sorts of networking problems at home, major packet loss, complete networking outages and more. They were spurious, unpredictable and hard to diagnose. I wasn’t sure if it was my desktop/laptop, phone, wireless, or my internet provider.

So I decided to put a Raspberry Pi 2 on my network with InfluxDB, telegraf and grafana with some network monitors in place. I need historical data because the issues I was seeing were sporadic.

I’ve used it over the past year to debug many issues and this year I’m polishing it up with a new setup on a Raspberry Pi 3.

Preface

These instructions are set for a Mac and a Pi 2 or 3 using raspbian 9 Stretch. This expects you know the basics of running Linux and a Raspberry Pi.

Initial Setup

First we download Etcher and the raspbian image. We use this to drop the image in the SD card we use in the Pi. We want the lite image, however the full image may work just fine as well.

Etcher by resin.io

Download Raspbian for Raspberry Pi

Next for a Pi3 we need to connect it to the wifi.

Setting WiFi up via the command line - Raspberry Pi Documentation

Then we enable SSH to remotely connect to the Pi.

SSH (Secure Shell) - Raspberry Pi Documentation

After that we can now remotely connect. % ssh pi@172.16.1.200 pi@172.16.1.200’s password: Linux raspberrypi 4.9.59-v7+ #1047 SMP Sun Oct 29 12:19:23 GMT 2017 armv7lThe programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright.Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. Last login: Tue Dec 26 19:19:27 2017 from 172.16.1.86 pi@raspberrypi:~ $

Now that the Pi is setup time to install our software packages.

Setup Docker

Install docker-ce via https://docs.docker.com/engine/installation/linux/docker-ce/debian/#install-using-the-convenience-script

Shortcut is pi@raspberrypi:~/docker $ curl -fsSL get.docker.com -o get-docker.sh pi@raspberrypi:~/docker $ sudo sh get-docker.sh

Check that docker is running pi@raspberrypi:~ $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES pi@raspberrypi:~ $

Setup InfluxDB

For this we use a pre built container: https://github.com/hypriot/rpi-influxdb sudo docker run -d --volume=/var/influxdb:/data -p 8086:8086 hypriot/rpi-influxdb

Now when running this I got this error: pi@raspberrypi:~ $ sudo docker run -d --volume=/var/influxdb:/data -p 8086:8086 hypriot/rpi-influxdb 9499d9a93862749fa6573dde8d0b6303751d2985e74d7612a84c3ac866f1a07e docker: Error response from daemon: cgroups: memory cgroup not supported on this system: unknown.

Which lead me to this issue: https://github.com/moby/moby/issues/35587

And leads me to this specific comment on how to fix: https://github.com/moby/moby/issues/35587#issuecomment-353976863

After that, reboot.

And now we’ve got influxdb running pi@raspberrypi:~ $ sudo docker run -d --volume=/var/influxdb:/data -p 8086:8086 hypriot/rpi-influxdb c10b584e321034f5f360be506ba36b80142054af11162828a792bb66bb64146f pi@raspberrypi:~ $ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c10b584e3210 hypriot/rpi-influxdb "/usr/bin/entry.sh /…" 23 seconds ago Up 21 seconds 0.0.0.0:8086->8086/tcp sad_neumann

Now we need to follow the rest of the instructions to initialize the influxdb database `pi@raspberrypi:~ $ sudo docker exec -it c10b584e3210 /usr/bin/influx
Connected to http://localhost:8086 version 1.2.2
InfluxDB shell version: 1.2.2
> CREATE DATABASE db1
> SHOW DATABASES
name: databases

name

_internal
db1``> USE db1
Using database db1
> CREATE USER root WITH PASSWORD 'passhere' WITH ALL PRIVILEGES
> GRANT ALL PRIVILEGES ON db1 TO root
> SHOW USERS
user admin


root true`

Setting up Telegraf

We’re going to use telegraf as our local agent to collect metrics and statistics. https://hub.docker.com/r/arm32v7/telegraf/

We have to use the arm32v7 version pi@raspberrypi:~ $ sudo docker run --net=container:c10b584e3210 arm32v7/telegraf Unable to find image 'arm32v7/telegraf:latest' locally latest: Pulling from arm32v7/telegraf Digest: sha256:7da91efbb3e228a31d3859daa0609c91f8c76970dd966a20d9e65e5193cff40a Status: Downloaded newer image for arm32v7/telegraf:latest 2017/12/26 21:35:15 I! Using config file: /etc/telegraf/telegraf.conf 2017-12-26T21:35:15Z I! Starting Telegraf v1.5.0 2017-12-26T21:35:15Z I! Loaded outputs: influxdb 2017-12-26T21:35:15Z I! Loaded inputs: inputs.mem inputs.processes inputs.swap inputs.system inputs.cpu inputs.disk inputs.diskio inputs.kernel 2017-12-26T21:35:15Z I! Tags enabled: host=c10b584e3210 2017-12-26T21:35:15Z I! Agent Config: Interval:10s, Quiet:false, Hostname:"c10b584e3210", Flush Interval:10s

Now let’s generate a config file we can mount into the container sudo docker run --rm arm32v7/telegraf telegraf config > telegraf.conf

And we can modify it as needed, we won’t do that quite yet though, but now we can run the container mounted with the local config file docker run `--net=container:c10b584e3210 `-v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf:ro arm32v7/telegraf

Setup Grafana

More recently there are now 3rd party grafana docker containers on docker hub which makes this considerably easier

https://hub.docker.com/r/fg2it/grafana-armhf/ docker run -i -p 3000:3000 --name grafana fg2it/grafana-armhf:v4.1.2

Now we can test by hitting the grafana endpoint and logging in with admin:admin

Voila

image

But adding the data source fails. Trying to run the container with the existing influxdb networked container fails as well. pi@raspberrypi:~ $ docker run --net=container:c10b584e3210 -i -p 3000:3000 --name grafana fg2it/grafana-armhf:v4.1.2 docker: Error response from daemon: conflicting options: port publishing and the container type network mode. See 'docker run --help'.

This is a subtetly of docker networking and docker has several modes of networking. In this case we are going to just use host based networking for simplicity.

Pulling things together.

So far we’ve done the basics of booting the three software packages together to ensure that has worked but now we need to reconfigure them to work together and to start on boot. After that we can start customizing our configuration.

First we need to stop all the containers we have running. The first steps were just to make sure everything ran. pi@raspberrypi:~ $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c10b584e3210 hypriot/rpi-influxdb "/usr/bin/entry.sh /…" About an hour ago Up About an hour 0.0.0.0:8086->8086/tcp sad_neumann pi@raspberrypi:~ $ docker stop c10b584e3210 c10b584e3210 pi@raspberrypi:~ $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES pi@raspberrypi:~ $

Since we are only planning on running monitoring on this pi, we can just use docker host networking.

Start influxdb using host networking. pi@raspberrypi:~ $ sudo docker run --rm -d --name=influxdb --net=host --volume=/var/influxdb:/data hypriot/rpi-influxdb 2e30695d09b2baac36653fca83fc5d797c9bf3d6243c831543beeb7121463811

And run telegraf pi@raspberrypi:~ $ sudo docker run --rm -d --net=host --name telegraf -v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf:ro arm32v7/telegraf 616a096391fcda76c111528e6e568dff9c6159a49f7c18ef1498fd9f108e6487

First let’s create some persistent storage for grafana pi@raspberrypi:~ $ sudo docker run -d -v /var/lib/grafana –name grafana-storage busybox:latest``

And then run grafana with the persistent storage pi@raspberrypi:~ $ sudo docker run --rm -d --net=host --name grafana --volumes-from grafana-storage fg2it/grafana-armhf:v4.1.2 de31ad82e994f579bb3aff0a3765152c8af3d290b1ce66fd04bb728c99a72e3d

Verify the install

Before we get to the monitoring we should verify the install is working.

First we can tail all the docker logs

Influx pi@raspberrypi:~ $ docker logs influxdb => Starting InfluxDB ... => No database need to be pre-created exec influxd -config=${CONFIG_FILE} <omitted> [I] 2017-12-28T17:43:00Z Listening for signals [I] 2017-12-28T17:43:00Z Sending usage statistics to usage.influxdata.com [httpd] 127.0.0.1 - - [28/Dec/2017:17:43:00 +0000] "POST /write?consistency=any&db=telegraf HTTP/1.1" 204 0 "-" "-" 8c1abea5-ebf6-11e7-8001-000000000000 15528 [httpd] 127.0.0.1 - - [28/Dec/2017:17:43:00 +0000] "POST /write?consistency=any&db=telegraf HTTP/1.1" 204 0 "-" "-" 8c1d763e-ebf6-11e7-8002-000000000000 18108

Telegraf 2017/12/28 17:44:20 I! Using config file: /etc/telegraf/telegraf.conf 2017-12-28T17:44:20Z I! Starting Telegraf v1.5.0 2017-12-28T17:44:20Z I! Loaded outputs: influxdb 2017-12-28T17:44:20Z I! Loaded inputs: inputs.kernel inputs.mem inputs.processes inputs.swap inputs.system inputs.cpu inputs.disk inputs.diskio 2017-12-28T17:44:20Z I! Tags enabled: host=raspberrypi 2017-12-28T17:44:20Z I! Agent Config: Interval:10s, Quiet:false, Hostname:"raspberrypi", Flush Interval:10s

Grafana pi@raspberrypi:~ $ docker logs grafana t=2017-12-28T17:42:45+0000 lvl=info msg="Starting Grafana" logger=main version=4.1.2 commit=v4.1.2 compiled=2017-02-13T12:13:31+0000 t=2017-12-28T17:42:45+0000 lvl=info msg="Config loaded from" logger=settings file=/usr/share/grafana/conf/defaults.ini t=2017-12-28T17:42:45+0000 lvl=info msg="Config loaded from" logger=settings file=/etc/grafana/grafana.ini t=2017-12-28T17:42:45+0000 lvl=info msg="Config overriden from command line" logger=settings arg="default.paths.data=/var/lib/grafana" t=2017-12-28T17:42:45+0000 lvl=info msg="Config overriden from command line" logger=settings arg="default.paths.logs=/var/log/grafana" t=2017-12-28T17:42:45+0000 lvl=info msg="Config overriden from command line" logger=settings arg="default.paths.plugins=/var/lib/grafana/plugins" t=2017-12-28T17:42:45+0000 lvl=info msg="Path Home" logger=settings path=/usr/share/grafana t=2017-12-28T17:42:45+0000 lvl=info msg="Path Data" logger=settings path=/var/lib/grafana t=2017-12-28T17:42:45+0000 lvl=info msg="Path Logs" logger=settings path=/var/log/grafana t=2017-12-28T17:42:45+0000 lvl=info msg="Path Plugins" logger=settings path=/var/lib/grafana/plugins t=2017-12-28T17:42:45+0000 lvl=info msg="Initializing DB" logger=sqlstore dbtype=sqlite3 t=2017-12-28T17:42:45+0000 lvl=info msg="Starting DB migration" logger=migrator t=2017-12-28T17:42:45+0000 lvl=info msg="Executing migration" logger=migrator id="copy data account to org" t=2017-12-28T17:42:45+0000 lvl=info msg="Skipping migration condition not fulfilled" logger=migrator id="copy data account to org" t=2017-12-28T17:42:45+0000 lvl=info msg="Executing migration" logger=migrator id="copy data account_user to org_user" t=2017-12-28T17:42:45+0000 lvl=info msg="Skipping migration condition not fulfilled" logger=migrator id="copy data account_user to org_user" t=2017-12-28T17:42:45+0000 lvl=info msg="Starting plugin search" logger=plugins t=2017-12-28T17:42:45+0000 lvl=info msg="Initializing Alerting" logger=alerting.engine t=2017-12-28T17:42:45+0000 lvl=info msg="Initializing CleanUpService" logger=cleanup t=2017-12-28T17:42:45+0000 lvl=info msg="Initializing HTTP Server" logger=server address=0.0.0.0:3000 protocol=http subUrl=

If all is well we can then setup Grafana. There are other tests we can run here to verify and I might add them later.

Setup Grafana Datasource

When loading up grafana at http:<ip>:3000 we are presented with the front page again and can move on to configuring a basic influx datasource

image

With the data source setup we can now setup a new dashboard. To test things out we will graph performance of the Pi itself, then add our networking monitoring graphs.

Hitting the edit button on the Panel Title allows us to edit it.

image

And we can get a simple graph together of average CPU as a test.

image

image

Adding basic monitoring

With the above we should now have a working monitoring setup, we just need to add some checks to it.

With influxdb and telegraf this is pretty easy right out of the box. In the telegraf.conf file (initially mapped right in the pi user home dir) we are going to change a few of the inputs.

Scrolling through the default file you will see settings regarding the agent, outputs aggregators and inputs. You’ll also notice that most of the items are commented out except our existing host metrics like [[inputs.cpu]] which we used to graph above.

We want to start by modifying [[inputs.dns_query]] # # Query given DNS server and gives statistics [[inputs.dns_query]] ## servers to query servers = [&#34;8.8.8.8&#34;, &#34;4.2.2.1&#34;] # required## Domains or subdomains to query. &#34;.&#34;(root) is default domains = [&#34;[www.google.com](http://www.google.com)&#34;] # optional## Query record type. Default is &#34;A&#34; ## Posible values: A, AAAA, CNAME, MX, NS, PTR, TXT, SOA, SPF, SRV. record_type = &#34;A&#34; # optional## Dns server port. 53 is default port = 53 # optional## Query timeout in seconds. Default is 2 seconds timeout = 2 # optional

Here we are setting up telegraf to do a dns query for www.google.com against 2 common DNS servers. You should also input the DNS server you use on your local network as well. With this we will get graphs for DNS response time.

Next we want to modify [[inputs.ping]] `# # Ping given url(s) and return statistics
[[inputs.ping]]

NOTE: this plugin forks the ping command. You may need to set capabilities

via setcap cap_net_raw+p /bin/ping

urls to ping

urls = ["www.github.com","www.amazon.com","8.8.8.8","4.2.2.1","172.16.1.1"]

number of pings to send per collection (ping -c <COUNT>)

count = 3

interval, in s, at which to ping. 0 == default (ping -i <PING_INTERVAL>)

ping_interval = 15.0

per-ping timeout, in s. 0 == no timeout (ping -W <TIMEOUT>)

timeout = 10.0

interface to send ping from (ping -I <INTERFACE>)

interface = "wlan0"`

Here we setup ping monitoring to some popular sites, our DNS servers and something on our local network. If you are using ethernet instead of wifi (standard on Pi2) then you’ll want to change the interface to “eth0”

With the note in the ping plugin comments we also need to allow pings pi@raspberrypi:~ $ sudo setcap cap_net_raw+p /bin/ping

Otherwise we will see something like this in our telegraf log: 2017-12-28T18:21:50Z E! Error in plugin [inputs.ping]: Fatal error processing ping output: 8.8.8.8 2017-12-28T18:21:50Z E! Error in plugin [inputs.ping]: Fatal error processing ping output: 172.16.1.1 2017-12-28T18:21:50Z E! Error in plugin [inputs.ping]: Fatal error processing ping output: 4.2.2.1

There is also a pretty silly issue in the standard telegraf docker container we used. The version of ping that is installed isn’t compatible with it’s own ping plugin. There is an open issue for this here that you will need to use as a workaround until it is fixed: https://github.com/influxdata/telegraf/issues/3295

You can also see throughout the config file there are many other things we can enable to graph if we wanted as well. [[inputs.http_response]] is something you may want to enable as well.

After changing the configuration we just restart telegraf. pi@raspberrypi:~ $ sudo docker restart telegraf telegraf

Setting up the network dashboard

We can setup a new dashboard for ping response time.

image

We can also add one for packet loss

image

And finally DNS

image

Summary

Now we’ve got monitoring in place and a dashboard setup to review it as needed. For about $50 we’ve now got a flexible monitoring solution that can be plugged in pretty much anywhere on your network to give you another view of network performance.

Now whenever I have a networking issue I can load up these graphs to get the Pi’s view of what is happening. Turns out I had several different issues going on and sometimes it was ISP, sometimes wireless and sometimes my local device.