Blob


1 I have a tiny i686 at home with OpenBSD where I run, amongst some other
2 things, an instance of unbound.
4 Last night I decided that I wanted a dashboard to collect some statistics
5 about it.
7 My first thought was the ELK stack. The only problem is the ram.
8 The little i686 has 1GB of ram, I don't know if it's enough to run
9 logstash, let alone the whole ELK.
11 A simple solution would be to collect the logs elsewhere, but I'm not
12 going to do this for various reason (lazyness being the first, followed
13 by the fact that having statistics about my dns queries isn't that
14 useful in my opinion, even if it's nice-to-have.)
16 Instead, my solution involves a bit of bash (don't hate me on this),
17 some fifos, tmux and ttyplot.
19 The primarly source of inspiration is [this
20 post](https://dataswamp.org/~solene/2019-07-29-ttyplot.html) that I red
21 some time ago: it's about plotting various system statistics with ttyplot.
23 The result is this
25 ![unbound dashboard screenshot](/img/unbound-dashboard.png)
27 (note that I usually disable colors in xterm)
29 ## The flow
31 .-
32 / various -------> multiple
33 unbound stats ------- fifos -------> ttyplot
34 \ -------> per tmux pane
35 `-
37 The idea is to run `unbound-control stats` every once in a while,
38 multiplexing its output and draw each (interesting) stats with ttyplot
39 in a tmux pane.
41 Why the fifos? Well, if I'm not wrong, every time you call
42 `unbound-control stats` it will clear the statistics, so you can't run
43 it *n* times to collect *n* different stats. And since the whole setup
44 requires only a couple of script, the easiest way was to use some fifos.
46 The whole setup requires three script:
48 - `gen-dashboard.sh`
49 - `dashboard.sh`
50 - `mystatd.sh`
52 ### `gen-dashboard.sh`
54 This is the startup script. I run it on my crontab as `@reboot
55 /path/to/gen-dashboard.sh`. It will create the required fifos, then
56 spawn a tmux session and create two windows and some panes.
58 ```sh
59 #!/bin/sh
61 # create the fifos
62 for f in netstat queries hit miss time; do
63 mkfifo /tmp/my-$f
64 done
66 session=dashboard
68 tmux new-session -d -s $session
70 # start mystatd.sh
71 tmux new-window -t $session:1 -n 'logs'
72 tmux send-keys "/path/to/mystatd.sh" C-m
74 # create the dashboard
75 tmux select-window -t $session:0
77 # setup the layout of the panes
78 tmux split-window -h
79 tmux select-pane -L
80 tmux split-window -v
81 tmux select-pane -R
82 tmux split-window -v -p 66
83 tmux split-window -v -p 50
85 # load the correct ttyplot in the panes
86 tmux select-pane -t 0
87 tmux send-keys "/path/to/dashboard.sh netstat" C-m
89 tmux select-pane -t 1
90 tmux send-keys "/path/to/dashboard.sh queries" C-m
92 tmux select-pane -t 2
93 tmux send-keys "/path/to/dashboard.sh hit" C-m
95 tmux select-pane -t 3
96 tmux send-keys "/path/to/dashboard.sh miss" C-m
98 tmux select-pane -t 4
99 tmux send-keys "/path/to/dashboard.sh time" C-m
100 ```
102 (A possible improvement may be to tell tmux which command to run when
103 creating a pane instead of sending the keys to the shell, but it works
104 neverthless.)
106 There's nothing special about this script, so let's move to the next.
108 ### `dashboard.sh`
110 This script also isn't interesting, all it does is pull the data out of
111 the correct fifo and start ttyplot with the correct labels and units.
113 ```sh
114 #!/bin/sh
116 if [ -z "$1" ]; then
117 echo "missing dashboard type"
118 echo "usage: $0 <dashboard-name>"
119 exit 0
120 fi
122 case "$1" in
123 netstat)
124 (while :; do
125 cat /tmp/my-netstat
126 done) | ttyplot -t "IN Bandwidth in KB/s" \
127 -u "KB/s" -c "#"
128 ;;
130 queries)
131 (while :; do
132 cat /tmp/my-queries
133 done) | ttyplot -t "DNS Queries/5s" \
134 -u "q/5s" -c "#"
135 ;;
137 hit)
138 (while :; do
139 cat /tmp/my-hit
140 done) | ttyplot -t "DNS cache hit/5s" \
141 -u "ch/5s" -c "#"
142 ;;
144 miss)
145 (while :; do
146 cat /tmp/my-miss
147 done) | ttyplot -t "DNS cache miss/5s" \
148 -u "cm/5s" -c "#"
149 ;;
151 time)
152 (while :; do
153 cat /tmp/my-time
154 done) | ttyplot -t "DNS query time avg/5s" \
155 -c "#"
156 ;;
158 *)
159 printf "%s\n" "$1 is not a valid dashboard"
160 exit 1
161 ;;
162 esac
163 ```
165 ### `mystatd.sh`
167 This is the (only?) interesting script. It's also the only one that
168 requires bash, because I'm lazy, it was already installed as dependecy of
169 something, and because of the `>(cmd)` construct. Rewriting the script
170 using only pure sh(1) constructs is left as an exercise to the reader
171 (hint: you need some extra fifo.)
173 ```sh
174 #!/usr/bin/env bash
176 filter() {
177 grep "$1" | awk -F= '{print $2}' > /tmp/my-$2
180 # unbound stats
181 ( while :; do
182 unbound-control stats \
183 | grep -v thread0 \
184 | tee >(filter queries= queries) \
185 | tee >(filter hit hit) \
186 | tee >(filter miss miss) \
187 | filter time.avg time
189 sleep 5
190 done ) &
192 # netstat - ty solene@ for the awk
194 (while :; do
195 netstat -ibn
196 sleep 1
197 done) | awk '
198 BEGIN {
199 old=-1
201 /^em0/ {
202 if(!index($4,":") && old>=0) {
203 print ($5-old)/1024
204 fflush
205 old = $5
207 if(old==-1) {
208 old = $5
210 }' | tee -a /tmp/my-netstat
211 ) &
213 wait
214 ```
216 The first piece collects the stat from unbound. Let's break it in pieces.
218 - `unbound-control stats` outputs the stats. Keep in mind that this
219 requires some privileges. I've solved this by creating a script
220 in /usr/local/bin that executes the command and allowed my user to
221 launch that script via `doas(1)`. Or you can run `mystatd.sh` as root.
222 Do as you please.
223 - `grep -v thread0` removes the per-thread stats (since my unbound
224 uses only one thread). A more solid approach like `egrep -v ^thread`
225 is probably better.
226 - `tee >(filter queries= queries) |` duplicates the stream: one copy
227 goes to the subshell with `filter` and another copy goes on the pipes.
228 - `filter` is just a small function to grep the desired entry and send
229 it to `/tmp/my-$something`
231 The netstat bit filters the output of netstat (the awk is copied-pasted
232 from the previously linked post by solene@). You may want to change the
233 `^em0` to match your network device.
235 And that's all!
237 ## Possible improvements
239 - if you `SIGINT` `mystatd.sh` some of its subprocess still run. Maybe a
240 `trap` is needed. Since it is the only bash running on that system,
241 `pkill bash` is, albeit a bit aggressive, a working solution.
242 - replace bash. It's not difficult, but requires more fifos.
243 - ...
245 ## Final considerations
247 This was fun. Now I have a tmux session I remotely attach with cool
248 graphs. This doesn't cover the archiviation of the statistics tho.
249 I think it should be trivial to add (just one more `|tee -a` to a local
250 file, maybe a cronjob to do rotation, ...) but for the moment I'm happy
251 with this result.