Ok, let’s chat about Wavefront’s UI and getting value from our data! This is the most user-friendly product that I’ve used for time series data! Let’s explore a quick example using disk space to showcase some of that functionality.
Telegraf only sends raw values for:
- Total Space
- Free Space
- Used Space
This is seen below where I have intentionally limited the results to a single host and single disk object. We have ~55 GB Total and ~30 GB Free.
What if I want to know the percentage of used space?
It’s actually amazingly intelligent in allowing you to do math based operations on objects dynamically. I’m even going to do it a bit backwards, intentionally, to showcase this. Ideally, the mathematical formula to figure the percentage of used space out would be:
100 * (disk.used / disk.total)
to find the inverse (percent available) we can simply subtract that value from 100(%):
100 – ( 100 * (disk.used / disk.total))
Let’s see if that actually works:
Wow, just like that we can take two metrics, multiply, divide and subtract to show the percent of disk used. Wavefront automatically handles the correlation of the devices and properly applies the math. Ok, that’s nice, but how can I tell if I’m rapidly running out of disk space WITHOUT setting a static threshold.
First, we’re going to clone our original query using the little copy icon to the right of the query. Then, select the Query Wizard.
The Query Wizard makes people like me look smart. I can select the general category that I want…
and then I can select the method to use in that category. The wizard automatically applies the requested query syntax and previews the results for you to easily verify it is behaving as desired.
That’s great, now let’s create an alert on this standard deviation. In this case I can see that my disk space deviation on an hourly basis is generally under 1. For this example, I’m going to configure our alarm to trigger when the SD is greater than 2.
First, for reference sake, this is what our values look like:
Now, remember that an alert will be triggered whenever your query returns a non-zero value. This means that we need to modify our query slightly to include a threshold of sorts so that only breaches above our threshold (2) trigger an alarm (value above zero). You can see how the below spikes in our alert definition correlate to the above raw data.
There you have it, both the basics, a bit of advanced data manipulation as well as some alerting based on statistical analysis!