cdeftutorial
DESCRIPTION You provide a question and I will try to provide an answer
in the next release. No feedback equals no changes!
Additions to this document are also welcome.
Alex van den Bogaerdt <alex@ergens.op.het.net>
Why this tutorial ?
One of the powerful parts of RRDtool is its ability to do
all sorts of calculations on the data retrieved from it's
databases. However RRDtool's many options and syntax make
it difficult for the average user to understand. The manu
als are good at explaining what these options do; however
they do not (and should not) explain in detail why they
are useful. As with my RRDtool tutorial: if you want a
simple document in simple language you should read this
tutorial. If you are happy with the official documenta
tion, you may find this document too simple or even bor
ing. If you do choose to read this tutorial, I also expect
you to have read and fully understand my other tutorial.
More reading
If you have difficulties with the way I try to explain it
please read Steve Rader's rpntutorial. It may help you
understand how this all works.
What are CDEFs ? When retrieving data from an RRD, you are using a "DEF" to
work with that data. Think of it as a variable that
changes over time (where time is the xaxis). The value of
this variable is what is found in the database at that
particular time and you can't do any modifications on it.
This is what CDEFs are for: they takes values from DEFs
and perform calculations on them.
Syntax DEF:var_name_1=some.rrd:ds_name:CF
CDEF:var_name_2=RPN_expression
You first define "var_name_1" to be data collected from
data source "ds_name" found in RRD "some.rrd" with consol
idation function "CF".
Assume the ifInOctets SNMP counter is saved in mrtg.rrd as
the DS "in". Then the following DEF defines a variable
for the average of that data source:
DEF:inbytes=mrtg.rrd:in:AVERAGE
RPNexpressions RPN is shorthand for Reverse Polish Notation. It works as
follows. You put the variables or numbers on a stack. You
also put operations (thingstodo) on the stack and this
stack is then processed. The result will be placed on the
stack. At the end, there should be exactly one number
left: the outcome of the series of operations. If there is
not exactly one number left, rrdtool will complain loudly.
Above multiplication by eight will look like:
1. Start with an empty stack
2. Put the content of variable inbytes on the stack
3. Put the number eight on the stack
4. Put the operation multiply on the stack
5. Process the stack
6. Retrieve the value from the stack and put it in vari
able inbits
We will now do an example with real numbers. Suppose the
variable inbytes would have value 10, the stack would be:
1. 
2. 10
3. 108
4. 108*
5. 80
6. 
Processing the stack (step 5) will retrieve one value from
the stack (from the right at step 4). This is the opera
tion multiply and this takes two values off the stack as
input. The result is put back on the stack (the value 80
in this case). For multiplication the order doesn't matter
but for other operations like subtraction and division it
does. Generally speaking you have the following order:
y = A  B > y=minus(A,B) > CDEF:y=A,B,
This is not very intuitive (at least most people don't
think so). For the function f(A,B) you reverse the posi
router1.rrd with link1in link2in
router2.rrd with link1in link2in
router3.rrd with link1in link2in
Suppose you would like to add up all these counters,
except for link2in inside router2.rrd. You need to do:
(in this example, "router1.rrd:link1in" means the DS
link1in inside the RRD router1.rrd)
router1.rrd:link1in
router1.rrd:link2in
router2.rrd:link1in
router3.rrd:link1in
router3.rrd:link2in
 +
(outcome of the sum)
As a mathmatical function, this could be written:
"add(router1.rrd:link1in , router1.rrd:link2in ,
router2.rrd:link1in , router3.rrd:link1in ,
router3.rrd:link2.in)"
With RRDtool and RPN, first, define the inputs:
DEF:a=router1.rrd:link1in:AVERAGE
DEF:b=router1.rrd:link2in:AVERAGE
DEF:c=router2.rrd:link1in:AVERAGE
DEF:d=router3.rrd:link1in:AVERAGE
DEF:e=router3.rrd:link2in:AVERAGE
Now, the mathematical function becomes: "add(a,b,c,d,e)"
In RPN, there's no operator that sums more than two values
so you need to do several additions. You add a and b, add
c to the result, add d to the result and add e to the
result.
push a: a stack contains the value of a
push b and add: b,+ stack contains the result of a+b
push c and add: c,+ stack contains the result of a+b+c
push d and add: d,+ stack contains the result of a+b+c+d
push e and add: e,+ stack contains the result of a+b+c+d+e
What was calculated here would be written down as:
( ( ( (a+b) + c) + d) + e) >
This is in RPN: "CDEF:result=a,b,+,c,+,d,+,e,+"
push operator + on the stack: a b c P +
and process it: a b Q (where Q == c+P)
push operator + on the stack: a b Q +
and process it: a R (where R == b+Q)
push operator + on the stack: a R +
and process it: S (where S == a+R)
As you can see the RPN expression "a,b,c,d,e,+,+,+,+,+"
will evaluate in "((((d+e)+c)+b)+a)" and it has the same
outcome as "a,b,+,c,+,d,+,e,+" According to Steve Rader
this is called the commutative law of addition but you may
forget this right away, as long as you remember what it
represents.
Now look at an expression that contains a multiplication:
First in normal math: "let result = a+b*c". In this case
you can't choose the order yourself, you have to start
with the multiplication and then add a to it. You may
alter the position of b and c, you may not alter the posi
tion of a and b.
You have to take this in consideration when converting
this expression into RPN. Read it as: "Add the outcome of
b*c to a" and then it is easy to write the RPN expression:
"result=a,b,c,*,+" Another expression that would return
the same: "result=b,c,*,a,+"
In normal math, you may encounter something like "a*(b+c)"
and this can also be converted into RPN. The parenthesis
just tell you to first add b and c, and then multiply a
with the result. Again, now it is easy to write it in RPN:
"result=a,b,c,+,*". Note that this is very similar to one
of the expressions in the previous paragraph, only the
multiplication and the addition changed places.
When you have problems with RPN or when rrdtool is com
plaining, it's usually a Good Thing to write down the
stack on a piece of paper and see what happens. Have the
manual ready and pretend to be rrdtool. Just do all the
math by hand to see what happens, I'm sure this will solve
most, if not all, problems you encounter.
Some special numbers The unknown value
Sometimes collecting your data will fail. This can be very
common, especially when querying over busy links. RRDtool
can be configured to allow for one (or even more) unknown
value and calculate the missing update. You can, for
instance, query your device every minute. This is creating
one so called PDP or primary data point per minute. If you
unknown, the CDP is unknown as well. This is determined
by the xff factor. Please note that one unknown counter
update can result in two unknown PDPs! If you only allow
for one unknown PDP per CDP, this makes the CDP go
unknown!
Suppose the counter increments with one per second and you
retrieve it every minute:
counter value resulting rate
10000
10060 1; (1006010000)/60 == 1
10120 1; (1012010060)/60 == 1
unknown unknown; you don't know the last value
10240 unknown; you don't know the previous value
10300 1; (1030010240)/60 == 1
If the CDP was to be calculated from the last five
updates, it would get two unknown PDPs and three known
PDPs. If xff would have been set to 0.5 which by the way
is a commonly used factor, the CDP would have a known
value of 1. If xff would have been set to 0.2 then the
resulting CDP would be unknown.
You have to decide the proper values for heartbeat, number
of PDPs per CDP and the xff factor. As you can see from
the previous text they define the behavior of your RRA.
Working with unknown data in your database
As you have read in the previous chapter, entries in an
RRA can be set to the unknown value. If you do calcula
tions with this type of value, the result has to be
unknown too. This means that an expression such as
"result=a,b,+" will be unknown if either a or b is
unknown. It would be wrong to just ignore the unknown
value and return the value of the other parameter. By
doing so, you would assume "unknown" means "zero" and this
is not true.
There has been a case where somebody was collecting data
for over a year. A new piece of equipment was installed,
a new RRD was created and the scripts were changed to add
a counter from the old database and a counter from the new
database. The result was disappointing, a large part of
the statistics seemed to have vanished mysteriously ...
They of course didn't, values from the old database (known
values) were added to values from the new database
(unknown values) and the result was unknown.
In this case, it is fairly reasonable to use a CDEF that
alters unknown data into zero. The counters of the device
depending on the position relative to zero.
RRDtool is capable of representing (not graphing!)
infinity by stopping at its current maximum (for positive
infinity) or minimum (for negative infinity) without know
ing this maximum (minimum).
Infinity in rrdtool is mostly used to draw an AREA without
knowing its vertical dimensions. You can think of it as
drawing an AREA with an infinite height and displaying
only the part that is visible in the current graph. This
is probably a good way to approximate infinity and it sure
allows for some neat tricks. See below for examples.
Working with unknown data and infinity
Sometimes you would like to discard unknown data and pre
tend it is zero (or any other value for that matter) and
sometimes you would like to pretend that known data is
unknown (to discard knowntobewrong data). This is why
CDEFs have support for unknown data. There are also exam
ples available that show unknown data by using infinity.
Some examples Example: using a recently created RRD
You are keeping statistics on your router for over a year
now. Recently you installed an extra router and you would
like to show the combined throughput for these two
devices.
If you just add up the counters from router.rrd and
router2.rrd, you will add known data (from router.rrd) to
unknown data (from router2.rrd) for the bigger part of
your stats. You could solve this in a few ways:
· While creating the new database, fill it with zeros
from the start to now. You have to make the database
start at or before the least recent time in the other
database.
· Alternately you could use CDEF and alter unknown data
to zero.
Both methods have their pros and cons. The first method is
troublesome and if you want to do that you have to figure
it out yourself. It is not possible to create a database
filled with zeros, you have to put them in on purpose.
Implementing the second method is described next:
What we want is: "if the value is unknown, replace it with
zero". This could be writte in pseudocode as: if (value
if true return a
if false return b
In RPN: "result=x,a,b,IF" where "x" is either true or
false.
Now we have to fill in "x", this should be the "(value is
unknown)" part and this is in RPN: "result=value,UN"
We now combine them: "result=value,UN,a,b,IF" and when we
fill in the appropriate things for "a" and "b" we're fin
ished:
"CDEF:result=value,UN,0,value,IF"
You may want to read Steve Raders RPN guide if you have
difficulties with the way I explained this last example.
If you want to check this RPN expression, just mimic rrd
tools behavior:
For any known value, the expression evaluates as follows:
CDEF:result=value,UN,0,value,IF (value,UN) is not true so it becomes 0
CDEF:result=0,0,value,IF "IF" will return the 3rd value
CDEF:result=value The known value is returned
For the unknown value, this happens:
CDEF:result=value,UN,0,value,IF (value,UN) is true so it becomes 1
CDEF:result=1,0,value,IF "IF" sees 1 and returns the 2nd value
CDEF:result=0 Zero is returned
Of course, if you would like to see another value instead
of zero, you can use that other value.
Eventually, when all unknown data is removed from the RRD,
you may want to remove this rule so that unknown data is
properly displayed.
Example: better handling of unknown data, by using time
Above example has one drawback. If you do log unknown data
in your database after installing your new equipment, it
will also be translated into zero and therefore you won't
see that there was a problem. This is not good and what
you really want to do is:
· If there is unknown data, look at the time that this
sample was taken
· If the unknown value is before time xxx, make it zero
date d "19990917 00:35:57" +%s
You could also dump the database and see where the data
starts to be known. There are several other ways of doing
this, just pick one.
Now we have to create the magic that allows us to process
unknown values different depending on the time that the
sample was taken. This is a three step process:
1. If the timestamp of the value is after 937521357,
leave it as is
2. If the value is a known value, leave it as is
3. Change the unknown value into zero.
Lets look at part one:
if (true) return the original value
We rewrite this:
if (true) return "a"
if (false) return "b"
We need to calculate true or false from step 1. There is a
function available that returns the timestamp for the cur
rent sample. It is called, how surprisingly, "TIME". This
time has to be compared to a constant number, we need
"GT". The output of "GT" is true or false and this is good
input to "IF". We want "if (time > 937521357) then (return
a) else (return b)".
This process was already described toroughly in the previ
ous chapter so lets do it quick:
if (x) then a else b
where x represents "time>937521357"
where a represents the original value
where b represents the outcome of the previous example
time>937521357 > TIME,937521357,GT
if (x) then a else b > x,a,b,IF
substitute x > TIME,937521357,GT,a,b,IF
substitute a > TIME,937521357,GT,value,b,IF
substitute b > TIME,937521357,GT,value,value,UN,0,value,IF,IF
We end up with:
"CDEF:result=TIME,937521357,GT,value,value,UN,0,value,IF,IF"
almost 10mb/s while the rest of your network activity does
not produce numbers higher than 100kb/s.
There are two options:
1. If the number exceeds 100kb/s it is wrong and you want
it masked out by changing it into unknown
2. You don't want the graph to show more than 100kb/s
Pseudo code: if (number > 100) then unknown else number or
Pseudo code: if (number > 100) then 100 else number.
The second "problem" may also be solved by using the rigid
option of rrdtool graph, however this has not the same
result. In this example you can end up with a graph that
does autoscaling. Also, if you use the numbers to display
maxima they will be set to 100kb/s.
We use "IF" and "GT" again. "if (x) then (y) else (z)" is
written down as "CDEF:result=x,y,z,IF"; now fill in x, y
and z. For x you fill in "number greater than 100kb/s"
becoming "number,100000,GT" (kilo is 1000 and b/s is what
we measure!). The "z" part is "number" in both cases and
the "y" part is either "UNKN" for unknown or "100000" for
100kb/s.
The two CDEF expressions would be:
CDEF:result=number,100000,GT,UNKN,number,IF
CDEF:result=number,100000,GT,100000,number,IF
Example: working on a certain time span
If you want a graph that spans a few weeks, but would only
want to see some routers data for one week, you need to
"hide" the rest of the time frame. Don't ask me when this
would be useful, it's just here for the example :)
We need to compare the time stamp to a begin date and an
end date. Comparing isn't difficult:
TIME,begintime,GE
TIME,endtime,LE
These two parts of the CDEF produce either 0 for false or
1 for true. We can now check if they are both 0 (or 1)
using a few IF statements but, as Wataru Satoh pointed
out, we can use the "*" or "+" functions as locical AND
and locical OR.
For "*", the result will be zero (false) if either one of
return true. You could also do it the other way around:
DEF:ds0=router1.rrd:AVERAGE
CDEF:ds0modified=TIME,begintime,LT,TIME,endtime,GT,+,UNKN,ds0,IF
This will return an UNKNOWN if either comparison returns
true.
Example: You suspect to have problems and want to see
unknown data.
Suppose you add up the number of active users on several
terminal servers. If one of them doesn't give an answer
(or an incorrect one) you get "NaN" in the database ("Not
a Number") and NaN is evaluated as Unknown.
In this case, you would like to be alerted to it and the
sum of the remaining values is of no value to you.
It would be something like:
DEF:users1=location1.rrd:onlineTS1:LAST
DEF:users2=location1.rrd:onlineTS2:LAST
DEF:users3=location2.rrd:onlineTS1:LAST
DEF:users4=location2.rrd:onlineTS2:LAST
CDEF:allusers=users1,users2,users3,users4,+,+,+
If you now plot allusers, unknown data in one of
users1..users4 will show up as a gap in your graph. You
want to modify this to show a bright red line, not a gap.
Define an extra CDEF that is unknown if all is okay and is
infinite if there is an unknown value:
CDEF:wrongdata=allusers,UN,INF,UNKN,IF
"allusers,UN" will evaluate to either true or false, it is
the (x) part of the "IF" function and it checks if
allusers is unknown. The (y) part of the "IF" function is
set to "INF" (which means infinity) and the (z) part of
the function returns "UNKN".
The logic is: if (allusers == unknown) then return INF
else return UNKN.
You can now use AREA to display this "wrongdata" in bright
red. If it is unknown (because allusers is known) then the
red AREA won't show up. If the value is INF (because
allusers is unknown) then the red AREA will be filled in
on the graph at that particular time.
AREA:allusers#0000FF:combined user count
DEF:users1=location1.rrd:onlineTS1:LAST
DEF:users2=location1.rrd:onlineTS2:LAST
DEF:users3=location2.rrd:onlineTS1:LAST
DEF:users4=location2.rrd:onlineTS2:LAST
CDEF:allusers=users1,users2,users3,users4,+,+,+
CDEF:wrongdata=allusers,UN,INF,UNKN,IF
AREA:users1#0000FF:users at ts1
STACK:users2#00FF00:users at ts2
STACK:users3#00FFFF:users at ts3
STACK:users4#FFFF00:users at ts4
AREA:wrongdata#FF0000:unknown data
If there is unknown data in one of users1..users4, the
"wrongdata" AREA will be drawn and because it starts at
the Xaxis and has infinite height it will effectively
overwrite the STACKed parts.
You could combine the two CDEF lines into one (we don't
use "allusers") if you like. But there are good reasons
for writting two CDEFS:
· It improves the readability of the script
· It can be used inside GPRINT to display the total num
ber of users
If you choose to combine them, you can substitute the
"allusers" in the second CDEF with the part after the
equal sign from the first line:
CDEF:wrongdata=users1,users2,users3,users4,+,+,+,UN,INF,UNKN,IF
If you do so, you won't be able to use these next GPRINTs:
COMMENT:"Total number of users seen"
GPRINT:allusers:MAX:"Maximum: %6.0lf"
GPRINT:allusers:MIN:"Minimum: %6.0lf"
GPRINT:allusers:AVERAGE:"Average: %6.0lf"
GPRINT:allusers:LAST:"Current: %6.0lf\n"
The examples from the rrd graph manual page Degrees Celcius vs. Degrees Fahrenheit
rrdtool graph demo.gif title="Demo Graph" \
DEF:cel=demo.rrd:exhaust:AVERAGE \
CDEF:far=cel,32,,0.55555,* \
LINE2:cel#00a000:"D. Celsius" \
LINE2:far#ff0000:"D. Fahrenheit\c"
This example gets the DS called "exhaust" from database
"demo.rrd" and puts the values in variable "cel". The CDEF
used is evaluated as follows:
6. the resulting value is now "(cel32)*0.55555"
Note that if you take the celcius to fahrenheit function
you should be doing "5/9*(cel32)" so 0.55555 is not
exactly correct. It is close enough for this purpose and
it saves a calculation.
Changing unknown into zero
rrdtool graph demo.gif title="Demo Graph" \
DEF:idat1=interface1.rrd:ds0:AVERAGE \
DEF:idat2=interface2.rrd:ds0:AVERAGE \
DEF:odat1=interface1.rrd:ds1:AVERAGE \
DEF:odat2=interface2.rrd:ds1:AVERAGE \
CDEF:agginput=idat1,UN,0,idat1,IF,idat2,UN,0,idat2,IF,+,8,* \
CDEF:aggoutput=odat1,UN,0,odat1,IF,odat2,UN,0,odat2,IF,+,8,* \
AREA:agginput#00cc00:Input Aggregate \
LINE1:aggoutput#0000FF:Output Aggregate
These two CDEFs are built from several functions. It helps
to split them when viewing what they do. Starting with
the first CDEF we would get:
idat1,UN > a
0 > b
idat1 > c
if (a) then (b) else (c) The result is therefore "0"
if it is true that "idat1" equals "UN". If not, the orig
inal value of "idat1" is put back on the stack. Lets call
this answer "d". The process is repeated for the next five
items on the stack, it is done the same and will return
answer "h". The resulting stack is therefore "d,h". The
expression has been simplified to "d,h,+,8,*" and it will
now be easy to see that we add "d" and "h", and multiply
the result with eight.
The end result is that we have added "idat1" and "idat2"
and in the process we effectively ignored unknown values.
The result is multiplied by eight, most likely to convert
bytes/s to bits/s.
Infinity demo
rrdtool graph example.png title="INF demo" \
DEF:val1=some.rrd:ds0:AVERAGE \
DEF:val2=some.rrd:ds1:AVERAGE \
DEF:val3=some.rrd:ds2:AVERAGE \
DEF:val4=other.rrd:ds0:AVERAGE \
CDEF:background=val4,POP,TIME,7200,%,3600,LE,INF,UNKN,IF \
CDEF:wipeout=val1,val2,val3,val4,+,+,+,UN,INF,UNKN,IF \
AREA:background#F0F0F0 \
AREA:val1#0000FF:Value1 \
STACK:val2#00C000:Value2 \
"TIME,7200,%" takes the modulo of time and 7200 (which is
two hours). The resulting value on the stack will be a
number in the range from 0 to 7199.
For people who don't know the modulo function: it is the
remainder after an integer division. If you divide 16 by
3, the answer would be 5 and the remainder would be 1. So,
"16,3,%" returns 1.
We have the result of "TIME,7200,%" on the stack, lets
call this "a". The start of the RPN has become "a,3600,LE"
and this checks if "a" is less or equal than "3600". It is
true half of the time. We now have to process the rest of
the RPN and this is only a simple "IF" function that
returns either "INF" or "UNKN" depending on the time. This
is returned to variable "background".
The second CDEF has been discussed earlyer in this docu
ment so we won't do that here.
Now you can draw the different layers. Start with the
background that is either unknown (nothing to see) or
infinite (the whole positive part of the graph gets
filled). Next you draw the data on top of this back
ground. It will overlay the background. Suppose one of
val1..val4 would be unknown, in that case you end up with
only three bars stacked on top of each other. You don't
want to see this because the data is only valid when all
four variables are valid. This is why you use the second
CDEF, it will overlay the data with an AREA so the data
cannot be seen anymore.
If your data can also have negative values you also need
to overwrite the other half of your graph. This can be
done in a relatively simple way: what you need is the
"wipeout" variable and place a negative sign before it:
"CDEF:wipeout2=wipeout,1,*"
Filtering data
You may do some complex data filtering:
MEDIAN FILTER: filters shot noise
DEF:var=database.rrd:traffic:AVERAGE
CDEF:prev1=PREV(var)
CDEF:prev2=PREV(prev1)
CDEF:prev3=PREV(prev2)
CDEF:median=prev1,prev2,prev3,+,+,3,/
LINE3:median#000077:filtered
LINE1:prev2#007700:'raw data'
Out of ideas for now This document was created from questions asked by either
myself or by other people on the list. Please let me know
if you find errors in it or if you have trouble under
standing it. If you think there should be an addition,
mail me: <alex@ergens.op.het.net>
Remember: No feedback equals no changes!
SEE ALSO The RRDtool manpages
AUTHOR Alex van den Bogaerdt <alex@ergens.op.het.net>
1.0.45 20020620 CDEFTUTORIAL(1)
