doc/rrdtutorial.txt

   1 RRDTUTORIAL(1)                      rrdtool                     RRDTUTORIAL(1)
   2
   3
   4
   5 rrdtutorial - Alex van den Bogaerdt's RRDtool tutorial
   6
   7 D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
   8        RRDtool is written by Tobias Oetiker <tobi@oetiker.ch> with contribu-
   9        tions from many people all around the world. This document is written
  10        by Alex van den Bogaerdt <alex@vandenbogaerdt.nl> to help you under-
  11        stand what RRDtool is and what it can do for you.
  12
  13        The documentation provided with RRDtool can be too technical for some
  14        people. This tutorial is here to help you understand the basics of RRD-
  15        tool. It should prepare you to read the documentation yourself.  It
  16        also explains the general things about statistics with a focus on net-
  17        working.
  18
  19 T\bTU\bUT\bTO\bOR\bRI\bIA\bAL\bL
  20        I\bIm\bmp\bpo\bor\brt\bta\ban\bnt\bt
  21
  22        Please don't skip ahead in this document!  The first part of this docu-
  23        ment explains the basics and may be boring.  But if you don't under-
  24        stand the basics, the examples will not be as meaningful to you.
  25
  26        Sometimes things change.  This example used to provide numbers like
  27        "0.04" in stead of "4.00000e-02".  Those are really the same numbers,
  28        just written down differently.  Don't be alarmed if a future version of
  29        rrdtool displays a slightly different form of output. The examples in
  30        this document are correct for version 1.2.0 of RRDtool.
  31
  32        Also, sometimes bugs do occur. They may also influence the outcome of
  33        the examples. Example speed4.png was suffering from this (the handling
  34        of unknown data in an if-statement was wrong). Normal data will be just
  35        fine (a bug in rrdtool wouldn't last long) but special cases like NaN,
  36        INF and so on may last a bit longer.  Try another version if you can,
  37        or just live with it.
  38
  39        I fixed the speed4.png example (and added a note). There may be other
  40        examples which suffer from the same or a similar bug.  Try to fix it
  41        yourself, which is a great excercise. But please do not submit your
  42        result as a fix to the source of this document. Discuss it on the
  43        user's list, or write to me.
  44
  45        W\bWh\bha\bat\bt i\bis\bs R\bRR\bRD\bDt\bto\boo\bol\bl?\b?
  46
  47        RRDtool refers to Round Robin Database tool.  Round robin is a tech-
  48        nique that works with a fixed amount of data, and a pointer to the cur-
  49        rent element. Think of a circle with some dots plotted on the edge.
  50        These dots are the places where data can be stored. Draw an arrow from
  51        the center of the circle to one of the dots; this is the pointer.  When
  52        the current data is read or written, the pointer moves to the next ele-
  53        ment. As we are on a circle there is neither a beginning nor an end,
  54        you can go on and on and on. After a while, all the available places
  55        will be used and the process automatically reuses old locations. This
  56        way, the dataset will not grow in size and therefore requires no main-
  57        tenance.  RRDtool works with with Round Robin Databases (RRDs). It
  58        stores and retrieves data from them.
  59
  60        W\bWh\bha\bat\bt d\bda\bat\bta\ba c\bca\ban\bn b\bbe\be p\bpu\but\bt i\bin\bnt\bto\bo a\ban\bn R\bRR\bRD\bD?\b?
  61
  62        You name it, it will probably fit as long as it is some sort of time-
  63        series data. This means you have to be able to measure some value at
  64        several points in time and provide this information to RRDtool. If you
  65        can do this, RRDtool will be able to store it. The values must be
  66        numerical but don't have to be integers, as is the case with MRTG (the
  67        next section will give more details on this more specialized applica-
  68        tion).
  69
  70        Many examples below talk about SNMP which is an acronym for Simple Net-
  71        work Management Protocol. "Simple" refers to the protocol. It does not
  72        mean it is simple to manage or monitor a network. After working your
  73        way through this document, you should know enough to be able to under-
  74        stand what people are talking about. For now, just realize that SNMP
  75        can be used to query devices for the values of counters they keep. It
  76        is the value from those counters that we want to store in the RRD.
  77
  78        W\bWh\bha\bat\bt c\bca\ban\bn I\bI d\bdo\bo w\bwi\bit\bth\bh t\bth\bhi\bis\bs t\bto\boo\bol\bl?\b?
  79
  80        RRDtool originated from MRTG (Multi Router Traffic Grapher). MRTG
  81        started as a tiny little script for graphing the use of a university's
  82        connection to the Internet. MRTG was later (ab-)used as a tool for
  83        graphing other data sources including temperature, speed, voltage, num-
  84        ber of printouts and the like.
  85
  86        Most likely you will start to use RRDtool to store and process data
  87        collected via SNMP. The data will most likely be bytes (or bits) trans-
  88        fered from and to a network or a computer.  But it can also be used to
  89        display tidal waves, solar radiation, power consumption, number of vis-
  90        itors at an exhibition, noise levels near an airport, temperature on
  91        your favorite holiday location, temperature in the fridge and whatever
  92        you imagination can come up with.
  93
  94        You only need a sensor to measure the data and be able to feed the num-
  95        bers into RRDtool. RRDtool then lets you create a database, store data
  96        in it, retrieve that data and create graphs in PNG format for display
  97        on a web browser. Those PNG images are dependent on the data you col-
  98        lected and could be, for instance, an overview of the average network
  99        usage, or the peaks that occurred.
 100
 101        W\bWh\bha\bat\bt i\bif\bf I\bI s\bst\bti\bil\bll\bl h\bha\bav\bve\be p\bpr\bro\bob\bbl\ble\bem\bms\bs a\baf\bft\bte\ber\br r\bre\bea\bad\bdi\bin\bng\bg t\bth\bhi\bis\bs d\bdo\boc\bcu\bum\bme\ben\bnt\bt?\b?
 102
 103        First of all: read it again! You may have missed something.  If you are
 104        unable to compile the sources and you have a fairly common OS, it will
 105        probably not be the fault of RRDtool. There may be pre-compiled ver-
 106        sions around on the Internet. If they come from trusted sources, get
 107        one of those.
 108
 109        If on the other hand the program works but does not give you the
 110        expected results, it will be a problem with configuring it. Review your
 111        configuration and compare it with the examples that follow.
 112
 113        There is a mailing list and an archive of it. Read the list for a few
 114        weeks and search the archive. It is considered rude to just ask a ques-
 115        tion without searching the archives: your problem may already have been
 116        solved for somebody else!  This is true for most, if not all, mailing
 117        lists and not only for this particular one. Look in the documentation
 118        that came with RRDtool for the location and usage of the list.
 119
 120        I suggest you take a moment to subscribe to the mailing list right now
 121        by sending an email to <rrd-users-request@lists.oetiker.ch> with a sub-
 122        ject of "subscribe". If you ever want to leave this list, just write an
 123        email to the same address but now with a subject of "unsubscribe".
 124
 125        H\bHo\bow\bw w\bwi\bil\bll\bl y\byo\bou\bu h\bhe\bel\blp\bp m\bme\be?\b?
 126
 127        By giving you some detailed descriptions with detailed examples.  I
 128        assume that following the instructions in the order presented will give
 129        you enough knowledge of RRDtool to experiment for yourself.  If it
 130        doesn't work the first time, don't give up. Reread the stuff that you
 131        did understand, you may have missed something.
 132
 133        By following the examples you get some hands-on experience and, even
 134        more important, some background information of how it works.
 135
 136        You will need to know something about hexadecimal numbers. If you don't
 137        then start with reading bin_dec_hex before you continue here.
 138
 139        Y\bYo\bou\bur\br f\bfi\bir\brs\bst\bt R\bRo\bou\bun\bnd\bd R\bRo\bob\bbi\bin\bn D\bDa\bat\bta\bab\bba\bas\bse\be
 140
 141        In my opinion the best way to learn something is to actually do it.
 142        Why not start right now?  We will create a database, put some values in
 143        it and extract this data again.  Your output should be the same as the
 144        output that is included in this document.
 145
 146        We will start with some easy stuff and compare a car with a router, or
 147        compare kilometers (miles if you wish) with bits and bytes. It's all
 148        the same: some number over some time.
 149
 150        Assume we have a device that transfers bytes to and from the Internet.
 151        This device keeps a counter that starts at zero when it is turned on,
 152        increasing with every byte that is transfered. This counter will proba-
 153        bly have a maximum value. If this value is reached and an extra byte is
 154        counted, the counter starts over at zero. This is the same as many
 155        counters in the world such as the mileage counter in a car.
 156
 157        Most discussions about networking talk about bits per second so lets
 158        get used to that right away. Assume a byte is eight bits and start to
 159        think in bits not bytes. The counter, however, still counts bytes!  In
 160        the SNMP world most of the counters are 32 bits. That means they are
 161        counting from 0 to 4294967295. We will use these values in the exam-
 162        ples.  The device, when asked, returns the current value of the
 163        counter. We know the time that has passes since we last asked so we now
 164        know how many bytes have been transfered ***on average*** per second.
 165        This is not very hard to calculate. First in words, then in calcula-
 166        tions:
 167
 168        1. Take the current counter, subtract the previous value from it.
 169
 170        2. Do the same with the current time and the previous time (in sec-
 171           onds).
 172
 173        3. Divide the outcome of (1) by the outcome of (2), the result is the
 174           amount of bytes per second. Multiply by eight to get the number of
 175           bits per second (bps).
 176
 177          bps = (counter_now - counter_before) / (time_now - time_before) * 8
 178
 179        For some people it may help to translate this to an automobile example.
 180        Do not try this example, and if you do, don't blame me for the results!
 181
 182        People who are not used to think in kilometers per hour can translate
 183        most into miles per hour by dividing km by 1.6 (close enough).  I will
 184        use the following abbreviations:
 185
 186         m:    meter
 187         km:   kilometer (= 1000 meters).
 188         h:    hour
 189         s:    second
 190         km/h: kilometers per hour
 191         m/s:  meters per second
 192
 193        You are driving a car. At 12:05 you read the counter in the dashboard
 194        and it tells you that the car has moved 12345 km until that moment.  At
 195        12:10 you look again, it reads 12357 km. This means you have traveled
 196        12 km in five minutes. A scientist would translate that into meters per
 197        second and this makes a nice comparison toward the problem of (bytes
 198        per five minutes) versus (bits per second).
 199
 200        We traveled 12 kilometers which is 12000 meters. We did that in five
 201        minutes or 300 seconds. Our speed is 12000m / 300s or 40 m/s.
 202
 203        We could also calculate the speed in km/h: 12 times 5 minutes is an
 204        hour, so we have to multiply 12 km by 12 to get 144 km/h.  For our
 205        native English speaking friends: that's 90 mph so don't try this exam-
 206        ple at home or where I live :)
 207
 208        Remember: these numbers are averages only.  There is no way to figure
 209        out from the numbers, if you drove at a constant speed.  There is an
 210        example later on in this tutorial that explains this.
 211
 212        I hope you understand that there is no difference in calculating m/s or
 213        bps; only the way we collect the data is different. Even the k from
 214        kilo is the same as in networking terms k also means 1000.
 215
 216        We will now create a database where we can keep all these interesting
 217        numbers. The method used to start the program may differ slightly from
 218        OS to OS, but I assume you can figure it out if it works different on
 219        your's. Make sure you do not overwrite any file on your system when
 220        executing the following command and type the whole line as one long
 221        line (I had to split it for readability) and skip all of the '\' char-
 222        acters.
 223
 224           rrdtool create test.rrd             \
 225                    --start 920804400          \
 226                    DS:speed:COUNTER:600:U:U   \
 227                    RRA:AVERAGE:0.5:1:24       \
 228                    RRA:AVERAGE:0.5:6:10
 229
 230        (So enter: "rrdtool create test.rrd --start 920804400 DS ...")
 231
 232        W\bWh\bha\bat\bt h\bha\bas\bs b\bbe\bee\ben\bn c\bcr\bre\bea\bat\bte\bed\bd?\b?
 233
 234        We created the round robin database called test (test.rrd) which starts
 235        at noon the day I started writing this document, 7th of March, 1999
 236        (this date translates to 920804400 seconds as explained below). Our
 237        database holds one data source (DS) named "speed" that represents a
 238        counter. This counter is read every five minutes (this is the default
 239        therefore you don't have to put "--step=300").  In the same database
 240        two round robin archives (RRAs) are kept, one averages the data every
 241        time it is read (e.g., there's nothing to average) and keeps 24 samples
 242        (24 times 5 minutes is 2 hours). The other averages 6 values (half
 243        hour) and contains 10 such averages (e.g. 5 hours).
 244
 245        RRDtool works with special time stamps coming from the UNIX world.
 246        This time stamp is the number of seconds that passed since January 1st
 247        1970 UTC.  The time stamp value is translated into local time and it
 248        will therefore look different for different time zones.
 249
 250        Chances are that you are not in the same part of the world as I am.
 251        This means your time zone is different. In all examples where I talk
 252        about time, the hours may be wrong for you. This has little effect on
 253        the results of the examples, just correct the hours while reading.  As
 254        an example: where I will see "12:05" the UK folks will see "11:05".
 255
 256        We now have to fill our database with some numbers. We'll pretend to
 257        have read the following numbers:
 258
 259         12:05  12345 km
 260         12:10  12357 km
 261         12:15  12363 km
 262         12:20  12363 km
 263         12:25  12363 km
 264         12:30  12373 km
 265         12:35  12383 km
 266         12:40  12393 km
 267         12:45  12399 km
 268         12:50  12405 km
 269         12:55  12411 km
 270         13:00  12415 km
 271         13:05  12420 km
 272         13:10  12422 km
 273         13:15  12423 km
 274
 275        We fill the database as follows:
 276
 277         rrdtool update test.rrd 920804700:12345 920805000:12357 920805300:12363
 278         rrdtool update test.rrd 920805600:12363 920805900:12363 920806200:12373
 279         rrdtool update test.rrd 920806500:12383 920806800:12393 920807100:12399
 280         rrdtool update test.rrd 920807400:12405 920807700:12411 920808000:12415
 281         rrdtool update test.rrd 920808300:12420 920808600:12422 920808900:12423
 282
 283        This reads: update our test database with the following numbers
 284
 285         time 920804700, value 12345
 286         time 920805000, value 12357
 287
 288        etcetera.
 289
 290        As you can see, it is possible to feed more than one value into the
 291        database in one command. I had to stop at three for readability but the
 292        real maximum per line is OS dependent.
 293
 294        We can now retrieve the data from our database using "rrdtool fetch":
 295
 296         rrdtool fetch test.rrd AVERAGE --start 920804400 --end 920809200
 297
 298        It should return the following output:
 299
 300                                  speed
 301
 302         920804700: nan
 303         920805000: 4.0000000000e-02
 304         920805300: 2.0000000000e-02
 305         920805600: 0.0000000000e+00
 306         920805900: 0.0000000000e+00
 307         920806200: 3.3333333333e-02
 308         920806500: 3.3333333333e-02
 309         920806800: 3.3333333333e-02
 310         920807100: 2.0000000000e-02
 311         920807400: 2.0000000000e-02
 312         920807700: 2.0000000000e-02
 313         920808000: 1.3333333333e-02
 314         920808300: 1.6666666667e-02
 315         920808600: 6.6666666667e-03
 316         920808900: 3.3333333333e-03
 317         920809200: nan
 318
 319        If it doesn't, something may be wrong.  Perhaps your OS will print
 320        "NaN" in a different form. "NaN" stands for "Not A Number".  If your OS
 321        writes "U" or "UNKN" or something similar that's okay.  If something
 322        else is wrong, it will probably be due to an error you made (assuming
 323        that my tutorial is correct of course :-). In that case: delete the
 324        database and try again.
 325
 326        The meaning of the above output will become clear below.
 327
 328        T\bTi\bim\bme\be t\bto\bo c\bcr\bre\bea\bat\bte\be s\bso\bom\bme\be g\bgr\bra\bap\bph\bhi\bic\bcs\bs
 329
 330        Try the following command:
 331
 332         rrdtool graph speed.png                                 \
 333                 --start 920804400 --end 920808000               \
 334                 DEF:myspeed=test.rrd:speed:AVERAGE              \
 335                 LINE2:myspeed#FF0000
 336
 337        This will create speed.png which starts at 12:00 and ends at 13:00.
 338        There is a definition of a variable called myspeed, using the data from
 339        RRA "speed" out of database "test.rrd". The line drawn is 2 pixels high
 340        and represents the variable myspeed. The color is red (specified by its
 341        rgb-representation, see below).
 342
 343        You'll notice that the start of the graph is not at 12:00 but at 12:05.
 344        This is because we have insufficient data to tell the average before
 345        that time. This will only happen when you miss some samples, this will
 346        not happen a lot, hopefully.
 347
 348        If this has worked: congratulations! If not, check what went wrong.
 349
 350        The colors are built up from red, green and blue. For each of the com-
 351        ponents, you specify how much to use in hexadecimal where 00 means not
 352        included and FF means fully included.  The "color" white is a mixture
 353        of red, green and blue: FFFFFF The "color" black is all colors off:
 354        000000
 355
 356           red     #FF0000
 357           green   #00FF00
 358           blue    #0000FF
 359           magenta #FF00FF     (mixed red with blue)
 360           gray    #555555     (one third of all components)
 361
 362        Additionally you can (with a recent RRDtool)  add an alpha channel
 363        (transparency).  The default will be "FF" which means non-transparent.
 364
 365        The PNG you just created can be displayed using your favorite image
 366        viewer.  Web browsers will display the PNG via the URL
 367        "file:///the/path/to/speed.png"
 368
 369        G\bGr\bra\bap\bph\bhi\bic\bcs\bs w\bwi\bit\bth\bh s\bso\bom\bme\be m\bma\bat\bth\bh
 370
 371        When looking at the image, you notice that the horizontal axis is
 372        labeled 12:10, 12:20, 12:30, 12:40 and 12:50. Sometimes a label doesn't
 373        fit (12:00 and 13:00 would be likely candidates) so they are skipped.
 374
 375        The vertical axis displays the range we entered. We provided kilometers
 376        and when divided by 300 seconds, we get very small numbers. To be
 377        exact, the first value was 12 (12357-12345) and divided by 300 this
 378        makes 0.04, which is displayed by RRDtool as "40 m" meaning "40/1000".
 379        The "m" (milli) has nothing to do with meters (also m), kilometers or
 380        millimeters! RRDtool doesn't know about the physical units of our data,
 381        it just works with dimensionless numbers.
 382
 383        If we had measured our distances in meters, this would have been
 384        (12357000-12345000)/300 = 12000/300 = 40.
 385
 386        As most people have a better feel for numbers in this range, we'll cor-
 387        rect that. We could recreate our database and store the correct data,
 388        but there is a better way: we do some calculations while creating the
 389        png file!
 390
 391           rrdtool graph speed2.png                           \
 392              --start 920804400 --end 920808000               \
 393              --vertical-label m/s                            \
 394              DEF:myspeed=test.rrd:speed:AVERAGE              \
 395              CDEF:realspeed=myspeed,1000,\*                  \
 396              LINE2:realspeed#FF0000
 397
 398        Note: I need to escape the multiplication operator * with a backslash.
 399        If I don't, the operating system may interpret it and use it for file
 400        name expansion. You could also place the line within quotation marks
 401        like so:
 402
 403              "CDEF:realspeed=myspeed,1000,*"                  \
 404
 405        It boils down to: it is RRDtool which should see *, not your shell.
 406        And it is your shell interpreting \, not RRDtool. You may need to
 407        adjust examples accordingly if you happen to use an operating system or
 408        shell which behaves differently.
 409
 410        After viewing this PNG, you notice the "m" (milli) has disappeared.
 411        This it what the correct result would be. Also, a label has been added
 412        to the image.  Apart from the things mentioned above, the PNG should
 413        look the same.
 414
 415        The calculations are specified in the CDEF part above and are in
 416        Reverse Polish Notation ("RPN"). What we requested RRDtool to do is:
 417        "take the data source myspeed and the number 1000; multiply those".
 418        Don't bother with RPN yet, it will be explained later on in more
 419        detail. Also, you may want to read my tutorial on CDEFs and Steve
 420        Rader's tutorial on RPN. But first finish this tutorial.
 421
 422        Hang on! If we can multiply values with 1000, it should also be possi-
 423        ble to display kilometers per hour from the same data!
 424
 425        To change a value that is measured in meters per second:
 426
 427         Calculate meters per hour:     value * 3600
 428         Calculate kilometers per hour: value / 1000
 429         Together this makes:           value * (3600/1000) or value * 3.6
 430
 431        In our example database we made a mistake and we need to compensate for
 432        this by multiplying with 1000. Applying that correction:
 433
 434         value * 3.6  * 1000 == value * 3600
 435
 436        Now let's create this PNG, and add some more magic ...
 437
 438         rrdtool graph speed3.png                             \
 439              --start 920804400 --end 920808000               \
 440              --vertical-label km/h                           \
 441              DEF:myspeed=test.rrd:speed:AVERAGE              \
 442              "CDEF:kmh=myspeed,3600,*"                       \
 443              CDEF:fast=kmh,100,GT,kmh,0,IF                   \
 444              CDEF:good=kmh,100,GT,0,kmh,IF                   \
 445              HRULE:100#0000FF:"Maximum allowed"              \
 446              AREA:good#00FF00:"Good speed"                   \
 447              AREA:fast#FF0000:"Too fast"
 448
 449        Note: here we use another means to escape the * operator by enclosing
 450        the whole string in double quotes.
 451
 452        This graph looks much better. Speed is shown in km/h and there is even
 453        an extra line with the maximum allowed speed (on the road I travel on).
 454        I also changed the colors used to display speed and changed it from a
 455        line into an area.
 456
 457        The calculations are more complex now. For speed measurements within
 458        the speed limit they are:
 459
 460           Check if kmh is greater than 100    ( kmh,100 ) GT
 461           If so, return 0, else kmh           ((( kmh,100 ) GT ), 0, kmh) IF
 462
 463        For values above the speed limit:
 464
 465           Check if kmh is greater than 100    ( kmh,100 ) GT
 466           If so, return kmh, else return 0    ((( kmh,100) GT ), kmh, 0) IF
 467
 468        G\bGr\bra\bap\bph\bhi\bic\bcs\bs M\bMa\bag\bgi\bic\bc
 469
 470        I like to believe there are virtually no limits to how RRDtool graph
 471        can manipulate data. I will not explain how it works, but look at the
 472        following PNG:
 473
 474           rrdtool graph speed4.png                           \
 475              --start 920804400 --end 920808000               \
 476              --vertical-label km/h                           \
 477              DEF:myspeed=test.rrd:speed:AVERAGE              \
 478              CDEF:nonans=myspeed,UN,0,myspeed,IF             \
 479              CDEF:kmh=nonans,3600,*                          \
 480              CDEF:fast=kmh,100,GT,100,0,IF                   \
 481              CDEF:over=kmh,100,GT,kmh,100,-,0,IF             \
 482              CDEF:good=kmh,100,GT,0,kmh,IF                   \
 483              HRULE:100#0000FF:"Maximum allowed"              \
 484              AREA:good#00FF00:"Good speed"                   \
 485              AREA:fast#550000:"Too fast"                     \
 486              STACK:over#FF0000:"Over speed"
 487
 488        Remember the note in the beginning?  I had to remove unknown data from
 489        this example. The 'nonans' CDEF is new, and the 6th line (which used to
 490        be the 5th line) used to read 'CDEF:kmh=myspeed,3600,*'
 491
 492        Let's create a quick and dirty HTML page to view the three PNGs:
 493
 494           <HTML><HEAD><TITLE>Speed</TITLE></HEAD><BODY>
 495           <IMG src="speed2.png" alt="Speed in meters per second">
 496           <BR>
 497           <IMG src="speed3.png" alt="Speed in kilometers per hour">
 498           <BR>
 499           <IMG src="speed4.png" alt="Traveled too fast?">
 500           </BODY></HTML>
 501
 502        Name the file "speed.html" or similar, and look at it in your web
 503        browser.
 504
 505        Now, all you have to do is measure the values regularly and update the
 506        database.  When you want to view the data, recreate the PNGs and make
 507        sure to refresh them in your browser. (Note: just clicking reload may
 508        not be enough, especially when proxies are involved.  Try shift-reload
 509        or ctrl-F5).
 510
 511        U\bUp\bpd\bda\bat\bte\bes\bs i\bin\bn R\bRe\bea\bal\bli\bit\bty\by
 512
 513        We've already used the "update" command: it took one or more parameters
 514        in the form of "<time>:<value>". You'll be glad to know that you can
 515        specify the current time by filling in a "N" as the time.  Or you could
 516        use the "time" function in Perl (the shortest example in this tuto-
 517        rial):
 518
 519           perl -e 'print time, "\n" '
 520
 521        How to run a program on regular intervals is OS specific. But here is
 522        an example in pseudo code:
 523
 524           - Get the value and put it in variable "$speed"
 525           - rrdtool update speed.rrd N:$speed
 526
 527        (do not try this with our test database, we'll use it in further exam-
 528        ples)
 529
 530        This is all. Run the above script every five minutes. When you need to
 531        know what the graphs look like, run the examples above. You could put
 532        them in a script as well. After running that script, view the page
 533        index.html we created above.
 534
 535        S\bSo\bom\bme\be w\bwo\bor\brd\bds\bs o\bon\bn S\bSN\bNM\bMP\bP
 536
 537        I can imagine very few people that will be able to get real data from
 538        their car every five minutes. All other people will have to settle for
 539        some other kind of counter. You could measure the number of pages
 540        printed by a printer, for example, the cups of coffee made by the cof-
 541        fee machine, a device that counts the electricity used, whatever. Any
 542        incrementing counter can be monitored and graphed using the stuff you
 543        learned so far. Later on we will also be able to monitor other types of
 544        values like temperature.
 545
 546        Many people interested in RRDtool will use the counter that keeps track
 547        of octets (bytes) transfered by a network device. So let's do just that
 548        next. We will start with a description of how to collect data.
 549
 550        Some people will make a remark that there are tools which can do this
 551        data collection for you. They are right! However, I feel it is impor-
 552        tant that you understand they are not necessary. When you have to
 553        determine why things went wrong you need to know how they work.
 554
 555        One tool used in the example has been talked about very briefly in the
 556        beginning of this document, it is called SNMP. It is a way of talking
 557        to networked equipment. The tool I use below is called "snmpget" and
 558        this is how it works:
 559
 560           snmpget device password OID
 561
 562        or
 563
 564           snmpget -v[version] -c[password] device OID
 565
 566        For device you substitute the name, or the IP address, of your device.
 567        For password you use the "community read string" as it is called in the
 568        SNMP world.  For some devices the default of "public" might work, how-
 569        ever this can be disabled, altered or protected for privacy and secu-
 570        rity reasons.  Read the documentation that comes with your device or
 571        program.
 572
 573        Then there is this parameter, called OID, which means "object identi-
 574        fier".
 575
 576        When you start to learn about SNMP it looks very confusing. It isn't
 577        all that difficult when you look at the Management Information Base
 578        ("MIB").  It is an upside-down tree that describes data, with a single
 579        node as the root and from there a number of branches.  These branches
 580        end up in another node, they branch out, etc.  All the branches have a
 581        name and they form the path that we follow all the way down.  The
 582        branches that we follow are named: iso, org, dod, internet, mgmt and
 583        mib-2.  These names can also be written down as numbers and are 1 3 6 1
 584        2 1.
 585
 586           iso.org.dod.internet.mgmt.mib-2 (1.3.6.1.2.1)
 587
 588        There is a lot of confusion about the leading dot that some programs
 589        use.  There is *no* leading dot in an OID.  However, some programs can
 590        use the above part of OIDs as a default.  To indicate the difference
 591        between abbreviated OIDs and full OIDs they need a leading dot when you
 592        specify the complete OID.  Often those programs will leave out the
 593        default portion when returning the data to you.  To make things worse,
 594        they have several default prefixes ...
 595
 596        Ok, lets continue to the start of our OID: we had 1.3.6.1.2.1 From
 597        there, we are especially interested in the branch "interfaces" which
 598        has number 2 (e.g., 1.3.6.1.2.1.2 or 1.3.6.1.2.1.interfaces).
 599
 600        First, we have to get some SNMP program. First look if there is a pre-
 601        compiled package available for your OS. This is the preferred way.  If
 602        not, you will have to get the sources yourself and compile those.  The
 603        Internet is full of sources, programs etc. Find information using a
 604        search engine or whatever you prefer.
 605
 606        Assume you got the program. First try to collect some data that is
 607        available on most systems. Remember: there is a short name for the part
 608        of the tree that interests us most in the world we live in!
 609
 610        I will give an example which can be used on Fedora Core 3.  If it
 611        doesn't work for you, work your way through the manual of snmp and
 612        adapt the example to make it work.
 613
 614           snmpget -v2c -c public myrouter system.sysDescr.0
 615
 616        The device should answer with a description of itself, perhaps an empty
 617        one. Until you got a valid answer from a device, perhaps using a dif-
 618        ferent "password", or a different device, there is no point in continu-
 619        ing.
 620
 621           snmpget -v2c -c public myrouter interfaces.ifNumber.0
 622
 623        Hopefully you get a number as a result, the number of interfaces.  If
 624        so, you can carry on and try a different program called "snmpwalk".
 625
 626           snmpwalk -v2c -c public myrouter interfaces.ifTable.ifEntry.ifDescr
 627
 628        If it returns with a list of interfaces, you're almost there.  Here's
 629        an example:
 630           [user@host /home/alex]$ snmpwalk -v2c -c public cisco 2.2.1.2
 631
 632           interfaces.ifTable.ifEntry.ifDescr.1 = "BRI0: B-Channel 1"
 633           interfaces.ifTable.ifEntry.ifDescr.2 = "BRI0: B-Channel 2"
 634           interfaces.ifTable.ifEntry.ifDescr.3 = "BRI0" Hex: 42 52 49 30
 635           interfaces.ifTable.ifEntry.ifDescr.4 = "Ethernet0"
 636           interfaces.ifTable.ifEntry.ifDescr.5 = "Loopback0"
 637
 638        On this cisco equipment, I would like to monitor the "Ethernet0" inter-
 639        face and from the above output I see that it is number four. I try:
 640
 641           [user@host /home/alex]$ snmpget -v2c -c public cisco 2.2.1.10.4 2.2.1.16.4
 642
 643           interfaces.ifTable.ifEntry.ifInOctets.4 = 2290729126
 644           interfaces.ifTable.ifEntry.ifOutOctets.4 = 1256486519
 645
 646        So now I have two OIDs to monitor and they are (in full, this time):
 647
 648           1.3.6.1.2.1.2.2.1.10
 649
 650        and
 651
 652           1.3.6.1.2.1.2.2.1.16
 653
 654        both with an interface number of 4.
 655
 656        Don't get fooled, this wasn't my first try. It took some time for me
 657        too to understand what all these numbers mean. It does help a lot when
 658        they get translated into descriptive text... At least, when people are
 659        talking about MIBs and OIDs you know what it's all about.  Do not for-
 660        get the interface number (0 if it is not interface dependent) and try
 661        snmpwalk if you don't get an answer from snmpget.
 662
 663        If you understand the above section and get numbers from your device,
 664        continue on with this tutorial. If not, then go back and re-read this
 665        part.
 666
 667        A\bA R\bRe\bea\bal\bl W\bWo\bor\brl\bld\bd E\bEx\bxa\bam\bmp\bpl\ble\be
 668
 669        Let the fun begin. First, create a new database. It contains data from
 670        two counters, called input and output. The data is put into archives
 671        that average it. They take 1, 6, 24 or 288 samples at a time.  They
 672        also go into archives that keep the maximum numbers. This will be
 673        explained later on. The time in-between samples is 300 seconds, a good
 674        starting point, which is the same as five minutes.
 675
 676         1 sample "averaged" stays 1 period of 5 minutes
 677         6 samples averaged become one average on 30 minutes
 678         24 samples averaged become one average on 2 hours
 679         288 samples averaged become one average on 1 day
 680
 681        Lets try to be compatible with MRTG which stores about the following
 682        amount of data:
 683
 684         600 5-minute samples:    2   days and 2 hours
 685         600 30-minute samples:  12.5 days
 686         600 2-hour samples:     50   days
 687         732 1-day samples:     732   days
 688
 689        These ranges are appended, so the total amount of data stored in the
 690        database is approximately 797 days. RRDtool stores the data differ-
 691        ently, it doesn't start the "weekly" archive where the "daily" archive
 692        stopped. For both archives the most recent data will be near "now" and
 693        therefore we will need to keep more data than MRTG does!
 694
 695        We will need:
 696
 697         600 samples of 5 minutes  (2 days and 2 hours)
 698         700 samples of 30 minutes (2 days and 2 hours, plus 12.5 days)
 699         775 samples of 2 hours    (above + 50 days)
 700         797 samples of 1 day      (above + 732 days, rounded up to 797)
 701
 702           rrdtool create myrouter.rrd         \
 703                    DS:input:COUNTER:600:U:U   \
 704                    DS:output:COUNTER:600:U:U  \
 705                    RRA:AVERAGE:0.5:1:600      \
 706                    RRA:AVERAGE:0.5:6:700      \
 707                    RRA:AVERAGE:0.5:24:775     \
 708                    RRA:AVERAGE:0.5:288:797    \
 709                    RRA:MAX:0.5:1:600          \
 710                    RRA:MAX:0.5:6:700          \
 711                    RRA:MAX:0.5:24:775         \
 712                    RRA:MAX:0.5:288:797
 713
 714        Next thing to do is to collect data and store it. Here is an example.
 715        It is written partially in pseudo code,  you will have to find out what
 716        to do exactly on your OS to make it work.
 717
 718           while not the end of the universe
 719           do
 720              get result of
 721                 snmpget router community 2.2.1.10.4
 722              into variable $in
 723              get result of
 724                 snmpget router community 2.2.1.16.4
 725              into variable $out
 726
 727              rrdtool update myrouter.rrd N:$in:$out
 728
 729              wait for 5 minutes
 730           done
 731
 732        Then, after collecting data for a day, try to create an image using:
 733
 734           rrdtool graph myrouter-day.png --start -86400 \
 735                    DEF:inoctets=myrouter.rrd:input:AVERAGE \
 736                    DEF:outoctets=myrouter.rrd:output:AVERAGE \
 737                    AREA:inoctets#00FF00:"In traffic" \
 738                    LINE1:outoctets#0000FF:"Out traffic"
 739
 740        This should produce a picture with one day worth of traffic.  One day
 741        is 24 hours of 60 minutes of 60 seconds: 24*60*60=86400, we start at
 742        now minus 86400 seconds. We define (with DEFs) inoctets and outoctets
 743        as the average values from the database myrouter.rrd and draw an area
 744        for the "in" traffic and a line for the "out" traffic.
 745
 746        View the image and keep logging data for a few more days.  If you like,
 747        you could try the examples from the test database and see if you can
 748        get various options and calculations to work.
 749
 750        Suggestion: Display in bytes per second and in bits per second. Make
 751        the Ethernet graphics go red if they are over four megabits per second.
 752
 753        C\bCo\bon\bns\bso\bol\bli\bid\bda\bat\bti\bio\bon\bn F\bFu\bun\bnc\bct\bti\bio\bon\bns\bs
 754
 755        A few paragraphs back I mentioned the possibility of keeping the maxi-
 756        mum values instead of the average values. Let's go into this a bit
 757        more.
 758
 759        Recall all the stuff about the speed of the car. Suppose we drove at
 760        144 km/h during 5 minutes and then were stopped by the police for 25
 761        minutes.  At the end of the lecture we would take our laptop and create
 762        and view the image taken from the database. If we look at the second
 763        RRA we did create, we would have the average from 6 samples. The sam-
 764        ples measured would be 144+0+0+0+0+0=144, divided by 30 minutes, cor-
 765        rected for the error by 1000, translated into km/h, with a result of 24
 766        km/h.  I would still get a ticket but not for speeding anymore :)
 767
 768        Obviously, in this case we shouldn't look at the averages. In some
 769        cases they are handy. If you want to know how many km you had traveled,
 770        the averaged picture would be the right one to look at. On the other
 771        hand, for the speed that we traveled at, the maximum numbers seen is
 772        much more interesting. Later we will see more types.
 773
 774        It is the same for data. If you want to know the amount, look at the
 775        averages. If you want to know the rate, look at the maximum.  Over
 776        time, they will grow apart more and more. In the last database we have
 777        created, there are two archives that keep data per day. The archive
 778        that keeps averages will show low numbers, the archive that shows max-
 779        ima will have higher numbers.
 780
 781        For my car this would translate in averages per day of 96/24=4 km/h (as
 782        I travel about 94 kilometers on a day) during working days, and maxima
 783        of 120 km/h (my top speed that I reach every day).
 784
 785        Big difference. Do not look at the second graph to estimate the dis-
 786        tances that I travel and do not look at the first graph to estimate my
 787        speed. This will work if the samples are close together, as they are in
 788        five minutes, but not if you average.
 789
 790        On some days, I go for a long ride. If I go across Europe and travel
 791        for 12 hours, the first graph will rise to about 60 km/h. The second
 792        one will show 180 km/h. This means that I traveled a distance of 60
 793        km/h times 24 h = 1440 km. I did this with a higher speed and a maximum
 794        around 180 km/h. However, it probably doesn't mean that I traveled for
 795        8 hours at a constant speed of 180 km/h!
 796
 797        This is a real example: go with the flow through Germany (fast!) and
 798        stop a few times for gas and coffee. Drive slowly through Austria and
 799        the Netherlands. Be careful in the mountains and villages. If you would
 800        look at the graphs created from the five-minute averages you would get
 801        a totally different picture. You would see the same values on the aver-
 802        age and maximum graphs (provided I measured every 300 seconds).  You
 803        would be able to see when I stopped, when I was in top gear, when I
 804        drove over fast highways etc. The granularity of the data is much
 805        higher, so you can see more. However, this takes 12 samples per hour,
 806        or 288 values per day, so it would be a lot of data over a longer
 807        period of time. Therefore we average it, eventually to one value per
 808        day. From this one value, we cannot see much detail, of course.
 809
 810        Make sure you understand the last few paragraphs. There is no value in
 811        only a line and a few axis, you need to know what they mean and inter-
 812        pret the data in an appropriate way. This is true for all data.
 813
 814        The biggest mistake you can make is to use the collected data for some-
 815        thing that it is not suitable for. You would be better off if you
 816        didn't have the graph at all.
 817
 818        L\bLe\bet\bt'\b's\bs r\bre\bev\bvi\bie\bew\bw w\bwh\bha\bat\bt y\byo\bou\bu n\bno\bow\bw s\bsh\bho\bou\bul\bld\bd k\bkn\bno\bow\bw
 819
 820        You know how to create a database and can put data in it. You can get
 821        the numbers out again by creating an image, do math on the data from
 822        the database and view the result instead of the raw data.  You know
 823        about the difference between averages and maximum, and when to use
 824        which (or at least you should have an idea).
 825
 826        RRDtool can do more than what we have learned up to now. Before you
 827        continue with the rest of this doc, I recommend that you reread from
 828        the start and try some modifications on the examples. Make sure you
 829        fully understand everything. It will be worth the effort and helps you
 830        not only with the rest of this tutorial, but also in your day to day
 831        monitoring long after you read this introduction.
 832
 833        D\bDa\bat\bta\ba S\bSo\bou\bur\brc\bce\be T\bTy\byp\bpe\bes\bs
 834
 835        All right, you feel like continuing. Welcome back and get ready for an
 836        increased speed in the examples and explanations.
 837
 838        You know that in order to view a counter over time, you have to take
 839        two numbers and divide the difference of them between the time lapsed.
 840        This makes sense for the examples I gave you but there are other possi-
 841        bilities.  For instance, I'm able to retrieve the temperature from my
 842        router in three places namely the inlet, the so called hot-spot and the
 843        exhaust.  These values are not counters.  If I take the difference of
 844        the two samples and divide that by 300 seconds I would be asking for
 845        the temperature change per second.  Hopefully this is zero! If not, the
 846        computer room is probably on fire :)
 847
 848        So, what can we do?  We can tell RRDtool to store the values we measure
 849        directly as they are (this is not entirely true but close enough). The
 850        graphs we make will look much better, they will show a rather constant
 851        value. I know when the router is busy (it works -> it uses more elec-
 852        tricity -> it generates more heat -> the temperature rises). I know
 853        when the doors are left open (the room is air conditioned) -> the warm
 854        air from the rest of the building flows into the computer room -> the
 855        inlet temperature rises). Etc. The data type we use when creating the
 856        database before was counter, we now have a different data type and thus
 857        a different name for it. It is called GAUGE. There are more such data
 858        types:
 859
 860         - COUNTER   we already know this one
 861         - GAUGE     we just learned this one
 862         - DERIVE
 863         - ABSOLUTE
 864
 865        The two additional types are DERIVE and ABSOLUTE. Absolute can be used
 866        like counter with one difference: RRDtool assumes the counter is reset
 867        when it's read. That is: its delta is known without calculation by RRD-
 868        tool whereas RRDtool needs to calculate it for the counter type.  Exam-
 869        ple: our first example (12345, 12357, 12363, 12363) would read:
 870        unknown, 12, 6, 0. The rest of the calculations stay the same.  The
 871        other one, derive, is like counter. Unlike counter, it can also
 872        decrease so it can have a negative delta. Again, the rest of the calcu-
 873        lations stay the same.
 874
 875        Let's try them all:
 876
 877           rrdtool create all.rrd --start 978300900 \
 878                    DS:a:COUNTER:600:U:U \
 879                    DS:b:GAUGE:600:U:U \
 880                    DS:c:DERIVE:600:U:U \
 881                    DS:d:ABSOLUTE:600:U:U \
 882                    RRA:AVERAGE:0.5:1:10
 883           rrdtool update all.rrd \
 884                    978301200:300:1:600:300    \
 885                    978301500:600:3:1200:600   \
 886                    978301800:900:5:1800:900   \
 887                    978302100:1200:3:2400:1200 \
 888                    978302400:1500:1:2400:1500 \
 889                    978302700:1800:2:1800:1800 \
 890                    978303000:2100:4:0:2100    \
 891                    978303300:2400:6:600:2400  \
 892                    978303600:2700:4:600:2700  \
 893                    978303900:3000:2:1200:3000
 894           rrdtool graph all1.png -s 978300600 -e 978304200 -h 400 \
 895                    DEF:linea=all.rrd:a:AVERAGE LINE3:linea#FF0000:"Line A" \
 896                    DEF:lineb=all.rrd:b:AVERAGE LINE3:lineb#00FF00:"Line B" \
 897                    DEF:linec=all.rrd:c:AVERAGE LINE3:linec#0000FF:"Line C" \
 898                    DEF:lined=all.rrd:d:AVERAGE LINE3:lined#000000:"Line D"
 899
 900        R\bRR\bRD\bDt\bto\boo\bol\bl u\bun\bnd\bde\ber\br t\bth\bhe\be M\bMi\bic\bcr\bro\bos\bsc\bco\bop\bpe\be
 901
 902
 903        · Line A is a COUNTER type, so it should continuously increment and
 904          RRDtool must calculate the differences. Also, RRDtool needs to divide
 905          the difference by the amount of time lapsed. This should end up as a
 906          straight line at 1 (the deltas are 300, the time is 300).
 907
 908        · Line B is of type GAUGE. These are "real" values so they should match
 909          what we put in: a sort of a wave.
 910
 911        · Line C is of type DERIVE. It should be a counter that can decrease.
 912          It does so between 2400 and 0, with 1800 in-between.
 913
 914        · Line D is of type ABSOLUTE. This is like counter but it works on val-
 915          ues without calculating the difference. The numbers are the same and
 916          as you can see (hopefully) this has a different result.
 917
 918        This translates in the following values, starting at 23:10 and ending
 919        at 00:10 the next day (where "u" means unknown/unplotted):
 920
 921         - Line A:  u  u  1  1  1  1  1  1  1  1  1  u
 922         - Line B:  u  1  3  5  3  1  2  4  6  4  2  u
 923         - Line C:  u  u  2  2  2  0 -2 -6  2  0  2  u
 924         - Line D:  u  1  2  3  4  5  6  7  8  9 10  u
 925
 926        If your PNG shows all this, you know you have entered the data cor-
 927        rectly, the RRDtool executable is working properly, your viewer doesn't
 928        fool you, and you successfully entered the year 2000 :)
 929
 930        You could try the same example four times, each time with only one of
 931        the lines.
 932
 933        Let's go over the data again:
 934
 935        · Line A: 300,600,900 and so on. The counter delta is a constant 300
 936          and so is the time delta. A number divided by itself is always 1
 937          (except when dividing by zero which is undefined/illegal).
 938
 939          Why is it that the first point is unknown? We do know what we put
 940          into the database, right? True, But we didn't have a value to calcu-
 941          late the delta from, so we don't know where we started. It would be
 942          wrong to assume we started at zero so we don't!
 943
 944        · Line B: There is nothing to calculate. The numbers are as they are.
 945
 946        · Line C: Again, the start-out value is unknown. The same story is
 947          holds as for line A. In this case the deltas are not constant, there-
 948          fore the line is not either. If we would put the same numbers in the
 949          database as we did for line A, we would have gotten the same line.
 950          Unlike type counter, this type can decrease and I hope to show you
 951          later on why this makes a difference.
 952
 953        · Line D: Here the device calculates the deltas. Therefore we DO know
 954          the first delta and it is plotted. We had the same input as with line
 955          A, but the meaning of this input is different and thus the line is
 956          different.  In this case the deltas increase each time with 300. The
 957          time delta stays at a constant 300 and therefore the division of the
 958          two gives increasing values.
 959
 960        C\bCo\bou\bun\bnt\bte\ber\br W\bWr\bra\bap\bps\bs
 961
 962        There are a few more basics to show. Some important options are still
 963        to be covered and we haven't look at counter wraps yet. First the
 964        counter wrap: In our car we notice that the counter shows 999987. We
 965        travel 20 km and the counter should go to 1000007. Unfortunately, there
 966        are only six digits on our counter so it really shows 000007. If we
 967        would plot that on a type DERIVE, it would mean that the counter was
 968        set back 999980 km. It wasn't, and there has to be some protection for
 969        this. This protection is only available for type COUNTER which should
 970        be used for this kind of counter anyways. How does it work? Type
 971        counter should never decrease and therefore RRDtool must assume it
 972        wrapped if it does decrease!  If the delta is negative, this can be
 973        compensated for by adding the maximum value of the counter + 1. For our
 974        car this would be:
 975
 976         Delta = 7 - 999987 = -999980    (instead of 1000007-999987=20)
 977
 978         Real delta = -999980 + 999999 + 1 = 20
 979
 980        At the time of writing this document, RRDtool knows of counters that
 981        are either 32 bits or 64 bits of size. These counters can handle the
 982        following different values:
 983
 984         - 32 bits: 0 ..           4294967295
 985         - 64 bits: 0 .. 18446744073709551615
 986
 987        If these numbers look strange to you, you can view them in their hex-
 988        adecimal form:
 989
 990         - 32 bits: 0 ..         FFFFFFFF
 991         - 64 bits: 0 .. FFFFFFFFFFFFFFFF
 992
 993        RRDtool handles both counters the same. If an overflow occurs and the
 994        delta would be negative, RRDtool first adds the maximum of a small
 995        counter + 1 to the delta. If the delta is still negative, it had to be
 996        the large counter that wrapped. Add the maximum possible value of the
 997        large counter + 1 and subtract the erroneously added small value.
 998
 999        There is a risk in this: suppose the large counter wrapped while adding
1000        a huge delta, it could happen, theoretically, that adding the smaller
1001        value would make the delta positive. In this unlikely case the results
1002        would not be correct. The increase should be nearly as high as the max-
1003        imum counter value for that to happen, so chances are you would have
1004        several other problems as well and this particular problem would not
1005        even be worth thinking about. Even though, I did include an example, so
1006        you can judge for yourself.
1007
1008        The next section gives you some numerical examples for counter-wraps.
1009        Try to do the calculations yourself or just believe me if your calcula-
1010        tor can't handle the numbers :)
1011
1012        Correction numbers:
1013
1014         - 32 bits: (4294967295 + 1) =                                4294967296
1015         - 64 bits: (18446744073709551615 + 1)
1016                                            - correction1 = 18446744069414584320
1017
1018         Before:        4294967200
1019         Increase:                100
1020         Should become: 4294967300
1021         But really is:             4
1022         Delta:        -4294967196
1023         Correction1:  -4294967196 + 4294967296 = 100
1024
1025         Before:        18446744073709551000
1026         Increase:                             800
1027         Should become: 18446744073709551800
1028         But really is:                        184
1029         Delta:        -18446744073709550816
1030         Correction1:  -18446744073709550816
1031                                        + 4294967296 = -18446744069414583520
1032         Correction2:  -18446744069414583520
1033                           + 18446744069414584320 = 800
1034
1035         Before:        18446744073709551615 ( maximum value )
1036         Increase:      18446744069414584320 ( absurd increase, minimum for
1037         Should become: 36893488143124135935             this example to work )
1038         But really is: 18446744069414584319
1039         Delta:                     -4294967296
1040         Correction1:  -4294967296 + 4294967296 = 0
1041         (not negative -> no correction2)
1042
1043         Before:        18446744073709551615 ( maximum value )
1044         Increase:      18446744069414584319 ( one less increase )
1045         Should become: 36893488143124135934
1046         But really is: 18446744069414584318
1047         Delta:                     -4294967297
1048         Correction1:  -4294967297 + 4294967296 = -1
1049         Correction2:  -1 + 18446744069414584320 = 18446744069414584319
1050
1051        As you can see from the last two examples, you need strange numbers for
1052        RRDtool to fail (provided it's bug free of course), so this should not
1053        happen. However, SNMP or whatever method you choose to collect the
1054        data, might also report wrong numbers occasionally.  We can't prevent
1055        all errors, but there are some things we can do. The RRDtool "create"
1056        command takes two special parameters for this. They define the minimum
1057        and maximum allowed values. Until now, we used "U", meaning "unknown".
1058        If you provide values for one or both of them and if RRDtool receives
1059        data points that are outside these limits, it will ignore those values.
1060        For a thermometer in degrees Celsius, the absolute minimum is just
1061        under -273. For my router, I can assume this minimum is much higher so
1062        I would set it to 10, where as the maximum temperature I would set to
1063        80. Any higher and the device would be out of order.
1064
1065        For the speed of my car, I would never expect negative numbers and also
1066        I would not expect a speed  higher than 230. Anything else, and there
1067        must have been an error. Remember: the opposite is not true, if the
1068        numbers pass this check, it doesn't mean that they are correct. Always
1069        judge the graph with a healthy dose of suspicion if it seems weird to
1070        you.
1071
1072        D\bDa\bat\bta\ba R\bRe\bes\bsa\bam\bmp\bpl\bli\bin\bng\bg
1073
1074        One important feature of RRDtool has not been explained yet: it is vir-
1075        tually impossible to collect data and feed it into RRDtool on exact
1076        intervals. RRDtool therefore interpolates the data, so they are stored
1077        on exact intervals. If you do not know what this means or how it works,
1078        then here's the help you seek:
1079
1080        Suppose a counter increases by exactly one for every second. You want
1081        to measure it in 300 seconds intervals. You should retrieve values that
1082        are exactly 300 apart. However, due to various circumstances you are a
1083        few seconds late and the interval is 303. The delta will also be 303 in
1084        that case. Obviously, RRDtool should not put 303 in the database and
1085        make you believe that the counter increased by 303 in 300 seconds.
1086        This is where RRDtool interpolates: it alters the 303 value as if it
1087        would have been stored earlier and it will be 300 in 300 seconds.  Next
1088        time you are at exactly the right time. This means that the current
1089        interval is 297 seconds and also the counter increased by 297. Again,
1090        RRDtool interpolates and stores 300 as it should be.
1091
1092              in the RRD                 in reality
1093
1094         time+000:   0 delta="U"   time+000:    0 delta="U"
1095         time+300: 300 delta=300   time+300:  300 delta=300
1096         time+600: 600 delta=300   time+603:  603 delta=303
1097         time+900: 900 delta=300   time+900:  900 delta=297
1098
1099        Let's create two identical databases. I've chosen the time range
1100        920805000 to 920805900 as this goes very well with the example numbers.
1101
1102           rrdtool create seconds1.rrd   \
1103              --start 920804700          \
1104              DS:seconds:COUNTER:600:U:U \
1105              RRA:AVERAGE:0.5:1:24
1106
1107        Make a copy
1108
1109           for Unix: cp seconds1.rrd seconds2.rrd
1110           for Dos:  copy seconds1.rrd seconds2.rrd
1111           for vms:  how would I know :)
1112
1113        Put in some data
1114
1115           rrdtool update seconds1.rrd \
1116              920805000:000 920805300:300 920805600:600 920805900:900
1117           rrdtool update seconds2.rrd \
1118              920805000:000 920805300:300 920805603:603 920805900:900
1119
1120        Create output
1121
1122           rrdtool graph seconds1.png                       \
1123              --start 920804700 --end 920806200             \
1124              --height 200                                  \
1125              --upper-limit 1.05 --lower-limit 0.95 --rigid \
1126              DEF:seconds=seconds1.rrd:seconds:AVERAGE      \
1127              CDEF:unknown=seconds,UN                       \
1128              LINE2:seconds#0000FF                          \
1129              AREA:unknown#FF0000
1130           rrdtool graph seconds2.png                       \
1131              --start 920804700 --end 920806200             \
1132              --height 200                                  \
1133              --upper-limit 1.05 --lower-limit 0.95 --rigid \
1134              DEF:seconds=seconds2.rrd:seconds:AVERAGE      \
1135              CDEF:unknown=seconds,UN                       \
1136              LINE2:seconds#0000FF                          \
1137              AREA:unknown#FF0000
1138
1139        View both images together (add them to your index.html file) and com-
1140        pare. Both graphs should show the same, despite the input being differ-
1141        ent.
1142
1143 W\bWR\bRA\bAP\bPU\bUP\bP
1144        It's time now to wrap up this tutorial. We covered all the basics for
1145        you to be able to work with RRDtool and to read the additional documen-
1146        tation available. There is plenty more to discover about RRDtool and
1147        you will find more and more uses for this package. You can easily cre-
1148        ate graphs using just the examples provided and using only RRDtool. You
1149        can also use one of the front ends to RRDtool that are available.
1150
1151 M\bMA\bAI\bIL\bLI\bIN\bNG\bGL\bLI\bIS\bST\bT
1152        Remember to subscribe to the RRDtool mailing list. Even if you are not
1153        answering to mails that come by, it helps both you and the rest of the
1154        users. A lot of the stuff that I know about MRTG (and therefore about
1155        RRDtool) I've learned while just reading the list without posting to
1156        it. I did not need to ask the basic questions as they are answered in
1157        the FAQ (read it!) and in various mails by other users. With thousands
1158        of users all over the world, there will always be people who ask ques-
1159        tions that you can answer because you read this and other documentation
1160        and they didn't.
1161
1162 S\bSE\bEE\bE A\bAL\bLS\bSO\bO
1163        The RRDtool manpages
1164
1165 A\bAU\bUT\bTH\bHO\bOR\bR
1166        I hope you enjoyed the examples and their descriptions. If you do, help
1167        other people by pointing them to this document when they are asking
1168        basic questions. They will not only get their answers, but at the same
1169        time learn a whole lot more.
1170
1171        Alex van den Bogaerdt <alex@vandenbogaerdt.nl>
1172
1173
1174
1175 1.3.7                             2009-02-21                    RRDTUTORIAL(1)