X-Git-Url: https://git.tokkee.org/?p=pkg-rrdtool.git;a=blobdiff_plain;f=doc%2Frrdcreate.html;h=adc0755016ed333275f93346144a3470fcd96ee6;hp=97172f63f487218bfe49fc41ab5cb649fa2ba19a;hb=51c3d3fb997c22e1ee828470413f1e84989e1f6c;hpb=1559397b94b4af3de73cfa23c04be31d8bee53e7 diff --git a/doc/rrdcreate.html b/doc/rrdcreate.html index 97172f6..adc0755 100644 --- a/doc/rrdcreate.html +++ b/doc/rrdcreate.html @@ -9,8 +9,10 @@ -

+ +
+

+ + +

@@ -62,11 +68,11 @@ and filled with *UNKNOWN* data.

filename

The name of the RRD you want to create. RRD files should end -with the extension .rrd. However, RRDtool will accept any +with the extension .rrd. However, RRDtool will accept any filename.

-

--start|-b start time (default: now - 10s)

+

--start|-b start time (default: now - 10s)

Specifies the time in seconds since 1970-01-01 UTC when the first value should be added to the RRD. RRDtool will not accept any data timed before or at the time specified.

@@ -74,12 +80,12 @@ any data timed before or at the time specified.

rrdfetch documentation for other ways to specify time.

-

--step|-s step (default: 300 seconds)

+

--step|-s step (default: 300 seconds)

Specifies the base interval in seconds with which data will be fed into the RRD.

-

DS:ds-name:DST:dst arguments

+

DS:ds-name:DST:dst arguments

A single RRD can accept input from several data sources (DS), for example incoming and outgoing traffic on a specific communication line. With the DS configuration option you must define some basic @@ -94,16 +100,16 @@ DERIVE, and ABSOLUTE the format for a data source entry is:

For COMPUTE data sources, the format is:

DS:ds-name:COMPUTE:rpn-expression

In order to decide which data source type to use, review the -definitions that follow. Also consult the section on ``HOW TO MEASURE'' +definitions that follow. Also consult the section on "HOW TO MEASURE" for further insight.

-
GAUGE
+
GAUGE

is for things like temperatures or number of people in a room or the value of a RedHat share.

-
COUNTER
+
COUNTER

is for continuous incrementing counters like the ifInOctets counter in @@ -114,7 +120,7 @@ rate. When the counter overflows, RRDtool checks if the overflow happened at the 32bit or 64bit border and acts accordingly by adding an appropriate value to the result.

-
DERIVE
+
DERIVE

will store the derivative of the line going from the last to the @@ -126,10 +132,10 @@ might want to use DERIVE and combine it with a MIN value of 0.

NOTE on COUNTER vs DERIVE

by Don Baarda <don.baarda@baesystems.com>

If you cannot tolerate ever mistaking the occasional counter reset for a -legitimate counter wrap, and would prefer ``Unknowns'' for all legitimate +legitimate counter wrap, and would prefer "Unknowns" for all legitimate counter wraps and resets, always use DERIVE with min=0. Otherwise, using COUNTER with a suitable max will return correct values for all legitimate -counter wraps, mark some counter resets as ``Unknown'', but can mistake some +counter wraps, mark some counter resets as "Unknown", but can mistake some counter resets for a legitimate counter wrap.

For a 5 minute step and 32-bit counter, the probability of mistaking a counter reset for a legitimate wrap is arguably about 0.8% per 1Mbps of @@ -139,7 +145,7 @@ probably preferable. If you are using a 64bit counter, just about any max setting will eliminate the possibility of mistaking a reset for a counter wrap.

-
ABSOLUTE
+
ABSOLUTE

is for counters which get reset upon reading. This is used for fast counters @@ -148,7 +154,7 @@ after every read to make sure you have a maximum time available before the next overflow. Another usage is for things you count like number of messages since the last update.

-
COMPUTE
+
COMPUTE

is for storing the result of a formula applied to other data sources @@ -158,7 +164,7 @@ the data sources according to the rpn-expression that defines the formula. Consolidation functions are then applied normally to the PDPs of the COMPUTE data source (that is the rpn-expression is only applied to generate PDPs). In database software, such data sets are referred -to as ``virtual'' or ``computed'' columns.

+to as "virtual" or "computed" columns.

heartbeat defines the maximum number of seconds that may pass @@ -197,22 +203,22 @@ the archive. There are several consolidation functions that consolidate primary data points via an aggregate function: AVERAGE, MIN, MAX, LAST.

-
AVERAGE
+
AVERAGE

the average of the data points is stored.

-
MIN
+
MIN

the smallest of the data points is stored.

-
MAX
+
MAX

the largest of the data points is stored.

-
LAST
+
LAST

the last data points is used.

@@ -236,7 +242,7 @@ Obviously, this has to be greater than zero.


-

Aberrant Behavior Detection with Holt-Winters Forecasting

+

Aberrant Behavior Detection with Holt-Winters Forecasting

In addition to the aggregate functions, there are a set of specialized functions that enable RRDtool to provide data smoothing (via the Holt-Winters forecasting algorithm), confidence bands, and the @@ -386,28 +392,28 @@ default value is 9.

It may help you to sort out why all this *UNKNOWN* data is popping up in your databases:

RRDtool gets fed samples/updates at arbitrary times. From these it builds Primary -Data Points (PDPs) on every ``step'' interval. The PDPs are +Data Points (PDPs) on every "step" interval. The PDPs are then accumulated into the RRAs.

-

The ``heartbeat'' defines the maximum acceptable interval between -samples/updates. If the interval between samples is less than ``heartbeat'', +

The "heartbeat" defines the maximum acceptable interval between +samples/updates. If the interval between samples is less than "heartbeat", then an average rate is calculated and applied for that interval. If -the interval between samples is longer than ``heartbeat'', then that -entire interval is considered ``unknown''. Note that there are other -things that can make a sample interval ``unknown'', such as the rate +the interval between samples is longer than "heartbeat", then that +entire interval is considered "unknown". Note that there are other +things that can make a sample interval "unknown", such as the rate exceeding limits, or a sample that was explicitly marked as unknown.

-

The known rates during a PDP's ``step'' interval are used to calculate -an average rate for that PDP. If the total ``unknown'' time accounts for -more than half the ``step'', the entire PDP is marked -as ``unknown''. This means that a mixture of known and ``unknown'' sample -times in a single PDP ``step'' may or may not add up to enough ``known'' +

The known rates during a PDP's "step" interval are used to calculate +an average rate for that PDP. If the total "unknown" time accounts for +more than half the "step", the entire PDP is marked +as "unknown". This means that a mixture of known and "unknown" sample +times in a single PDP "step" may or may not add up to enough "known" time to warrent for a known PDP.

-

The ``heartbeat'' can be short (unusual) or long (typical) relative to -the ``step'' interval between PDPs. A short ``heartbeat'' means you +

The "heartbeat" can be short (unusual) or long (typical) relative to +the "step" interval between PDPs. A short "heartbeat" means you require multiple samples per PDP, and if you don't get them mark the -PDP unknown. A long heartbeat can span multiple ``steps'', which means +PDP unknown. A long heartbeat can span multiple "steps", which means it is acceptable to have multiple PDPs calculated from a single -sample. An extreme example of this might be a ``step'' of 5 minutes and a -``heartbeat'' of one day, in which case a single sample every day will +sample. An extreme example of this might be a "step" of 5 minutes and a +"heartbeat" of one day, in which case a single sample every day will result in all the PDPs for that entire day period being set to the same average rate. -- Don Baarda <don.baarda@baesystems.com>

@@ -453,7 +459,7 @@ same average rate. -- Don Baarda <HOW TO MEASURE
 

Here are a few hints on how to measure:

-
Temperature
+
Temperature

Usually you have some type of meter you can read to get the temperature. @@ -462,7 +468,7 @@ that the temperature reading happened at a certain time. You can use the GAUGE data source type for this. RRDtool will then record your reading together with the time.

-
Mail Messages
+
Mail Messages

Assume you have a method to count the number of messages transported by @@ -476,7 +482,7 @@ from RRDtool for the day in question and multiply this number with the number of seconds in a day. Because all math is run with Doubles, the precision should be acceptable.

-
It's always a Rate
+
It's always a Rate

RRDtool stores rates in amount/second for COUNTER, DERIVE and ABSOLUTE @@ -484,7 +490,7 @@ data. When you plot the data, you will get on the y axis amount/second which you might be tempted to convert to an absolute amount by multiplying by the delta-time between the points. RRDtool plots continuous data, and as such is not appropriate for plotting -absolute amounts as for example ``total bytes'' sent and received in a +absolute amounts as for example "total bytes" sent and received in a router. What you probably want is plot rates that you can scale to bytes/hour, for example, or plot absolute amounts with another tool that draws bar-plots, where the delta-time is clear on the plot for @@ -503,7 +509,7 @@ on the y axis, days on the x axis and one bar for each day).

RRA:MIN:0.5:12:2400 \ RRA:MAX:0.5:12:2400 \ RRA:AVERAGE:0.5:12:2400
-

This sets up an RRD called temperature.rrd which accepts one +

This sets up an RRD called temperature.rrd which accepts one temperature value every 300 seconds. If no new data is supplied for more than 600 seconds, the temperature becomes *UNKNOWN*. The minimum acceptable value is -273 and the maximum is 5'000.