collectd.conf(5): Document the new `RandomTimeout' option.
rrdtool plugin: Optimize away the ‘random_timeout_mod’ variable.
rrdtool plugin: Call rand(3) less often.
2009/8/18 Florian Forster <octo@verplant.org>:
> Hi Mariusz,
>
> On Mon, Aug 17, 2009 at 02:20:29AM +0200, Mariusz Gronczewski wrote:
>> i was thinking how to "spread out" writes to rrd files a bit, because
>> now its big spike every CacheTimeout or little smaller "square" on
>> graph if u use WritesPerSecond.
>
> in general I like your patch, thank you very much for posting it :)
> I have some doubts about calling rand() in such a busy place though,
> since getting random numbers is potentially costly. Also, rand(3) is not
> thread-safe, though I don't think that's really an issue for us.
Yeah good point, but that would be probably noticable on very slow
(like PIII 800 slow) machines with tons of rrd, and then machine would
run out of disk bandwidth first.
> Maybe a solution would be to add a ‘random_timeout’ member to the
> ‘rrd_cache_t’ struct, too. This member is then set when creating the
> entry and set again right after the values have been removed. That way
> rand(3) is only called once for each write instead of calling for every
> check.
Yeah, very good idea, i didnt thougth about that (well tbh. i didnt
looked much into "interiors" of rrdtool plugin). Ive implemented it in
attached patch, so far ive been testing it for about 1 hour and works
pretty well.
> As an interesting sidenote: With the above approach, the random write
> times are distributed “uniform”, i. e. every delay from 0 to max-1
> seconds has the same probability. With your code, I think the actual
> time a value is written follows a “normal” distribution (you know, that
> famous bell curve). So I'd expect the above approach to spread the value
> quicker.
Yup, exactly as u said, its much quicker like that.
Im wondering how config variable should be called, name
"RandomTimeout" dont mean anything useful ("random timeout of what?"),
maybe TimeoutSpread ? RandomizeTimeout ?
2009/8/18 Florian Forster <octo@verplant.org>:
> Hi Mariusz,
>
> On Mon, Aug 17, 2009 at 02:20:29AM +0200, Mariusz Gronczewski wrote:
>> i was thinking how to "spread out" writes to rrd files a bit, because
>> now its big spike every CacheTimeout or little smaller "square" on
>> graph if u use WritesPerSecond.
>
> in general I like your patch, thank you very much for posting it :)
> I have some doubts about calling rand() in such a busy place though,
> since getting random numbers is potentially costly. Also, rand(3) is not
> thread-safe, though I don't think that's really an issue for us.
Yeah good point, but that would be probably noticable on very slow
(like PIII 800 slow) machines with tons of rrd, and then machine would
run out of disk bandwidth first.
> Maybe a solution would be to add a ‘random_timeout’ member to the
> ‘rrd_cache_t’ struct, too. This member is then set when creating the
> entry and set again right after the values have been removed. That way
> rand(3) is only called once for each write instead of calling for every
> check.
Yeah, very good idea, i didnt thougth about that (well tbh. i didnt
looked much into "interiors" of rrdtool plugin). Ive implemented it in
attached patch, so far ive been testing it for about 1 hour and works
pretty well.
> As an interesting sidenote: With the above approach, the random write
> times are distributed “uniform”, i. e. every delay from 0 to max-1
> seconds has the same probability. With your code, I think the actual
> time a value is written follows a “normal” distribution (you know, that
> famous bell curve). So I'd expect the above approach to spread the value
> quicker.
Yup, exactly as u said, its much quicker like that.
Im wondering how config variable should be called, name
"RandomTimeout" dont mean anything useful ("random timeout of what?"),
maybe TimeoutSpread ? RandomizeTimeout ?
Random write timeout for rrdtool plugin
Hi,
i was thinking how to "spread out" writes to rrd files a bit, because
now its big spike every CacheTimeout or little smaller "square" on
graph if u use WritesPerSecond. So ive written little patch which
"spreads out" writing by changing Cache timeout every time rrdtool
plugin finds data to save. Basically instead of moving data older than
CacheTimeout to write queue it moves it if its older than CacheTimeout
+- RandomTimeout. What it changes?
Without it, gathered data is "synchronised" with eachother, for
example (CacheTimeout = 600):
1.collectd starts
2. after 10 minutes, data from all plugins get "too old" and get
pushed into write queue and get saved
3. after another 10 minutes, same thing, all data "ages" at same time
and get saved in one big chunk
With it (RandomTimeout=300) it works like that
1. collectd starts
2. after 5 minutes some data (lets call it A) starts to go into write queue
3. after 10 minutes from start about 50% (on average) data is saved
(lets call it B)
4. finally, after 15 minutes, all "leftover" data gets saved (lets call it C)
5. next "cycle"
6. data A ages first (cos it was put to disk first) and like before,
some of it gets writen earlier, some of it gets written later)
7. after that data B ages and like before writes are spread over 10 mins
8. same with C
so first cycle (looking at i/o) looks like sinus, next 10 minute cycle
is same sinus but flattened a bit and so on (looks like fading sinus),
and after few cycles it gives pretty much same amount on writes per
sec, no ugly spikes.
Effect looks like that:
http://img24.imageshack.us/img24/7294/drrawcgi.png
(after few more h it will be more "smooth")
Regards
Mariusz
Signed-off-by: Florian Forster <octo@huhu.verplant.org>
Hi,
i was thinking how to "spread out" writes to rrd files a bit, because
now its big spike every CacheTimeout or little smaller "square" on
graph if u use WritesPerSecond. So ive written little patch which
"spreads out" writing by changing Cache timeout every time rrdtool
plugin finds data to save. Basically instead of moving data older than
CacheTimeout to write queue it moves it if its older than CacheTimeout
+- RandomTimeout. What it changes?
Without it, gathered data is "synchronised" with eachother, for
example (CacheTimeout = 600):
1.collectd starts
2. after 10 minutes, data from all plugins get "too old" and get
pushed into write queue and get saved
3. after another 10 minutes, same thing, all data "ages" at same time
and get saved in one big chunk
With it (RandomTimeout=300) it works like that
1. collectd starts
2. after 5 minutes some data (lets call it A) starts to go into write queue
3. after 10 minutes from start about 50% (on average) data is saved
(lets call it B)
4. finally, after 15 minutes, all "leftover" data gets saved (lets call it C)
5. next "cycle"
6. data A ages first (cos it was put to disk first) and like before,
some of it gets writen earlier, some of it gets written later)
7. after that data B ages and like before writes are spread over 10 mins
8. same with C
so first cycle (looking at i/o) looks like sinus, next 10 minute cycle
is same sinus but flattened a bit and so on (looks like fading sinus),
and after few cycles it gives pretty much same amount on writes per
sec, no ugly spikes.
Effect looks like that:
http://img24.imageshack.us/img24/7294/drrawcgi.png
(after few more h it will be more "smooth")
Regards
Mariusz
Signed-off-by: Florian Forster <octo@huhu.verplant.org>
madwifi plugin: Signal an error in the read function when appropriate.
An error will be signaled to the daemon if querying all interfaces failed.
Querying an interface fails if all ioctls return an error.
An error will be signaled to the daemon if querying all interfaces failed.
Querying an interface fails if all ioctls return an error.
madwifi plugin: Rename the antenna stats.
The first part of the type instance is already something like `ast_ant_rx' -
using `antenna%i' as the second part is therefore redundant. Thanks to Ondrej
for the pointer.
The first part of the type instance is already something like `ast_ant_rx' -
using `antenna%i' as the second part is therefore redundant. Thanks to Ondrej
for the pointer.
madwifi plugin: Unify ioctl error handling.
If an ioctl fails, a debug message is generated rather than an error message.
There are several types of interfaces manages by the madwifi driver, and not
all interfaces support all ioctls. Thanks to Ondrej for pointing this out.
If an ioctl fails, a debug message is generated rather than an error message.
There are several types of interfaces manages by the madwifi driver, and not
all interfaces support all ioctls. Thanks to Ondrej for pointing this out.
madwifi plugin: Fix buffer handling around `readlink'.
readlink(2) doesn't null-terminate the buffer in any case. Thanks to Ondrej for
pointing this out.
readlink(2) doesn't null-terminate the buffer in any case. Thanks to Ondrej for
pointing this out.
Merge branch 'ff/genericjmx'
AUTHORS: Add Ondrej.
madwifi plugin: Add some assertions …
… to otherwise unchecked array indices.
… to otherwise unchecked array indices.
madwifi plugin: Fix a few best practices.
Use `sstrncpy' and `ssnprintf' instead of the unsafe versions. Don't
specify array dimensions twice. Don't cast _Bool to int.
Use `sstrncpy' and `ssnprintf' instead of the unsafe versions. Don't
specify array dimensions twice. Don't cast _Bool to int.
madwifi plugin: Rename the ‘DisableSysfs’ to ‘Source’.
Configurations like
DisableSysfs false
are confusing.
Configurations like
DisableSysfs false
are confusing.
madwifi plugin: Plugin for detailed information from the MadWifi driver.
Hello
After some time i managed to make a new version of Madwifi plugin. The
main change is that it is possible to finely tune the set of monitored
statistics and just the most important statistics are monitored by
default. Also the number of new data types is reduced (by using type
instances).
Signed-off-by: Florian Forster <octo@verplant.org>
Hello
After some time i managed to make a new version of Madwifi plugin. The
main change is that it is possible to finely tune the set of monitored
statistics and just the most important statistics are monitored by
default. Also the number of new data types is reduced (by using type
instances).
Signed-off-by: Florian Forster <octo@verplant.org>
src/utils_cache.c: Don't tell the user about missing values.
This is bound to confuse users..
This is bound to confuse users..
network plugin: Use the meta data to implement the `Forward' option.
Previously, a cache in the network plugin was used to keep track of
which values were received via the network in order to distinguish
between ``forwarded'' values and values that were received from
somewhere else.
The same cache was also used to avoid loops when forwarding packages by
keeping track of the highest timestamp that was sent by the plugin and
discard received data that was older or as old as that.
This information is not kept in the meta data of the global cache (what
is the last timestamp sent) and the meta data of the value list (was
this value list received via the network?). The cache that was
maintained in the network plugin has been removed.
Previously, a cache in the network plugin was used to keep track of
which values were received via the network in order to distinguish
between ``forwarded'' values and values that were received from
somewhere else.
The same cache was also used to avoid loops when forwarding packages by
keeping track of the highest timestamp that was sent by the plugin and
discard received data that was older or as old as that.
This information is not kept in the meta data of the global cache (what
is the last timestamp sent) and the meta data of the value list (was
this value list received via the network?). The cache that was
maintained in the network plugin has been removed.
src/utils_cache.[ch]: Make the `value_list_t *' const.
The struct is only needed to build the name (a string) anyway..
The struct is only needed to build the name (a string) anyway..
src/utils_cache.c: Free the meta data when removing a cache entry.
src/utils_cache.[ch]: Implement a meta-data interface for cached values.
This should make it possible to write stateful matches and similar nifty
stuff - at last \o/
This should make it possible to write stateful matches and similar nifty
stuff - at last \o/
src/common.h: Remove the `ds' argument from the `FORMAT_VL' macro.
Since `type' is now included in `value_list_t' the `data_set_t' is no
longer required.
Since `type' is now included in `value_list_t' the `data_set_t' is no
longer required.
src/collectd.conf.in: Fix the default class path of the java plugin.
contrib/GenericJMX.conf: Added a sample config file for the GenericJMX plugin.
java bindings: GenericJMX: Add support for more numeric classes.
java bindings: GenericJMX: Fix a couple of error messages.
Also renamed a variable to fit the naming schema.
Also renamed a variable to fit the naming schema.
java bindings: GenericJMX: Add support for "InstanceFrom".
This can be used to specify so called "properties" to include in the
plugin instance.
This can be used to specify so called "properties" to include in the
plugin instance.
java bindings: GenericJMX: This first prototype version seems to do something.
Well, at least it's not throwing exceptions like mad..
Well, at least it's not throwing exceptions like mad..
java bindings: JMXMemory: Remove an annoying folding.
java bindings: Add first take at a `GenericJMX' plugin.
src/collectd.conf.in: java plugin: Use @prefix@ when building the example class path.
java bindings: DataSource: Add `DERIVE' and `ABSOLUTE'.
df plugin, AUTHORS: Add Paul.
df plugin: Fix some "best practices" that have been changed.
Add option to collectd.conf
collectd.conf(5): Add new config option.
df plugin: Add option to report by mountpoint or devicename
match_empty_counter plugin: Match for zero counter values.
src/utils_cache.[ch]: Add uc_get_history[_by_name].
These two new functions can be used to get historical data of values in
the cache. This can be used to calculate floating averages, hysteresis
and a shipload of other aggregation and consolidation functions.
The current implementation is probably not yet perfect:
- If not enough values are available to satisfy the request, the buffer
will be enlarged and NaNs will be returned in the newly allocated
cells. The caller has no way to recognize this case.
- If a value is missing, no NaNs will be added to the cache. It's
unclear if this was desirable.
- The returned values are reversed, i. e. val[0] will be the newest
value, val[n-1] will be the oldest. Here, too, I'm unsure which way
is easier to comprehend / use. I went for this implementation because
it was easier to write.
These two new functions can be used to get historical data of values in
the cache. This can be used to calculate floating averages, hysteresis
and a shipload of other aggregation and consolidation functions.
The current implementation is probably not yet perfect:
- If not enough values are available to satisfy the request, the buffer
will be enlarged and NaNs will be returned in the newly allocated
cells. The caller has no way to recognize this case.
- If a value is missing, no NaNs will be added to the cache. It's
unclear if this was desirable.
- The returned values are reversed, i. e. val[0] will be the newest
value, val[n-1] will be the oldest. Here, too, I'm unsure which way
is easier to comprehend / use. I went for this implementation because
it was easier to write.
Merge branch 'collectd-4.7'
ChangeLog: Fix a typo.
Bumped version to 4.7.2; Updated ChangeLog.
Merge branch 'collectd-4.6' into collectd-4.7
Conflicts:
ChangeLog
version-gen.sh
Conflicts:
ChangeLog
version-gen.sh
Bumped version to 4.6.4; Updated ChangeLog.
Merge branch 'collectd-4.7'
Merge branch 'collectd-4.6' into collectd-4.7
Conflicts:
src/memcached.c
Conflicts:
src/memcached.c
src/configfile.c: Warn if an unexpected block is found.
If the `snmp' plugin isn't loaded (but a configuration exists), no
warning is printed because there are only blocks in the SNMP
configuration..
If the `snmp' plugin isn't loaded (but a configuration exists), no
warning is printed because there are only blocks in the SNMP
configuration..
build.sh, version-gen.sh: Remove bashisms.
Thanks to Peter Bray for pointing them out.
Thanks to Peter Bray for pointing them out.
src/collectd.conf.in: Fix a typo in tokyotyrant's sample config.
Merge branch 'ps/tokyotyrant'
collectd.conf(5): Improved markup of the tokyotyrant documentation.
.gitignore: Update the file.
The pattern `Makefile.in' will match `src/Makefile.in' and others,
because the pattern does not contain a match.
`/configure' will only match the configure script in the base directory
due to special syntax.
`.libs/' matches only directories named `.libs', special syntax again.
For more information see the `gitignore(5)' manual page. The syntax used
corresponds to Git 1.6.
The pattern `Makefile.in' will match `src/Makefile.in' and others,
because the pattern does not contain a match.
`/configure' will only match the configure script in the base directory
due to special syntax.
`.libs/' matches only directories named `.libs', special syntax again.
For more information see the `gitignore(5)' manual page. The syntax used
corresponds to Git 1.6.
tokyotyrant plugin: Lookup service names (port names) and minor fixes.
Build system: Improve detection of the tokyotyrant library.
src/utils_cache.c: `ce' *is* written to in `c_avl_remove'.
Therefore we should definitely free it.
Therefore we should definitely free it.
src/collectd.conf.in: Fix a typo.
src/utils_cache.c: uc_check_timeout: Don't free a `ce' from the previous iteration.
This may habe been a cause of the reported assertion failure, too.
This may habe been a cause of the reported assertion failure, too.
src/utils_cache.c: Add a missing `continue'.
tokkee on IRC & I think we found a bug with utils_cache.c. The uc_check_timeout
function is missing a continue after the "uninteresting" service check, that
causes a key to be null.
This probably caused an assertion failure in cache_compare as reported by
Mariusz.
tokkee on IRC & I think we found a bug with utils_cache.c. The uc_check_timeout
function is missing a continue after the "uninteresting" service check, that
causes a key to be null.
This probably caused an assertion failure in cache_compare as reported by
Mariusz.
tokyotyrant plugin: Make DB handle `static'.
.gitignore: Add some *.o files.
tokyotyrant plugin: Don't need to pass the db handle around, its global.
tokyotyrant plugin: Only connect once.
tokyotyrant plugin: Handle port config param as a string
Add some documentation for tokyotyrant to the collectd.conf manpage
cpu plugin: Fix a typo.
src/utils_threshold.c: Change the percentage code so it works with the DataSource option.
The percentage code used to *always* check the first data source. With this
patch, the code honors the `DataSource' option again, checking only the
configured data sources if applicable.
The percentage code used to *always* check the first data source. With this
patch, the code honors the `DataSource' option again, checking only the
configured data sources if applicable.
collectd.conf(5): Document the new `Percentage' option.
src/utils_threshold.c: Fix a typo.
src/utils_threshold.c: Add a percent sign to the minimum value, too.
src/utils_threshold.c: Percentage support in thresholds
Hi all!
I attach a patch to add percentage support in thresholds, like this example:
<Threshold>
<Type df>
WarningMax 90
Percentage true
</Type>
</Threshold>
The percentage option works like collectd-nagios, that is, calculate the
percentage of the value of the first DS over the total. For df plugin,
for example,
calculate the percentage of the "used" DS.
Bugs and suggestions are welcome :)
Enjoy!
Regards,
Andres
Hi all!
I attach a patch to add percentage support in thresholds, like this example:
<Threshold>
<Type df>
WarningMax 90
Percentage true
</Type>
</Threshold>
The percentage option works like collectd-nagios, that is, calculate the
percentage of the value of the first DS over the total. For df plugin,
for example,
calculate the percentage of the "used" DS.
Bugs and suggestions are welcome :)
Enjoy!
Regards,
Andres
Update README and add Paul to the AUTHORS file.
Fix a bug with recording of port
Port was getting written to plugin_instance as "1978.00000", because
apparently that's the value returned by the config.
Port was getting written to plugin_instance as "1978.00000", because
apparently that's the value returned by the config.
Changes suggested by Sebastian Harl.
* Separate Host and Port in config, report Host as hostname, and Port as
plugin instance.
* Submit before closing connection.
* Else-case in config, in case of invalid config params.
* Flounder around at using pkg-config in configure.in
* Remove forward declarations.
* Include plugin in config summary.
* Separate Host and Port in config, report Host as hostname, and Port as
plugin instance.
* Submit before closing connection.
* Else-case in config, in case of invalid config params.
* Flounder around at using pkg-config in configure.in
* Remove forward declarations.
* Include plugin in config summary.
Plugin for monitoring TokyoTyrant
This plugin monitors the record count and file size of the configured
tokyocabinet server.
TokyoTyrant: http://tokyocabinet.sourceforge.net/tyrantdoc/
This plugin monitors the record count and file size of the configured
tokyocabinet server.
TokyoTyrant: http://tokyocabinet.sourceforge.net/tyrantdoc/
memcached plugin: Pass `ai_hints' to `getaddrinfo'.
Merge branch 'collectd-4.7'
Merge branch 'collectd-4.6' into collectd-4.7
bindings/java/Makefile.am: Fully support $DESTDIR.
src/Makefile: Link the ping plugin against libm.
The plugin now uses sqrt() which is provided by the math lib.
The plugin now uses sqrt() which is provided by the math lib.
collectd2html.pl: Added --recursive command line option.
This option may be used to recursively scan the specified directory for RRD
files. This way, the script works reasonably well with collectd 4.
Thanks to 'ABL <abl@xxx.lt>' for providing an initial patch in Debian bug
#482185.
This option may be used to recursively scan the specified directory for RRD
files. This way, the script works reasonably well with collectd 4.
Thanks to 'ABL <abl@xxx.lt>' for providing an initial patch in Debian bug
#482185.
collectd2html.pl: Allow for --imgformat to be passed to rrdtool.
This was reported as Debian bug #482185.
Signed-off-by: Sebastian Harl <sh@tokkee.org>
This was reported as Debian bug #482185.
Signed-off-by: Sebastian Harl <sh@tokkee.org>
rrdcached plugin: Fix a typo.
network plugin: Cast data sources to their respective types.
Various plugins: Fix formatstring errors.
perl plugin: Improve handling of DERIVE and ABSOLUTE data source types.
java plugin: Improve handling of DERIVE and ABSOLUTE data source types.
csv plugin: Improve handling of DERIVE and ABSOLUTE data source types.
collectd-perl(5): Add the DERIVE and ABSOLUTE data source types.
gmond plugin: Add the DERIVE and ABSOLUTE data source types.
couchdb plugin: Add the DERIVE and ABSOLUTE data source types.
src/utils_cmd_putval.c: Use `parse_values'.
snmp plugin: Use `parse_value' instead of using a separate function here.
src/plugin.c: Introduce the `DS_TYPE_TO_STRING' macro.
src/common.c: Rewrite `parse_value'.
src/utils_cache.c: Add the DERIVE and ABSOLUTE data source types.
src/plugin.h: Use `int64_t' for `derive_t' and `uint64_t' for `absolute_t'.
network plugin: Add the DERIVE and ABSOLUTE data source types.
gmond plugin: Use `strtoull' to parse counter values.
Instead of `strtoll'.
Instead of `strtoll'.
src/common.c: More reliable error reporting in `parse_values'.
Introduce the DERIVE and ABSOLUTE data source types.
Hi,
i've updated my patch to 4.7.0, most of "data input" plugins (curl, java, exec,
perl, tail, couchdb) should work with derive. In case of couchdb and curl, if u
use absolute DS you can only "Set", no "Inc" or "Add" coz obviously that
wouldn't make much sense with it. Other plugins can be "enabled" globally to
use derive by changing "COUNTER" to "DERIVE" in types.db but that way is ugly
(but makes sense in some cases, like when u have lot of tunnels or ppp
interfaces) and either needs converting or recreating rrd files.
Regards
Mariusz
---
Hi,
ive been running my patch with 4.7.1, found a minor bug, but after repairing
that i didnt had any problems with it on my servers, im including patch
(against 4.7.1 from webpage),
Regards,
XANi
Hi,
i've updated my patch to 4.7.0, most of "data input" plugins (curl, java, exec,
perl, tail, couchdb) should work with derive. In case of couchdb and curl, if u
use absolute DS you can only "Set", no "Inc" or "Add" coz obviously that
wouldn't make much sense with it. Other plugins can be "enabled" globally to
use derive by changing "COUNTER" to "DERIVE" in types.db but that way is ugly
(but makes sense in some cases, like when u have lot of tunnels or ppp
interfaces) and either needs converting or recreating rrd files.
Regards
Mariusz
---
Hi,
ive been running my patch with 4.7.1, found a minor bug, but after repairing
that i didnt had any problems with it on my servers, im including patch
(against 4.7.1 from webpage),
Regards,
XANi
configure.in: Add -rpath to JAVA_LDFLAGS.
src/plugin.[ch]: Add meta data to value_list_t.