From: Mariusz Gronczewski Date: Tue, 18 Aug 2009 19:18:06 +0000 (+0200) Subject: rrdtool plugin: Call rand(3) less often. X-Git-Tag: collectd-4.8.0~40^2~2 X-Git-Url: https://git.tokkee.org/?a=commitdiff_plain;h=bdcac4078f8052b8e4f425a1e5aea3957551e0d3;p=collectd.git rrdtool plugin: Call rand(3) less often. 2009/8/18 Florian Forster : > Hi Mariusz, > > On Mon, Aug 17, 2009 at 02:20:29AM +0200, Mariusz Gronczewski wrote: >> i was thinking how to "spread out" writes to rrd files a bit, because >> now its big spike every CacheTimeout or little smaller "square" on >> graph if u use WritesPerSecond. > > in general I like your patch, thank you very much for posting it :) > I have some doubts about calling rand() in such a busy place though, > since getting random numbers is potentially costly. Also, rand(3) is not > thread-safe, though I don't think that's really an issue for us. Yeah good point, but that would be probably noticable on very slow (like PIII 800 slow) machines with tons of rrd, and then machine would run out of disk bandwidth first. > Maybe a solution would be to add a ‘random_timeout’ member to the > ‘rrd_cache_t’ struct, too. This member is then set when creating the > entry and set again right after the values have been removed. That way > rand(3) is only called once for each write instead of calling for every > check. Yeah, very good idea, i didnt thougth about that (well tbh. i didnt looked much into "interiors" of rrdtool plugin). Ive implemented it in attached patch, so far ive been testing it for about 1 hour and works pretty well. > As an interesting sidenote: With the above approach, the random write > times are distributed “uniform”, i. e. every delay from 0 to max-1 > seconds has the same probability. With your code, I think the actual > time a value is written follows a “normal” distribution (you know, that > famous bell curve). So I'd expect the above approach to spread the value > quicker. Yup, exactly as u said, its much quicker like that. Im wondering how config variable should be called, name "RandomTimeout" dont mean anything useful ("random timeout of what?"), maybe TimeoutSpread ? RandomizeTimeout ? --- diff --git a/src/rrdtool.c b/src/rrdtool.c index 27879443..2bcfb6d9 100644 --- a/src/rrdtool.c +++ b/src/rrdtool.c @@ -40,6 +40,7 @@ struct rrd_cache_s char **values; time_t first_value; time_t last_value; + int random_variation; enum { FLAG_NONE = 0x00, @@ -742,15 +743,14 @@ static int rrd_cache_insert (const char *filename, filename, rc->values_num, (unsigned long)(rc->last_value - rc->first_value)); - - if ((rc->last_value - rc->first_value) >= (cache_timeout + (random_timeout - (rand() % random_timeout_mod) ) ) ) + if ((rc->last_value - rc->first_value + rc->random_variation) >= cache_timeout) { /* XXX: If you need to lock both, cache_lock and queue_lock, at * the same time, ALWAYS lock `cache_lock' first! */ if (rc->flags == FLAG_NONE) { int status; - + rc->random_variation = (random_timeout - (rand() % random_timeout_mod)); status = rrd_queue_enqueue (filename, &queue_head, &queue_tail); if (status == 0) rc->flags = FLAG_QUEUED;