Discussion:
question about --bwlimit=
Wallace Matthews
2004-05-21 18:48:12 UTC
Permalink
I am doing some benchmarking of rsync. I am using the --bwlimit= option to throttle down rsync to predict its operation over slow communications links. I am using rsync 2.6.2 from the release site without any patches. I downloaded the release rather than pull from the CVS tree.

I have 2 servers "wilber" (the remote archive) and "judy" (the local archive) connected with a gig ethernet. I have a file on "judy" that if I use the following command completes in under 1 second:

time rsync -ar --rsh=rsh bluesAlbums/Pilgrim/track1.mp3 wilber://test/bluesAlbums/Pilgrim

the track1 file on wilber exists and the track1 file on judy has been touched. The file Track1 is 6.3Meg in size. The checksums file if you do --write-batch is 60K bytes. The difference file is 40k bytes (no differences). I check the modification time on wilber after each transfer to make sure the transfer actually happened.

If I use the command

time rsync -ar -rsh=rsh --bwlimit=4001 bluesAlbums/Pilgrim/track1.mp3 wilber://test/bluesAlbums/Pilgrim

real = 0.70 to 0.90

If I use the command

time rsync -ar -rsh=rsh --bwlimit=4000 bluesAlbums/Pilgrim/track1.mp3 wilber://test/bluesAlbums/Pilgrim

real = 1m34.000 to 1m35.000

There are no other processes running on either server, and I touch the file on judy each time I repeat the test. I actually have 2 telnet sessions open on judy and in one I repeat the touch and in the other I repeat the command string after I change the --bwlimit= option. I do this to make sure I don't fat finger things.

I can repeat this time after time. If --bwlimit is > 4000 (ie. 4005, 4025, 4050,5000,7500,10000,100000) real is in the same range as 4001. If --bwlimit is 4000 or under (ie. 3725, 2000, 1000, 100) real is in the same range as 4000.

I can understand bipolar behavior at extremes of bandwidth availability but I cant understand the cutoff being that acute that a 1k difference would yield such a dramatic result.

At unlimited bandwidth, the real time is twice what it is for rcp so I believe the 0.70 to 0.90 is correct.

??? Is there something going on with --bwlimits around the value of 4000 that could be causing this sharp break.????

Wally
Wallace Matthews
2004-05-21 20:39:31 UTC
Permalink
Since --bwlimit depends upon sleep(1 second), I repeated the experiment with a file that was 383 Megabyte so that when I am running unthrottled it takes significantly longer than a second (ie. ~50 seconds) to complete. I get the same bi-modal behavior but with different values for 4000 and 4001 respectively. The fact that the break point stays fixed isnt intuitive (to me at least).

wally
Paul Slootman
2004-05-24 13:36:42 UTC
Permalink
Post by Wallace Matthews
Since --bwlimit depends upon sleep(1 second), I repeated the experiment with a file that was 383 Megabyte so that when I am running unthrottled it takes significantly longer than a second (ie. ~50 seconds) to complete. I get the same bi-modal behavior but with different values for 4000 and 4001 respectively. The fact that the break point stays fixed isnt intuitive (to me at least).
There have been earlier discussions about the --bwlimit behaviour, and
that it's not that well suited for e.g. slower ADSL lines because it's
rather "bursty", a limit of 20 means it'll write out at full throttle
until it reaches the 20k, then it sleeps.

There's an alternative --bwlimit patch that was posted back then, that
takes a subtlely different approach (by also limiting the size of the
writes). This prompted a discussion of whether this may have impact on
the tcp packets going out on the wire, perhaps leading to extra tcp
overhead which is contrary to rsync's goal of reducing network traffic
at all costs...

In the most recent Debian versions I've made the other bwlimit
implementation available (via --bwlimit-mod, for "modified"). Of course,
both ends need the Debian hacked version. My private tests have shown it
to work pretty well, other people have also been happy.

I haven't tried it with larger limits than about 100, though...
I doubt whether it will have any effect on your test case.

The patch was basically this:

--- rsync-2.6.2.orig/io.c
+++ rsync-2.6.2/io.c
@@ -814,6 +814,8 @@
if (FD_ISSET(fd, &w_fds)) {
int ret;
size_t n = len-total;
+ if (bwlimit && n > (unsigned)(bwlimit*100))
+ n = bwlimit*100;
ret = write(fd,buf+total,n);

if (ret < 0) {


Paul Slootman
Wayne Davison
2004-05-24 20:54:42 UTC
Permalink
Post by Wallace Matthews
I can repeat this time after time. If --bwlimit is > 4000 (ie. 4005,
4025, 4050,5000,7500,10000,100000) real is in the same range as 4001.
If --bwlimit is 4000 or under (ie. 3725, 2000, 1000, 100) real is in
the same range as 4000.
That is because of this calculation:

tv.tv_usec = bytes_written * 1000 / bwlimit;

Rsync calls this function after a lot of 4-byte writes, and thus the
sleep time for "4 * 1000 / 4001" (or anything higher than 4001) is 0.
Thus, rsync neglects a bunch of sleep calls (but not all of them).

I'm looking into some of the old bwlimit patches to see about improving
this.

..wayne..
Wayne Davison
2004-05-24 23:20:14 UTC
Permalink
Post by Wayne Davison
I'm looking into some of the old bwlimit patches to see about
improving this.
Here's a potential patch to make --bwlimit better. This started with
Roger's idea on accumulating delay until we have enough to make a sleep
call without significant rounding error, but I modified it to keep count
in bytes written so that we should avoid the problem discovered when
bwlimit is 4001KBPS or larger. The patch subtracts out elapsed time
since the last call to sleep_for_bwlimit() (but only in a limited way)
and also makes note of any rounding after the sleep() call when it
resets the counter. I also changed the use of 1000 for "K" to 1024 so
that it would more closely match the value reported by the progress
output. Finally, I applied a modified version of the patch that Paul
just reminded us that Debian is using, though I decided to limit the
write size to "bwlimit * 512" rather than "bwlimit * 100" (at least for
now, but feel free to argue that a different value is better).

Comments? Is this overkill? Does it have flaws? In my limited testing
this made the bwlimit more accurate.

..wayne..
Paul Slootman
2004-05-26 07:42:55 UTC
Permalink
Post by Wayne Davison
output. Finally, I applied a modified version of the patch that Paul
just reminded us that Debian is using, though I decided to limit the
write size to "bwlimit * 512" rather than "bwlimit * 100" (at least for
now, but feel free to argue that a different value is better).
What is a typical value for "len-total"? If it's typically less than a
couple of k, then bwlimit * 512 is a bit big, meaning that the patch
there will do mostly nothing...

I think the 100 was chosen because the point of this patch was to
prevent bursts of writing, and then waiting for buffers to drain. If you
have a 512kbit link, you typically don't want writes of more than
5kbytes if you want to also use the link interactively, it will take
0.1s for 5kB to go over the line (ignoring overhead). So if I start a
transfer with --bwlimit=40, I'd expect to still be able to use an
interactive ssh session over the same line without big delays. 40*512
means writes of 20k, meaning my keystrokes will take almost half a
second to go out if one of these writes have just been done. With 40*100
writes are 4kB, still much more than the typical MTU, and the max. delay
should be less than one tenths of a second.

So my vote is still for 100.

Paul Slootman
Wayne Davison
2004-05-26 10:09:59 UTC
Permalink
Post by Paul Slootman
What is a typical value for "len-total"?
The most typical value is "4". Once a file starts to transfer, values
are typically several K (I've seen up to 32K, but only on larger files).

I had been thinking that a better algorithm for setting the max value
correctly for both large and small values of bwlimit would be good.
Maybe something like this:

--- io.c 15 May 2004 19:31:10 -0000 1.121
+++ io.c 26 May 2004 09:48:27 -0000
@@ -47,6 +47,7 @@ static time_t last_io;
static int no_flush;

extern int bwlimit;
+extern size_t bwlimit_writemax;
extern int verbose;
extern int io_timeout;
extern int am_server;
@@ -812,6 +813,8 @@ static void writefd_unbuffered(int fd,ch
if (FD_ISSET(fd, &w_fds)) {
int ret;
size_t n = len-total;
+ if (bwlimit && n > bwlimit_writemax)
+ n = bwlimit_writemax;
ret = write(fd,buf+total,n);

if (ret < 0) {
--- options.c 24 May 2004 18:38:05 -0000 1.152
+++ options.c 26 May 2004 09:48:27 -0000
@@ -83,6 +83,7 @@ int safe_symlinks = 0;
int copy_unsafe_links = 0;
int size_only = 0;
int bwlimit = 0;
+size_t bwlimit_writemax = 0;
int delete_after = 0;
int only_existing = 0;
int opt_ignore_existing = 0;
@@ -728,6 +729,12 @@ int parse_arguments(int *argc, const cha
if (do_progress && !verbose)
verbose = 1;

+ if (bwlimit) {
+ bwlimit_writemax = (size_t)bwlimit * 128;
+ if (bwlimit_writemax < 512)
+ bwlimit_writemax = 512;
+ }
+
if (files_from) {
char *colon;
if (*argc != 2 && !(am_server && am_sender && *argc == 1)) {

This makes the calculation more like the original for larger bwlimit
values, but tries to avoid making really small TCP packets for smaller
bwlimit values (which was a concern someone expressed a while back).
The above logic for setting bwlimit_writemax should probably be tuned
with actual testing (it's currently just an arbitrary "this looks good"
choice).

..wayne..

Loading...