random technical thoughts from the Nominet technical team

Flow control with token bucket algorithm

1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 4 out of 5)
Loading ... Loading ...
Posted by jay on May 27th, 2005

We have a number of applications where we have to limit the access of users, otherwise the systems would collapse under the strain of excessive use. For that we rely heavily on the excellent token bucket algorithm. This has three different ways (at least) in which it can be used.

Rate limiting

The most basic use of token bucket is to rate limit a client to a number of operations within a specific time period. For example you could use it to limit the number of whois lookups to 60 per 60 seconds for a particular client. The way to do this is to assign a bucket of 60 tokens to each client when it makes its initial request. Each time it makes a request a token is removed from the bucket. If there are no tokens left then the request is not carried out and so the client is effectively blocked. Every 60 seconds the bucket is refilled to a maximum of 60 tokens and so request can once again be processed.

Rolling window

There are times when you might not be worried about the rate but you do want to limit a client to a maximum number of operations within a rolling time window. For this example lets say you want to limit a client to 10,000 operations per rolling 24 hours and you are going to recalculate every hour. You start by giving each client the maximum 10,000 tokens when it first connects. Once again a token is removed each time a request is serviced and no tokens means no service. Every hour the total number of requests over the last hour is calculated and stored.

The algorithm to refill the bucket is a bit more complex. The total the number of requests received over the last 23 hours (the 23 is the key) is calculated and this total is then subtracted from 10,000, which leaves the maximum remaining requests that could be made over the next hour. So the number of tokens in the bucket is then increased (or decreased) to that maximum.

Just to make sure you get this, assume you want to limit a client to Q queries over a rolling window of W time and you are going to recalculate every P time units. Then every P you refill the bucket to be equal to Q - (Total queries for previous period of (W - P)).

Rolling window and rate limiting

Combining the two is pretty simple if you want to limit a client to a certain number of queries within a rolling window time period plus limiting the maximum rate at which they can carry out operations. Say for example you wanted to allow a maximum of 10,000 operations per rolling 24 hours, with a maximum rate of 1,000 per hour and you are going to recalculate every hour. This time you give each new client a bucket of 1,000 tokens. Once again a token is removed each time a request is serviced and no tokens means no service. Again, every hour the total number of requests over the last hour is calculated and stored.

The algorithm to refill the bucket works like this. Again the total number of requests over the last 23 hours is calculated and again this is substracted from 10,000. This time though the bucket is filled to either this figure if it is less than 1,000 or to 1,000 otherwise. This effectively limits the rate to a maximum of 1,000 per hour.

Of course in the last two examples if someone does exceed their limit then they could be blocked for up to an hour, which may be far too long for your application. The processing requirements obviously increase as you make the recalculation window shorter.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.

Recent Posts

Highest Rated

Categories

Archives

Meta: