Traffic Shaping with the Token Bucket Algorithm
The Token Bucket (or Leaking Bucket) algorithm is a mechanism to rate-limit the average traffic in a stream, while allowing for a certain amount of burstiness as well.
The Basic Algorithm
The basic idea is quite simple, really. First, define a bucket size; this will control the maximum burst size that is allowed to pass. Then define a flow of tokens that are being passed into the bucket, at constant rate: this determines the average flow that is allowed. (If the bucket is full, additional tokens are dropped.)
When a network packet arrives on the stream, the corresponding number of tokens is removed from the bucket, and the packet is allowed to proceed. If the bucket does not contain a sufficient number of tokens, the packet is either queued or dropped (or even passed, but marked as “invalid” in some way).
The “tokens” need not be discrete; in fact, they do not need to be added explicitly to the bucket at all. Instead, one can remember the timestamp of the last packet arrival. When the next packet arrives, one multiplies the time since the last arrival by the token flow rate to calculate the bucket content.
Variation: The Leaking Bucket
Described as is, the traffic emanating downstream from the bucket may be bursty. There is an alternative formulation that facilitates maintaining a steady outbound flow. In this variation of the algorithm, the messages themselves act as tokens: each arriving message is queued, provided there is still room in the bucket, otherwise it is dropped. Messages are dequeued from the bucket and passed downstream at a constant rate.
This mechanism guarantees a steady rate of messages emanating from the bucket. If the messages are not all of equal size, additional bookkeeping is necessary in order to ensure that the downstream bandwidth is used evenly.
Statistical Process Control
It is worth noting that a mechanism similar to the algorithm(s) described here is employed in statistical process control, in particular in the creation of control charts (or Shewhart charts). Those measure the cumulative deviation of a process from an expected value. Under normal conditions, the fluctuations around the expected value should cancel out when forming the cumulative sum, but when the process is disturbed, the cumulative error will deviate significantly from zero.