Manual Chapter : About rate limiting API requests

Applies To:

Show Versions Show Versions


  • 17.1.0, 17.0.0, 16.1.4, 16.1.3, 16.1.2, 16.1.1, 16.1.0
Manual Chapter

About rate limiting API requests

Rate limiting allows you to enhance system security and protect API servers by limiting the number of API requests that can be made within a given period of time. A request can be as simple as a
request for the homepage of a website or a
request on a log‑in form.
You develop rate limiting configurations that specify Request Quotas and Spike Arrest limits. Request Quota and Spike Arrest are counters that have different lifetime windows. The Request Quota time window is typically much longer than the Spike Arrest time window. To keep all the requests allowed in the longer window from being exhausted immediately, Spike Arrest prevents a shorter term “spike” in the request rate.
For example, if you have a Request Quota of 1000 requests per minute and a Spike Arrest of 50 requests per 10 seconds, if 50 requests are received in 5 seconds, for the next 5 seconds all requests are sent to the fallback branch but they are still counted in the request quota. The 1 minute window continues.
If clients send 100 requests in 10 seconds. The first 50 requests are passed, and the next 50 requests are sent to the fallback branch (but still count toward the quota). After 10 seconds, the available quota is reduced by 100, and becomes 900.
Often companies that provide APIs charge for different levels of access to the API. By classifying API requests and using rate limiting, you can direct different classes of users to different quotas. You can use multiple Rate Limiting agents in a per-request policy to impose additional restrictions, as needed.