Inter-service Rate Limiting
Overview
When interacting with external APIs or services, adhering to the rate limits imposed by the service provider is crucial. Aperture can model these external API rate limits, enabling rate limits at the calling service in a distributed application. This approach not only prevents potential penalties for exceeding the rate limits, but also allows Aperture to prioritize access to the external API, ensuring essential workloads receive a fair share of the API quota.
Configuration
This policy is based on the
Quota Scheduler
blueprint. It uses a quota scheduler to regulate the control point
some-external-api
, which represents all outgoing requests made to an
external API from services within a distributed application. The rate limit is
applied per api_key
label, with an interval set to 1
second, meaning
that the token bucket is replenished with 25
tokens every second, up to a
maximum of 500
tokens. Lazy sync is enabled on the rate limiter, which
allows the rate limit counters on each Agent to sync four times every interval
(1 second).
The WFQ Scheduler prioritizes interactive requests with 200
priority and
background requests with 50
priority, ensuring that interactive calls
receive roughly four times the quota share of background requests.
The below values.yaml
file can be generated by following the steps in the
Installation section.
- aperturectl values.yaml
# yaml-language-server: $schema=../../../../../../blueprints/quota-scheduling/base/gen/definitions.json
# Generated values file for quota-scheduling/base blueprint
# Documentation/Reference for objects and parameters can be found at:
# https://docs.fluxninja.com/reference/blueprints/quota-scheduling
blueprint: quota-scheduling/base
uri: ../../../../../../../blueprints
policy:
policy_name: inter-service-rate-limiting
quota_scheduler:
alerter:
alert_name: "More than 90% of requests are being rate limited"
bucket_capacity: 500
fill_amount: 25
selectors:
- control_point: some-external-api
rate_limiter:
limit_by_label_key: api_key
interval: 1s
lazy_sync:
enabled: false
num_sync: 4
scheduler:
workloads:
- label_matcher:
match_labels:
call_type: background
parameters:
priority: 50.0
- label_matcher:
match_labels:
call_type: interactive
parameters:
priority: 200.0
Generated Policy
apiVersion: fluxninja.com/v1alpha1
kind: Policy
metadata:
labels:
fluxninja.com/validate: "true"
name: inter-service-rate-limiting
spec:
circuit:
components:
- flow_control:
quota_scheduler:
in_ports:
bucket_capacity:
constant_signal:
value: 500
fill_amount:
constant_signal:
value: 25
out_ports:
accept_percentage:
signal_name: ACCEPT_PERCENTAGE
rate_limiter:
interval: 1s
lazy_sync:
enabled: false
num_sync: 4
limit_by_label_key: api_key
scheduler:
workloads:
- label_matcher:
match_labels:
call_type: background
parameters:
priority: 50
- label_matcher:
match_labels:
call_type: interactive
parameters:
priority: 200
selectors:
- control_point: some-external-api
- decider:
in_ports:
lhs:
signal_name: ACCEPT_PERCENTAGE
rhs:
constant_signal:
value: 90
operator: gte
out_ports:
output:
signal_name: ACCEPT_PERCENTAGE_ALERT
- alerter:
in_ports:
signal:
signal_name: ACCEPT_PERCENTAGE_ALERT
parameters:
alert_name: More than 90% of requests are being rate limited
evaluation_interval: 1s
resources:
flow_control:
classifiers: []
Installation
Generate a values file specific to the policy. This can be achieved using the command provided below.
aperturectl blueprints values --name=quota-scheduling/base --version=v2.33.1 --output-file=values.yaml
Apply the policy using the aperturectl
CLI or kubectl
.
- aperturectl (Aperture Cloud)
- aperturectl (self-hosted controller)
- kubectl (self-hosted controller)
aperturectl cloud blueprints apply --values-file=values.yaml
Pass the --kube
flag with aperturectl
to directly apply the generated policy
on a Kubernetes cluster in the namespace where the Aperture Controller is
installed.
aperturectl blueprints generate --values-file=values.yaml --output-dir=policy-gen
aperturectl apply policy --file=policy-gen/policies/inter-service-rate-limiting.yaml --kube
Apply the generated policy YAML (Kubernetes Custom Resource) with kubectl
.
aperturectl blueprints generate --values-file=values.yaml --output-dir=policy-gen
kubectl apply -f policy-gen/policies/inter-service-rate-limiting-cr.yaml -n aperture-controller
Policy in Action
The quota scheduler successfully ensures that the request rate remains within the specified limits, as seen below, with a steady state of 25 requests per second. Additionally, the workload decisions panel shows that interactive requests receive approximately four times the acceptance rate compared to background requests.
Circuit Diagram for this policy.