- Spring Boot
@Scheduledpolling (for illustration only) WebClientfor non-blocking HTTP- Resilience4j
RateLimiterfor token-bucket control - Reactor
Monofor reactive chaining
Why avoid hitting rate limits?
Rate-limited APIs (like Phrase’s) reject requests when you exceed a quota — for logged in users at Phrase, 6000 requests per minute. Hitting the limit can cause:- HTTP
429 Too Many Requestsresponses - Retries that worsen load (retry storms)
- Degradation of service or even temporary API bans (very rarely)
How to recover from hitting the limit
Even with controls in place, you might overshoot occasionally. You should:- Detect
429responses - Retry once with jitter (random delay)
- Suppress stack traces for expected rate-limit errors
- Never block threads
Our Example
This component polls the Phrase API periodically, respecting the rate limit and logging project names reactively. Retrieving a list of project names is really just an example that was used during the creation of this article to make sure the code works as intended.⚙️ WebClient + Resilience4j Setup
- The limiter allows
requestsPerMinuteAPI calls per minute (configurable). - It fails immediately if the quota is exhausted (no queueing or waiting).
- No new threads are created — everything stays non-blocking.
The API call method
- Applies the rate limiter reactively using
transformDeferred(...). - Uses
retryWhen(...)to retry only on429errors. - Adds jitter to avoid retry storms.
JSON → Project name extraction
"name" from the "content" array in the JSON response. Just for readability.
Polling logic with logging & error handling
-
The poller runs every 100 milliseconds by default (configurable via
phrase.poll.delay-ms). While this is safely within the defined rate limit, it’s primarily for demonstration. Real-world applications will typically trigger API requests based on actual events, workflows, or user actions—not by polling a static endpoint in a tight loop. -
Subscribes to the
Mono<List<String>>returned bylistProjects(). - Handles:
-
RequestNotPermitted— client-side rate limit exceeded (token bucket empty) - Other exceptions (e.g. HTTP errors)
What happens when the limit is hit?
There are two possible failure scenarios:1. Client-side limit exceeded
- The
RateLimiterdetects that no tokens are left. - It immediately fails with
RequestNotPermitted. - The
subscribe()block catches it and logs:
2. Server returns HTTP 429
- The server says “too many requests” via a
429 Too Many Requestsresponse. - The
.retryWhen(...)block triggers a single retry after a jittered delay. - If it fails again, the error is logged as usual.
Summary
| Concern | This Example Handles It With |
|---|---|
| Avoiding rate limit | RateLimiterOperator with RPM config |
| Failing fast on quota hit | .timeoutDuration(Duration.ofMillis(0)) |
| Retrying 429s | .retryWhen(...).jitter(...).filter(...) |
| Logging gracefully | `subscribe(…, error -> log.warn |
| Staying non-blocking | Fully reactive: WebClient + Mono + Operator |
Final Result
A minimal but robust setup for:- Scheduled polling
- Reactive rate limiting
- Retry and error handling
- Clean logs and no blocking
WebClient, and you’re good to go.