Last April, a token bridge hack spike sent Ethereum gas prices to 5,000 gwei. Payment processors that bragged about "99.99% uptime" started bouncing transactions with cryptic timeouts. Their infrastructure wasn't built for adversarial conditions. It was built for normal Tuesday afternoons.
Rate limiting isn't sexy. Nobody writes Medium posts titled "Our Rate Limiting Strategy Is Mediocre But It Works." But when the network catches fire, you find out immediately who designed their backpressure model for real chaos and who just ran load tests against their own servers.
Why Normal Rate Limiting Fails On-Chain
Traditional payment systems rate limit to protect databases. A database can handle maybe 10,000 writes per second. Exceed that and you get connection pool exhaustion. So you add a rate limiter. Request comes in, you check a counter, increment it, reject if above threshold. Done.
Blockchain payment infrastructure has to rate limit for three completely different reasons.
First, gas cost spikes. When a single transaction costs $500 instead of $5, your queue of pending payments suddenly represents $10 million in sunk costs. You can't just let all requests through and optimistically assume gas will drop. You need to shed load intelligently, not randomly.
Second, settlement latency explodes. A normal transaction confirms in 12 seconds on Ethereum. During network congestion, it takes 3 minutes. Mempool gets backed up. Your queue of pending payments goes from 100 items to 3,000 items in 30 seconds because the settlement side is moving at 1% speed but the intake side doesn't know that yet.
Third, RPC endpoints lie or disappear entirely. When gas spikes, public RPC endpoints get hammered. Alchemy rate limits you. Infura times out. You're staring at a request you accepted from a merchant two minutes ago and you have no idea if the transaction was even mined. Your rate limiter doesn't know this yet. It just sees requests coming in.
A simple counter-based rate limiter handles none of these. You need backpressure that flows upstream.
Backpressure and Circuit Breakers For Load Shedding
Backpressure is when a system that's overloaded tells the upstream to slow down instead of accepting everything and failing silently. In payment infrastructure, you need three layers.
At the queue layer, you monitor settlement latency. If transactions are taking 10 times longer to confirm than the rolling 5-minute average, you reduce the acceptance rate by 50%. The next merchant to hit the API gets a rate-limit-remaining header showing they have 50% capacity left. They can queue their payment and retry, or they can back off gracefully. Either way, you're not accepting $10 million of transaction requests that will all timeout in 3 minutes.
At the gas price layer, you monitor on-chain conditions. If gas is above a certain threshold, you route payments to lower-congestion chains (Polygon, Arbitrum) or you queue them for off-peak settlement. You don't reject them outright. You tell the caller "your payment is accepted but will settle in 8 minutes instead of 30 seconds due to network conditions." The merchant can decide if that's acceptable or if they want to use a different settlement chain.
At the RPC layer, you run consensus between multiple providers. Your code calls Alchemy, Infura, and QuickNode simultaneously. If two out of three agree on the block height, you use that as truth. If they diverge, you trigger a circuit breaker. All downstream requests get a 503 Service Unavailable and a retry hint. You don't keep accepting payments while your RPC providers are showing you three different versions of the chain.
Circuit breakers are critical. When you detect an anomaly (RPC divergence, gas spike above 3x normal, queue backlog above 5,000 items), you don't gradually degrade. You open the circuit. New requests get a clear, fast error message. Existing payments stay queued. The system tries to auto-recover by checking RPC consensus or waiting for gas to normalize. When RPC providers agree again or gas drops back to normal, the circuit closes and you resume accepting new requests.
Most systems don't have circuit breakers. They have timeouts. Timeouts that are set too long so merchants wait 5 minutes for a 503 error. Or timeouts that are too short so you get cascading failures where one slow endpoint brings everything down.
A proper circuit breaker rejects new requests immediately (return a 503 in 50ms) while the system tries to fix itself. That costs you some transaction volume during the incident. But you're not turning an incident into a catastrophe.
For on-chain payments, build a stateful rate limiter aware of downstream health, not just request count. Run a health monitor that samples the blockchain every 10 seconds. Measure gas price, confirmation latency, and RPC consensus. Under normal conditions (100 gwei, 30s confirmation), accept 100% of requests. Under moderate stress (200 gwei, 60s confirmation), accept 70% and queue the rest. Under severe stress (500 gwei, 3min confirmation), accept 30%. Use token bucket algorithms to give fair access by tier. Bound your queue at 5,000 items; return 429 Too Many Requests with Retry-After beyond that threshold.
The Honest Conversation
Every vendor claims "99.99% uptime" and "enterprise-grade availability." This is a meaningless metric designed for sales presentations. What matters is what happens when the chain melts down.
Can you shed load gracefully? Does your backpressure model flow upstream? Do you have circuit breakers or just timeouts? Can you distinguish between "network is slow" and "RPC provider is lying"? Do you route payments to alternative chains when your primary chain is congested? Can you explain to a customer why their payment took 8 minutes instead of 30 seconds?
Most platforms can't answer these questions. They have rate limiters that protect their database but not their merchants' money. They have timeouts instead of circuit breakers. They have a single RPC endpoint with no fallback.
When gas spiked last April, the platforms that survived were the ones that had built for adversarial conditions from day one. Not because they predicted that specific attack. But because they understood that blockchains are public, decentralized, and subject to conditions you can't control. You can only control how gracefully you degrade when they happen.
Rate limiting is about accepting reality. The chain will melt down. Your job is to shed load intelligently, tell the upstream what happened, and recover without losing money.
That's engineering. Everything else is optimism.