Documentation

Rate Limiting

Rate limiting rules and policies to prevent abuse and ensure fair usage of PayPerScrape API endpoints.

Overview

PayPerScrape implements rate limiting to:

  • Prevent abuse and cost attacks
  • Ensure fair usage across all users
  • Protect against automated spam
  • Maintain service stability

Rate limits are enforced using a sliding window algorithm, which provides smooth rate limiting without the "cliff effect" of fixed-window limits.

Note: Rate limits are configurable via environment variables. The limits shown below are the default production values.

Rate Limit Rules

/api/scrape

Main scraping endpoint (paid via x402)

Limits

TypeLimitWindow
IP-based120 requests per minute60 seconds
Wallet-based60 requests per minute60 seconds

Algorithm

Sliding window

Notes

Both limits apply simultaneously. The stricter limit (wallet) takes precedence for paid requests. Additional domain-specific throttling may apply based on domain cost classification.

/api/feedback/submit

Programmatic feedback submission (free)

Limits

TypeLimitWindow
IP-based5 requests per hour3600 seconds
Wallet-based3 requests per hour3600 seconds

Algorithm

Sliding window

Notes

Stricter limits to prevent spam. Domain must be scraped in last 24 hours.

/api/feedback/web

Web form feedback submission (free)

Limits

TypeLimitWindow
IP-based3 requests per hour3600 seconds

Algorithm

Sliding window

Notes

CAPTCHA required. Domain must be scraped in last 7 days.

Sliding Window Algorithm

PayPerScrape uses a sliding window rate limiting algorithm instead of fixed-window limits. This provides several advantages:

  • No Cliff Effect: Fixed-window limits allow bursts at window boundaries (e.g., 20 requests at 0:59 and 20 more at 1:00). Sliding window prevents this.
  • Smooth Rate Limiting: Requests are tracked individually with timestamps, providing accurate rate limiting.
  • Fair Distribution: Rate limits are distributed evenly across the time window.

How It Works:

  1. Each request is stored with its timestamp in a Redis sorted set
  2. When checking rate limits, old requests outside the window are removed
  3. The count of remaining requests is compared against the limit
  4. If under the limit, the request is allowed and its timestamp is added

Rate Limit Headers

When rate limits are checked, the following headers are included in responses:

HeaderDescription
X-RateLimit-LimitMaximum number of requests allowed in the window
X-RateLimit-RemainingNumber of requests remaining in the current window
X-RateLimit-ResetUnix timestamp when the rate limit resets
Retry-AfterSeconds to wait before retrying (only present when rate limit exceeded)

Rate Limit Exceeded

Response (429)

Rate Limit Exceededjson

When a rate limit is exceeded:

  • HTTP status code 429 Too Many Requests is returned
  • The Retry-After header indicates seconds to wait
  • Rate limit headers show current status and reset time
  • Wait for the specified time before retrying

Domain Cost Throttling

In addition to standard rate limits, PayPerScrape implements domain cost throttling to protect against expensive scraping operations and ensure fair resource usage.

Domain Classification Tiers:

  • Tier 1 (Low Cost): Domains with low average processing cost. Full speed allowed, 1 retry permitted.
  • Tier 2 (Moderate Cost): Domains with moderate average processing cost. Limited to 20 requests/min per wallet, 1 retry, burst protection enabled.
  • Tier 3 (High Cost): Domains with high average processing cost or recent expensive requests. Limited to 5 requests/min per wallet, no retries, strict burst protection.

Additional Protections:

  • User-Domain Hourly Limit: Maximum usage per hour per wallet per domain to prevent abuse
  • Domain Auto-Blocking: Domains with excessive processing costs are automatically blocked for 6 hours to maintain service stability
  • Burst Protection: Domain-wide usage limits prevent sudden cost spikes and ensure fair resource distribution

Failure Streak Protection

PayPerScrape includes a Failure Streak Brake System to protect users from infinite loops and billing issues when the scrape returns hard failures.

How It Works:

  • Hard Failures Tracked: 403 (Forbidden), 429 (Rate Limited), and 5xx (Server Errors) are counted as hard failures
  • Per Wallet+Domain: Failure streaks are tracked independently for each wallet and domain combination
  • Automatic Cooldown: After 3 consecutive hard failures within 5 minutes, requests are blocked for 5 minutes
  • Auto-Reset: Streak resets automatically on first successful scrape
  • No Charging: Hard failures do not result in charges, preventing billing for failed requests

Error Response (429):

This protection prevents your scripts from burning money in infinite loops when domains are blocked or Payperscrape is experiencing issues.

Best Practices

  • Monitor Rate Limit Headers: Check X-RateLimit-Remaining to track your usage
  • Implement Exponential Backoff: When receiving 429 responses, wait for the Retry-After period before retrying
  • Respect Limits: Don't attempt to bypass rate limits by using multiple IPs or wallets
  • Handle Failure Streaks: If you receive failure_streak_limit errors, wait 5 minutes before retrying. This indicates the domain may be temporarily blocked.
  • Avoid Expensive Domains: Some domains (Amazon, YouTube, etc.) are automatically classified as Tier 3 and have stricter limits to protect costs