POST /api/scrape
The main endpoint for scraping URLs. Send a URL, get back the page content with intelligent strategy escalation.
Endpoint
Request Body
Request body must be valid JSON and must not exceed 10 KB in size. The body should include the url field and optional parameters.
| Parameter | Type | Required | Description |
|---|---|---|---|
| url | string (URL) | Yes | The URL to scrape. Must be a valid HTTP or HTTPS URL. Maximum length: 2048 characters. |
Request Example
Here's a simple example of how to make a request to the /api/scrape endpoint using the x402 SDK ( For more information, see the x402 documentation).
Response Formats
Success Response (200)
Partial Response (200)
Returned when content is incomplete (login wall, geoblocked, etc.) but some content was retrieved.
Error Response (400/403/429)
Blocked Host (403):
Target Site Forbidden (403):
Rate Limit / Throttling (429):
Domain Cost Limit (429):
User Domain Limit (429):
Failure Streak Limit (429):
Domain Blocked (429):
Metadata Fields
The metadata object contains detailed information about the scraping process.
| Field | Type | Description |
|---|---|---|
| domain | string | The normalized domain (without www) that was scraped. |
| country | "us" | "eu" | null | The country code used for the request. May be flipped automatically if geoblocked. |
| attempts | number | Total number of scraping attempts made (including escalations). |
| escalations | number | Number of times the scraping strategy was escalated (static → js → premium). |
| byte_size | number | Size of the returned HTML content in bytes. |
| is_partial | boolean | Whether the returned content is partial/incomplete. true if login wall detected or content incomplete. |
| domain_confidence | number | Confidence score (0-1) for the domain strategy. Higher means more reliable strategy. |
| classification_version | string | Version of the classification algorithm used (e.g., 'v1.3'). |
| decision_reason | string | Reason for the final decision. Common values: 'ok', 'hydration_skeleton', 'antibot_challenge', 'geoblocked', 'missing_semantics', 'too_small', etc. |
| report_url | string (optional) | URL to report issues with this domain. Only present in partial responses. |
Attempts and Escalations
The attempts array shows the sequence of scraping attempts made. Each entry follows the format: mode:reason
Example attempts array:
This indicates: first attempt used static mode and detected hydration_skeleton, then escalated to js mode which succeeded (ok).
Escalation Strategy:
- static → Basic HTTP request, no JavaScript rendering
- js → JavaScript rendering enabled for SPAs and dynamic content
- premium → Premium proxies and anti-bot bypass for protected sites
The system automatically escalates based on detected issues (hydration skeletons, anti-bot challenges, etc.). The escalations count shows how many times the strategy was upgraded.
Decision Reasons
The decision_reason field explains why the scraping succeeded, failed, or required escalation.
| Reason | Description |
|---|---|
| ok | Content is complete and valid |
| hydration_skeleton | Detected React/Next.js hydration skeleton - needs JS rendering |
| antibot_challenge | Detected anti-bot protection (Cloudflare, etc.) - needs premium mode |
| geoblocked | Content blocked by geographic restrictions |
| missing_semantics | Missing title or semantic HTML elements - may need JS rendering |
| under_baseline | Content size significantly below expected baseline |
| too_small | Content too small (<700 bytes) - likely incomplete |
| empty_redirect | Empty response with redirect header |
| empty_no_content | Empty or very small response (<100 bytes) |
| empty_response_max_escalation | Empty response even after maximum escalation |
| http_403 | HTTP 403 Forbidden response |
| http_429 | HTTP 429 Too Many Requests |
Response Headers
| Header | Description |
|---|---|
| X-Strategy-Attempts | Number of attempts made during the scraping process. |
| Content-Type | application/json |
Pricing
Per-Request Pricing
PayPerScrape uses x402 (Coinbase's payment protocol) for per-request payments. No accounts, no API keys—just pay with your crypto wallet.
HTTP Status Codes
| Code | Status | Description |
|---|---|---|
| 400 | Bad Request | Invalid URL, malformed request body, or validation error |
| 401 | Unauthorized | Invalid x402 payment signature (if signature verification is enabled) |
| 402 | Payment Required | x402 payment needed to complete request |
| 403 | Forbidden | Domain is blocked (localhost, private networks) or restricted. |
| 413 | Payload Too Large | Request body exceeds maximum size of 10 KB |
| 429 | Too Many Requests | Rate limit exceeded, domain cost throttling, or failure streak limit. Check Retry-After header and error message for details. |
| 206 | Partial Content | Content retrieved but marked as partial/incomplete (only in non-strict mode) |
How It Works
PayPerScrape uses intelligent strategy escalation to get you the best results at the lowest cost:
- Domain Strategy Cache: Each domain's optimal scraping strategy is cached and reused.
- Automatic Escalation: If static scraping fails or detects issues (hydration skeleton, anti-bot, etc.), the system automatically escalates to JS rendering or premium mode.
- Content Evaluation: The returned HTML is evaluated for completeness. If incomplete, the system tries a more advanced strategy.
- Efficiency Optimization: The system always tries the fastest strategy first (static), only escalating when necessary to ensure complete content.
No Platform, Just an Endpoint: Unlike traditional scraping platforms, PayPerScrape is a single API endpoint. No dashboard, no account management, no complex setup. Just send a request with x402 payment and get your content.
Throttling and Protection Systems
Domain Cost Throttling
PayPerScrape automatically classifies domains into cost tiers based on historical processing costs to protect against expensive scraping operations:
- Tier 1 (Low Cost): Domains with low average processing cost. Full speed, 1 retry allowed.
- Tier 2 (Moderate Cost): Domains with moderate average processing cost. Limited to 20 requests/min per wallet, burst protection enabled.
- Tier 3 (High Cost): Domains with high average processing cost. Limited to 5 requests/min per wallet, no retries, strict burst protection.
Domains with excessive processing costs are automatically blocked for 6 hours to maintain service stability and prevent resource drain.
Failure Streak Protection
To prevent infinite loops and billing issues, PayPerScrape tracks consecutive hard failures (403, 429, 5xx) per wallet+domain combination:
- After 3 consecutive hard failures within 5 minutes, requests are blocked for 5 minutes
- Hard failures do not result in charges (you don't pay for failed requests)
- Streak automatically resets on first successful scrape
- This protects your scripts from burning money when domains are blocked
User-Domain Limits
Additional per-wallet, per-domain limits prevent abuse:
- Hourly Limit: Maximum usage per hour per wallet per domain to prevent abuse
- Per-Minute Throttling: Tier-based limits prevent rapid-fire requests to expensive domains
