Reindex Website
Re-crawl a website by starting a new crawl job. The job will delete old pages before indexing. Uses the configuration from the original index request.
Authentication
AuthorizationBearer
Bearer authentication of the form Bearer <token>, where token is your auth token.
Path Parameters
domain
Request
This endpoint expects an object.
base_url
The base URL to re-crawl (will delete old pages and re-index)
domain_filter
Domain to filter crawling (e.g., ‘docs.example.com’). If not provided, uses previous config.
path_filter
Path prefix to restrict crawling (e.g., ‘/docs’). If not provided, uses previous config.
url_pattern
Regex pattern to filter URLs (e.g., ‘https://example\.com/(docs|api)/.*’). If not provided, uses previous config.
chunk_size
Size of text chunks for splitting documents. If not provided, uses previous config.
chunk_overlap
Overlap between consecutive chunks. If not provided, uses previous config.
min_content_length
Minimum content length to index a page. If not provided, uses previous config.
max_pages
Maximum number of pages to crawl. If not provided, uses previous config.
delay
Delay in seconds between requests. If not provided, uses previous config.
version
Version to tag all indexed pages with. If not provided, uses previous config.
product
Product to tag all indexed pages with. If not provided, uses previous config.
authed
Whether indexed pages should be auth-gated. If not provided, uses previous config.
Response
Successful Response
job_id
ID to track the re-crawling job status
base_url
The base URL being re-crawled