Reindex Website

Re-crawl a website by starting a new crawl job. The job will delete old pages before indexing. Uses the configuration from the original index request.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Path Parameters

domainstringRequired

Request

This endpoint expects an object.
base_urlstringRequired

The base URL to re-crawl (will delete old pages and re-index)

domain_filterstring or nullOptional

Domain to filter crawling (e.g., ‘docs.example.com’). If not provided, uses previous config.

path_filterstring or nullOptional

Path prefix to restrict crawling (e.g., ‘/docs’). If not provided, uses previous config.

url_patternstring or nullOptional

Regex pattern to filter URLs (e.g., ‘https://example\.com/(docs|api)/.*’). If not provided, uses previous config.

chunk_sizeinteger or nullOptional
Size of text chunks for splitting documents. If not provided, uses previous config.
chunk_overlapinteger or nullOptional
Overlap between consecutive chunks. If not provided, uses previous config.
min_content_lengthinteger or nullOptional
Minimum content length to index a page. If not provided, uses previous config.
max_pagesinteger or nullOptional
Maximum number of pages to crawl. If not provided, uses previous config.
delaydouble or nullOptional
Delay in seconds between requests. If not provided, uses previous config.
versionstring or nullOptional
Version to tag all indexed pages with. If not provided, uses previous config.
productstring or nullOptional
Product to tag all indexed pages with. If not provided, uses previous config.
authedboolean or nullOptional

Whether indexed pages should be auth-gated. If not provided, uses previous config.

Response

Successful Response
job_idstring

ID to track the re-crawling job status

base_urlstring

The base URL being re-crawled

Errors