Reindex Website
Re-crawl a website by starting a new crawl job. The job will delete old pages before indexing. Uses the configuration from the original index request.
身份验证
AuthorizationBearer
Bearer 身份验证,格式为 Bearer <token>,其中 token 是您的身份验证令牌。
路径参数
domain
请求
This endpoint expects an object.
base_url
The base URL to re-crawl (will delete old pages and re-index)
domain_filter
Domain to filter crawling (e.g., ‘docs.example.com’). If not provided, uses previous config.
path_filter
Path prefix to restrict crawling (e.g., ‘/docs’). If not provided, uses previous config.
url_pattern
Regex pattern to filter URLs (e.g., https://example\.com/(docs|api)/.*). If not provided, uses previous config.
chunk_size
Size of text chunks for splitting documents. If not provided, uses previous config.
chunk_overlap
Overlap between consecutive chunks. If not provided, uses previous config.
min_content_length
Minimum content length to index a page. If not provided, uses previous config.
max_pages
Maximum number of pages to crawl. If not provided, uses previous config.
delay
Delay in seconds between requests. If not provided, uses previous config.
version
Version to tag all indexed pages with. If not provided, uses previous config.
product
Product to tag all indexed pages with. If not provided, uses previous config.
authed
Whether indexed pages should be auth-gated. If not provided, uses previous config.
响应
Successful Response
job_id
ID to track the re-crawling job status
base_url
The base URL being re-crawled
错误
422
Unprocessable Entity Error