Index Website | Fern Documentation

Start crawling and indexing a website. Returns a job_id to track the crawling progress.

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

domainstringRequired

This endpoint expects an object.

base_urlstringRequired

The base URL to start indexing from (e.g., 'https://docs.example.com')

domain_filterstring or nullOptional

Domain to filter crawling (e.g., 'docs.example.com'). Defaults to base_url domain.

path_filterstring or nullOptional

Path prefix to restrict crawling (e.g., '/docs'). Only URLs starting with this will be crawled.

url_patternstring or nullOptional

Regex pattern to filter URLs (e.g., https://example\.com/(docs|api)/.*).

chunk_sizeinteger or nullOptionalDefaults to 1000

Size of text chunks for splitting documents

chunk_overlapinteger or nullOptionalDefaults to 200

Overlap between consecutive chunks

min_content_lengthinteger or nullOptionalDefaults to 100

Minimum content length to index a page

max_pagesinteger or nullOptional

Maximum number of pages to crawl. None means unlimited.

delaydouble or nullOptionalDefaults to 1

Delay in seconds between requests

versionstring or nullOptional

Version to tag all indexed pages with

productstring or nullOptional

Product to tag all indexed pages with

authedboolean or nullOptional

Whether indexed pages should be auth-gated

Successful Response

job_idstring

ID to track the indexing job status

base_urlstring

The base URL being indexed

422

Unprocessable Entity Error

$	curl -X POST https://fai.buildwithfern.com/sources/website/domain/index \
>	-H "Authorization: Bearer <token>" \
>	-H "Content-Type: application/json" \
>	-d '{
>	"base_url": "https://docs.buildwithfern.com"
>	}'