curl --request POST \
--url https://api.orq.ai/v2/chunking \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"text": "The quick brown fox jumps over the lazy dog. This is a sample text that will be chunked into smaller pieces. Each chunk will maintain context while respecting the maximum chunk size.",
"strategy": "semantic",
"chunk_size": 256,
"threshold": 0.8,
"embedding_model": "openai/text-embedding-3-small",
"dimensions": 512,
"mode": "window",
"similarity_window": 1,
"metadata": true
}
'{
"chunks": [
{
"id": "01HQ3K4M5N6P7Q8R9SATBVCWDX",
"text": "The quick brown fox jumps over the lazy dog.",
"index": 0,
"metadata": {
"start_index": 0,
"end_index": 44,
"token_count": 10
}
},
{
"id": "01HQ3K4M5N6P7Q8R9SATBVCWDY",
"text": "This is a sample text that will be chunked into smaller pieces.",
"index": 1,
"metadata": {
"start_index": 45,
"end_index": 108,
"token_count": 12
}
}
]
}Split large text documents into smaller, manageable chunks using different chunking strategies optimized for RAG (Retrieval-Augmented Generation) workflows. This endpoint supports multiple chunking algorithms including token-based, sentence-based, recursive, semantic, and specialized strategies.
curl --request POST \
--url https://api.orq.ai/v2/chunking \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"text": "The quick brown fox jumps over the lazy dog. This is a sample text that will be chunked into smaller pieces. Each chunk will maintain context while respecting the maximum chunk size.",
"strategy": "semantic",
"chunk_size": 256,
"threshold": 0.8,
"embedding_model": "openai/text-embedding-3-small",
"dimensions": 512,
"mode": "window",
"similarity_window": 1,
"metadata": true
}
'{
"chunks": [
{
"id": "01HQ3K4M5N6P7Q8R9SATBVCWDX",
"text": "The quick brown fox jumps over the lazy dog.",
"index": 0,
"metadata": {
"start_index": 0,
"end_index": 44,
"token_count": 10
}
},
{
"id": "01HQ3K4M5N6P7Q8R9SATBVCWDY",
"text": "This is a sample text that will be chunked into smaller pieces.",
"index": 1,
"metadata": {
"start_index": 45,
"end_index": 108,
"token_count": 12
}
}
]
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Request payload for text chunking with strategy-specific configuration
The text content to be chunked
token Whether to include metadata for each chunk
Return format: chunks (with metadata) or texts (plain strings)
chunks, texts Maximum tokens per chunk
Number of tokens to overlap between chunks
x >= 0Text successfully chunked
Show child attributes
The text content of the chunk
The position index of this chunk in the sequence
Was this page helpful?