Chunking strategy

Configure how your datasources are cleaned up and split into chunks.

Chunks are portions of a source document loaded within a Knowledge Base. When adding a new source to a Knowledge Base, you can decide how this source's information will be chunked.

Larger chunks will hold more relevant information but will imply more token use when sent to a model.

Here are the options you can choose when configuring your chunks.

Default

Automatically set chunk and preprocessing rules. Unfamiliar users are recommended to select this.

Advanced

Maximum Chunk Length

Defines the maximum size of each chunk. The bigger the size, the more information they contain.

Chunk Overlap

Defines the number of characters overlapping neighboring chunks. The higher the value, the more chunks will contain redundant information from one another, but the more likely relevant information will be sent back to models.