Chunking strategy
Configure how your datasources are cleaned up and split into chunks.
Chunks are portions of a source document loaded within a Knowledge Base. When adding a new source to a Knowledge Base, you can decide how this source's information will be chunked.
Larger chunks will hold more relevant information but will imply more token use when sent to a model, impacting the generation cost.
Here are the options you can choose when configuring your chunks.
Default
Automatically set chunk and preprocessing rules. Unfamiliar users are recommended to select this.
Advanced
Maximum Chunk Length
Defines the maximum size of each chunk. The bigger the size, the more information they contain.
Chunk Overlap
Defines the number of characters overlapping neighboring chunks. The higher the value, the more chunks will contain redundant information from one another, but the more likely relevant information will be sent back to models.
Updated 3 months ago