Creating a Knowledge Base

To create a new Knowledge Base head to the orq.ai Studio.

Use the + button in a chosen Project and select Knowledge Base.

Press Create Knowledge, the following modal will appear:

Here you can enter a unique **Key** that will be used to reference your Knowledge within [Playgrounds](doc:llm-experimentation-playground) and [Deployments](ref:deployments-2). Also enter a **Name** and which **Project** this knowledge belongs in. — Here you can enter a unique **Key** that will be used to reference your Knowledge Base within Playground and Deployment. Also enter a **Name** and which **Project** this knowledge belongs in. You can also choose an available **Embedding Model** to use during knowledge search.

🚧
You can only create a Knowledge Base once you have activated an embedding model within your Model Garden

👍
With the Knowledge Bases API, it is now possible to create Knowledge Bases with code, to learn more, see Creating a Knowledge Base Programmatically.

Adding a source

You are then taken to the source management page.

A source represents a document that is loaded within your Knowledge Base. This document's information will then be used when referencing and querying the Knowledge Base.

Documents needs to be loaded ahead of time so that they can be parsed and cut into chunks. Language models will then use the loaded information as source for answering user queries.

To load a new source, select the Add Source button. Here you can add any document of the following format: TXT, PDF, DOCX, CSV, XML.

🚧
While you can add any number of sources to a Knowledge Base, A single source document must be of a maximum of 10MB.

Once you have selected files from your disk, you will be able to configure how the file is parsed and indexed within the Knowledge Base.

Chunk Setting

Chunks refer to the size of information source documents will be divided in. The bigger chunks the more information they contain, the smaller the chunks the cheaper their transfer costs.

📘
You can configure how you want chunks to be configured for your source, see Chunking Strategy.

Data Cleanup

You can choose to modify the data loaded within your sources, this can be great to clean the chunks or anonymize data. To activate each cleanup, simply toggle on the option within the data cleanup panel.

Summary and Cost Estimation

Once your document has been processed, the following summary will be displayed:

Here you can see details of the data parsed into your Knowledge Base and estimate the cost of retrieval.

Retrieval Testing

The right side of the panel lets you query the Embedding model directly. Type any query to test whether knowledge retrieval works correctly. Chunks will be returned ordered by score.

📘
To learn about of the capabilities of Testing, see Retrieval testing.

Knowledge Settings

By choosing the Knowledge Settings button, you can configure the following settings.

Embedding Models

Here, you can configure which llm model to use to query the Knowledge Base. Your configuration here is similar to any model configuration within Playground, Experiment, Deployment and includes the usual parameters

Retrieval Settings

Here you can configure how search will be made within the sources.

📘
To learn more, see Retrieval Settings.

Agentic RAG

You can enable Agentic RAG to improve the quality and performance of your Retrieval.

📘
To learn more, see Agentic RAG.