Skip to main content

Creating a Knowledge Base

Use the + button in a chosen Project and select Knowledge Base > Internal. Press Create Knowledge, the following modal will appear:

Here you can enter a unique Key that will be used to reference your Knowledge Base within Prompts and Deployments. Also enter a Name and which Project this knowledge belongs in. You can also choose an available Embedding Model to use during knowledge search.

You can only create a Knowledge Base once you have activated an embedding model within your Model Garden
With the Knowledge Bases API, it is now possible to create Knowledge Bases with code, to learn more, see Creating a Knowledge Base Programmatically.

Adding a source

You are then taken to the source management page. A source represents a document that is loaded within your Knowledge Base. This document’s information will then be used when referencing and querying the Knowledge Base. Documents needs to be loaded ahead of time so that they can be parsed and cut into chunks. Language models will then use the loaded information as source for answering user queries. To load a new source, select the Add Source button. Here you can add any document of the following format: TXT, PDF, DOCX, CSV, XML.
While you can add any number of sources to a Knowledge Base, A single source document must be of a maximum of 10MB.
Once you have selected files from your disk, you will be able to configure how the file is parsed and indexed within the Knowledge Base.

Chunk Settings and Strategies

Chunks are portions of a source document loaded within a Knowledge Base. When adding a new source to a Knowledge Base, you can decide how this source’s information will be chunked. Larger chunks will hold more relevant information but will imply more token use when sent to a model, impacting the generation cost.
Splits text into chunks based on token count. Best for ensuring chunks fit within LLM context windows and maintaining consistent chunk sizes for embedding models.
ParameterDescriptionDefault
chunk_sizeMaximum tokens per chunk512
chunk_overlapNumber of tokens to overlap between chunks0
Splits text at sentence boundaries while respecting token limits. Ideal for maintaining semantic coherence and readability.
ParameterDescriptionDefault
chunk_sizeMaximum tokens per chunk512
chunk_overlapNumber of overlapping tokens between chunks0
min_sentences_per_chunkMinimum number of sentences per chunk1
Recursively splits text using a hierarchy of separators (paragraphs, sentences, words). Versatile general-purpose chunker that preserves document structure.
ParameterDescriptionDefault
chunk_sizeMaximum tokens per chunk512
separatorsHierarchy of separators to use["\n\n", "\n", " ", ""]
min_characters_per_chunkMinimum characters allowed per chunk24
Groups semantically similar sentences using embeddings. Excellent for maintaining topic coherence and context within chunks.
ParameterDescriptionDefault
chunk_sizeMaximum tokens per chunk512
embedding_modelEmbedding model for similarity (required)-
dimensionsNumber of dimensions for embedding output-
thresholdSimilarity threshold (0-1) or “auto""auto”
modeChunking mode: “window” or “sentence""window”
similarity_windowWindow size for similarity comparison1
AI-powered intelligent chunking that uses an LLM to determine optimal split points. Best for complex documents requiring intelligent segmentation.
ParameterDescriptionDefault
modelLLM model to use for chunking (required)-
chunk_sizeMaximum tokens per chunk1024
candidate_sizeSize of candidate splits for LLM evaluation128
min_characters_per_chunkMinimum characters allowed per chunk24
High-performance SIMD-optimized byte-level chunking. Best for large files (>1MB) where speed and memory efficiency are critical. 2x faster and 3x less memory than token-based chunking.
ParameterDescriptionDefault
target_sizeTarget chunk size in bytes4096
delimitersSingle-byte delimiters to split on (e.g., "\n.?!")"\n.?"
patternMulti-byte pattern for splitting (e.g., "▁" for SentencePiece)-
prefixAttach delimiter to start of next chunkfalse
consecutiveSplit at START of consecutive delimiter runsfalse
forward_fallbackSearch forward if no delimiter found backwardfalse
When to use Fast: Large files (>1MB), high-throughput ingestion, memory-constrained environments.When NOT to use Fast: When you need precise token counts for embedding models, small documents where speed isn’t critical, or when semantic boundaries matter more than byte boundaries.
Strategy Selection Guide
Use CaseRecommended Strategy
Large files (>1MB)Fast - 2x faster, 3x less memory
RAG with precise tokensToken or Recursive
Semantic searchSemantic
Complex document understandingAgentic
General purposeRecursive
UI Configuration
Automatically set chunk and preprocessing rules. Unfamiliar users are recommended to select this.
Maximum Chunk LengthDefines the maximum size of each chunk. The bigger the size, the more information they contain.Chunk OverlapDefines the number of characters overlapping neighboring chunks. The higher the value, the more chunks will contain redundant information from one another, but the more likely relevant information will be sent back to models.

Data Cleanup

You can choose to modify the data loaded within your sources, this can be great to clean the chunks or anonymize data. To activate each cleanup, simply toggle on the option within the data cleanup panel.

Summary and Cost Estimation

Once your document has been processed, the following summary will be displayed:

Here you can see details of the data parsed into your Knowledge Base and estimate the cost of retrieval.

Retrieval Settings

You can configure these options on the Knowledge Settings page. Each option will yield different results, depending on your needs.

Search Methods

Search Parameters

All previous search types can be configured with the following parameters:
This parameter sets the number of chunks most similar to the user’s questions.
This controls the relevance of the results on a scale from 0 to 1. Results scoring below threshold will be excluded from retrieval.The closer to 1, the more relevant and narrow the results will be.
Setting a too high threshold can yield little to no result to a search.

Reranking

Reranking invokes a model that analyzes your initial query and the result fetched by the Knowledge Base search. This model then scores the similarity of the chunks returned with the user query, then scores and ranks the chunks accordingly. This ensures the results is the most relevant for your query.

You can choose a rerank model within your Knowledge Base settings, click on the model name to choose one.

To use reranking within your Knowledge Base, you must enable at least one Reranking model within your Model Garden.

Knowledge Settings

By choosing the Knowledge Settings button, you can configure the following settings.

Embedding Models

Here, you can configure which llm model to use to query the Knowledge Base. Your configuration here is similar to any model configuration within Playground, Experiment, Deployment, and Agent and includes the usual parameters

Here you can define which model to use. You need to have activated Embedding models within your [Model Garden](/docs/model-garden/overview).

Agentic RAG

Settings****Agentic RAG

To enable Agentic RAG, head to the Settings of your Knowledge Base and toggle on Agentic RAG. You will then be able to configure the related model.

The agent performs two main actions:
  • Document Grading, which ensures relevant chunks are retrieved.
  • Query Refinement, improving the query if needed.
Example
See the screenshot below on how the input query gets refined. Input query: is my suitcase too big? is reformulated to luggage size requirements and restrictions for carry-on and checked baggage

Connecting an External Knowledge Base

To connect to an external Knowledge Base, choose the + button on the desired Project.

Choose `External` when connecting your Knowledge Base

The following modal opens to configure an external knowledge base.
Connect External Kb

Configuration Modal.

Configuration
FieldDescriptionExample
KeyUnique Identifier, alphanumeric with hyphens/underscoreexternal_kb
DescriptionDescriptionExternal Knowledge Base
NameDisplay NameExternal Knowledge Base Name
API URLURL to search knowledge base, must be HTTPShttps://api.example.org/search
API KeyAuthentication API key to the previously API URL. Orq will use Bearer Authentication Header to call your API.<API_KEY>
orq.ai will include the API Key in the Authorization: Bearer <API_KEY> header when calling your endpoint.
API keys are encrypted using workspace-specific keys (AES-256-GCM)
Select Connect to finalize connecting your external Knowledge Base.

API Payloads

Here are example payloads for request and response expected from your API.
{
  "query": "<string>",
  "top_k": 50,
  "threshold": 0.5,
  "filter_by": {},
  "search_options": {
    "include_vectors": true,
    "include_metadata": true,
    "include_scores": true
  },
  "rerank_config": {
    "model": "cohere/rerank-multilingual-v3.0",
    "threshold": 0,
    "top_k": 10
  }
}
{
  "matches": [
    {
      "id": "<string>",
      "text": "<string>",
      "vector": [
        123
      ],
      "metadata": {},
      "scores": {
        "rerank_score": 123,
        "search_score": 123
      }
    }
  ]
}
The API must respond like a standard Knowledge Base search, to learn more about the expected payload, see our Search API.

Example Implementation for an External API

We’ve created example implementation for External Knowledge Base API.

Get the Code

Install Dependencies

# Install dependencies
pip install -r requirements.txt

Run the Server

# Run the server
uvicorn main:app --reload

Test the API

The API is running at http://localhost:8000Dynamic Documentation will be running at http://localhost:8000/docs 

Get the Code

Install Dependencies

# Install dependencies
npm install

Run the Server

# Run the server
npm run dev

Test the API

The API is running at http://localhost:8000Dynamic Documentation will be running at http://localhost:8000/doc 

Integrating Vector Database Providers

We support providers like Weaviate and Pinecone, as both platforms expose REST APIs that conform to the expected payload format documented above. Integration Examples

Common Errors and Troubleshooting

ScenarioError Message
HTTP instead of HTTPS”External knowledge base URL must use HTTPS protocol”
Local/private IP”External knowledge base URL cannot point to local network”
API unreachable”Failed to verify external knowledge base connectivity”
API timeout (>50s)“External API request timed out”
Problem: Cannot connect to external API
  1. Verify your API endpoint is publicly accessible via HTTPS
  2. Check your API logs for incoming requests from orq.ai IP addresses
  3. Verify your firewall/security groups allow inbound HTTPS traffic
Problem: API key authentication failing
  1. Verify the API key is correct and has not expired
  2. Check that your API expects Bearer authentication in the Authorization header
  3. Confirm your API key has the necessary permissions to perform searches
Problem: No results returned or poor quality results
  1. Verify your API returns the expected response format (see Response Payload above)
  2. Check that scores.search_score values are between 0 and 1
  3. Test with different threshold values (lower threshold = more results)
  4. If using reranking, ensure both search_score and rerank_score are provided
  5. Verify your external vector database has sufficient indexed documents
Problem: Slow response times
  1. Monitor your external API response times
  2. Consider implementing caching for frequently searched queries
  3. Optimize your vector database indexes
  4. Check if your external API is rate limiting requests

Configuring your External Knowledge Base

Datasource configuration is not accessible within External Knowledge base, as data is hosted outside of Orq.ai.
The available configurations are:
For detailed configuration options, scroll to the Knowledge Settings section above to see all available features for both internal and external Knowledge Bases.
Your External Knowledge Base is connected:

Retrieval Logs

When Using a Knowledge Base in a Prompt within Playground, Experiment, Deployment, or Agent, logs are generated and will transparently contain details of how Knowledge Bases were accessed. To find logs, head to the Logs tabs within the module you’re in, then select one log to open the detail panel for one log entry. The following panel will open. On the right side of the screen you have access to the Retrievals section that details the Knowledge Base used and how it was queried. The Query can be seen, which was used to retrieve the relevant chunks from the Knowledge Base. The Documents show the retrieved chunks, ordered by relevance score.

User Message Augmentation

On the left side of the panel, you can see how the Knowledge Base variable is modified with the retrieval results in blue. The blue parts of the messages are the retrieval results injected into the user message. They will then be used as data with which the model can respond to the user query.
Using the blue text, you can verify that the query is correct and that the expected chunks are loaded into the message.