Skip to main content

Creating a Knowledge Base

Use the + button in a chosen Project and select Knowledge Base > Internal. Press Create Knowledge, the following modal will appear:
You can only create a Knowledge Base once you have activated an embedding model within your Model Garden
With the Knowledge Bases API, it is now possible to create Knowledge Bases with code, to learn more, see Creating a Knowledge Base Programmatically.

Adding a source

You are then taken to the source management page. A source represents a document that is loaded within your Knowledge Base. This document’s information will then be used when referencing and querying the Knowledge Base. Documents needs to be loaded ahead of time so that they can be parsed and cut into chunks. Language models will then use the loaded information as source for answering user queries. To load a new source, select the Add Source button. Here you can add any document of the following format: TXT, PDF, DOCX, CSV, XML.
While you can add any number of sources to a Knowledge Base, A single source document must be of a maximum of 10MB.
Once you have selected files from your disk, you will be able to configure how the file is parsed and indexed within the Knowledge Base.

Chunk Settings and Strategies

Chunks are portions of a source document loaded within a Knowledge Base. When adding a new source to a Knowledge Base, you can decide how this source’s information will be chunked. Larger chunks will hold more relevant information but will imply more token use when sent to a model, impacting the generation cost.
Splits text into chunks based on token count. Best for ensuring chunks fit within LLM context windows and maintaining consistent chunk sizes for embedding models.
ParameterDescriptionDefault
chunk_sizeMaximum tokens per chunk512
chunk_overlapNumber of tokens to overlap between chunks0
Splits text at sentence boundaries while respecting token limits. Ideal for maintaining semantic coherence and readability.
ParameterDescriptionDefault
chunk_sizeMaximum tokens per chunk512
chunk_overlapNumber of overlapping tokens between chunks0
min_sentences_per_chunkMinimum number of sentences per chunk1
Recursively splits text using a hierarchy of separators (paragraphs, sentences, words). Versatile general-purpose chunker that preserves document structure.
ParameterDescriptionDefault
chunk_sizeMaximum tokens per chunk512
separatorsHierarchy of separators to use["\n\n", "\n", " ", ""]
min_characters_per_chunkMinimum characters allowed per chunk24
Groups semantically similar sentences using embeddings. Excellent for maintaining topic coherence and context within chunks.
ParameterDescriptionDefault
chunk_sizeMaximum tokens per chunk512
embedding_modelEmbedding model for similarity (required)-
dimensionsNumber of dimensions for embedding output-
thresholdSimilarity threshold (0-1) or “auto""auto”
modeChunking mode: “window” or “sentence""window”
similarity_windowWindow size for similarity comparison1
AI-powered intelligent chunking that uses an LLM to determine optimal split points. Best for complex documents requiring intelligent segmentation.
ParameterDescriptionDefault
modelLLM model to use for chunking (required)-
chunk_sizeMaximum tokens per chunk1024
candidate_sizeSize of candidate splits for LLM evaluation128
min_characters_per_chunkMinimum characters allowed per chunk24
High-performance SIMD-optimized byte-level chunking. Best for large files (>1MB) where speed and memory efficiency are critical. 2x faster and 3x less memory than token-based chunking.
ParameterDescriptionDefault
target_sizeTarget chunk size in bytes4096
delimitersSingle-byte delimiters to split on (e.g., "\n.?!")"\n.?"
patternMulti-byte pattern for splitting (e.g., "▁" for SentencePiece)-
prefixAttach delimiter to start of next chunkfalse
consecutiveSplit at START of consecutive delimiter runsfalse
forward_fallbackSearch forward if no delimiter found backwardfalse
When to use Fast: Large files (>1MB), high-throughput ingestion, memory-constrained environments.When NOT to use Fast: When you need precise token counts for embedding models, small documents where speed isn’t critical, or when semantic boundaries matter more than byte boundaries.
Strategy Selection Guide
Use CaseRecommended Strategy
Large files (>1MB)Fast - 2x faster, 3x less memory
RAG with precise tokensToken or Recursive
Semantic searchSemantic
Complex document understandingAgentic
General purposeRecursive
UI Configuration
Automatically set chunk and preprocessing rules. Unfamiliar users are recommended to select this.
Maximum Chunk LengthDefines the maximum size of each chunk. The bigger the size, the more information they contain.Chunk OverlapDefines the number of characters overlapping neighboring chunks. The higher the value, the more chunks will contain redundant information from one another, but the more likely relevant information will be sent back to models.

Data Cleanup

You can choose to modify the data loaded within your sources, this can be great to clean the chunks or anonymize data. To activate each cleanup, simply toggle on the option within the data cleanup panel.

Summary and Cost Estimation

Once your document has been processed, the following summary will be displayed:

Retrieval Settings

You can configure these options on the Knowledge Settings page. Each option will yield different results, depending on your needs.

Search Methods

Search Parameters

All previous search types can be configured with the following parameters:
This parameter sets the number of chunks most similar to the user’s questions.
This controls the relevance of the results on a scale from 0 to 1. Results scoring below threshold will be excluded from retrieval.The closer to 1, the more relevant and narrow the results will be.
Setting a too high threshold can yield little to no result to a search.

Reranking

Reranking invokes a model that analyzes your initial query and the result fetched by the Knowledge Base search. This model then scores the similarity of the chunks returned with the user query, then scores and ranks the chunks accordingly. This ensures the results is the most relevant for your query.
To use reranking within your Knowledge Base, you must enable at least one Reranking model within your Model Garden.

Knowledge Settings

By choosing the Knowledge Settings button, you can configure the following settings.

Embedding Models

Here, you can configure which llm model to use to query the Knowledge Base. Your configuration here is similar to any model configuration within Playground, Experiment, Deployment, and Agent and includes the usual parameters

Agentic RAG

The agent performs two main actions:
  • Document Grading, which ensures relevant chunks are retrieved.
  • Query Refinement, improving the query if needed.
See the screenshot below on how the input query gets refined.Input query: is my suitcase too big? is reformulated to luggage size requirements and restrictions for carry-on and checked baggage

Chunk Metadata

Each chunk in a Knowledge Base can carry a metadata object: a set of key-value pairs that describe the chunk’s origin, topic, or any custom attribute relevant to your use case. Metadata lets you store all your content in a single Knowledge Base while still scoping retrieval to exactly the right subset of chunks at query time. For example, you can tag each chunk with a client, source, or topic field, then pass a filter at search time to return only the chunks that match. Common use cases:
  • Multi-tenant RAG: tag chunks by client_id to isolate results per customer.
  • Source filtering: filter by filetype or source to restrict results to PDFs, support tickets, or a specific data feed.
  • Topic scoping: tag chunks by topic or category and filter queries to stay on a single subject.

Editing Chunk Metadata

Open a chunk from the datasource view to access the Edit Chunk panel. The panel has three sections:
  • Text: the chunk content.
  • Metadata: a JSON editor pre-filled with the current metadata, or {} if none has been set.
  • Enabled: toggle to enable or disable the chunk.
Edit the metadata JSON directly and save. The metadata object must follow these rules:
  • Valid JSON syntax, otherwise an error is shown.
  • All values must be strings, numbers, or booleans. Nested arrays or objects are not supported.
Pass a filter object to the search API to restrict results to chunks whose metadata matches the specified conditions. See Knowledge Base via the API for examples.

Searching a Knowledge Base

Once your Knowledge Base is populated, you can query it in four ways.
Test your Knowledge Base search directly in the AI Studio using the built-in search panel. This lets you experiment with different search modes, parameters, and queries before integrating the Knowledge Base into Deployments or Agents.
1

Open Knowledge Settings

Navigate to your Knowledge Base and click Knowledge Settings.
2

Enter your search query

Type your query in the Search query field in the right panel.
3

View results

Results appear below showing:
  • Document name (e.g., “Logistics FAQ.docx”)
  • Relevance score for each chunk (e.g., 0.49, 0.48)
  • Chunk content preview
Experiment with different search modes and threshold values to find the optimal configuration for your use case. Lower thresholds return more results but may include less relevant chunks.
Attach a Knowledge Base to a Deployment to automatically retrieve relevant chunks on every call.
  1. Open the Deployment’s configuration and go to Knowledge Bases.
  2. Select Knowledge Base and choose your Knowledge Base.
  3. Set the query type:
    • Last User Message: the user’s latest message is used as the search query automatically.
    • Query: use a predefined query. You can make it dynamic with an input variable such as {{query}}.
  4. Reference the retrieved chunks in your prompt with the {{knowledge_base_key}} syntax. If not explicitly referenced, the chunks are appended to the end of the system message.
Add a Knowledge Base as context to an Agent. Unlike Deployments, the Agent only queries the Knowledge Base when it determines it is necessary, using the query_knowledge_base tool automatically.
  1. In the Agent configuration, go to the Context section and click Add context.
  2. Select your Knowledge Base.
  3. In the Agent’s Instructions, explicitly tell it to use the Knowledge Base. For example:
“First use retrieve_knowledge_bases to see what knowledge sources are available, then use query_knowledge_base to find relevant information before answering.”
The Knowledge Base description must also be explicit so the Agent can identify the right source to query.
To learn more, see Knowledge Bases with Agents.
Query a Knowledge Base directly using the Search Knowledge Base API, without going through a Deployment or Agent.
curl 'https://my.orq.ai/v2/knowledge/{knowledge_id}/search' \
-H 'Authorization: Bearer $ORQ_API_KEY' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
--data-raw '{
    "query": "Your search query",
    "top_k": 20,
    "search_options": {
        "include_vectors": true,
        "include_metadata": true,
        "include_scores": false
    }
}' \
--compressed
For advanced options including metadata filtering, see Knowledge Base via the API.

Connecting an External Knowledge Base

To connect to an external Knowledge Base, choose the + button on the desired Project.
The following modal opens to configure an external knowledge base.
Connect External Kb
Configuration
FieldDescriptionExample
KeyUnique Identifier, alphanumeric with hyphens/underscoreexternal_kb
DescriptionDescriptionExternal Knowledge Base
NameDisplay NameExternal Knowledge Base Name
API URLURL to search knowledge base, must be HTTPShttps://api.example.org/search
API KeyAuthentication API key to the previously API URL. Orq will use Bearer Authentication Header to call your API.<API_KEY>
orq.ai will include the API Key in the Authorization: Bearer <API_KEY> header when calling your endpoint.
API keys are encrypted using workspace-specific keys (AES-256-GCM)
Select Connect to finalize connecting your external Knowledge Base.

API Payloads

Here are example payloads for request and response expected from your API.
{
  "query": "<string>",
  "top_k": 50,
  "threshold": 0.5,
  "filter_by": {},
  "search_options": {
    "include_vectors": true,
    "include_metadata": true,
    "include_scores": true
  },
  "rerank_config": {
    "model": "cohere/rerank-multilingual-v3.0",
    "threshold": 0,
    "top_k": 10
  }
}
{
  "matches": [
    {
      "id": "<string>",
      "text": "<string>",
      "vector": [
        123
      ],
      "metadata": {},
      "scores": {
        "rerank_score": 123,
        "search_score": 123
      }
    }
  ]
}
The API must respond like a standard Knowledge Base search, to learn more about the expected payload, see our Search API.

Example Implementation for an External API

We’ve created example implementation for External Knowledge Base API.

Get the Code

Install Dependencies

# Install dependencies
pip install -r requirements.txt

Run the Server

# Run the server
uvicorn main:app --reload

Test the API

The API is running at http://localhost:8000Dynamic Documentation will be running at http://localhost:8000/docs 

Get the Code

Install Dependencies

# Install dependencies
npm install

Run the Server

# Run the server
npm run dev

Test the API

The API is running at http://localhost:8000Dynamic Documentation will be running at http://localhost:8000/doc 

Integrating Vector Database Providers

We support providers like Weaviate and Pinecone, as both platforms expose REST APIs that conform to the expected payload format documented above. Integration Examples

Common Errors and Troubleshooting

ScenarioError Message
HTTP instead of HTTPS”External knowledge base URL must use HTTPS protocol”
Local/private IP”External knowledge base URL cannot point to local network”
API unreachable”Failed to verify external knowledge base connectivity”
API timeout (>50s)“External API request timed out”
Problem: Cannot connect to external API
  1. Verify your API endpoint is publicly accessible via HTTPS
  2. Check your API logs for incoming requests from orq.ai IP addresses
  3. Verify your firewall/security groups allow inbound HTTPS traffic
Problem: API key authentication failing
  1. Verify the API key is correct and has not expired
  2. Check that your API expects Bearer authentication in the Authorization header
  3. Confirm your API key has the necessary permissions to perform searches
Problem: No results returned or poor quality results
  1. Verify your API returns the expected response format (see Response Payload above)
  2. Check that scores.search_score values are between 0 and 1
  3. Test with different threshold values (lower threshold = more results)
  4. If using reranking, ensure both search_score and rerank_score are provided
  5. Verify your external vector database has sufficient indexed documents
Problem: Slow response times
  1. Monitor your external API response times
  2. Consider implementing caching for frequently searched queries
  3. Optimize your vector database indexes
  4. Check if your external API is rate limiting requests

Configuring your External Knowledge Base

Datasource configuration is not accessible within External Knowledge base, as data is hosted outside of Orq.ai.
The available configurations are:
For detailed configuration options, scroll to the Knowledge Settings section above to see all available features for both internal and external Knowledge Bases.
Your External Knowledge Base is connected:

Retrieval Logs

When Using a Knowledge Base in a Prompt within Playground, Experiment, Deployment, or Agent, logs are generated and will transparently contain details of how Knowledge Bases were accessed. To find logs, head to the Logs tabs within the module you’re in, then select one log to open the detail panel for one log entry. The following panel will open. On the right side of the screen you have access to the Retrievals section that details the Knowledge Base used and how it was queried. The Query can be seen, which was used to retrieve the relevant chunks from the Knowledge Base. The Documents show the retrieved chunks, ordered by relevance score.

User Message Augmentation

On the left side of the panel, you can see how the Knowledge Base variable is modified with the retrieval results in blue. The blue parts of the messages are the retrieval results injected into the user message. They will then be used as data with which the model can respond to the user query.
Using the blue text, you can verify that the query is correct and that the expected chunks are loaded into the message.