Searching a Knowledge Base
Introduction
Knowledge Base search in Orq.ai allows you to query your uploaded documents and data using vector similarity search. You can perform semantic searches across your content and apply metadata filters to narrow results to specific subsets of your data. This enables you to build powerful RAG (Retrieval Augmented Generation) applications that can find relevant information from your knowledge base to enhance LLM responses.
You can search knowledge bases using the dedicated Search Knowledge Base API, which provides programmatic access to perform queries with optional metadata filtering and search options.
Basic Search
A basic search queries your knowledge base using semantic similarity to find the most relevant chunks for your query:
curl --location 'https://api.orq.ai/v2/knowledge/KNOWLEDGE_BASE_ID/search' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $ORQ_API_KEY' \
--data '{
"query": "What are the benefits of machine learning?"
}'
from orq_ai_sdk import Orq
import os
client = Orq(api_key=os.getenv("ORQ_API_KEY"))
results = client.knowledge.search(
knowledge_id="KNOWLEDGE_BASE_ID",
query="What are the benefits of machine learning?"
)
import { Orq } from '@orq-ai/node';
const orq = new Orq({
apiKey: process.env.ORQ_API_KEY,
});
const results = await orq.knowledge.search({
knowledgeId: 'KNOWLEDGE_BASE_ID',
query: 'What are the benefits of machine learning?'
});
The API returns chunks matching your query from the Knowledge Base.
{
"matches": [
{
"id": "01K2XYMZB25NC01T4BKWARQG5M",
"text": "Machine learning algorithms excel at identifying complex patterns within vast datasets, enabling computers to make predictions and decisions without explicit programming. From recommendation systems that suggest your next favorite movie to autonomous vehicles navigating city streets, ML transforms raw data into actionable insights. Deep neural networks mimic brain structures, while ensemble methods combine multiple models for enhanced accuracy and robustness across diverse applications."
},
{
"id": "01K2XYMZB25NC01T4BKWARQG5V",
"text": "The neural network's gradient descent algorithm struggled to converge during training, prompting Sarah to adjust the learning rate from 0.01 to 0.001. Her convolutional layers were overfitting on the image classification dataset, so she implemented dropout regularization and data augmentation techniques. After 50 epochs, the validation accuracy plateaued at 87%, suggesting she needed more diverse training samples or perhaps a deeper architecture with residual connections to break through the performance barrier."
}
...
}
Filter by metadata
In every chunk in the knowledge base, you can include metadata key-value pairs to store additional information. This metadata can be added via de Chunks API or via the Studio.
When searching the knowledge base, you can include a metadata filter to limit the search to chunks matching a filter expression.
A search without metadata filters omits metadata and performs a search on the entire Knowledge Base.
Metadata types
Metadata payloads must be key-value pairs in a JSON object. Keys must be strings, and values can be one of the following data types:
- String
- Number
- Boolean
For example, the following would be valid metadata payloads:
{
"page_id": "page_x1i2j3",
"edition": 2020,
}
{
"color": "blue",
"is_premium_info": true
}
Metadata constraints
- Use metadata for concise, discrete filter attributes to maximize search performance.
- Avoid placing large text blobs in metadata, long strings will result in slower queries.
- Keep each field’s data type consistent. Our system attempts to coerce mismatched values during ingestion, non-coercible values are discarded and omitted from the chunk.
Metadata filter expressions
orq.ai Knowledge Base filtering is based on MongoDB’s query and projection operators. We currently supports a subset of those selectors:
Filter | Description | Example | Supported types |
---|---|---|---|
$eq | Search chunks with metadata values that are equal to a specified value. | {"page_id": {"eq": "page_x1i2j3"}} | Number, string, boolean |
$ne | Search chunks with metadata values that are not equal to a specified value. | {"page_id": {"ne": "page_x1i2j3"}} | Number, string, boolean |
$gt | Search chunks with metadata values that are greater than a specified value. | {"edition": {"gt": 2019}} | Number |
$gte | Search chunks with metadata values that are greater than or equal to a specified value. | {"edition": {"gte": 2020}} | Number |
$lt | Search chunks with metadata values that are less than a specified value. | {"edition": {"lt": 2022}} | Number |
$lte | Search chunks with metadata values that are less than or equal to a specified value. | {"edition": {"lte": 2020}} | Number |
$in | Search chunks with metadata values that are in a specified array. | {"page_id": {"in": ["page_x1i2j3", "page_y1xiijas"]}} | String, number, boolean |
$nin | Search chunks with metadata values that are not in a specified array. | {"page_id": {"nin": ["comedy", "documentary"]}} | String, number, boolean |
$and | Joins query clauses with a logical AND. | {"$and": [{"page_id": {"eq": "page_x1i2j3"}}, {"edition": {"gte": 2020}}]} | Object (array of filter expressions) |
$or | Joins query clauses with a logical OR. | {"$or": [{"page_id": {"eq": "page_x1i2j3"}}, {"edition": {"gte": 2020}}]} | Object (array of filter expressions) |
Search examples
curl --location 'http://api.orq.ai/v2/knowledge/01J58RKRX4AWMSBMJVPYY1N2CG/search' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $ORQ_API_KEY' \
--data '{
"query": "What we the top editions of science fiction books",
"filter_by": {
"edition": {
"gte": 2020
}
},
"search_options":{
"include_metadata": true,
"include_vectors": true,
"include_scores": true
}
}'
from orq_ai_sdk import Orq
import os
client = Orq(api_key=os.getenv("ORQ_API_KEY"))
client.knowledge.search(
knowledge_id="unique_knowledge_id",
query={"edition": {"$gte": 2020}},
search_options={
"include_metadata": True,
"include_vectors": True,
"include_scores": True,
},
)
import { Orq } from '@orq-ai/node';
const orq = new Orq({
apiKey: 'ORQ_API_KEY',
});
orq.knowledge.search({
knowledgeId: 'unique_knowledge_id',
query: { edition: { $gte: 2020 } },
searchOptions: {
includeMetadata: true,
includeVectors: true,
includeScores: true,
},
});
Updated 13 days ago