Searching a knowledge base
Filter by metadata
In every chunk in the knowledge base, you can include metadata key-value pairs to store additional information. This metadata can be added via de Chunks API or via the Studio.
When searching the knowledge base, you can include a metadata filter to limit the search to chunks matching a filter expression.
A search without metadata filters omits metadata and performs a search on the entire Knowledge Base.
Metadata types
Metadata payloads must be key-value pairs in a JSON object. Keys must be strings, and values can be one of the following data types:
- String
- Number
- Boolean
For example, the following would be valid metadata payloads:
{
"page_id": "page_x1i2j3",
"edition": 2020,
}
{
"color": "blue",
"is_premium_info": true
}
Metadata constraints
- Use metadata for concise, discrete filter attributes to maximize search performance.
- Avoid placing large text blobs in metadata, long strings will result in slower queries.
- Keep each field’s data type consistent. Our system attempts to coerce mismatched values during ingestion, non-coercible values are discarded and omitted from the chunk.
Metadata filter expressions
orq.ai Knowledge Base filtering is based on MongoDB’s query and projection operators. We currently supports a subset of those selectors:
Filter | Description | Example | Supported types |
---|---|---|---|
$eq | Search chunks with metadata values that are equal to a specified value. | {"page_id": {"eq": "page_x1i2j3"}} | Number, string, boolean |
$ne | Search chunks with metadata values that are not equal to a specified value. | {"page_id": {"ne": "page_x1i2j3"}} | Number, string, boolean |
$gt | Search chunks with metadata values that are greater than a specified value. | {"edition": {"gt": 2019}} | Number |
$gte | Search chunks with metadata values that are greater than or equal to a specified value. | {"edition": {"gte": 2020}} | Number |
$lt | Search chunks with metadata values that are less than a specified value. | {"edition": {"lt": 2022}} | Number |
$lte | Search chunks with metadata values that are less than or equal to a specified value. | {"edition": {"lte": 2020}} | Number |
$in | Search chunks with metadata values that are in a specified array. | {"page_id": {"in": ["page_x1i2j3", "page_y1xiijas"]}} | String, number, boolean |
$nin | Search chunks with metadata values that are not in a specified array. | {"page_id": {"nin": ["comedy", "documentary"]}} | String, number, boolean |
$and | Joins query clauses with a logical AND. | {"$and": [{"page_id": {"eq": "page_x1i2j3"}}, {"edition": {"gte": 2020}}]} | Object (array of filter expressions) |
$or | Joins query clauses with a logical OR. | {"$or": [{"page_id": {"eq": "page_x1i2j3"}}, {"edition": {"gte": 2020}}]} | Object (array of filter expressions) |
Search examples
curl --location 'http://localhost:4200/v2/knowledge/01J58RKRX4AWMSBMJVPYY1N2CG/search' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $ORQ_API_KEY' \
--data '{
"query": "What we the top editions of science fiction books",
"filter_by": {
"edition": {
"gte": 2020
}
},
"search_options":{
"include_metadata": true,
"include_vectors": true,
"include_scores": true
}
}'
from orq_ai_sdk import Orq
import os
client = Orq(api_key=os.getenv("ORQ_API_KEY"))
client.knowledge.search(
knowledge_id="unique_knowledge_id",
query={"edition": {"$gte": 2020}},
search_options={
"include_metadata": True,
"include_vectors": True,
"include_scores": True,
},
)
import { Orq } from '@orq-ai/node';
const orq = new Orq({
apiKey: 'ORQ_API_KEY',
});
orq.knowledge.search({
knowledgeId: 'unique_knowledge_id',
query: { edition: { $gte: 2020 } },
searchOptions: {
includeMetadata: true,
includeVectors: true,
includeScores: true,
},
});
Updated about 14 hours ago