Searching a knowledge base

Filter by metadata

In every chunk in the knowledge base, you can include metadata key-value pairs to store additional information. This metadata can be added via de Chunks API or via the Studio.

When searching the knowledge base, you can include a metadata filter to limit the search to chunks matching a filter expression.

📘

A search without metadata filters omits metadata and performs a search on the entire Knowledge Base.

Metadata types

Metadata payloads must be key-value pairs in a JSON object. Keys must be strings, and values can be one of the following data types:

  • String
  • Number
  • Boolean

For example, the following would be valid metadata payloads:

{
    "page_id": "page_x1i2j3",
    "edition": 2020,
}

{
    "color": "blue",
    "is_premium_info": true
}

Metadata constraints

  • Use metadata for concise, discrete filter attributes to maximize search performance.
  • Avoid placing large text blobs in metadata, long strings will result in slower queries.
  • Keep each field’s data type consistent. Our system attempts to coerce mismatched values during ingestion, non-coercible values are discarded and omitted from the chunk.

Metadata filter expressions

orq.ai Knowledge Base filtering is based on MongoDB’s query and projection operators. We currently supports a subset of those selectors:

FilterDescriptionExampleSupported types
$eqSearch chunks with metadata values that are equal to a specified value.{"page_id": {"eq": "page_x1i2j3"}}Number, string, boolean
$neSearch chunks with metadata values that are not equal to a specified value.{"page_id": {"ne": "page_x1i2j3"}}Number, string, boolean
$gtSearch chunks with metadata values that are greater than a specified value.{"edition": {"gt": 2019}}Number
$gteSearch chunks with metadata values that are greater than or equal to a specified value.{"edition": {"gte": 2020}}Number
$ltSearch chunks with metadata values that are less than a specified value.{"edition": {"lt": 2022}}Number
$lteSearch chunks with metadata values that are less than or equal to a specified value.{"edition": {"lte": 2020}}Number
$inSearch chunks with metadata values that are in a specified array.{"page_id": {"in": ["page_x1i2j3", "page_y1xiijas"]}}String, number, boolean
$ninSearch chunks with metadata values that are not in a specified array.{"page_id": {"nin": ["comedy", "documentary"]}}String, number, boolean
$andJoins query clauses with a logical AND.{"$and": [{"page_id": {"eq": "page_x1i2j3"}}, {"edition": {"gte": 2020}}]}Object (array of filter expressions)
$orJoins query clauses with a logical OR.{"$or": [{"page_id": {"eq": "page_x1i2j3"}}, {"edition": {"gte": 2020}}]}Object (array of filter expressions)

Search examples

curl --location 'http://localhost:4200/v2/knowledge/01J58RKRX4AWMSBMJVPYY1N2CG/search' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $ORQ_API_KEY' \
--data '{
    "query": "What we the top editions of science fiction books",
    "filter_by": {
        "edition": {
            "gte": 2020
        }
    },
    "search_options":{
        "include_metadata": true,
        "include_vectors": true,
        "include_scores": true
    }
}'
from orq_ai_sdk import Orq
import os

client = Orq(api_key=os.getenv("ORQ_API_KEY"))

client.knowledge.search(
    knowledge_id="unique_knowledge_id",
    query={"edition": {"$gte": 2020}},
    search_options={
        "include_metadata": True,
        "include_vectors": True,
        "include_scores": True,
    },
)
import { Orq } from '@orq-ai/node';

const orq = new Orq({
  apiKey: 'ORQ_API_KEY',
});

orq.knowledge.search({
  knowledgeId: 'unique_knowledge_id',
  query: { edition: { $gte: 2020 } },
  searchOptions: {
    includeMetadata: true,
    includeVectors: true,
    includeScores: true,
  },
});