DSPy

Getting Started

Stanford DSPy is a framework for algorithmically optimizing LM prompts and weights through programming rather than prompting. Tracing DSPy with Orq.ai provides comprehensive insights into signature execution, module performance, optimization processes, and few-shot learning effectiveness to optimize your programmatic LLM applications.

Prerequisites

Before you begin, ensure you have:

An Orq.ai account and API Key
Python 3.8+
DSPy installed in your project
API keys for your chosen LLM providers

Install Dependencies

# Core DSPy and OpenTelemetry packages
pip install dspy-ai opentelemetry-sdk opentelemetry-exporter-otlp

# OpenInference instrumentation for DSPy
pip install openinference-instrumentation-dspy

# LLM providers
pip install openai

Configure Orq.ai

Set up your environment variables to connect to Orq.ai's OpenTelemetry collector:

Unix/Linux/macOS:

export OTEL_EXPORTER_OTLP_ENDPOINT="https://api.orq.ai/v2/otel"
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer <ORQ_API_KEY>"
export OTEL_RESOURCE_ATTRIBUTES="service.name=dspy-app,service.version=1.0.0"
export OPENAI_API_KEY="<YOUR_OPENAI_API_KEY>"

Windows (PowerShell):

$env:OTEL_EXPORTER_OTLP_ENDPOINT = "https://api.orq.ai/v2/otel"
$env:OTEL_EXPORTER_OTLP_HEADERS = "Authorization=Bearer <ORQ_API_KEY>"
$env:OTEL_RESOURCE_ATTRIBUTES = "service.name=dspy-app,service.version=1.0.0"
$env:OPENAI_API_KEY = "<YOUR_OPENAI_API_KEY>"

Using .env file:

OTEL_EXPORTER_OTLP_ENDPOINT=https://api.orq.ai/v2/otel
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer <ORQ_API_KEY>
OTEL_RESOURCE_ATTRIBUTES=service.name=dspy-app,service.version=1.0.0
OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>

Integration

DSPy uses OpenInference instrumentation for automatic OpenTelemetry tracing.

Set up the instrumentation in your application:

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
import os

# Configure tracer provider
tracer_provider = TracerProvider(
    resource=Resource({"service.name": "dspy-app"})
)

# Set up OTLP exporter
otlp_exporter = OTLPSpanExporter(
    endpoint=f"{os.getenv('OTEL_EXPORTER_OTLP_ENDPOINT')}/v1/traces",
    headers={"Authorization": os.getenv('OTEL_EXPORTER_OTLP_HEADERS').split('=', 1)[1]}
)

tracer_provider.add_span_processor(BatchSpanProcessor(otlp_exporter))

# Instrument DSPy
from openinference.instrumentation.dspy import DSPyInstrumentor

DSPyInstrumentor().instrument(tracer_provider=tracer_provider)

Use DSPy with automatic tracing:

import dspy
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
import os

# Configure OpenTelemetry
tracer_provider = TracerProvider(
    resource=Resource({"service.name": "dspy-app"})
)

otlp_exporter = OTLPSpanExporter(
    endpoint=f"{os.getenv('OTEL_EXPORTER_OTLP_ENDPOINT')}/v1/traces",
    headers={"Authorization": os.getenv('OTEL_EXPORTER_OTLP_HEADERS').split('=', 1)[1]}
)

tracer_provider.add_span_processor(BatchSpanProcessor(otlp_exporter))

# Instrument DSPy
from openinference.instrumentation.dspy import DSPyInstrumentor

DSPyInstrumentor().instrument(tracer_provider=tracer_provider)

# Modern DSPy syntax (v2.0+)
lm = dspy.LM('openai/gpt-4', api_key=os.getenv('OPENAI_API_KEY'))
dspy.configure(lm=lm)

# Define signature
class BasicQA(dspy.Signature):
    """Answer questions with helpful and accurate responses"""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="A comprehensive answer to the question")

# Create predictor (automatically traced)
qa = dspy.Predict(BasicQA)

# Execute prediction
result = qa(question="What are the benefits of renewable energy?")
print(result.answer)

👍
All DSPy signature executions and module operations will be automatically instrumented and exported to Orq.ai through the OTLP exporter. For more details, see Traces.

Advanced Examples

Chain of Thought Reasoning

import dspy

# Setup done as shown in Integration section above

lm = dspy.LM('openai/gpt-4', api_key=os.getenv('OPENAI_API_KEY'))
dspy.configure(lm=lm)


# Define signature with reasoning
class ComplexProblem(dspy.Signature):
    """Solve complex problems with step-by-step reasoning"""
    problem = dspy.InputField(desc="The problem to solve")
    reasoning = dspy.OutputField(desc="Step-by-step reasoning process")
    solution = dspy.OutputField(desc="Final solution")

# Use Chain of Thought
class ReasoningModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.think = dspy.ChainOfThought(ComplexProblem)

    def forward(self, problem):
        return self.think(problem=problem)

# Execute with tracing
reasoner = ReasoningModule()

problems = [
    "A farmer has 17 sheep. All but 9 die. How many are left?",
    "If it takes 5 machines 5 minutes to make 5 widgets, how long for 100 machines to make 100 widgets?"
]

for problem in problems:
    result = reasoner(problem=problem)
    print(f"Problem: {problem}")
    print(f"Reasoning: {result.reasoning}")
    print(f"Solution: {result.solution}\n")

Few-Shot Optimization

import dspy


lm = dspy.LM('openai/gpt-4', api_key=os.getenv('OPENAI_API_KEY'))
dspy.configure(lm=lm)


# Define signature
class SentimentAnalysis(dspy.Signature):
    """Analyze sentiment of text"""
    text = dspy.InputField()
    sentiment = dspy.OutputField(desc="positive, negative, or neutral")
    confidence = dspy.OutputField(desc="Confidence score 0-1")

# Create module
class SentimentClassifier(dspy.Module):
    def __init__(self):
        super().__init__()
        self.classify = dspy.ChainOfThought(SentimentAnalysis)

    def forward(self, text):
        return self.classify(text=text)

# Training examples
training_examples = [
    dspy.Example(
        text="I love this product! Amazing quality!",
        sentiment="positive",
        confidence="0.95"
    ).with_inputs("text"),
    dspy.Example(
        text="Terrible experience. Very disappointed.",
        sentiment="negative",
        confidence="0.9"
    ).with_inputs("text"),
    dspy.Example(
        text="It's okay, nothing special.",
        sentiment="neutral",
        confidence="0.8"
    ).with_inputs("text")
]

# Create optimizer (automatically traced)
classifier = SentimentClassifier()

optimizer = dspy.BootstrapFewShot(
    metric=lambda gold, pred, trace=None: gold.sentiment == pred.sentiment,
    max_bootstrapped_demos=3
)

# Compile optimized program
optimized_classifier = optimizer.compile(
    classifier,
    trainset=training_examples
)

# Test
result = optimized_classifier(text="This is fantastic!")
print(f"Sentiment: {result.sentiment}, Confidence: {result.confidence}")

Multi-Step RAG Pipeline

import dspy
from typing import List

lm = dspy.LM('openai/gpt-4', api_key=os.getenv('OPENAI_API_KEY'))
dspy.configure(lm=lm)


# Define signatures
class GenerateQuery(dspy.Signature):
    """Generate search query from question"""
    question = dspy.InputField()
    search_query = dspy.OutputField(desc="Optimized search query")

class AnswerWithContext(dspy.Signature):
    """Answer question using context"""
    question = dspy.InputField()
    context = dspy.InputField(desc="Retrieved context")
    answer = dspy.OutputField(desc="Comprehensive answer")
    sources = dspy.OutputField(desc="Sources used")

# Build RAG system
class RAGPipeline(dspy.Module):
    def __init__(self, knowledge_base: List[dict]):
        super().__init__()
        self.knowledge_base = knowledge_base
        self.query_generator = dspy.Predict(GenerateQuery)
        self.answer_generator = dspy.ChainOfThought(AnswerWithContext)

    def retrieve(self, query: str, top_k: int = 3) -> List[str]:
        # Simple keyword matching
        query_words = set(query.lower().split())
        scored = []
        for doc in self.knowledge_base:
            doc_words = set(doc['content'].lower().split())
            score = len(query_words.intersection(doc_words))
            if score > 0:
                scored.append((score, doc))
        scored.sort(reverse=True)
        return [doc['content'] for _, doc in scored[:top_k]]

    def forward(self, question):
        # Generate search query
        query_result = self.query_generator(question=question)

        # Retrieve contexts
        contexts = self.retrieve(query_result.search_query)
        context_text = "\n\n".join(contexts)

        # Generate answer
        answer_result = self.answer_generator(
            question=question,
            context=context_text
        )

        return dspy.Prediction(
            query=query_result.search_query,
            answer=answer_result.answer,
            sources=answer_result.sources
        )

# Example knowledge base
kb = [
    {"id": "1", "content": "Machine learning is a subset of AI that enables systems to learn from data."},
    {"id": "2", "content": "Neural networks are computing systems inspired by biological neural networks."},
    {"id": "3", "content": "Natural Language Processing focuses on interaction between computers and human language."}
]

# Use RAG pipeline (automatically traced)
rag = RAGPipeline(kb)
result = rag(question="What is machine learning?")

print(f"Query: {result.query}")
print(f"Answer: {result.answer}")
print(f"Sources: {result.sources}")

👍
DSPy is also compatible with our AI Gateway, to learn more, see DSPy.