PDF Input

📖
This page describes features extending the AI Proxy, which provides a unified API for accessing multiple AI providers. To learn more, see AI Proxy.

Quick Start

Send PDF documents directly in chat messages for analysis and content extraction.

import fs from "fs";

const pdfBuffer = fs.readFileSync("contract.pdf");
const pdfBase64 = pdfBuffer.toString("base64");

const response = await openai.chat.completions.create({
  model: "openai/gpt-4o",
  messages: [
    {
      role: "user",
      content: [
        {
          type: "text",
          text: "Extract key terms and conditions from this contract",
        },
        {
          type: "file",
          file: {
            file_data: `data:application/pdf;base64,${pdfBase64}`,
            filename: "contract.pdf",
          },
        },
      ],
    },
  ],
});

Configuration

Parameter	Type	Required	Description
`type`	`"file"`	Yes	Content type for file input
`file.file_data`	string	Yes	Data URI with base64 PDF content
`file.filename`	string	Yes	Name of the file for model context

Format: data:application/pdf;base64,{base64_content}

Supported Models

Provider	Model	PDF Support
OpenAI	`gpt-4o`	✅ Native
OpenAI	`gpt-4o-mini`	✅ Native
OpenAI	`gpt-4-turbo`	✅ Native
Anthropic	`claude-3-sonnet`	✅ Via conversion
Anthropic	`claude-3-haiku`	✅ Via conversion

Use Cases

Scenario	Best Model	Example Prompt
Contract analysis	`gpt-4o`	"Extract key terms and obligations"
Invoice processing	`gpt-4o-mini`	"Extract amounts, dates, vendor info"
Research papers	`gpt-4o`	"Summarize methodology and findings"
Form extraction	`gpt-4o-mini`	"Convert form data to JSON"

Code examples

curl -X POST https://api.orq.ai/v2/proxy/chat/completions \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Please analyze this PDF document and provide a summary"
          },
          {
            "type": "file",
            "file": {
              "file_data": "data:application/pdf;base64,YOUR_BASE64_ENCODED_PDF",
              "filename": "document.pdf"
            }
          }
        ]
      }
    ]
  }'

from openai import OpenAI
import os
import base64

openai = OpenAI(
  api_key=os.environ.get("ORQ_API_KEY"),
  base_url="https://api.orq.ai/v2/proxy"
)

# Read and encode your PDF file
with open("document.pdf", "rb") as pdf_file:
    pdf_base64 = base64.b64encode(pdf_file.read()).decode('utf-8')

response = openai.chat.completions.create(
    model="openai/gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Please analyze this PDF document and provide a summary"
                },
                {
                    "type": "file",
                    "file": {
                        "file_data": f"data:application/pdf;base64,{pdf_base64}",
                        "filename": "document.pdf"
                    }
                }
            ]
        }
    ]
)

import OpenAI from "openai";
import fs from "fs";

const openai = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://api.orq.ai/v2/proxy",
});

// Read and encode your PDF file
const pdfBuffer = fs.readFileSync("document.pdf");
const pdfBase64 = pdfBuffer.toString("base64");

const response = await openai.chat.completions.create({
  model: "openai/gpt-4o",
  messages: [
    {
      role: "user",
      content: [
        {
          type: "text",
          text: "Please analyze this PDF document and provide a summary",
        },
        {
          type: "file",
          file: {
            file_data: `data:application/pdf;base64,${pdfBase64}`,
            filename: "document.pdf",
          },
        },
      ],
    },
  ],
});

File Handling

Reading PDF files:

// Node.js
const fs = require('fs');
const pdfBase64 = fs.readFileSync('document.pdf', 'base64');

// Browser (File input)
const fileInput = document.getElementById('pdf-upload');
const file = fileInput.files[0];
const arrayBuffer = await file.arrayBuffer();
const pdfBase64 = btoa(String.fromCharCode(...new Uint8Array(arrayBuffer)));

// Python
import base64
with open('document.pdf', 'rb') as f:
    pdf_base64 = base64.b64encode(f.read()).decode('utf-8')

Size optimization:

// Check file size before encoding
const maxSize = 20 * 1024 * 1024; // 20MB
if (pdfBuffer.length > maxSize) {
  throw new Error("PDF file too large. Consider compressing first.");
}

Best Practices

File preparation:

Compress PDFs to reduce size (under 20MB recommended)
Ensure text is selectable (not scanned images)
Remove unnecessary pages for focused analysis
Use clear, structured layouts for better extraction

Prompt engineering:

// Specific extraction
"Extract all dollar amounts and their associated line items as JSON";

// Structured analysis
"Provide a summary with these sections: Executive Summary, Key Findings, Recommendations";

// Data validation
"Verify if all required fields are present: name, date, signature, amount";

Error handling:

const processPDF = async (pdfPath, prompt) => {
  try {
    const pdfBase64 = fs.readFileSync(pdfPath, "base64");

    if (pdfBase64.length > 50000000) {
      // ~37MB base64 limit
      throw new Error("PDF too large for processing");
    }

    const response = await openai.chat.completions.create({
      model: "openai/gpt-4o",
      messages: [
        {
          role: "user",
          content: [
            { type: "text", text: prompt },
            {
              type: "file",
              file: {
                file_data: `data:application/pdf;base64,${pdfBase64}`,
                filename: pdfPath.split("/").pop(), // Extract filename from path
              },
            },
          ],
        },
      ],
    });

    return response.choices[0].message.content;
  } catch (error) {
    if (error.message.includes("context_length_exceeded")) {
      throw new Error("PDF too large. Try splitting into smaller sections.");
    }
    throw error;
  }
};

Troubleshooting

PDF not processing

Verify base64 encoding is correct
Check file size (under model's context limit)
Ensure MIME type is application/pdf
Try with a different model

Poor extraction quality

Use higher-quality models (gpt-4o vs gpt-4o-mini)
Provide more specific prompts
Break complex documents into sections
Consider preprocessing scanned PDFs with OCR

Performance issues

Compress PDFs before sending
Extract only relevant pages
Use streaming for large documents
Cache results for repeated analysis

Limitations

Limitation	Details	Workaround
File size	Model context limits	Split large PDFs
Scanned documents	Quality varies by model	Use OCR preprocessing
Complex layouts	Tables/charts may not extract well	Use structured prompts
Security	Sensitive documents sent to provider	Use on-premise models
Cost	Large files consume more tokens	Optimize file size

Advanced Usage

Batch processing:

const processPDFBatch = async (pdfPaths) => {
  const results = await Promise.allSettled(
    pdfPaths.map((path) => processPDF(path, "Extract key information")),
  );

  return results.map((result, index) => ({
    file: pdfPaths[index],
    success: result.status === "fulfilled",
    data: result.status === "fulfilled" ? result.value : null,
    error: result.status === "rejected" ? result.reason : null,
  }));
};

Progressive analysis:

// Analyze in stages for large documents
const stages = [
  "Identify document type and structure",
  "Extract metadata (author, date, title)",
  "Summarize each section",
  "Extract actionable items",
];

for (const prompt of stages) {
  const result = await processPDF(pdfPath, prompt);
  console.log(`Stage: ${prompt}\nResult: ${result}\n`);
}

Content validation:

const validateExtraction = async (pdfPath, expectedFields) => {
  const prompt = `Extract these fields as JSON: ${expectedFields.join(", ")}`;
  const result = await processPDF(pdfPath, prompt);

  try {
    const data = JSON.parse(result);
    const missing = expectedFields.filter((field) => !data[field]);

    return {
      valid: missing.length === 0,
      missing,
      data,
    };
  } catch (error) {
    return { valid: false, error: "Invalid JSON response" };
  }
};