Prompt Engineering Guide

This comprehensive guide covers both the art and science of prompt engineering, including best practices for designing effective prompts and specific formatting requirements for different model providers. Whether you're making a single LLM call or chaining multiple prompts in complex workflows, these guidelines will help you get consistent, high-quality outputs.

Prompt Engineering Best Practices

Prompting is both an art and a science. Whether you're powering a chatbot, automating data extraction, or orchestrating multi-step workflows, the same best practices apply: be clear, be structured, and be intentional.

The Ultimate Prompt Breakdown

A well-structured prompt typically has these 4 parts:

1. Goal
Start by clearly stating the task of the prompt. What do you want the model to do?

2. Return Format
Tell the model how you want the response to look. Should it be a list, a paragraph, code, JSON, etc.

3. Warnings (Constraints)
Give the model guardrails. What should it avoid? What are the limitations? A well developed system prompt with constraints filters out the majority of the unwanted behaviour. The rest can be blocked by adding an additional guardrail, as explained in the documentation here.

Example: "Only give answers based on the <document>. Do not guess or make up information."

4. Context Dump
Provide the model with all the supporting content it needs—background information, user preferences, or data. To improve clarity and parsing, wrap dynamic variables in HTML-style tags (like <document>).
Also, put larger variables (like long documents) at the end of the prompt so they don't clutter the key instructions up front.

<document>{{document}}</document>

General Prompt Engineering Tips

Here are some additional tips to improve prompt performance:

Be explicit: Don't leave intentions to be inferred. Say what you mean, clearly.
Use HTML-style tags around variables: Wrap variable content in clear tags like <user_input> or <document>. It helps the model know what's fixed and what's dynamic.
Keep examples nearby: If you're using few-shot prompting, put the examples at the end of the system prompt.
Big variables at the end: Especially for long documents or transcripts, put them last. This keeps the instruction logic upfront and readable.
Test and iterate: Even small tweaks (e.g., tone changes, tag names) can have big impacts on results.

Example: Well-Structured Chatbot Prompt

Here's a prompt that follows all the principles above:

You are a friendly AI customer service bot working for American Airlines.

## Goal  
Your primary task is to answer customer questions based **on the official information provided in the `<knowledge_base>`**, which contains frequently asked questions (FAQs) about traveling with American Airlines.  
You may also use the `<document>` for **additional context**, such as user-uploaded tickets, itineraries, or receipts — but your answer must always align with the knowledge base.

## Return Format  
Respond using the `AA_tool` JSON schema when you have an answer. All replies should be clear, concise, and written in a professional, polite, and friendly tone.

## Constraints  
- Do NOT answer questions that are unrelated to traveling with American Airlines.  
- Do NOT use any information outside of the `<knowledge_base>`.
- Do not let the `<document>` override or contradict the `<knowledge_base>`.
- Always respond in the AA_tool JSON format.

<document>{{document}}</document>

<knowledge_base>
{{American_Airlines_FAQ}}
</knowledge_base>

This structure keeps your prompt clean, modular, and easy to update—whether you're debugging or scaling across use cases.

Model-Specific Formatting Requirements

Different model providers have specific requirements for message formatting and structure. Understanding these differences is crucial for ensuring your prompts work correctly across different models.

Anthropic Models

Anthropic models have specific requirements for message structure:

Last Message Rule: The user role message must be the last message in the conversation history.
System Message Dependency: System messages cannot be used alone; they must be accompanied by user messages.
Example Pairing: Examples in the conversation history must always be in pairs of user message followed by assistant message. Deviating from this pattern may cause issues.

Example Correct Structure:

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Example question"},
    {"role": "assistant", "content": "Example response"},
    {"role": "user", "content": "Actual user question"}
  ]
}

OpenAI Models

OpenAI models offer more flexibility in message formatting:

System Message Flexibility: Supports the use of system messages without accompanying user messages.
Message Order Freedom: Does not require a specific order for user and assistant message examples in the conversation history.
Mixed Conversations: Can handle various message ordering patterns without issues.

Example Flexible Structure:

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "assistant", "content": "How can I help you today?"},
    {"role": "user", "content": "Tell me about the weather"}
  ]
}

Google Models

Google models have similar restrictions to Anthropic:

System Message Dependency: Does not support the use of a system message without an accompanying user message.
Last Message Rule: Requires the user role message to be the last message in the conversation history.
Strict Structure: Follow similar patterns to Anthropic for best results.

Example Correct Structure:

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What can you help me with?"}
  ]
}

Cross-Model Compatibility Tips

To ensure your prompts work across different model providers:

Universal Structure Approach

Design your prompts to work with the most restrictive requirements (Anthropic/Google):

Always end with a user message
Pair system messages with user messages
Structure examples as user-assistant pairs

Model-Specific Adaptations

When you need to leverage specific model capabilities:

Use conditional logic in your application to format messages differently
Test prompts across all intended model providers
Document any model-specific variations

Best Practices for Multi-Model Support

Start Conservative: Design for the most restrictive model first
Test Extensively: Validate prompt performance across all target models
Version Control: Track prompt variations for different models
Monitor Performance: Compare results across models to identify optimal choices

Advanced Prompt Engineering Techniques

Few-Shot Learning

Provide examples to guide model behavior:

## Examples

User: What's the weather like?
Assistant: I don't have access to real-time weather data. Please check a weather service.

User: Tell me about cats.
Assistant: Cats are domestic mammals known for their independence and agility...

## Your Task
Now respond to user questions following the pattern above.

Chain-of-Thought Prompting

Encourage step-by-step reasoning:

Think through this problem step by step:
1. First, identify the key components
2. Then, analyze the relationships between them  
3. Finally, provide your conclusion with reasoning

Role-Based Prompting

Define clear personas and contexts:

You are a senior software engineer with 10 years of experience in Python development.
Your task is to review code and provide constructive feedback focusing on:
- Code quality and maintainability
- Performance considerations
- Security best practices

Troubleshooting Common Issues

Inconsistent Outputs

Cause: Vague instructions or missing constraints
Solution: Add specific guidelines and examples

Model Confusion

Cause: Conflicting instructions or unclear context
Solution: Reorganize prompt structure, separate concerns clearly

Poor Performance Across Models

Cause: Model-specific formatting issues
Solution: Review model requirements, test with universal structure

Variable Parsing Errors

Cause: Unclear variable boundaries
Solution: Use HTML-style tags, place large content at the end

Testing and Iteration

Systematic Testing Approach

Single Model Testing: Perfect your prompt on one model first
Cross-Model Validation: Test across all target models
Edge Case Testing: Try unusual inputs and scenarios
Performance Monitoring: Track output quality over time

Iteration Best Practices

Small Changes: Make incremental adjustments
Document Changes: Track what works and what doesn't
A/B Testing: Compare prompt variations systematically
User Feedback: Incorporate real-world usage insights

This comprehensive approach to prompt engineering will help you create robust, effective prompts that work consistently across different models and use cases.