Using Pydantic AI for Seamless AI Model Integration

Pydantic AI is a library designed to make it easy to integrate AI models with Python applications. It is built on the top of the Pydantic library by extending its data validation capabilities to handle AI outputs. This means developers can easily transform the raw responses from AI models into structured data that's ready to be used in applications. Pydantic AI helps in processing both straightforward and complex AI model outputs, making these easier to manage and utilize in various projects.

Installation

To use Pydantic AI, install it with pip:

pip install pydantic-ai

This command ensures that you have all the necessary components to start integrating AI into your applications with minimal effort.

Setting Up Your API Key

In this article, we'll use OpenAI as our provider for all examples. To access their services, you'll need to set up an API key. There are two ways to set your API key in Pydantic AI:

Setting Globally with the os Module:

You can set your API key as an environment variable, which can be accessed globally in your application:

import os

os.environ["OPENAI_API_KEY"] = "my_key"

agent = Agent(
    "openai:gpt-4o-mini",
    ...
)

With this method, the API key is available for all modules using the OpenAI service in your application.

Using the OpenAIModel Class:

Alternatively, you can provide the API key directly when creating an OpenAI model instance. Although this example uses OpenAI, the same method can be applied to other AI providers:

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

model = OpenAIModel(
   model_name="gpt-4o-mini",
   provider=OpenAIProvider(api_key="my_key")
)

agent = Agent(
   model,
   output_type=str
)

This approach allows you to encapsulate the API key within the model instance, which can enhance security and portability in certain settings.

Unstructured Output: Managing Raw AI Responses

AI models often return unstructured outputs, typically in the form of strings. These raw responses can include anything from text descriptions to complex paragraphs, and managing them effectively can pose challenges.

Pydantic AI provides the tools to handle this unstructured output without diving into convoluted processing logic. You can use the AgentRunResult class to encapsulate the responses, which makes it easier to manage and process the information generated by AI models. The class provides a structured way to handle and track the state and output of AI operations.

import asyncio
from pydantic_ai import Agent

async def main():
    agent = Agent(
        "openai:gpt-4o-mini",
        output_type=str
    )
    result = await agent.run("What is the capital of France?")

    print("Raw Output:", result.output)

asyncio.run(main())

In the example above, an AgentRunResult instance holds the raw string response generated by the AI model. The output attribute within the AgentRunResult provides access to this response, ready for integration or analysis.

Furthermore, output_type isn't restricted to just str. Here's an example where a list of strings is used:

import asyncio
from pydantic_ai import Agent

async def main():
    agent = Agent(
        "openai:gpt-4o-mini",
        output_type=list[str]
    )
    result = await agent.run("List the commonly used Python web frameworks!")

    print("Raw Output List:", result.output)

asyncio.run(main())

In this example, the AI model generates a list of strings, demonstrating that Pydantic AI can manage outputs in various formats efficiently.

Structured Output: Enforcing Consistency and Structure

If you need more than just raw string data, Pydantic AI steps up with its structured output feature, empowering you to bring consistency to AI responses. By defining a schema, you can specify the format of the output, ensuring it aligns with your application's requirements.

Here's how you use structured output with Pydantic AI:

import asyncio
from pydantic_ai import Agent
from typing import List
from pydantic import BaseModel

class GeneratedText(BaseModel):
    title: str
    paragraphs: List[str]

async def main():
    agent = Agent(
        "openai:gpt-4o-mini",
        output_type=GeneratedText
    )
    result = await agent.run("Generate a structured article")
    structured_output = result.output

    print(structured_output.title)
    print(structured_output.paragraphs)

asyncio.run(main())

In this example, a GeneratedText class is defined, which includes a title and paragraphs. When the AI model runs, an AgentRunResult instance is returned where the output attribute provides access to the GeneratedText structured response. This structured approach makes it easier to work with in your applications.

Handling Incomplete AI Responses

Pydantic AI also provides a way to manage situations where the AI cannot provide a proper response. You can define a model class that incorporates optional fields to capture errors or incomplete results. For example:

import asyncio
from typing import Optional
from pydantic_ai import Agent
from pydantic import BaseModel

class Question(BaseModel):
    answer: Optional[str] = None
    error: Optional[str] = None

async def main():
    agent = Agent(
        "openai:gpt-4o-mini",
        output_type=Question
    )
    result = await agent.run("What is the capital of Japan?")
    
    if result.output.answer:
        print("Answer:", result.output.answer)
    else:
        print("Error:", result.output.error)

asyncio.run(main())

This Question class incorporates two optional fields: answer and error. When the AI model returns a result, it will either populate the answer field if successful, or provide an error message in the error field if it cannot generate a suitable response. This approach ensures that your application can gracefully handle failures while processing AI outputs.

To trigger an error case in the above example, you could send a question that the AI cannot answer or input meaningless text. For instance, asking "What is the capital of Mu?" or entering gibberish like "dafsdfsd" would likely result in the error field being populated, allowing you to handle these scenarios effectively.

System Messages: Guiding AI Interaction

System messages allow you to guide the behavior of the AI model by setting context or instructions before the AI execution. This can be incredibly useful for setting up constraints or focusing the AI on specific tasks or areas of expertise.

Here's a practical implementation using a system message:

import asyncio
from typing import Optional
from pydantic import BaseModel
from pydantic_ai import Agent

class CodeSnippet(BaseModel):
    code: Optional[str] = None
    error: Optional[str] = None

async def main():
    agent = Agent(
        "openai:gpt-4o-mini",
        output_type=CodeSnippet,
        system_prompt="You are an AI assistant specializing in Python programming. Please provide clear and concise answers. Do not answer questions that are not related to Python programming. If you do not know the answer, return an error message.",
    )
    result = await agent.run("How do you write a loop in Python?")

    if result.output.code:
        print("Output with System Message:", result.output.code)
    else:
        print("Error:", result.output.error)

asyncio.run(main())

In this example, the system_prompt argument instructs the AI to act as an expert in Python programming. This context helps the AI to generate more relevant and focused responses tailored to the given specialization. The CodeSnippet class handles both successful and error responses to ensure robustness. The error field will be filled in the case where the question is not related to the Python language or there is no valid answer.

Providing a Full Chat History

When working with conversational AI, it's often important to provide a full chat history to maintain context and flow in the conversation. Pydantic AI allows you to include a full conversation history, giving the AI more information to generate contextually relevant responses.

Here's an example demonstrating how you can provide a full chat history using Pydantic AI:

import asyncio
from typing import List
from pydantic_ai import Agent
from pydantic_ai.messages import ModelRequest, SystemPromptPart, UserPromptPart

async def main():
    # Define the full conversation history
    history = [
        ModelRequest(parts=[SystemPromptPart(content="You are a travel assistant with expertise in Europe.")]),
        ModelRequest(parts=[UserPromptPart(content="Hi, I recently visited Rome and Venice, and I loved them!")]),
        ModelRequest(parts=[UserPromptPart(content="I want to explore more places in Italy.")]),
    ]

    # Initialize Agent with model
    agent = Agent(
        "openai:gpt-4o-mini",
        output_type=str
    )

    # Run the agent with the full message history
    result = await agent.run("Can you recommend where I should visit next?", message_history=history)

    # Print the current response and the whole message history
    print("Current Response:", result.output)
    print("Full Conversation History:", result.all_messages())

asyncio.run(main())

In this example, the conversation history is represented as a list of ModelRequest objects, each containing either a SystemPromptPart or UserPromptPart. This history is passed to the run method as message_history, enabling the AI to reference previous exchanges. This approach is particularly useful in applications requiring a continuous dialogue thread, such as chatbots or interactive systems.

Tracking Token Usage

Tracking token usage can be critical for managing costs and optimizing the performance of your AI applications. Pydantic AI provides a straightforward way to access token usage information.

Here's how you can access the token usage info with Pydantic AI:

import asyncio
from pydantic_ai import Agent

async def main():
    agent = Agent(
        "openai:gpt-4o-mini",
        output_type=str
    )

    result = await agent.run("Explain the laws of thermodynamics.")

    # Fetching token usage information
    usage = result.usage()
    print(f"Requests: {usage.requests}, Request Tokens: {usage.request_tokens}, Response Tokens: {usage.response_tokens}, Total Tokens: {usage.total_tokens}")

asyncio.run(main())

In this example, after running an AI query, you can call the usage() method on the result to retrieve information about token usage. The example demonstrates how to extract and format the requests, request_tokens, response_tokens, and total_tokens attributes, which helps you monitor and manage your token expenditures effectively, ensuring your AI operations remain efficient and cost-effective.

Requests: Represents the total number of requests or API calls sent to the AI model. This number helps in tracking the frequency of interactions with the model. For example, if you ask one question and get one response, that's 1 request. If your agent makes multiple calls internally, this counts each one.
Request Tokens: The number of tokens sent to the model in the prompt (input). It includes any initial instructions or historical data provided for context, ensuring accurate input monitoring.
Response Tokens: The number of tokens generated by the model as output (response). Managing response tokens allows developers to anticipate and analyze the resource consumption per output block.
Total Tokens: The sum of request and response tokens used in that request. It helps in a comprehensive understanding of resource use for each interaction, which is valuable for budgeting and optimization.

Conclusion

Pydantic AI offers developers a simplified approach to incorporate AI into Python applications. By offering both unstructured and structured output handling, it caters to varying needs, whether for raw text or organized data structures. Additionally, the integration of system messages allows developers to guide AI behavior according to specific requirements. This flexibility, combined with the ability to manage full chat histories, makes Pydantic AI a powerful tool in the developer's toolkit.