Pydantic AI is a library designed to make it easy to integrate AI models with Python applications. It is built on the top of the Pydantic library by extending its data validation capabilities to handle AI outputs. This means developers can easily transform the raw responses from AI models into structured data that's ready to be used in applications. Pydantic AI helps in processing both straightforward and complex AI model outputs, making these easier to manage and utilize in various projects.
Installation
To use Pydantic AI, install it with pip:
pip install pydantic-ai
This command ensures that you have all the necessary components to start integrating AI into your applications with minimal effort.
Setting Up Your API Key
In this article, we'll use OpenAI as our provider for all examples. To access their services, you'll need to set up an API key. There are two ways to set your API key in Pydantic AI:
Setting Globally with the os
Module:
You can set your API key as an environment variable, which can be accessed globally in your application:
import os
os.environ["OPENAI_API_KEY"] = "my_key"
agent = Agent(
"openai:gpt-4o-mini",
...
)
With this method, the API key is available for all modules using the OpenAI service in your application.
Using the OpenAIModel
Class:
Alternatively, you can provide the API key directly when creating an OpenAI model instance. Although this example uses OpenAI, the same method can be applied to other AI providers:
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
model = OpenAIModel(
model_name="gpt-4o-mini",
provider=OpenAIProvider(api_key="my_key")
)
agent = Agent(
model,
output_type=str
)
This approach allows you to encapsulate the API key within the model instance, which can enhance security and portability in certain settings.
Unstructured Output: Managing Raw AI Responses
AI models often return unstructured outputs, typically in the form of strings. These raw responses can include anything from text descriptions to complex paragraphs, and managing them effectively can pose challenges.
Pydantic AI provides the tools to handle this unstructured output without diving into convoluted processing logic. You can use the AgentRunResult
class to encapsulate the responses, which makes it easier to manage and process the information generated by AI models. The class provides a structured way to handle and track the state and output of AI operations.
import asyncio
from pydantic_ai import Agent
async def main():
agent = Agent(
"openai:gpt-4o-mini",
output_type=str
)
result = await agent.run("What is the capital of France?")
print("Raw Output:", result.output)
asyncio.run(main())
In the example above, an AgentRunResult
instance holds the raw string response generated by the AI model. The output
attribute within the AgentRunResult
provides access to this response, ready for integration or analysis.
Furthermore, output_type
isn't restricted to just str
. Here's an example where a list of strings is used:
import asyncio
from pydantic_ai import Agent
async def main():
agent = Agent(
"openai:gpt-4o-mini",
output_type=list[str]
)
result = await agent.run("List the commonly used Python web frameworks!")
print("Raw Output List:", result.output)
asyncio.run(main())
In this example, the AI model generates a list of strings, demonstrating that Pydantic AI can manage outputs in various formats efficiently.
Structured Output: Enforcing Consistency and Structure
If you need more than just raw string data, Pydantic AI steps up with its structured output feature, empowering you to bring consistency to AI responses. By defining a schema, you can specify the format of the output, ensuring it aligns with your application's requirements.
Here's how you use structured output with Pydantic AI:
import asyncio
from pydantic_ai import Agent
from typing import List
from pydantic import BaseModel
class GeneratedText(BaseModel):
title: str
paragraphs: List[str]
async def main():
agent = Agent(
"openai:gpt-4o-mini",
output_type=GeneratedText
)
result = await agent.run("Generate a structured article")
structured_output = result.output
print(structured_output.title)
print(structured_output.paragraphs)
asyncio.run(main())
In this example, a GeneratedText
class is defined, which includes a title
and paragraphs
. When the AI model runs, an AgentRunResult
instance is returned where the output
attribute provides access to the GeneratedText
structured response. This structured approach makes it easier to work with in your applications.
Handling Incomplete AI Responses
Pydantic AI also provides a way to manage situations where the AI cannot provide a proper response. You can define a model class that incorporates optional fields to capture errors or incomplete results. For example:
import asyncio
from typing import Optional
from pydantic_ai import Agent
from pydantic import BaseModel
class Question(BaseModel):
answer: Optional[str] = None
error: Optional[str] = None
async def main():
agent = Agent(
"openai:gpt-4o-mini",
output_type=Question
)
result = await agent.run("What is the capital of Japan?")
if result.output.answer:
print("Answer:", result.output.answer)
else:
print("Error:", result.output.error)
asyncio.run(main())
This Question
class incorporates two optional fields: answer
and error
. When the AI model returns a result, it will either populate the answer
field if successful, or provide an error message in the error
field if it cannot generate a suitable response. This approach ensures that your application can gracefully handle failures while processing AI outputs.
To trigger an error case in the above example, you could send a question that the AI cannot answer or input meaningless text. For instance, asking "What is the capital of Mu?" or entering gibberish like "dafsdfsd" would likely result in the error
field being populated, allowing you to handle these scenarios effectively.
System Messages: Guiding AI Interaction
System messages allow you to guide the behavior of the AI model by setting context or instructions before the AI execution. This can be incredibly useful for setting up constraints or focusing the AI on specific tasks or areas of expertise.
Here's a practical implementation using a system message:
import asyncio
from typing import Optional
from pydantic import BaseModel
from pydantic_ai import Agent
class CodeSnippet(BaseModel):
code: Optional[str] = None
error: Optional[str] = None
async def main():
agent = Agent(
"openai:gpt-4o-mini",
output_type=CodeSnippet,
system_prompt="You are an AI assistant specializing in Python programming. Please provide clear and concise answers. Do not answer questions that are not related to Python programming. If you do not know the answer, return an error message.",
)
result = await agent.run("How do you write a loop in Python?")
if result.output.code:
print("Output with System Message:", result.output.code)
else:
print("Error:", result.output.error)
asyncio.run(main())
In this example, the system_prompt
argument instructs the AI to act as an expert in Python programming. This context helps the AI to generate more relevant and focused responses tailored to the given specialization. The CodeSnippet
class handles both successful and error responses to ensure robustness. The error
field will be filled in the case where the question is not related to the Python language or there is no valid answer.
Providing a Full Chat History
When working with conversational AI, it's often important to provide a full chat history to maintain context and flow in the conversation. Pydantic AI allows you to include a full conversation history, giving the AI more information to generate contextually relevant responses.
Here's an example demonstrating how you can provide a full chat history using Pydantic AI:
import asyncio
from typing import List
from pydantic_ai import Agent
from pydantic_ai.messages import ModelRequest, SystemPromptPart, UserPromptPart
async def main():
# Define the full conversation history
history = [
ModelRequest(parts=[SystemPromptPart(content="You are a travel assistant with expertise in Europe.")]),
ModelRequest(parts=[UserPromptPart(content="Hi, I recently visited Rome and Venice, and I loved them!")]),
ModelRequest(parts=[UserPromptPart(content="I want to explore more places in Italy.")]),
]
# Initialize Agent with model
agent = Agent(
"openai:gpt-4o-mini",
output_type=str
)
# Run the agent with the full message history
result = await agent.run("Can you recommend where I should visit next?", message_history=history)
# Print the current response and the whole message history
print("Current Response:", result.output)
print("Full Conversation History:", result.all_messages())
asyncio.run(main())
In this example, the conversation history is represented as a list of ModelRequest
objects, each containing either a SystemPromptPart
or UserPromptPart
. This history is passed to the run
method as message_history
, enabling the AI to reference previous exchanges. This approach is particularly useful in applications requiring a continuous dialogue thread, such as chatbots or interactive systems.
Tracking Token Usage
Tracking token usage can be critical for managing costs and optimizing the performance of your AI applications. Pydantic AI provides a straightforward way to access token usage information.
Here's how you can access the token usage info with Pydantic AI:
import asyncio
from pydantic_ai import Agent
async def main():
agent = Agent(
"openai:gpt-4o-mini",
output_type=str
)
result = await agent.run("Explain the laws of thermodynamics.")
# Fetching token usage information
usage = result.usage()
print(f"Requests: {usage.requests}, Request Tokens: {usage.request_tokens}, Response Tokens: {usage.response_tokens}, Total Tokens: {usage.total_tokens}")
asyncio.run(main())
In this example, after running an AI query, you can call the usage()
method on the result to retrieve information about token usage. The example demonstrates how to extract and format the requests
, request_tokens
, response_tokens
, and total_tokens
attributes, which helps you monitor and manage your token expenditures effectively, ensuring your AI operations remain efficient and cost-effective.
- Requests: Represents the total number of requests or API calls sent to the AI model. This number helps in tracking the frequency of interactions with the model. For example, if you ask one question and get one response, that's 1 request. If your agent makes multiple calls internally, this counts each one.
- Request Tokens: The number of tokens sent to the model in the prompt (input). It includes any initial instructions or historical data provided for context, ensuring accurate input monitoring.
- Response Tokens: The number of tokens generated by the model as output (response). Managing response tokens allows developers to anticipate and analyze the resource consumption per output block.
- Total Tokens: The sum of request and response tokens used in that request. It helps in a comprehensive understanding of resource use for each interaction, which is valuable for budgeting and optimization.
Conclusion
Pydantic AI offers developers a simplified approach to incorporate AI into Python applications. By offering both unstructured and structured output handling, it caters to varying needs, whether for raw text or organized data structures. Additionally, the integration of system messages allows developers to guide AI behavior according to specific requirements. This flexibility, combined with the ability to manage full chat histories, makes Pydantic AI a powerful tool in the developer's toolkit.