Instructor

Introduction

Concepts

Blog

Cookbook

Structured outputs

powered by LLMs.

Designed for simplicity, transparency, and control.

Github

5.4k

Checkout Docs

Elixir

Ruby

PHP

TypeScript

Python

Github

Documentation

Follow {author}

pip install -U instructor

OpenAI

Anthropic

Litellm

Cohere



import instructor
from pydantic import BaseModel
from openai import OpenAI

# Define your desired output structure
class UserInfo(BaseModel):
name: str
age: int

# Patch the OpenAI client
client = instructor.from_openai(OpenAI())

# Extract structured data from natural language
user_info = client.chat.completions.create(
model="gpt-3.5-turbo",
response_model=UserInfo,

messages=[{"role": "user", "content": "John Doe is 30 years old."}],
)

print(user_info.name)
#> John Doe
print(user_info.age)
#> 30


Trusted by

Key features

Hooks into the most popular

Specify Pydantic models to define the structure of your LLM outputs

Validation

Ensure LLM responses conform to your expectations with Pydantic validation

Flexible Backends

Seamlessly integrate with various LLM providers beyond OpenAI

Retry Management

Easily configure the number of retry attempts for your requests

Why use instructor?

Pydantic over Raw Schema

I find many prompt building tools to be overly complex and difficult to use, they might be simple to get started with a trivial examples but once you need more control, you have to wish they were simpler. Instructor does the least amount of work to get the job done.

Pydantic

Json Schema

var = {

"$defs": {

"Character": {

"description": "Any character in a fictional story",

"properties": {

"name": {"title": "Name", "type": "string"},

"age": {"title": "Age", "type": "integer"},

"properties": {

"type": "array",

"items": {"$ref": "#/$defs/Property"},

"title": "Properties",

},

"role": {

"enum": ["protagonist", "antagonist", "supporting"],

"title": "Role",

"type": "string",

},

},

"required": ["name", "age", "properties", "role"],

"title": "Character",

"type": "object",

},

"Property": {

"properties": {

"name": {

"description": "name of property in snake case",

"title": "Name",

"type": "string",

},

"value": {"title": "Value", "type": "string"},

},

"required": ["name", "value"],

"title": "Property",

"type": "object",

},

},

"properties": {

"characters": {

"description": "A list of all characters in the story",

"items": {"$ref": "#/$defs/Character"},

"title": "Characters",

"type": "array",

}

},

"required": ["characters"],

"title": "AllCharacters",

"type": "object",

}

Easy to try and install

The minimum viable api just adds response_model to the client, if you don’t think you want a model its very easy to remove it and continue building your application.

Instructor

OpenAI

import instructor

from openai import OpenAI

from pydantic import BaseModel


# Patch the OpenAI client with Instructor

client = instructor.from_openai(OpenAI())


class UserDetail(BaseModel):

name: str

age: int


# Function to extract user details

def extract_user() -> UserDetail:

user = client.chat.completions.create(

model="gpt-4-turbo-preview",

response_model=UserDetail,

messages=[

{"role": "user", "content": "Extract Jason is 25 years old"},

]

)

return user

Cookbook

How are single and multi-label classifications done using enums?

How is AI self-assessment implemented with llm_validator?

How to do classification in batch from user provided classes.

How are exact citations retrieved using regular expressions and smart prompting?

How are search queries segmented through function calling and multi-task definitions?

How are knowledge graphs generated from questions?

Full Cookbook

Community

Discord

Github

Documentation