Observability for OpenAI SDK (Python)

Looking for the JS/TS version? Check it out here.

If you use the OpenAI Python SDK, you can use the Langfuse drop-in replacement to get full logging by changing only the import. This works with OpenAI and Azure OpenAI.

- import openai
+ from langfuse.openai import openai
 
Alternative imports:
+ from langfuse.openai import OpenAI, AsyncOpenAI, AzureOpenAI, AsyncAzureOpenAI

Langfuse automatically tracks:

All prompts/completions with support for streaming, async and functions
Latencies
API Errors (example)
Model usage (tokens) and cost (USD) (learn more)

In the Langfuse Console

How it works

Install Langfuse SDK

The integration is compatible with OpenAI SDK versions >=0.27.8. It supports async functions and streaming for OpenAI SDK versions >=1.0.0.

pip install langfuse openai

Switch to Langfuse Wrapped OpenAI SDK

Add Langfuse credentials to your environment variables

.env

LANGFUSE_SECRET_KEY="sk-lf-..."
LANGFUSE_PUBLIC_KEY="pk-lf-..."
LANGFUSE_HOST="https://cloud.langfuse.com" # 🇪🇺 EU region
# LANGFUSE_HOST="https://us.cloud.langfuse.com" # 🇺🇸 US region

Change import

- import openai
+ from langfuse.openai import openai
 
Alternative imports:
+ from langfuse.openai import OpenAI, AsyncOpenAI, AzureOpenAI, AsyncAzureOpenAI

Optional, checks the SDK connection with the server. Not recommended for production usage.

openai.langfuse_auth_check()

Use OpenAI SDK as usual

No changes required.

Check out the notebook for end-to-end examples of the integration:

Example notebook Error tracking example

Troubleshooting

Queuing and batching of events

The Langfuse SDKs queue and batches events in the background to reduce the number of network requests and improve overall performance. In a long-running application, this works without any additional configuration.

If you are running a short-lived application, you need to flush Langfuse to ensure that all events are flushed before the application exits.

openai.flush_langfuse()

Learn more about queuing and batching of events here.

Assistants API

Tracing of the assistants api is not supported by this integration as OpenAI Assistants have server-side state that cannot easily be captured without additional api requests. We added some more information on how to best track usage of the assistants api in this FAQ.

Debug mode

If you are having issues with the integration, you can enable debug mode to get more information about the requests and responses.

openai.langfuse_debug=True

Streaming function / tool calls

The capture of input and output when streaming function / tool calls is currently not supported. Please upvote this feature request in the GitHub discussion (opens in a new tab) if you would like to see this supported going forward.

Advanced usage

Custom trace properties

You can add the following properties to the openai method, e.g. openai.chat.completions.create(), to use additional Langfuse features:

Property	Description
`name`	Set `name` to identify a specific type of generation.
`metadata`	Set `metadata` with additional information that you want to see in Langfuse.
`session_id`	The current session.
`user_id`	The current user_id.
`tags`	Set tags to categorize and filter traces.
`trace_id`	See "Interoperability with Langfuse Python SDK" (below) for more details.
`parent_observation_id`	See "Interoperability with Langfuse Python SDK" (below) for more details.
`sample_rate`	Sample rate for tracing.

Example:

openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
      {"role": "system", "content": "You are a very accurate calculator. You output only the result of the calculation."},
      {"role": "user", "content": "1 + 1 = "}],
    name="test-chat",
    metadata={"someMetadataKey": "someValue"},
)

Use Traces

Langfuse Tracing groups multiple observations (can be any LLM or non-LLM call) into a single trace. This integration by default creates a single trace for each openai call.

Add non-OpenAI related observations to the trace.
Group multiple OpenAI calls into a single trace while customizing the trace.
Have more control over the trace structure.
Use all Langfuse Tracing features.

New to Langfuse Tracing? Checkout this introduction to the basic concepts.

You can use any of the following three options:

Python @observe() decorator
Set trace_id property, best if you have an existing id from your application.
Use the low-level SDK to create traces manually and add OpenAI calls to it.

Desired trace structure:

TRACE: capital_poem_generator(input="Bulgaria")
|
|-- GENERATION: get-capital
|
|-- GENERATION: generate-poem

Implementation:

from langfuse.decorators import observe
from langfuse.openai import openai
 
@observe()
def capital_poem_generator(country)
  capital = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "What is the capital of the country?"},
        {"role": "user", "content": country}],
    name="get-capital",
  ).choices[0].message.content
 
  poem = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a poet. Create a poem about this city."},
        {"role": "user", "content": capital}],
    name="generate-poem",
  ).choices[0].message.content
  return poem
 
capital_poem_generator("Bulgaria")

OpenAI Beta APIs

Since OpenAI beta APIs are changing frequently across versions, we fully support only the stable APIs in the OpenAI SDK. If you are using a beta API, you can still use the Langfuse SDK by wrapping the OpenAI SDK manually with the @observe() decorator.

Structured Output

For structured output parsing, please use the response_format argument to openai.chat.completions.create() instead of the Beta API. This will allow you to set Langfuse attributes and metadata.

If you rely on parsing Pydantic defintions for your response_format, you may leverage the type_to_response_format_param utility function from the OpenAI Python SDK to convert the Pydantic definition to a response_format dictionary. This is the same function the OpenAI Beta API uses to convert Pydantic definitions to response_format dictionaries.

from langfuse.openai import openai
from openai.lib._parsing._completions import type_to_response_format_param
from pydantic import BaseModel
 
class CalendarEvent(BaseModel):
  name: str
  date: str
  participants: list[str]
 
 
completion = openai.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {
            "role": "user",
            "content": "Alice and Bob are going to a science fair on Friday.",
        },
    ],
    response_format=type_to_response_format_param(CalendarEvent),
)
 
print(completion)
 
openai.flush_langfuse()

Assistants API

Tracing of the assistants api is not supported by this integration as OpenAI Assistants have server-side state that cannot easily be captured without additional api requests. Check out this notebook for an end-to-end example on how to best track usage of the assistants api in Langfuse.

FAQ

How to trace the OpenAI Assistants API?

Overview Track Errors

Was this page useful?

Questions? We're here to help

GitHub Q&AEmail Talk to sales

Observability for OpenAI SDK (Python)

How it works

Install Langfuse SDK

Switch to Langfuse Wrapped OpenAI SDK

Use OpenAI SDK as usual

Troubleshooting

Queuing and batching of events

Assistants API

Debug mode

Streaming function / tool calls

Advanced usage

Custom trace properties

Use Traces

OpenAI Beta APIs

Structured Output

Assistants API

FAQ

Was this page useful?

Questions? We're here to help

Subscribe to updates