🦙 LlamaIndex Integration

LlamaIndex (GitHub (opens in a new tab)) is an advanced "data framework" tailored for augmenting Large Language Models (LLMs) with private data.

It streamlines the integration of diverse data sources and formats (APIs, PDFs, docs, SQL, etc.) through versatile data connectors and structures data into indices and graphs for LLM compatibility. The platform offers a sophisticated retrieval/query interface for enriching LLM inputs with context-specific outputs. Designed for both beginners and experts, LlamaIndex provides a user-friendly high-level API for easy data ingestion and querying, alongside customizable lower-level APIs for detailed module adaptation.

Langfuse offers a simple integration for automatic capture of traces and metrics generated in LlamaIndex applications. Any feedback? Let us know on Discord or GitHub. This is a new integration, and we'd love to hear your thoughts.

Currently only Python is supported by this integration. If you are interested in an integration with LlamaIndex.TS, add your upvote/comments to this issue (opens in a new tab).

Example LlamaIndex trace in Langfuse. See a full video demo here.

Add Langfuse to your LlamaIndex application

Make sure you have both llama-index and langfuse installed.

pip install llama-index langfuse

At the root of your LlamaIndex application, register Langfuse's LlamaIndexCallbackHandler in the LlamaIndex Settings.callback_manager. When instantiating LlamaIndexCallbackHandler, make sure to configure it correctly with your Langfuse API keys and the Host URL.

.env

LANGFUSE_SECRET_KEY="sk-lf-..."
LANGFUSE_PUBLIC_KEY="pk-lf-..."
LANGFUSE_HOST="https://cloud.langfuse.com" # 🇪🇺 EU region
# LANGFUSE_HOST="https://us.cloud.langfuse.com" # 🇺🇸 US region

from llama_index.core import Settings
from llama_index.core.callbacks import CallbackManager
from langfuse.llama_index import LlamaIndexCallbackHandler
 
langfuse_callback_handler = LlamaIndexCallbackHandler()
Settings.callback_manager = CallbackManager([langfuse_callback_handler])

✨

Done! Traces and metrics from your LlamaIndex application are now automatically tracked in Langfuse. If you construct a new index or query an LLM with your documents in context, your traces and metrics are immediately visible in the Langfuse UI.

Check out the notebook for end-to-end examples of the integration:

Example Notebook

Additional configuration

Queuing and flushing

The Langfuse SDKs queue and batches events in the background to reduce the number of network requests and improve overall performance. In a long-running application, this works without any additional configuration.

If you are running a short-lived application, you need to flush Langfuse to ensure that all events are flushed before the application exits.

langfuse_handler.flush()

Learn more about queuing and batching of events here.

Custom trace parameters

You can update trace parameters at any time to add additional context to a trace, such as a user ID, session ID, or tags. See the Python SDK Trace documentation for more information. All subsequent traces will include these set parameters.

Property	Description
`name`	Identify a specific type of trace, e.g. a use case or functionality.
`metadata`	Additional information that you want to see in Langfuse. Can be any JSON.
`session_id`	The current session.
`user_id`	The current user_id.
`tags`	Tags to categorize and filter traces.
`version`	The specified version to trace experiments.
`release`	The specified release to trace experiments.
`sample_rate`	Sample rate for tracing.

from llama_index.core import Settings
from llama_index.core.callbacks import CallbackManager
from langfuse import langfuse
 
# Instantiate a new LlamaIndexCallbackHandler and register it in the LlamaIndex Settings
langfuse_handler = LlamaIndexCallbackHandler()
Settings.callback_manager = CallbackManager([langfuse_handler])
 
def my_func():
  # Set trace parameters before executing your LlamaIndex code
  langfuse_callback_handler.set_trace_params(
    user_id="user-123",
    session_id="session-abc",
    tags=["production"]
  )
 
  # Your LlamaIndex code, trace will include the set parameters

Notes

The params will be applied to all traces and spans created after the set_trace_params call. You can unset them by calling e.g. set_trace_params(user_id=None).

If you run this in a Jupyter Notebook, you need to run set_trace_params in the same cell as your LlamaIndex code.

When setting a root trace or span, this setting will have no effect as the root trace or span will be used. See next section for more information.

Interoperability with Langfuse SDK

The Langfuse Python SDK is fully interoperable with the LlamaIndex integration.

This is useful when your LlamaIndex executions are part of a larger application and you want to link all traces and spans together. This can also be useful when you'd like to group multiple LlamaIndex executions to be part of the same trace or span.

When using the Langfuse @observe() decorator, langfuse_context.get_current_llama_index_handler() exposes a callback handler scoped to the current trace context, in this case llama_index_fn(). Pass it to the LlamaIndex Settings.callback_manager to trace subsequent LlamaIndex executions.

from langfuse.decorators import langfuse_context, observe
from llama_index.core import Document, VectorStoreIndex
from llama_index.core import Settings
from llama_index.core.callbacks import CallbackManager
 
@observe()
def llama_index_fn(question: str):
    # Set callback manager for LlamaIndex, will apply to all LlamaIndex executions in this function
    langfuse_handler = langfuse_context.get_current_llama_index_handler()
    Settings.callback_manager = CallbackManager([langfuse_handler])
 
    # Run application
    index = VectorStoreIndex.from_documents([doc1,doc2])
    response = index.as_query_engine().query(question)
    return response

Notes

The Llamaindex intergation will not make any changes to your provided root trace or span. If you want to add additional context or input/output to your root trace or span, you can do so via the Python SDK.

This uses context vars and will work reliably when run in the same cell in Jupyter.

🛠️ Beta: Observability based on LlamaIndex instrumentation module

⚠️ For production use cases, we recommend using the callback-based LlamaIndex integration with Langfuse as described above until this integration is stable.

The new LlamaIndex instrumentation module (opens in a new tab) allows for seamless instrumentation of LlamaIndex applications. In particular, one can handle events and track spans using both custom logic as well as those offered in the module. Users can also define their own events and specify where and when in the code logic that they should be emitted.

Langfuse offers an experimental integration with the LlamaIndex instrumentation module. This integration allows for the automatic capture of traces and metrics generated in LlamaIndex applications. To enable this integration, add the following code snippet to your LlamaIndex application.

NOTE: This integration is in beta and under active development. Behavior and APIs may change in future releases.

import llama_index.core.instrumentation as instrument
from langfuse.llama_index import LlamaIndexSpanHandler
 
langfuse_span_handler = LlamaIndexSpanHandler()
instrument.get_dispatcher().add_span_handler(langfuse_span_handler)

Please report issues and feedback to us on GitHub (opens in a new tab).

Upgrade Paths Example (Python)

Was this page useful?

Questions? We're here to help

GitHub Q&AEmail Talk to sales