Trace Tools
Overview
In Arize, traces represent end-to-end executions of your LLM application, composed of individual spans. Each span captures a single operation (LLM call, retrieval, tool use, etc.) with its input, output, latency, and status.
In arize_toolkit, the Client exposes helpers for:
- Discovering available span column names for a model
- Listing recent traces (root spans) for a model in a time window
- Retrieving all spans for a specific trace with full attributes
- Exporting trace data as pandas DataFrames for analysis
| Operation | Helper |
|---|---|
| Discover available columns | get_span_columns |
| List traces for a model | list_traces |
| Get all spans for a trace | get_trace |
Trace Operations
get_span_columns
columns: list[str] = client.get_span_columns(
model_name="my-agent",
start_time="2025-01-01T00:00:00Z",
end_time="2025-01-02T00:00:00Z",
)
Discovers all available span column names for a model by querying tracingSchema.spanProperties. Returns column names in attributes.* format ready to pass to get_trace().
Parameters
model_name(Optional[str]) — Name of the model. Eithermodel_nameormodel_idis required.model_id(Optional[str]) — ID of the model (base64-encoded). Eithermodel_nameormodel_idis required.start_time(Optional[datetime | str]) — Start of time window. Defaults to 7 days ago.end_time(Optional[datetime | str]) — End of time window. Defaults to now.
Returns
A list of column name strings, e.g. ["attributes.input.value", "attributes.output.value", "attributes.llm.model_name", ...].
Example
columns = client.get_span_columns(model_name="business-intel-agent")
print(columns)
# ['attributes.input.value', 'attributes.output.value', 'attributes.llm.model_name', ...]
list_traces
traces: list[dict] = client.list_traces(
model_name="my-agent",
start_time="2025-01-01T00:00:00Z",
end_time="2025-01-02T00:00:00Z",
count=20,
sort_direction="desc",
)
Lists root spans (one per trace) for a model within a time window. Use this to discover trace IDs for further inspection. By default, requests attributes.input.value and attributes.output.value as structured columns.
Parameters
model_name(Optional[str]) — Name of the model. Eithermodel_nameormodel_idis required.model_id(Optional[str]) — ID of the model (base64-encoded). Eithermodel_nameormodel_idis required.start_time(Optional[datetime | str]) — Start of time window. Defaults to 7 days ago.end_time(Optional[datetime | str]) — End of time window. Defaults to now.count(int) — Number of traces per page. Default20.sort_direction(str) — Sort direction:"desc"or"asc". Default"desc".to_dataframe(bool) — IfTrue, return a pandas DataFrame with flattened attributes as columns. DefaultFalse.
Returns
When to_dataframe=False (default), a list of dictionaries — one per trace — containing:
traceId— Unique trace identifiername— Root span namespanKind— Span kind (e.g.CHAIN,LLM,AGENT)statusCode— Status (OK,ERROR,UNSET)startTime— When the trace startedlatencyMs— End-to-end latency in millisecondsspanId— Root span identifierparentId— AlwaysNonefor root spansattributes— JSON string containing all span attributescolumns— Structured column values for requested column namestraceTokenCounts— Aggregate token counts (prompt, completion, total)
When to_dataframe=True, a pandas DataFrame with the above fields plus all attributes flattened as attributes.<key> columns.
Example
from arize_toolkit import Client
client = Client(organization="my-org", space="my-space")
# List the 10 most recent traces
traces = client.list_traces(model_name="business-intel-agent", count=10)
for t in traces:
print(f"[{t['statusCode']}] {t['name']} — {t['latencyMs']:.0f}ms — {t['traceId']}")
# Get traces as a DataFrame for analysis
df = client.list_traces(model_name="business-intel-agent", count=50, to_dataframe=True)
print(df[["traceId", "name", "latencyMs", "attributes.input.value"]].head())
get_trace
spans: list[dict] = client.get_trace(
trace_id="abc123-def456",
model_name="my-agent",
)
Retrieves all spans for a specific trace with full attributes and structured column data. When column_names is not specified, all available columns are auto-discovered via get_span_columns().
Parameters
trace_id(str) — The trace ID to look up.model_name(Optional[str]) — Name of the model. Eithermodel_nameormodel_idis required.model_id(Optional[str]) — ID of the model (base64-encoded). Eithermodel_nameormodel_idis required.start_time(Optional[datetime | str]) — Start of time window. Defaults to 7 days ago.end_time(Optional[datetime | str]) — End of time window. Defaults to now.column_names(Optional[list[str]]) — Column names to include (e.g.["attributes.input.value"]). IfNone(default), all available columns are auto-discovered.count(int) — Number of spans per page. Default20.to_dataframe(bool) — IfTrue, return a pandas DataFrame with flattened attributes as columns. DefaultFalse.
Returns
When to_dataframe=False (default), a list of dictionaries — one per span — containing:
spanId— Span identifiertraceId— Parent trace identifiername— Span namespanKind— Span kind (e.g.LLM,RETRIEVER,TOOL)statusCode— Status (OK,ERROR,UNSET)parentId— Parent span ID (Nonefor root)startTime— When the span startedlatencyMs— Span latency in millisecondsattributes— JSON string of all span attributescolumns— Structured column values for requested column namestraceTokenCounts— Aggregate token counts (prompt, completion, total)
When to_dataframe=True, a pandas DataFrame with span fields plus attributes flattened as attributes.<key> columns. Structured column values take precedence over parsed attributes.
Column Names
Column names use the attributes.* prefix format. Use get_span_columns() to discover available columns, or refer to common ones below:
| Category | Column Names |
|---|---|
| Core | attributes.input.value, attributes.output.value |
| LLM Messages | attributes.llm.input_messages, attributes.llm.output_messages |
| Token Counts | attributes.llm.token_count.prompt, attributes.llm.token_count.completion, attributes.llm.token_count.total |
| Metadata | attributes.llm.model_name, attributes.llm.provider |
Example
# Get all spans with all available columns (auto-discovered)
spans = client.get_trace(
trace_id="abc123-def456",
model_name="business-intel-agent",
)
for s in spans:
indent = " " if s["parentId"] else ""
print(f"{indent}{s['name']} ({s['spanKind']}) — {s['latencyMs']:.0f}ms")
# Get a DataFrame with specific columns
df = client.get_trace(
trace_id="abc123-def456",
model_name="business-intel-agent",
column_names=[
"attributes.input.value",
"attributes.output.value",
"attributes.llm.token_count.total",
],
to_dataframe=True,
)
print(df[["name", "spanKind", "latencyMs", "attributes.input.value"]].to_string())
# Discover all available columns first
columns = client.get_span_columns(model_name="business-intel-agent")
print(f"Available columns: {columns}")