Development
Setup
The following instructions are for setting up a development environment for the arize_toolkit.
Before you begin, make sure to clone the repository and navigate to the arize_toolkit
directory:
git clone https://github.com/duncankmckinnon/arize_toolkit.git
cd arize_toolkit
To set up a development environment for this project, first run the bootstrap script to create a named virtual environment and install the dependencies using uv:
sh ./bin/bootstrap.sh
Then activate the virtual environment:
source arize-toolkit-venv/bin/activate
You're ready to develop! The virtual environment will be created in the current directory with the name "arize-toolkit-venv".
Base Classes Explained
1. BaseVariables
BaseVariables is the base class for all query variables. It provides a structure for defining the variables for a query and ensures that the variables are validated and serialized correctly.
It inherits from Dictable
, which is a utility class that wraps a Pydantic BaseModel. This interface allows for consistent type conversions between graphql friendly dictionaries and objects. All the model types in the arize_toolkit eventually inherit from Dictable
so that they can be used in the same way.
The endCursor
field is used in pagination throughout Arize graphql, so it is included as an optional field by default.
class BaseVariables(Dictable):
"""Base class for all query variables"""
endCursor: Optional[str] = None
Purpose:
- Validates input parameters for GraphQL queries using Pydantic
- Ensures type safety for query variables
- Provides automatic validation and serialization
- Includes pagination support via
endCursor
Implementation in BaseQuery
The BaseQuery
class requires a Variables
class that often inherits from BaseVariables
. This allows for the variables to be validated and serialized correctly. When defining the Variables
class you simply need to define the fields and types of variables used as input to the query.
Example Usage:
class GetModelQuery(BaseQuery):
class Variables(BaseVariables):
space_id: str
model_name: str
The BaseVariables class is a convenient tool for validating the input to the query, but in situations where the input to the mutation is already represented by a model type, it may be more convenient to override the BaseVariables class in the BaseQuery with the model type definition instead.
Example of a mutation using an existing model type:
class Thing(GraphQLModel):
id: str
name: str
class CreateThingMutation(BaseQuery):
class Variables(Thing):
pass
2. BaseResponse
BaseResponse is the base class for all query responses. It provides a structure for defining the response for a query, and ensures that the response is validated and serialized correctly.
Like BaseVariables, it inherits from Dictable
, which is a utility class that wraps a Pydantic BaseModel.
class BaseResponse(Dictable):
"""Base class for all query responses"""
pass
Purpose:
- Defines the structure and type validation for query and mutation responses
- Ensures consistent response handling and error messages
Example Usage:
class GetModelQuery(BaseQuery):
class QueryResponse(BaseResponse):
id: str
name: str
As with BaseVariables, the BaseResponse class is a convenient tool for validating the response from the query, but in situations where the response is a model type, it may be more convenient to override the BaseResponse class in the BaseQuery with the model type definition instead.
Example of a mutation using an existing model type:
class Thing(GraphQLModel):
id: str
name: str
class GetThingQuery(BaseQuery):
class QueryResponse(Thing):
pass
3. ArizeAPIException
All exceptions in the arize_toolkit are subclasses of ArizeAPIException. This allows for consistent error handling across all queries. It also allows for custom exception types per query, and handling for common exceptions related to the API.
The keyword_exceptions class variable is used to define the exceptions that are common to all queries, but don't provide useful information about the error. The ArizeAPIException class uses a keyword search to determine if a raised exception is related to a common issue, and if so, it will use more specific and actionable error messages defined in the keyword exception classes.
class ArizeAPIException(Exception):
"""Base class for all API exceptions"""
keyword_exceptions = [RateLimitException, RetryException]
message: str = "An error occurred while running the query"
details: Optional[str] = None
Example Usage:
class GetModelQuery(BaseQuery):
class QueryException(ArizeAPIException):
message: str = "Error getting the id of a named model in the space"
4. BaseQuery
BaseQuery is the base class for all queries and mutations. It provides a structure for defining the query, variables, exception, parsing, and response. All the base classes are inherited and used in the query logic, so the specific implementations only need to define:
- The GraphQL query
- The variables for the query
- The exception for the query
- The response for the query
- The logic for parsing the response
The base query handles logic around:
- Executing queries or mutations
- Validating the variables
- Handling the response
- Handling errors
- Iterating over pages
- Rate limiting
So you will rarely need to add any additional functionality in your query implementations outside of the setup and parsing logic.
class BaseQuery:
"""Base class for all queries"""
graphql_query: str
query_description: str
class Variables(BaseVariables):
# Define the variables for the query
pass
class QueryException(ArizeAPIException):
# Define the exception for the query
pass
class QueryResponse(BaseResponse):
# Define the response for the query
pass
@classmethod
def _graphql_query(
cls, client: GraphQLClient, **kwargs
) -> Tuple[BaseResponse, bool, Optional[str]]:
try:
query = gql(cls.graphql_query)
# Relies on the QueryVariables class to validate the variables
result = client.execute(
query,
variable_values=cls.QueryVariables(**kwargs).to_dict(
exclude_none=False
),
)
# Relies on the QueryResponse class to parse the result
return cls._parse_graphql_result(result)
except Exception as e:
# Relies on the QueryException class to handle the exception
raise cls.QueryException(details=str(e))
Implementing Patterns for Queries and Mutations
GraphQL Model Types
Parsing
The base query handles parsing of the response from the API. This is done by the _parse_graphql_result
method.
For queries that retrieve a single item by its id, the base query will handle the parsing of the response into the model type.
For other queries and mutations, you will need to implement the _parse_graphql_result
method in your query implementation.
The _parse_graphql_result
method takes in the graphql query result as a dictionary and returns a tuple containing a list of the parsed response(s), a boolean indicating if there are more pages, and an optional endCursor to be used for pagination. For queries that retrieve a single item by its id, the base query will handle the parsing of the response into the model type.
Base Implementation for Queries of Objects by Id
For any query that retrieves a single item by its id, the base query will handle the parsing of the response into the model type. This is the base implementation because regardless of the object type, the response is always the same format:
{
"node": {
"id": "123",
"name": "Thing",
...
}
}
class GetThingQuery(BaseQuery):
...
@classmethod
def _parse_graphql_result(
cls, result: dict
) -> Tuple[List[BaseResponse], bool, Optional[str]]:
# Default behavior for queries of objects by id
if "node" in result and result["node"] is not None:
result_node = result["node"]
return [cls.QueryResponse(**result_node)], False, None
else:
cls.raise_exception("Object not found")
Parsing Queries that Retrieve a List of Items
For queries that retrieve a list of items, the base query will handle the parsing of the response into a list of model types.
The form of these queries is often the same in Arize GraphQL, with an endCursor
marker for pagination and a flag indicating if there are more pages to retrieve.
Example of a query that retrieves a list of items:
class GetThingsQuery(BaseQuery):
# Typical form of a query that retrieves a list of items - the node is the object type that is being retrieved
graphql_query = (
"""
query getAllThings($space_id: ID!, $endCursor: String) {
node(id: $space_id) {
... on Space {
things (first: 10, after: $endCursor) {
edges {
node {"""
+ Thing.to_graphql_fields()
+ """ }
}
pageInfo {
hasNextPage
endCursor
}
}
}
}
}
"""
)
...
@classmethod
def _parse_graphql_result(
cls, result: dict
) -> Tuple[List[BaseResponse], bool, Optional[str]]:
# Default behavior for queries of objects by id
if (
"edges" in result["node"]["things"]
and result["node"]["things"]["edges"] is not None
):
edges = result["node"]["things"]["edges"]
things = [cls.QueryResponse(**edge["node"]) for edge in edges]
# Check if there are more pages to retrieve
page_info = result["node"]["things"]["pageInfo"]
hasNextPage = page_info["hasNextPage"]
endCursor = page_info["endCursor"]
return things, hasNextPage, endCursor
else:
cls.raise_exception("No things found")
Adding Functions to the Client
The client provides a clean interface to run queries and retrieve data from the api. It is the main interface for the arize_toolkit. Under the hood, each function exposed by the client uses base query classes to interact with the API and handle the response parsing and error handling.
Example of a client function:
class Client:
def get_model(self, model_name: str) -> Dict:
results, _, _ = GetModelQuery.run_graphql_query(
self._graphql_client, space_id=self._space_id, model_name=model_name
)
# The results are a list of the model type defined in the QueryResponse class
return results[0].to_dict()
While there is flexibility in how client functions are defined, there are some conventions that are used throughout the arize_toolkit
Key Features
- Type Safety: Uses Pydantic models for request/response validation
- Pagination: Built-in support through
iterate_over_pages
- Error Handling: Structured exceptions for each query type
- Separation of Concerns:
- Query definition (GraphQL)
- Parameter validation
- Response parsing
- Error handling
Example Flow
- Client makes a request:
client.get_model("my_model")
-
Query execution:
-
Variables validated through
BaseVariables
- GraphQL query executed
- Response parsed and validated
-
Typed response returned to client
-
Error handling:
-
Network errors caught
- Invalid responses caught
- Custom exceptions raised with context
This pattern makes it easy to:
- Add new queries
- Maintain type safety
- Handle errors consistently
- Support pagination where needed
- Test individual components