Skip to main content

Documentation Index

Fetch the complete documentation index at: https://openmetadata-codex-fix-python-sdk-doc-examples.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Python SDK

The OpenMetadata Python SDK provides typed helpers for common OpenMetadata API operations. SDK operations live on plural facade classes such as Tables, Databases, and Users. The singular generated entity classes, such as metadata.generated.schema.entity.data.table.Table, are Pydantic models and do not expose SDK CRUD methods.

Installation

Install the OpenMetadata Python SDK using pip:
pip install openmetadata-ingestion

Quick Start

Basic Connection

from metadata.sdk import configure

# Configure with your host and JWT token
configure(
    host="http://localhost:8585/api",
    jwt_token="your-jwt-token"
)
You can also configure from environment variables (OPENMETADATA_HOST or OPENMETADATA_SERVER_URL, and OPENMETADATA_JWT_TOKEN or OPENMETADATA_API_KEY):
from metadata.sdk import configure

# Reads OpenMetadata host and token environment variables automatically
configure()

Working with Entities

from metadata.sdk import Tables

# Get one page of tables
tables = Tables.list(limit=10, fields=["columns"])
print(f"Found {len(tables.entities)} tables")

# Get a specific table by name
table = Tables.retrieve_by_name("your-service.your-database.your-schema.your-table")

if table:
    print(f"Table: {table.name}")
    print(f"Columns: {len(table.columns) if table.columns else 0}")

Core Functionality

Entity Management

The Python SDK facade classes provide common CRUD operations for supported OpenMetadata entities:

Create or Update Entities

from metadata.sdk import Tables
from metadata.generated.schema.api.data.createTable import CreateTableRequest
from metadata.generated.schema.entity.data.table import Column, DataType

# Create table request
create_table = CreateTableRequest(
    name="sample_table",
    databaseSchema="service.database.schema",
    columns=[
        Column(name="id", dataType=DataType.BIGINT),
        Column(name="status", dataType=DataType.VARCHAR, dataLength=255),
    ],
    description="Sample table created via Python SDK"
)

# Create the table
table = Tables.create(create_table)

Retrieve Entities

from metadata.sdk import Tables

# Get by ID
table = Tables.retrieve("uuid-here")

# Get by fully qualified name
table = Tables.retrieve_by_name("service.database.schema.table")

# Get with specific fields
table = Tables.retrieve_by_name(
    "service.database.schema.table",
    fields=["owners", "tags"]
)

List All Entities

from metadata.sdk import Tables

# Auto-pagination for large datasets
for table in Tables.list_all(batch_size=100):
    print(f"Processing table: {table.name}")

List with Filters

from metadata.sdk import Tables

# List with filters and field selection
page = Tables.list(
    limit=50,
    fields=["owners", "tags"],
    filters={"databaseSchema": "service.database.schema"},
)

for table in page.entities:
    print(table.fullyQualifiedName)

if page.after:
    next_page = Tables.list(limit=50, after=page.after)

Update Entities

from metadata.sdk import Tables
from metadata.generated.schema.type.basic import Markdown

table = Tables.retrieve_by_name("service.database.schema.table")
updated_table = table.model_copy(deep=True)
updated_table.description = Markdown("Updated description")
updated = Tables.update(updated_table)

Delete Entities

from metadata.sdk import Tables

# Soft delete
Tables.delete("uuid-here")

# Hard delete with recursive removal of children
Tables.delete("uuid-here", recursive=True, hard_delete=True)

Entity References

from metadata.sdk import to_entity_reference, Tables

# Retrieve the entity first, then get a reference
table = Tables.retrieve_by_name("service.database.schema.table")
ref = to_entity_reference(table)

# Use in other entity creation
if ref:
    print(f"Table reference ID: {ref['id']}")

Advanced Features

Error Handling

from metadata.ingestion.ometa.client import APIError
from metadata.sdk import Tables

try:
    table = Tables.retrieve("table-id")
except APIError as e:
    if e.status_code == 404:
        print("Table not found")
    elif e.status_code == 401:
        print("Authentication failed")
    else:
        print(f"Error: {e}")

Common Use Cases

Data Discovery

from metadata.sdk import Tables

# Iterate all tables and filter for a keyword
matching_tables = [
    table for table in Tables.list_all(batch_size=100)
    if "customer" in table.name.lower()
]

for table in matching_tables:
    print(f"Found customer table: {table.fullyQualifiedName}")

Metadata Automation

from metadata.sdk import Tables
from metadata.generated.schema.type.basic import Markdown

# Bulk update table descriptions
for table in Tables.list_all(batch_size=100):
    if not table.description:
        updated_table = table.model_copy(deep=True)
        updated_table.description = Markdown(f"Production table: {table.name}")
        Tables.update(updated_table)

Lineage Management

from metadata.sdk.api import Lineage

lineage = Lineage.get_lineage(
    "service.database.schema.table",
    upstream_depth=1,
    downstream_depth=1
)

if lineage:
    print(f"Upstream entities: {len(lineage.upstreamEdges or [])}")
    print(f"Downstream entities: {len(lineage.downstreamEdges or [])}")

API Reference

The Python SDK provides a comprehensive API based on the OpenMetadata data model:

Core Classes

  • Entity facade classes (Tables, Databases, DatabaseSchemas, DatabaseServices, Users, etc.): plural static-method interfaces, no instantiation required
  • configure(): One-time global setup for host and JWT token
  • to_entity_reference(entity): Convert a retrieved entity to an EntityReference for use in relationships
  • Entity Request Classes: Pydantic-based typed request objects (e.g., CreateTableRequest)

Key Methods

Each entity class exposes the same consistent interface:
  • EntityClass.create(request): Create a new entity
  • EntityClass.retrieve(entity_id): Retrieve entity by UUID
  • EntityClass.retrieve_by_name(fqn, fields=[]): Retrieve entity by fully qualified name
  • EntityClass.list(limit=10, after=None, before=None, fields=None, filters=None): List one page of entities as an EntityList
  • EntityClass.list_all(batch_size=100, fields=None, filters=None): Fetch all pages
  • EntityClass.update(entity): Update an existing entity by reading entity.id
  • EntityClass.delete(entity_id, recursive=False, hard_delete=False): Delete an entity
The facade classes do not expose patch(entity_id, json_patch) helpers. For ordinary partial updates, retrieve the entity, mutate a copy, and call update(entity). For lower-level patch flows, use the OpenMetadata client returned by metadata.sdk.client().

Current Facade Coverage

The SDK currently exports facade classes for common data assets, services, governance, team/user, and data quality entities, including Tables, Databases, DatabaseSchemas, DatabaseServices, Dashboards, Pipelines, Glossaries, GlossaryTerms, Classifications, Tags, Teams, Users, TestCases, TestDefinitions, and TestSuites. If a facade does not exist for an entity yet, use the underlying OpenMetadata client returned by metadata.sdk.client() or the ingestion client APIs directly.

Type Safety

The Python SDK is built on generated Pydantic models, providing:
  • Type hints for better IDE support
  • Runtime validation of data structures
  • Auto-completion for entity properties
  • Error prevention through static typing
from metadata.sdk import Tables
from metadata.generated.schema.entity.data.table import Table

# Type-safe retrieval with IDE auto-completion and type checking
table: Table = Tables.retrieve_by_name("service.database.schema.table")
if table:
    columns_count: int = len(table.columns) if table.columns else 0

Best Practices

  1. Configure once: Call configure() once at application startup and reuse globally; no need to pass a client object around
  2. Error Handling: Always handle APIError exceptions for robust integrations
  3. Pagination: Use list_all() for full auto-pagination or list() with after/before cursors for page-by-page control
  4. Performance: Specify only required fields when fetching entities (e.g., fields=["owners", "tags"])