Python Quickstart

A complete async example using the Yoda Python bindings (yoda). Compatible with both asyncio and anyio.

Install

pip install yoda
# or with uv:
uv pip install yoda

Build from source (requires a Rust toolchain and maturin):

uv run maturin develop           # debug build
uv run maturin develop --release # optimised build

End-to-End Example

The snippet below is a complete, runnable program. It creates an in-memory engine, registers a users table, writes rows via OLTP, syncs CDC events to OLAP, and then runs an aggregate query returning PyArrow batches.

python

import asyncio
import pyarrow as pa
import yoda

async def main():
    # 1. Configure the engine.
    #    olap_backend="datafusion"  — default, pure-Rust, no C deps.
    #    storage_mode="inmemory"    — no files written; ideal for demos/tests.
    config = yoda.HtapConfig(
        oltp_path="quickstart.db",
        olap_backend="datafusion",
        storage_mode="inmemory",
        sync_mode="destructive",
    )

    # 2. Create the engine (synchronous constructor — Tokio runtime starts here).
    engine = yoda.HtapEngine(config)

    # 3. Register a table schema.
    #    columns is a list of (name, type_string) tuples.
    #    Supported scalar types: int8/16/32/64, uint8/16/32/64,
    #    float32/64, utf8/string/text, bool/boolean, binary/bytes.
    schema = yoda.TableSchema(
        name="users",
        columns=[
            ("id",   "int64"),
            ("name", "utf8"),
            ("age",  "int32"),
        ],
        pk=["id"],
    )
    await engine.register_table(schema)

    # 4. Write rows — routed to SQLite OLTP.
    await engine.execute("INSERT INTO users VALUES (1, 'Alice', 30)")
    await engine.execute("INSERT INTO users VALUES (2, 'Bob', 25)")
    await engine.execute("INSERT INTO users VALUES (3, 'Carol', 35)")

    # 5. Flush CDC events from _yoda_cdc_log to the OLAP mirror.
    result = await engine.sync_now()
    print(f"Synced {result.events_processed} events "
          f"({result.rows_inserted} inserted)")

    # 6. Analytical query — GROUP BY routes automatically to OLAP.
    batches = await engine.query("SELECT COUNT(*) AS n FROM users")
    # batches is a list of pyarrow.RecordBatch objects.
    table = pa.Table.from_batches(batches)
    print(table.to_pandas())

    # 7. Force a query to OLAP explicitly (skip the router).
    batches2 = await engine.query_olap(
        "SELECT AVG(age) AS avg_age FROM users"
    )
    print(pa.Table.from_batches(batches2).to_pandas())

asyncio.run(main())

anyio compatibility

The bindings work transparently with anyio — just replace asyncio.run(main()) with anyio.run(main). The Tokio runtime is managed internally; no configuration is needed on the Python side.

Batch Writes

For loading many rows, execute_batch wraps all statements in a single SQLite transaction — substantially faster than individual execute calls because the per-commit fsync is amortized over the whole batch (see execute_batch in the Python API guide):

python

statements = [
    f"INSERT INTO users VALUES ({i}, 'User{i}', {20 + i % 30})"
    for i in range(4, 1004)
]
await engine.execute_batch(statements)

Durable Parquet Storage

To persist OLAP data across restarts, switch the storage mode:

python

config = yoda.HtapConfig(
    oltp_path="app.db",
    olap_backend="datafusion",
    storage_mode="parquet",
    storage_path="/var/lib/myapp/olap",
    sync_mode="destructive",
)

Accepted storage_mode values: "inmemory", "arrow_ipc", "parquet". The latter two require storage_path.

Schema Evolution

Add or drop columns on a live engine without recreating it:

python

await engine.add_column("users", "email", "utf8")
await engine.drop_column("users", "age")

Next Steps

Architecture — CDC pipeline and engine internals
Configuration Reference — all HtapConfig fields
Sidecar Mode — follow an existing Postgres/SQLite DB
Python API Reference — full method signatures and types

Python Quickstart ​

Install ​

End-to-End Example ​

Batch Writes ​

Durable Parquet Storage ​

Schema Evolution ​

Next Steps ​

Python Quickstart

Install

End-to-End Example

Batch Writes

Durable Parquet Storage

Schema Evolution

Next Steps