Configuration Reference
HtapConfig is the single configuration struct passed to HtapEngine::new. All fields have Default implementations, so you only need to set what differs from the defaults.
use yoda::HtapConfig;
use std::time::Duration;
let config = HtapConfig {
oltp_path: "/var/lib/myapp/oltp.db".to_string(),
olap_in_memory: false,
olap_path: Some("/var/lib/myapp/olap".to_string()),
sync_interval: Some(Duration::from_millis(200)),
..HtapConfig::default()
};Core Engine Fields
| Field | Type | Default | Description |
|---|---|---|---|
oltp_path | String | "yoda.db" | Path to the SQLite database file. Created on first use. WAL mode and PRAGMA synchronous=NORMAL are applied automatically. |
olap_in_memory | bool | true | Keep the OLAP engine entirely in memory. Set to false for durability across restarts. |
olap_path | Option<String> | None | Filesystem path for durable OLAP storage. For DuckDB: a .duckdb file. For DataFusion: a directory. Ignored when olap_in_memory = true. |
olap_backend | OlapBackendType | DataFusion | Which OLAP engine to instantiate. See OLAP Backends. |
read_pool_size | usize | 4 | Number of SQLite read connections. Increase for workloads with heavy concurrent OLTP reads. Values below 1 are clamped to 1. |
CDC Sync Fields
| Field | Type | Default | Description |
|---|---|---|---|
sync_interval | Option<Duration> | None | When Some(d), a background loop polls CDC every d. When None, you call sync_now() manually. Values of 100–500 ms are typical for balanced HTAP workloads. |
sync_batch_size | u32 | 1000 | Maximum CDC events consumed per sync cycle. Larger values increase throughput at the cost of per-cycle latency. For append-heavy workloads, 5 000–50 000 is common. |
sync_mode | SyncMode | Destructive | How CDC events are applied to OLAP. Destructive mirrors current state; Temporal (SCD Type 2) appends a history row per change. See Sync Modes. |
prune_after_sync | bool | true | Delete processed events from _yoda_cdc_log after each successful cycle. Set to false only for debugging or replay scenarios. |
DataFusion Storage Mode
Available when the datafusion-backend feature is enabled (the default).
| Field | Type | Default | Description |
|---|---|---|---|
datafusion_storage | StorageMode | InMemory | Controls how DataFusion persists tables between queries. Ignored when olap_backend = DuckDb. |
StorageMode Variants
DataFusion supports five storage modes — InMemory, ArrowIpc { path }, Parquet { path }, S3Parquet { url }, and GcsParquet { url }. The cloud variants require the cloud-storage feature. See OLAP Backends → DataFusion storage modes for the full comparison (durability, write speed, predicate pushdown, UPDATE/DELETE characteristics).
UPDATE/DELETE on cloud backends
S3Parquet and GcsParquet perform a full read-modify-write cycle for every UPDATE or DELETE. They are designed for append-heavy analytics, not high-frequency point mutations.
Schema Registry Persistence
| Field | Type | Default | Description |
|---|---|---|---|
schema_registry_path | Option<String> | None | Path to a JSON file where registered table schemas are persisted across restarts. When set, register_table writes the updated registry atomically after each call. On restart, all previously registered tables are restored automatically — no need to call register_table again. |
RocksDB CDC Buffer
Available when the rocksdb-cdc feature is enabled.
| Field | Type | Default | Description |
|---|---|---|---|
rocksdb_cdc_path | Option<String> | None | Path to a RocksDB directory used as a durable CDC event buffer. SQLite triggers still fire into _yoda_cdc_log; a bridge drains them atomically into RocksDB on every poll cycle. The sync engine then reads exclusively from RocksDB, giving crash-durable event buffering. Ignored in sidecar mode. |
Sidecar Mode Fields
Available when the sidecar feature is enabled. Set sidecar: Some(SidecarConfig { … }) to switch into sidecar mode.
| Field | Type | Default | Description |
|---|---|---|---|
sidecar | Option<SidecarConfig> | None | Sidecar CDC configuration. When Some, CDC events come from an external SQLite or PostgreSQL database via timestamp polling instead of local triggers. |
SidecarConfig Fields
| Field | Type | Description |
|---|---|---|
source | SidecarSource | External database to follow. Use SidecarSource::Sqlite(path) or SidecarSource::Postgres(conn_str). |
timestamp_config | TimestampCdcConfig | Per-table polling settings: which tables, timestamp columns, primary keys, batch size, and delete detection. See Sidecar Mode. |
enable_local_oltp | bool | false by default. Set to true to also create a local SQLite write path at oltp_path. |
watermark_path | Option<String> | Path to a RocksDB directory for persisting the CDC watermark between restarts. Without this, the watermark is in-memory only and polling restarts from scratch after a process restart. |
TOML Configuration (yd serve)
When running the engine as a service via yd serve --config config/htap.toml, all fields map to TOML keys under [engine]:
[engine]
oltp_path = "/var/lib/myapp/oltp.db"
olap_in_memory = false
olap_path = "/var/lib/myapp/olap"
sync_interval_ms = 200 # milliseconds
sync_batch_size = 5000
read_pool_size = 4
sync_mode = "destructive" # or "temporal"
schema_registry_path = "/var/lib/myapp/schema_registry.json"
log_format = "json" # "text" (default) or "json"
metrics_port = 9100 # Prometheus endpoint (metrics-exporter feature)See CLI Reference for the full service configuration and signal handling.