Storage: ClickHouse Schema

Events Table

The events table is the primary storage for all Aires events. It uses the MergeTree engine with date-based partitioning and bloom filter indexes.

CREATE TABLE IF NOT EXISTS aires.events
(
    -- Identity
    id               String,
    timestamp        DateTime64(9, 'UTC'),
    service          LowCardinality(String),
    environment      LowCardinality(String),
    host             String,
    instance         String,

    -- Severity
    severity         LowCardinality(String),

    -- Content
    message          String,
    display_text     String,
    body             String,

    -- Tracing
    trace_id         String,
    span_id          String,
    parent_span_id   String,
    subtrace_id      String,

    -- Session / User / Agent
    session_id       String,
    user_id          String,
    agent_id         String,

    -- Source Location
    source_file      String,
    source_line      UInt32,
    source_function  String,

    -- Categorization
    category         LowCardinality(String),
    kind             LowCardinality(String),
    tags             Array(String),

    -- HTTP
    http_method      LowCardinality(String),
    http_path        String,
    http_status_code UInt16,
    http_duration_ms Int64,

    -- Metrics
    metric_name      LowCardinality(String),
    metric_value     Float64,

    -- Errors
    error_type       LowCardinality(String),
    error_message    String,
    error_stack      String,
    error_handled    Bool,

    -- SDK Metadata
    sdk_name         LowCardinality(String),
    sdk_version      LowCardinality(String),
    sdk_language     LowCardinality(String),

    -- Indexes
    INDEX idx_trace_id trace_id TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_span_id span_id TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_session_id session_id TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_message message TYPE tokenbf_v1(10240, 3, 0) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY toDate(timestamp)
ORDER BY (service, severity, timestamp)
TTL toDateTime(timestamp) + INTERVAL 30 DAY
SETTINGS index_granularity = 8192;

Column Reference

Identity Columns

Column	Type	Description
`id`	`String`	Unique event ID (UUID v7, time-ordered). Generated by the SDK.
`timestamp`	`DateTime64(9, 'UTC')`	Event timestamp with nanosecond precision. Set by the SDK at event creation time.

Service Columns

Column	Type	Description
`service`	`LowCardinality(String)`	Service name (e.g. `"workforce-api"`). Part of the primary sort key.
`environment`	`LowCardinality(String)`	Environment (e.g. `"production"`, `"staging"`).
`host`	`String`	Hostname or pod name.
`instance`	`String`	Instance/replica identifier.

Content Columns

Column	Type	Description
`severity`	`LowCardinality(String)`	Log level: `"trace"`, `"debug"`, `"info"`, `"warn"`, `"error"`, `"fatal"`. Part of the primary sort key.
`message`	`String`	Human-readable log message. Indexed with token bloom filter for full-text search.
`display_text`	`String`	Formatted display text (may include ANSI codes).
`body`	`String`	Raw body content (for binary payloads, stored as base64).

Tracing Columns

Column	Type	Description
`trace_id`	`String`	Distributed trace ID. Bloom filter indexed for fast lookups.
`span_id`	`String`	Span ID within a trace. Bloom filter indexed.
`parent_span_id`	`String`	Parent span ID (for building span trees).
`subtrace_id`	`String`	Sub-trace ID for grouping nested operations.

Session Columns

Column	Type	Description
`session_id`	`String`	User/agent session ID. Bloom filter indexed.
`user_id`	`String`	Authenticated user identifier.
`agent_id`	`String`	AI agent identifier.

Source Location Columns

Column	Type	Description
`source_file`	`String`	Source file path where the event originated.
`source_line`	`UInt32`	Line number in the source file.
`source_function`	`String`	Function or method name.

Categorization Columns

Column	Type	Description
`category`	`LowCardinality(String)`	Event category: `"http"`, `"db"`, `"auth"`, `"ai"`, `"k8s"`, etc.
`kind`	`LowCardinality(String)`	Event kind: `"log"`, `"span"`, `"metric"`, `"request"`, `"response"`.
`tags`	`Array(String)`	Free-form tags. Queryable with `has(tags, 'value')`.

HTTP Columns

Column	Type	Description
`http_method`	`LowCardinality(String)`	HTTP method (`"GET"`, `"POST"`, etc.).
`http_path`	`String`	URL path (e.g. `"/api/tasks"`).
`http_status_code`	`UInt16`	HTTP response status code.
`http_duration_ms`	`Int64`	Request duration in milliseconds.

Metric Columns

Column	Type	Description
`metric_name`	`LowCardinality(String)`	Metric name (e.g. `"http.request.duration_ms"`).
`metric_value`	`Float64`	Metric value.

Error Columns

Column	Type	Description
`error_type`	`LowCardinality(String)`	Error class name (e.g. `"DatabaseError"`, `"TimeoutError"`).
`error_message`	`String`	Error message text.
`error_stack`	`String`	Stack trace.
`error_handled`	`Bool`	Whether the error was caught (`true`) or unhandled (`false`).

SDK Metadata Columns

Column	Type	Description
`sdk_name`	`LowCardinality(String)`	SDK identifier (`"aires-sdk-rust"`, `"aires-sdk-ts"`, `"aires-sdk-python"`).
`sdk_version`	`LowCardinality(String)`	SDK version string.
`sdk_language`	`LowCardinality(String)`	SDK language (`"rust"`, `"typescript"`, `"python"`).

Index Strategy

Primary Key (Sort Order)

ORDER BY (service, severity, timestamp)

The primary sort key optimizes the most common query pattern: filter by service, then severity, then time range. ClickHouse can skip entire granules (8192 rows) when these columns appear in WHERE clauses.

Queries that benefit from the primary key:

-- Exact prefix match on service + severity + time (fastest)
WHERE service = 'my-api' AND severity = 'error' AND timestamp > now() - INTERVAL 1 HOUR

-- Prefix match on service + time
WHERE service = 'my-api' AND timestamp > now() - INTERVAL 1 HOUR

-- Prefix match on service only
WHERE service = 'my-api'

Bloom Filter Indexes

INDEX idx_trace_id trace_id TYPE bloom_filter(0.01) GRANULARITY 1
INDEX idx_span_id span_id TYPE bloom_filter(0.01) GRANULARITY 1
INDEX idx_session_id session_id TYPE bloom_filter(0.01) GRANULARITY 1

Bloom filters enable fast point lookups on high-cardinality string columns. The 0.01 false positive rate means 1% of granules may be read unnecessarily — a good tradeoff for columns with millions of unique values.

Queries that benefit from bloom filters:

WHERE trace_id = 'abc-123'
WHERE session_id = 'sess-789'
WHERE span_id = 'span-001'

Token Bloom Filter on Message

INDEX idx_message message TYPE tokenbf_v1(10240, 3, 0) GRANULARITY 1

The token bloom filter indexes individual tokens (words) in the message column. This enables substring-like search:

-- Token search (uses the index)
WHERE message LIKE '%timeout%'
WHERE hasToken(message, 'timeout')

Partitioning

PARTITION BY toDate(timestamp)

One partition per calendar day. Benefits:

Time-range queries skip entire partitions for dates outside the range
TTL cleanup drops entire partitions efficiently (no row-by-row deletion)
Backup/restore can operate on individual day partitions

LowCardinality Columns

Columns marked LowCardinality(String) use dictionary encoding. ClickHouse maintains a dictionary of unique values and stores integer indexes instead of the full string. This dramatically reduces storage for columns with fewer than ~10,000 unique values:

service — typically 10-100 unique services
environment — 3-5 values
severity — 6 values
category — 10-20 values
kind — 5-10 values
http_method — 9 values (GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS, TRACE, CONNECT)
error_type — typically 20-100 unique error classes
sdk_name, sdk_version, sdk_language — handful of values

Storage Efficiency

Typical compression ratios for observability data in ClickHouse:

Column	Raw Size	Compressed	Ratio
`timestamp`	8 bytes/row	~0.5 bytes/row	16x
`service` (LowCardinality)	~20 bytes/row	~0.1 bytes/row	200x
`severity` (LowCardinality)	~5 bytes/row	~0.05 bytes/row	100x
`message`	~100 bytes/row	~30 bytes/row	3x
`trace_id`	36 bytes/row	~5 bytes/row	7x
Overall	~500 bytes/row	~50 bytes/row	10x

At 10x compression, 1 billion events (~500 GB raw) fits in ~50 GB of disk.