ClickHouse C client
clickhouse-c is a header-only C client for the ClickHouse native protocol.
The source and per-header reference are in the GitHub repository.
Unlike the higher-level clients, it does little for you on purpose. The core header decodes and
encodes Native format blocks over an I/O callback you supply. You own
the socket, TLS context, allocator, retries, and connection pooling. That makes it small enough to
embed: including clickhouse.h alone pulls in no link-time dependencies beyond libc.
This library is under active development. v1 decodes core ClickHouse types. Report limitations or missing functionality through the issue tracker. Understand however that this library is missing functionality by design.
What the library doesn't do
These are deliberate non-goals. Handle them in your application or with a sibling library:
- HTTP protocol. Wrap libcurl directly for the HTTP interface.
- DNS resolution, endpoint failover, connection pooling, retry, and backoff.
- TLS context lifecycle. The OpenSSL backend uses an
SSLyou have already connected. - Threading. Each
chc_clientis single-threaded by design. - Async I/O inside the library. The blocking client calls
chc_io.readsynchronously. For an event-loop client that performs no I/O itself, use the ioless client.
How the library is organized
clickhouse-c ships as a flat set of headers. Each header holds both declarations and implementation,
guarded by a sentinel macro. Pick the headers your build needs.
| Header | Purpose | Link flags |
|---|---|---|
clickhouse.h | Core: types, errors, allocator, I/O vtable, type-name parser, block reader, and writer | — |
clickhouse-client.h | TCP packet loop: Hello, Query, Data, EndOfStream, Exception, Progress, Pong | — |
clickhouse-async.h | Ioless client: the same packet loop driven by caller byte submission, no socket | — |
clickhouse-compression.h | Compressed-frame layout, CityHash128, codec dispatch, and LZ4/ZSTD adapters | -llz4 -lzstd |
clickhouse-posix-io.h | I/O backend over blocking read(2)/write(2) | — |
clickhouse-openssl.h | I/O backend over SSL_read/SSL_write | -lssl -lcrypto |
Required server setting
The decoder reads printable type names from the wire, so they must be encoded as text. ClickHouse writes them as text by default, but pin the setting on your queries so a server or session profile that sets it to binary can't break decoding:
Adding it to your project
There's no package to install, so you should vendor the headers into your tree via a git submodule or a copy.
Exactly one translation unit defines CHC_IMPLEMENTATION and pulls in the implementation;
every other unit includes the same headers for declarations only.
Define CHC_PROVIDE_STDLIB_ALLOC before including clickhouse.h to use chc_alloc_stdlib.
Define CHC_NO_LZ4 or CHC_NO_ZSTD for clickhouse-compression.h to drop a lz4/zstd dependencies.
Connecting over TCP
To talk to a ClickHouse server you set up the socket yourself, wrap it in a chc_io, and hand that
to chc_client_init, which runs the Hello handshake synchronously. The library does no DNS,
failover, reconnection, or pooling — those are caller concerns.
Each chc_client is single-threaded and wraps one connection. The library calls the chc_io
callbacks synchronously; what those callbacks do underneath (epoll, io_uring,
WaitLatchOrSocket) is up to you.
Running a query
Send the query, then drain packets until CHC_PKT_END_OF_STREAM. Use chc_client_send_query_ex to
attach the required server setting; the bare chc_client_send_query sends an
empty settings list and inherits whatever the server defaults to.
Server exceptions arrive as CHC_PKT_EXCEPTION packets, not as a non-OK return from
chc_client_recv_packet. Only transport-level failures return non-OK. The first CHC_PKT_DATA
packet of a result is a header block describing the schema with zero rows; data blocks follow.
chc_packet_clear frees the packet's block or exception — null those fields on the packet first to
take ownership instead.
Reading column data
Blocks are column-oriented. Each column has a physical layout, returned by chc_column_layout, that
you dispatch on; its declared type comes from chc_block_column_type. Composite layouts nest, so
reading a Nullable(Array(String)) means unwrapping the nullable, walking the array offsets, then
slicing the string data.
| Layout | Accessors |
|---|---|
CHC_COL_FIXED | chc_column_fixed_data(c, &elem_size) — n_rows * elem_size little-endian bytes |
CHC_COL_STRING | chc_column_string_data(c), chc_column_string_offsets(c) — offsets[i] is row i's exclusive end in host byte order; row 0 starts at 0 |
CHC_COL_NULLABLE | chc_column_null_map(c) (one byte per row, 1 = NULL), chc_column_nullable_inner(c) |
CHC_COL_ARRAY | chc_column_array_offsets(c) (cumulative ends), chc_column_array_values(c); Map decodes as Array(Tuple(K, V)) |
CHC_COL_TUPLE | chc_column_tuple_arity(c), chc_column_tuple_child(c, i) — each child has the same row count |
CHC_COL_LOW_CARDINALITY | chc_column_lc_key_size(c) (1/2/4/8), chc_column_lc_keys(c), chc_column_lc_dict(c); dictionary slot 0 is the default value |
A reader for plain numeric, string, and nullable columns:
CHC_COL_FIXED data is little-endian on the wire; on big-endian hosts you byte-swap multi-byte
integers yourself. Offsets and LowCardinality keys are already swapped to host order at decode time.
UUIDs are two little-endian UInt64 halves, IPv4 is a 4-byte little-endian integer, and IPv6 is
network byte order. DateTime64 ticks are UTC — the timezone in the type is metadata only.
When ingesting from an untrusted peer, call chc_column_validate on each column before traversing
it. chc_block_read doesn't validate cross-field invariants such as array offsets and
LowCardinality keys, so a forged block could otherwise read past inner-column bounds.
Inserting data
Build a block with chc_block_builder, then hand it to chc_client_send_data. The builder records
pointers rather than copying, so the column slabs must outlive the send. An INSERT sends the query,
waits for the server's header block, sends one or more data blocks, then sends an empty block to
terminate the stream.
chc_block_builder_append_fixed takes n_rows * elem_size little-endian bytes;
chc_block_builder_append_string takes cumulative exclusive end offsets in host byte order over a
packed slab. Routing the builder through chc_client_send_data rather than the lower-level
chc_block_write lets the client set the block options from the negotiated revision and apply
compression.
Compression
Pass a compression mode and a filled codec in chc_client_opts. The client decompresses incoming
Data packets and compresses outgoing ones. The compression header ships LZ4 and ZSTD adapters;
each init only fills its own slots, so call both to support either.
To wire a compression library the project doesn't ship a binding for, fill the four chc_codec
function pointers yourself — see
examples/custom_codec.c.
TLS
clickhouse-openssl.h provides a chc_io backend over SSL_read/SSL_write. You drive OpenSSL:
the library never creates an SSL_CTX, verifies certificates, sets SNI, or calls SSL_connect /
SSL_shutdown. By the time chc_io.read fires, the handshake must be complete.
ClickHouse Cloud and other TLS-enabled deployments use the native protocol on
port 9440. Both backends accept an optional check_cancel callback, polled between reads, and a
read deadline via chc_openssl_io_set_deadline / chc_posix_io_set_deadline.
Ioless (async) client
clickhouse-async.h is an ioless variant of the TCP client for event loops. It never touches a
socket: you submit the bytes you've received and drain the bytes it wants to send, driving epoll,
io_uring, or WaitLatchOrSocket yourself. The options, packet types, and block builder are the
same as the blocking client.
chc_async_client_init does no I/O and can't block. The handshake runs afterward as a resumable
state machine, as does every send and receive. When a parse runs past the bytes you've submitted, the
call returns CHC_WOULD_BLOCK instead of blocking — submit more inbound bytes and call again, and the
parser resumes mid-block.
Your pump moves bytes both ways. Outbound, chc_async_pending_out hands back a pointer and length
into the queued bytes; after the socket accepts some, call chc_async_consume_out with that count, a
partial write is fine. Inbound, feed socket reads to chc_async_submit. Sends never block or apply
backpressure, so watch the pending-out length and stop issuing sends when it grows too large.
A working liburing driver is in
test/test_async_uring.c.
Memory and the allocator
Every entry point takes a chc_alloc vtable, so allocation rides on whatever scheme the host uses.
Define CHC_PROVIDE_STDLIB_ALLOC before including clickhouse.h and call chc_alloc_stdlib() for a
standard malloc-backed allocator.
Errors and server exceptions
Functions return CHC_OK (0) or a nonzero CHC_ERR_* code. The code is the return value; a
caller-stack-allocated chc_err carries the human-readable message. The library never heap-allocates
an error.
Server-side query errors aren't chc_err failures. They arrive on the packet stream as
CHC_PKT_EXCEPTION, carrying the server's code, display_text, and stack_trace. Reserve
chc_err checking for transport, protocol, and decode failures.
Supported data types
The block reader decodes:
Int8–Int256,UInt8–UInt256Float32,Float64,BFloat16BoolDecimal32,Decimal64,Decimal128,Decimal256Date,Date32,DateTime,DateTime64,Time,Time64String,FixedString(N)UUID,IPv4,IPv6Enum8,Enum16Nullable(T),Array(T),Tuple(...),Map(K, V),Nested(...)LowCardinality(T)IntervalQBit(...)Point,Ring,Polygon,MultiPolygonSimpleAggregateFunction(f, T), which decodes as its innerTJSONandObject('json'), asStringcolumns under string serialization (see below)
JSON and Object('json') decode only when the query sets output_format_native_write_json_as_string=1.
Each row arrives as one JSON document in a CHC_COL_STRING column, so the string accessors read it;
the builder writes the same shape with chc_block_builder_append_json_string. Any other JSON
serialization version returns CHC_ERR_TYPE naming the setting.
Variant, Dynamic, AggregateFunction aren't yet decoded and return CHC_ERR_TYPE;
cast them to String server-side as a fallback.