Building a Client
This guide walks through everything needed to build an interoperable p2p-mes client: how to authenticate, how to derive chat identifiers, the message lifecycle, and how to layer end-to-end encryption on top. For the exact schema of every endpoint, see the API Overview and the generated API Reference; for the design rationale behind these choices, see Design Philosophy.
A client never talks to the peer-to-peer network directly. It speaks HTTP to a single node, which signs nothing on the client's behalf except the network clock (see the trust note below). Everything else -- authorship, chat membership, encryption -- is the client's responsibility.
Prerequisites
To talk to a node you need three things:
- An ECDSA secp256k1 keypair. The user's identity (their address) is derived from the public key exactly as in Ethereum.
- The node's HTTP API base URL (for example
http://localhost:3000). - The node's PeerId (Base58). Every request is bound to a specific node, so the client must know which node it is addressing.
Identity and addresses
A user is identified by a 20-byte address derived from their public key:
address = keccak256(uncompressed_pubkey[1..])[12..32] // last 20 bytes
This is identical to Ethereum address derivation. Addresses appear in the API
as 0x-prefixed hex (42 characters). See
Cryptography & Authentication for the full derivation.
Authentication: signing every request
Every endpoint requires a signature. The node verifies it by recovering the
signer's address from the signature and comparing it to the X-User header --
so there is no session, token, or password. Each request is signed
independently.
Required headers
| Header | Value |
|---|---|
X-User | Signer's address, 0x-prefixed hex (20 bytes) |
X-Ts | Unix timestamp in milliseconds; must be within +/- 30 s of node |
X-Node | Base58 PeerId of the node being addressed (must match that node) |
X-Sig | 65-byte signature as hex: r[32] || s[32] || v[1] (130 hex chars) |
X-Sig-Version | Protocol tag; currently p2p-mes-v1 (the default if omitted) |
What you sign
You do not sign the raw request bytes. You sign a canonical string built from the request, so the signature is stable regardless of JSON key order or whitespace. The string is:
p2p-mes-v1
METHOD:{UPPERCASE_METHOD}
PATH:{path}
QUERY:{canonical_query}
BODY:{canonical_body}
TS:{timestamp_ms}
NODE:{node_peer_id_base58}
canonical_query and canonical_body are produced the same way:
- Reduce the input to a list of
(key, value)pairs.- Query string: parse as URL-encoded pairs.
- JSON body: flatten to dot notation --
{"a":{"b":1}}becomesa.b=1; arrays use a[]suffix --{"t":[1,2]}becomest[]=1,t[]=2. - Empty body or query: the result is the empty string.
- Sort the pairs by key, then value.
- Percent-encode every key and every value with the
NON_ALPHANUMERICset -- that is, everything exceptA-Z a-z 0-9is escaped, including.,-,_, and~. The=and&joiners are not escaped. - Join as
key1=value1&key2=value2.
The full normative rules (including form bodies and binary payloads) are in Cryptography & Authentication.
How you sign
- Build the canonical string above.
msg_hash = keccak256(utf8_bytes(canonical_string)).- Produce a recoverable ECDSA signature over
msg_hash. - Serialize as
r[32] || s[32] || v[1]and hex-encode intoX-Sig. The recovery bytevmay be0/1or the Ethereum-style27/28; both are accepted.
Worked example
Sending {"text":"Hello, world!"} as a DM. The canonical body is
text=Hello%2C%20world%21 (comma, space, and ! are escaped). With no query
string, the string to sign is:
p2p-mes-v1
METHOD:POST
PATH:/dialogs/0xabcdef1234567890abcdef1234567890abcdef12/messages
QUERY:
BODY:text=Hello%2C%20world%21
TS:1700000000000
NODE:12D3KooWExampleNodePeerId
Hash it with Keccak-256, sign, and send the signature in X-Sig.
Common pitfalls
- Signing raw bytes instead of the canonical string. Re-serialize through the canonicalization rules; do not hash the JSON you happened to send.
- Wrong timestamp unit.
X-Tsis milliseconds, and the node rejects anything more than 30 seconds from its own clock. - Wrong or missing
X-Node. The node rejects requests addressed to a different PeerId. - Aggressive percent-encoding.
NON_ALPHANUMERICescapes far more than a typical URL encoder; verify against the worked example.
Reference test vectors
Signing is the easiest thing to get subtly wrong, so the site ships
machine-readable vectors at test-vectors.json. Each entry
pairs a request with its exact canonical_string, the Keccak-256
message_hash_keccak256, the resulting x_sig, and the full headers to send.
They are produced from a fixed test key (0x1111...1111, address
0x19e7e376e7c213b7e7e7e46cc70a5dd086daff2a) using the same code the node
verifies with, and a test re-checks every one.
To validate your client, replay a vector: rebuild the canonical string from its
request, confirm it matches byte-for-byte, then hash, sign, and verify your
signature recovers to the signer address. The POST /dialogs/{peer}/messages
vector is the worked example above with its signature filled in.
Deriving chat identifiers
Chat IDs are 32 bytes and are computed by the client, not assigned by the server.
Direct messages -- derived from the two participant addresses, order- independent, so both parties compute the same value with no coordination:
chat_id = blake3("p2p-mes:chat:dm:v1:" || min(a,b) || max(a,b))
min/max are taken over the raw 20-byte addresses. No membership is stored
for DMs: the ability to compute the ID is the access control.
Groups -- derived from the creator's address and a random 16-byte nonce the client generates at creation time:
chat_id = blake3("p2p-mes:chat:group:v1:" || admin_address || nonce)
The nonce is sent in the create operation, and the node verifies the derivation. See Cryptography & Authentication.
The message lifecycle
Sending is fire-and-forget. When you POST a message, the node validates
and signs nothing further, publishes it to the gossip network, queues it for
storage, and returns 200 before the write is durable. A success response
means "accepted and broadcast by this node," not "durably replicated
everywhere." Convergence across nodes happens asynchronously through
anti-entropy sync.
Reading is served from the queried node's local store and returns immediately -- under full replication every node stores everything, so there is no network round-trip on read. A node that is still catching up simply returns its partial local view. Treat reads as eventually consistent; see Operational notes for client authors at the end of this guide for handling incompleteness, ordering, retries, and node selection.
Sending a direct message
POST /dialogs/0xPEER.../messages
X-User: 0xSENDER...
X-Ts: 1699900000000
X-Node: 12D3KooW...
X-Sig: 0x<130 hex chars>
Content-Type: application/json
{ "text": "Hello, world!" }
Response:
{
"chat_id": "0x<32-byte hex>",
"msg_id": "0x<32-byte hex>",
"ts": 1699900000000
}
Group messages are the same against POST /groups/{chat_id}/messages; the node
rejects the send if the sender is not a member.
Reading history (and decoding messages)
GET /dialogs/{peer}/messages (and the group equivalent) accept from/to
(millisecond bounds), limit (1-1000), and an opaque after cursor. The
response is a page of messages plus a next_after cursor:
{
"items": [
{ "key": "0x<hex key>", "msg_cbor": "0x<hex-encoded CBOR>" }
],
"next_after": "0x<opaque cursor>"
}
Two things to note:
- Message bodies are returned as hex-encoded CBOR (
msg_cbor). The client must hex-decode, then CBOR-decode, to obtain the message fields (sender, timestamp, text,msg_type,control). The message structure is documented in Gossip Protocol and Types. - Cursors are opaque. Pass
next_afterback asafterto fetch the next page; do not parse or construct cursors yourself.
Reference: decoding msg_cbor
Each msg_cbor is the hex of a CBOR map. Decode in two steps: hex -> bytes,
then CBOR -> fields. The keys and value types:
| Key | CBOR type | Meaning |
|---|---|---|
schema | uint | wire schema version (currently 1) |
msg_id | array of 32 uints | message id |
chat_id | array of 32 uints | chat id |
sender | array of 20 uints | sender address |
hlc | uint (u64) | packed HLC (physical_ms << 16 | logical) |
origin_wall_ts | uint (u64) | sender wall-clock ms, for display |
seq | uint | per-chat sequence number |
text | text string | message text ("" for pure control messages) |
msg_type | uint | client-defined; 0 = regular text |
control | array of uints | optional (omitted when absent): opaque Layer-2 payload |
kind | map | {"t": 0|1|2, "d": {...}} -- DM / Group / Channel (see TYPES.md) |
Gotcha: byte fields are CBOR arrays, not byte strings.
msg_id,chat_id,sender(andcontrol, when present) encode as CBOR arrays (major type 4) ofu8integers -- not CBOR byte strings (major type 2), because the wire format uses noserde_bytesannotation. A decoder that expects byte strings will fail to parse.
Worked example. A DM with text = "Hello, world!", msg_type = 0, and no
control payload encodes as the following msg_cbor (also checked by
cargo test -p db, test msg_cbor_reference_vector, so it cannot drift):
aa66736368656d6101666d73675f69649820111111111111111111111111111111111111111111111111111111111111111167636861745f69649820182218221822182218221822182218221822182218221822182218221822182218221822182218221822182218221822182218221822182218221822182218226673656e646572941833183318331833183318331833183318331833183318331833183318331833183318331833183363686c631b018bcfe5680000006e6f726967696e5f77616c6c5f74731b0000018bcfe56800637365710164746578746d48656c6c6f2c20776f726c6421686d73675f7479706500646b696e64a2617461306164a164706565729418441844184418441844184418441844184418441844184418441844184418441844184418441844
It decodes to:
schema=1msg_id=0x1111...11(32 bytes)chat_id=0x2222...22(32 bytes)sender=0x3333...33(20 bytes)hlc=111372002710290432(physical1700000000000ms, logical0)origin_wall_ts=1700000000000seq=1text="Hello, world!"msg_type=0kind={ "t": 0, "d": { "peer": 0x4444...44 } }(a DM)
Reference decoder (Rust; any CBOR library works -- the keys are plain strings):
#![allow(unused)] fn main() { use serde::Deserialize; #[derive(Deserialize)] struct Message { schema: u8, msg_id: [u8; 32], chat_id: [u8; 32], sender: [u8; 20], hlc: u64, // packed: (physical_ms << 16) | logical origin_wall_ts: u64, seq: u32, text: String, #[serde(default)] msg_type: u8, #[serde(default)] control: Option<Vec<u8>>, // `kind` omitted here; unknown CBOR keys are skipped by default. } let bytes = hex::decode(msg_cbor.trim_start_matches("0x"))?; let msg: Message = serde_cbor::from_slice(&bytes)?; }
Read progress and unread counts
Mark progress with POST /dialogs/{peer}/messages/read carrying the highest
sequence number you have read:
{ "seq": 123 }
Unread counts are not stored: GET /conversations derives each chat's unread
by comparing its latest sequence against your stored read progress.
Groups
Group membership is managed through a single compound endpoint,
POST /groups/{chat_id}/ops, which bundles one or more membership operations
and optional accompanying messages (for example, encryption handshake data) in
one call:
{
"ops": [
{ "op_type": "create", "target": "0xADMIN...", "role": 1, "sig": "0x..." },
{ "op_type": "add", "target": "0xMEMBER...", "role": 0, "sig": "0x..." }
],
"messages": [],
"nonce": "0x<16-byte hex>"
}
nonce is required whenever the batch contains a create op. List members
with GET /groups/{chat_id}/members (returns address and role, where
0 = participant and 1 = admin), and leave with
DELETE /groups/{chat_id}/membership.
The second signature
Group operations require two distinct signatures, and confusing them is the most common group-related bug:
- The request signature in
X-Sig, over the canonical string (as for any request). - A per-operation signature inside each op's
sigfield, over the raw binary messagechat_id[32] || target[20] || op_type[1], hashed with Keccak-256. Theop_typebyte is0for add,1for remove,2for create -- even though the JSON field spells it"add"/"remove"/"create".
The per-op signature lets every node independently verify who authorized each
membership change as it propagates over gossip. Leaving a group
(DELETE .../membership) carries the same kind of per-op signature over
chat_id || sender || 1 (a self-remove).
Identity blobs
A user may publish one opaque identity blob (for example, a public-key bundle)
with PUT /identity, and anyone may fetch it with GET /identity/{address}.
The blob is base64 in JSON and capped at 1024 bytes; the node stores it
last-write-wins and never inspects it.
{ "identity": "SGVsbG8gV29ybGQ=" }
Layer 2: end-to-end encryption
The node is a transport: it never reads message contents for meaning. To add end-to-end encryption (or any other client protocol), use the opaque Layer 2 fields:
msg_type(u8) -- you define the meaning (text, handshake, key rotation, ...). The node treats every value as opaque.control-- an opaque payload sent via the control endpoints (POST /dialogs/{peer}/messages/control,POST /groups/{chat_id}/messages/control). It is base64 in JSON (note: addresses, IDs, and signatures are hex, but control and identity blobs are base64).
A typical E2EE client performs a key-exchange handshake over control
messages, then sends ciphertext as ordinary messages, encrypting on the client
and decrypting after the CBOR decode on read. Because the node cannot interpret
any of this, two clients can agree on any scheme without node support.
Encoding conventions
| Data | Encoding in JSON / headers |
|---|---|
| Addresses, chat IDs, msg IDs, cursors | 0x-prefixed hex |
Signatures (X-Sig, op sig) | 0x-prefixed hex |
Message bodies on read (msg_cbor) | 0x-prefixed hex of CBOR |
control payloads, identity blobs | base64 |
| Timestamps | integer milliseconds |
Error handling
Errors are returned with a conventional HTTP status and a JSON body:
{ "error": "forbidden" }
| Status | Meaning |
|---|---|
400 | Bad input -- malformed hex/base64, wrong length, invalid field |
401 | Authentication failed -- bad signature, stale X-Ts, wrong node |
403 | Forbidden -- not a member, or not authorized for a membership op |
404 | Not found |
500 | Internal error |
Validation failures (400) return a structured fields map identifying each
offending field and why, in addition to the top-level error.
Operational notes for client authors
This section is the honest current state of the protocol from a client's point of view: what works today, the sharp edges, and how to cope with them. Several items here are limitations the protocol intends to address; they are called out so you can design around them now.
Real-time delivery and background
There is no push, WebSocket, or SSE today. The only way to learn of new
messages is to poll GET /conversations (cheap: reverse-time, carries unread
counts) and then GET .../messages for chats that changed. Poll on a backoff
while foregrounded. Background delivery on iOS/Android does not work -- there
is no push gateway (APNs/FCM), so a backgrounded app will not receive messages
until it next polls. Do not emulate typing/presence with normal messages: they
would replicate and persist for the whole retention window. Real-time transport
and push are the largest planned additions.
Which node, and trusting it
A client speaks HTTP to a single node it does not run (a phone cannot hold a full replica of the whole network). There is no node discovery, health, or failover endpoint yet: pin a node URL (or a short operator-provided list) and handle transport errors with retry/backoff. You trust that node's operator with all of your metadata -- see Privacy.
What 200 means; delivery state
A 200 means the node accepted and broadcast the message, not that it is
durably stored or delivered (the write is queued after the response). There are
no delivery receipts, and the only read signal is coarse per-chat read
progress. Build optimistic UI and reconcile by reading back; do not present
"delivered" as a guarantee.
Idempotency and retries
The node computes msg_id from its own HLC, so resending the same text after
a timeout produces a different msg_id -- a duplicate, not a dedup. There is
no client idempotency key and no anti-replay nonce yet. Until there is: prefer
waiting for the response (its body carries the real msg_id) before retrying;
if you must retry blind, dedup on the client by (sender, text, approximate time) and reconcile when the real msg_id arrives.
Ordering and timestamps
Each message carries two times: hlc (the network-consistent stamp that defines
storage and sync order) and origin_wall_ts (the sender's wall clock, for
display). Sort by hlc for stable, cross-node-consistent ordering. Show
origin_wall_ts as the human time, but treat it as untrusted -- the sender sets
it and nothing validates it, so clamp obvious outliers and never use it for
ordering. Note a known gap: a malicious relay can rewrite hlc without
invalidating the client signature, so hlc is not cryptographically
authoritative; prefer a node you trust until this is closed.
seq, read progress, and switching nodes
seq is assigned locally by each node (last_seq + 1 at write time), not
globally. Because nodes can apply the same messages in different orders (gossip
vs. anti-entropy sync), the seq of one message can differ between nodes -- and
so can pagination cursors, which embed the storage key. Consequences:
- Use
msg_id(deterministic BLAKE3) as the stable, cross-node message identity. - Treat
seq, cursors, and read progress (which is keyed byseq) as node-relative. Marking read on node A does not map cleanly onto node B. - For consistent unread/read state, keep one identity pinned to one node until globally consistent sequencing lands.
Pagination and the message tail
GET .../messages returns ascending HLC with from/to/after -- forward
only. There is no backward (before) cursor, so "load the newest N, scroll up
to older" is not directly supported. Today: page forward and cache locally,
using /conversations (last_ts, unread) to know there is a new tail. Since
reads never block on the network, a node that is behind returns a partial page
with no "is this complete?" flag -- show a soft "syncing" state rather than
implying the history is final.
Clock skew and X-Ts
X-Ts must be within +/- 30 s of the node's clock or the request is rejected
(401). There is no server-time endpoint yet, so sync the device clock (NTP);
if you see unexplained 401s, suspect clock drift before signature bugs.
Key custody and recovery
Every request is signed by the user's secp256k1 key, so the key is the
identity. Store it in the platform secure store (iOS Secure Enclave / Android
Keystore). Note that recoverable ECDSA with hardware-backed keys is fiddly: you
must recover the recovery byte v (the node tolerates a wrong v by trying
both). There is no key backup or recovery -- losing the key permanently
loses the identity. Design enrollment and backup UX accordingly.
Multi-device
One keypair is one identity. Multi-device is not specified: sharing the private key across devices authenticates fine, but read progress is node-relative (see above) and any Layer 2 session state is yours to coordinate. Treat the protocol as single-device today.
Identity blobs and key trust
PUT /identity is signed, so a node can attest who published a blob (the
address owner), and the blob propagates network-wide (last-write-wins by HLC).
But the blob's content is opaque -- there is no protocol-level
signature-over-content, version, or fingerprint. Your Layer 2 must add its own
versioning and a fingerprint/verification step (trust-on-first-use plus
out-of-band verification) before trusting a key bundle.
Groups: rekey is not atomic
A compound ops call can bundle membership changes with accompanying messages
(e.g. an MLS Commit), but there is no atomicity between them: a Remove can
apply while the rekey message is lost, leaving the group in a broken crypto
state. Until atomic membership+rekey exists, detect the gap (an expected rekey
never arrives) and recover by re-issuing it.
Text length and large messages
text is validated as 1-1000 Unicode scalar values (Rust chars), not
bytes -- an emoji counts as one or more chars, and percent-encoding during
canonicalization does not change the count. There is no server-side chunking:
messages longer than 1000 must be split client-side, and reassembly is your
Layer 2 concern.
Feature scope today
- No media/attachments. Only text plus a small opaque
controlpayload. Large binaries are out of scope under full replication. - No edit or delete. Messages are append-only and leave only via the retention window. Edits/deletes, if you need them, are a client-side Layer 2 convention (e.g. tombstone control messages), not a protocol feature.
- No fetch-by-
msg_id. Only ranges; to deep-link to a single message you currently fetch its range and filter client-side. channelchats are reserved -- no channel create/post/subscribe endpoints exist yet.