Building a Client

This guide walks through everything needed to build an interoperable p2p-mes client: how to authenticate, how to derive chat identifiers, the message lifecycle, and how to layer end-to-end encryption on top. For the exact schema of every endpoint, see the API Overview and the generated API Reference; for the design rationale behind these choices, see Design Philosophy.

A client never talks to the peer-to-peer network directly. It speaks HTTP to a single node, which signs nothing on the client's behalf except the network clock (see the trust note below). Everything else -- authorship, chat membership, encryption -- is the client's responsibility.

Prerequisites

To talk to a node you need three things:

An ECDSA secp256k1 keypair. The user's identity (their address) is derived from the public key exactly as in Ethereum.
The node's HTTP API base URL (for example http://localhost:3000).
The node's PeerId (Base58). Every request is bound to a specific node, so the client must know which node it is addressing.

Identity and addresses

A user is identified by a 20-byte address derived from their public key:

address = keccak256(uncompressed_pubkey[1..])[12..32]   // last 20 bytes

This is identical to Ethereum address derivation. Addresses appear in the API as 0x-prefixed hex (42 characters). See Cryptography & Authentication for the full derivation.

Authentication: signing every request

Every endpoint requires a signature. The node verifies it by recovering the signer's address from the signature and comparing it to the X-User header -- so there is no session, token, or password. Each request is signed independently.

Required headers

Header	Value
`X-User`	Signer's address, `0x`-prefixed hex (20 bytes)
`X-Ts`	Unix timestamp in milliseconds; must be within +/- 30 s of node
`X-Node`	Base58 PeerId of the node being addressed (must match that node)
`X-Sig`	65-byte signature as hex: `r[32] \|\| s[32] \|\| v[1]` (130 hex chars)
`X-Sig-Version`	Protocol tag; currently `p2p-mes-v1` (the default if omitted)

What you sign

You do not sign the raw request bytes. You sign a canonical string built from the request, so the signature is stable regardless of JSON key order or whitespace. The string is:

p2p-mes-v1
METHOD:{UPPERCASE_METHOD}
PATH:{path}
QUERY:{canonical_query}
BODY:{canonical_body}
TS:{timestamp_ms}
NODE:{node_peer_id_base58}

canonical_query and canonical_body are produced the same way:

Reduce the input to a list of (key, value) pairs.
- Query string: parse as URL-encoded pairs.
- JSON body: flatten to dot notation -- {"a":{"b":1}} becomes a.b=1; arrays use a [] suffix -- {"t":[1,2]} becomes t[]=1, t[]=2.
- Empty body or query: the result is the empty string.
Sort the pairs by key, then value.
Percent-encode every key and every value with the NON_ALPHANUMERIC set -- that is, everything except A-Z a-z 0-9 is escaped, including ., -, _, and ~. The = and & joiners are not escaped.
Join as key1=value1&key2=value2.

The full normative rules (including form bodies and binary payloads) are in Cryptography & Authentication.

How you sign

Build the canonical string above.
msg_hash = keccak256(utf8_bytes(canonical_string)).
Produce a recoverable ECDSA signature over msg_hash.
Serialize as r[32] || s[32] || v[1] and hex-encode into X-Sig. The recovery byte v may be 0/1 or the Ethereum-style 27/28; both are accepted.

Worked example

Sending {"text":"Hello, world!"} as a DM. The canonical body is text=Hello%2C%20world%21 (comma, space, and ! are escaped). With no query string, the string to sign is:

p2p-mes-v1
METHOD:POST
PATH:/dialogs/0xabcdef1234567890abcdef1234567890abcdef12/messages
QUERY:
BODY:text=Hello%2C%20world%21
TS:1700000000000
NODE:12D3KooWExampleNodePeerId

Hash it with Keccak-256, sign, and send the signature in X-Sig.

Common pitfalls

Signing raw bytes instead of the canonical string. Re-serialize through the canonicalization rules; do not hash the JSON you happened to send.
Wrong timestamp unit. X-Ts is milliseconds, and the node rejects anything more than 30 seconds from its own clock.
Wrong or missing X-Node. The node rejects requests addressed to a different PeerId.
Aggressive percent-encoding. NON_ALPHANUMERIC escapes far more than a typical URL encoder; verify against the worked example.

Reference test vectors

Signing is the easiest thing to get subtly wrong, so the site ships machine-readable vectors at test-vectors.json. Each entry pairs a request with its exact canonical_string, the Keccak-256 message_hash_keccak256, the resulting x_sig, and the full headers to send. They are produced from a fixed test key (0x1111...1111, address 0x19e7e376e7c213b7e7e7e46cc70a5dd086daff2a) using the same code the node verifies with, and a test re-checks every one.

To validate your client, replay a vector: rebuild the canonical string from its request, confirm it matches byte-for-byte, then hash, sign, and verify your signature recovers to the signer address. The POST /dialogs/{peer}/messages vector is the worked example above with its signature filled in.

Deriving chat identifiers

Chat IDs are 32 bytes and are computed by the client, not assigned by the server.

Direct messages -- derived from the two participant addresses, order- independent, so both parties compute the same value with no coordination:

chat_id = blake3("p2p-mes:chat:dm:v1:" || min(a,b) || max(a,b))

min/max are taken over the raw 20-byte addresses. No membership is stored for DMs: the ability to compute the ID is the access control.

Groups -- derived from the creator's address and a random 16-byte nonce the client generates at creation time:

chat_id = blake3("p2p-mes:chat:group:v1:" || admin_address || nonce)

The nonce is sent in the create operation, and the node verifies the derivation. See Cryptography & Authentication.

The message lifecycle

Sending is fire-and-forget. When you POST a message, the node validates and signs nothing further, publishes it to the gossip network, queues it for storage, and returns 200 before the write is durable. A success response means "accepted and broadcast by this node," not "durably replicated everywhere." Convergence across nodes happens asynchronously through anti-entropy sync.

Reading is served from the queried node's local store and returns immediately -- under full replication every node stores everything, so there is no network round-trip on read. A node that is still catching up simply returns its partial local view. Treat reads as eventually consistent; see Operational notes for client authors at the end of this guide for handling incompleteness, ordering, retries, and node selection.

Sending a direct message

POST /dialogs/0xPEER.../messages
X-User: 0xSENDER...
X-Ts: 1699900000000
X-Node: 12D3KooW...
X-Sig: 0x<130 hex chars>
Content-Type: application/json

{ "text": "Hello, world!" }

Response:

{
  "chat_id": "0x<32-byte hex>",
  "msg_id":  "0x<32-byte hex>",
  "ts": 1699900000000
}

Group messages are the same against POST /groups/{chat_id}/messages; the node rejects the send if the sender is not a member.

Reading history (and decoding messages)

GET /dialogs/{peer}/messages (and the group equivalent) accept from/to (millisecond bounds), limit (1-1000), and an opaque after cursor. The response is a page of messages plus a next_after cursor:

{
  "items": [
    { "key": "0x<hex key>", "msg_cbor": "0x<hex-encoded CBOR>" }
  ],
  "next_after": "0x<opaque cursor>"
}

Two things to note:

Message bodies are returned as hex-encoded CBOR (msg_cbor). The client must hex-decode, then CBOR-decode, to obtain the message fields (sender, timestamp, text, msg_type, control). The message structure is documented in Gossip Protocol and Types.
Cursors are opaque. Pass next_after back as after to fetch the next page; do not parse or construct cursors yourself.

Reference: decoding `msg_cbor`

Each msg_cbor is the hex of a CBOR map. Decode in two steps: hex -> bytes, then CBOR -> fields. The keys and value types:

Key	CBOR type	Meaning
`schema`	uint	wire schema version (currently `1`)
`msg_id`	array of 32 uints	message id
`chat_id`	array of 32 uints	chat id
`sender`	array of 20 uints	sender address
`hlc`	uint (u64)	packed HLC (`physical_ms << 16 \| logical`)
`origin_wall_ts`	uint (u64)	sender wall-clock ms, for display
`seq`	uint	per-chat sequence number
`text`	text string	message text (`""` for pure control messages)
`msg_type`	uint	client-defined; `0` = regular text
`control`	array of uints	optional (omitted when absent): opaque Layer-2 payload
`kind`	map	`{"t": 0\|1\|2, "d": {...}}` -- DM / Group / Channel (see TYPES.md)

Gotcha: byte fields are CBOR arrays, not byte strings. msg_id, chat_id, sender (and control, when present) encode as CBOR arrays (major type 4) of u8 integers -- not CBOR byte strings (major type 2), because the wire format uses no serde_bytes annotation. A decoder that expects byte strings will fail to parse.

Worked example. A DM with text = "Hello, world!", msg_type = 0, and no control payload encodes as the following msg_cbor (also checked by cargo test -p db, test msg_cbor_reference_vector, so it cannot drift):

aa66736368656d6101666d73675f69649820111111111111111111111111111111111111111111111111111111111111111167636861745f69649820182218221822182218221822182218221822182218221822182218221822182218221822182218221822182218221822182218221822182218221822182218226673656e646572941833183318331833183318331833183318331833183318331833183318331833183318331833183363686c631b018bcfe5680000006e6f726967696e5f77616c6c5f74731b0000018bcfe56800637365710164746578746d48656c6c6f2c20776f726c6421686d73675f7479706500646b696e64a2617461306164a164706565729418441844184418441844184418441844184418441844184418441844184418441844184418441844

It decodes to:

schema = 1
msg_id = 0x1111...11 (32 bytes)
chat_id = 0x2222...22 (32 bytes)
sender = 0x3333...33 (20 bytes)
hlc = 111372002710290432 (physical 1700000000000 ms, logical 0)
origin_wall_ts = 1700000000000
seq = 1
text = "Hello, world!"
msg_type = 0
kind = { "t": 0, "d": { "peer": 0x4444...44 } } (a DM)

Reference decoder (Rust; any CBOR library works -- the keys are plain strings):

#![allow(unused)]
fn main() {
use serde::Deserialize;

#[derive(Deserialize)]
struct Message {
    schema: u8,
    msg_id: [u8; 32],
    chat_id: [u8; 32],
    sender: [u8; 20],
    hlc: u64,            // packed: (physical_ms << 16) | logical
    origin_wall_ts: u64,
    seq: u32,
    text: String,
    #[serde(default)]
    msg_type: u8,
    #[serde(default)]
    control: Option<Vec<u8>>,
    // `kind` omitted here; unknown CBOR keys are skipped by default.
}

let bytes = hex::decode(msg_cbor.trim_start_matches("0x"))?;
let msg: Message = serde_cbor::from_slice(&bytes)?;
}

Read progress and unread counts

Mark progress with POST /dialogs/{peer}/messages/read carrying the highest sequence number you have read:

{ "seq": 123 }

Unread counts are not stored: GET /conversations derives each chat's unread by comparing its latest sequence against your stored read progress.

Groups

Group membership is managed through a single compound endpoint, POST /groups/{chat_id}/ops, which bundles one or more membership operations and optional accompanying messages (for example, encryption handshake data) in one call:

{
  "ops": [
    { "op_type": "create", "target": "0xADMIN...", "role": 1, "sig": "0x..." },
    { "op_type": "add",    "target": "0xMEMBER...", "role": 0, "sig": "0x..." }
  ],
  "messages": [],
  "nonce": "0x<16-byte hex>"
}

nonce is required whenever the batch contains a create op. List members with GET /groups/{chat_id}/members (returns address and role, where 0 = participant and 1 = admin), and leave with DELETE /groups/{chat_id}/membership.

The second signature

Group operations require two distinct signatures, and confusing them is the most common group-related bug:

The request signature in X-Sig, over the canonical string (as for any request).
A per-operation signature inside each op's sig field, over the raw binary message chat_id[32] || target[20] || op_type[1], hashed with Keccak-256. The op_type byte is 0 for add, 1 for remove, 2 for create -- even though the JSON field spells it "add"/"remove"/"create".

The per-op signature lets every node independently verify who authorized each membership change as it propagates over gossip. Leaving a group (DELETE .../membership) carries the same kind of per-op signature over chat_id || sender || 1 (a self-remove).

Identity blobs

A user may publish one opaque identity blob (for example, a public-key bundle) with PUT /identity, and anyone may fetch it with GET /identity/{address}. The blob is base64 in JSON and capped at 1024 bytes; the node stores it last-write-wins and never inspects it.

{ "identity": "SGVsbG8gV29ybGQ=" }

Layer 2: end-to-end encryption

The node is a transport: it never reads message contents for meaning. To add end-to-end encryption (or any other client protocol), use the opaque Layer 2 fields:

msg_type (u8) -- you define the meaning (text, handshake, key rotation, ...). The node treats every value as opaque.
control -- an opaque payload sent via the control endpoints (POST /dialogs/{peer}/messages/control, POST /groups/{chat_id}/messages/control). It is base64 in JSON (note: addresses, IDs, and signatures are hex, but control and identity blobs are base64).

A typical E2EE client performs a key-exchange handshake over control messages, then sends ciphertext as ordinary messages, encrypting on the client and decrypting after the CBOR decode on read. Because the node cannot interpret any of this, two clients can agree on any scheme without node support.

Encoding conventions

Data	Encoding in JSON / headers
Addresses, chat IDs, msg IDs, cursors	`0x`-prefixed hex
Signatures (`X-Sig`, op `sig`)	`0x`-prefixed hex
Message bodies on read (`msg_cbor`)	`0x`-prefixed hex of CBOR
`control` payloads, `identity` blobs	base64
Timestamps	integer milliseconds

Error handling

Errors are returned with a conventional HTTP status and a JSON body:

{ "error": "forbidden" }

Status	Meaning
`400`	Bad input -- malformed hex/base64, wrong length, invalid field
`401`	Authentication failed -- bad signature, stale `X-Ts`, wrong node
`403`	Forbidden -- not a member, or not authorized for a membership op
`404`	Not found
`500`	Internal error

Validation failures (400) return a structured fields map identifying each offending field and why, in addition to the top-level error.

Operational notes for client authors

This section is the honest current state of the protocol from a client's point of view: what works today, the sharp edges, and how to cope with them. Several items here are limitations the protocol intends to address; they are called out so you can design around them now.

Real-time delivery and background

There is no push, WebSocket, or SSE today. The only way to learn of new messages is to poll GET /conversations (cheap: reverse-time, carries unread counts) and then GET .../messages for chats that changed. Poll on a backoff while foregrounded. Background delivery on iOS/Android does not work -- there is no push gateway (APNs/FCM), so a backgrounded app will not receive messages until it next polls. Do not emulate typing/presence with normal messages: they would replicate and persist for the whole retention window. Real-time transport and push are the largest planned additions.

Which node, and trusting it

A client speaks HTTP to a single node it does not run (a phone cannot hold a full replica of the whole network). There is no node discovery, health, or failover endpoint yet: pin a node URL (or a short operator-provided list) and handle transport errors with retry/backoff. You trust that node's operator with all of your metadata -- see Privacy.

What `200` means; delivery state

A 200 means the node accepted and broadcast the message, not that it is durably stored or delivered (the write is queued after the response). There are no delivery receipts, and the only read signal is coarse per-chat read progress. Build optimistic UI and reconcile by reading back; do not present "delivered" as a guarantee.

Idempotency and retries

The node computes msg_id from its own HLC, so resending the same text after a timeout produces a different msg_id -- a duplicate, not a dedup. There is no client idempotency key and no anti-replay nonce yet. Until there is: prefer waiting for the response (its body carries the real msg_id) before retrying; if you must retry blind, dedup on the client by (sender, text, approximate time) and reconcile when the real msg_id arrives.

Ordering and timestamps

Each message carries two times: hlc (the network-consistent stamp that defines storage and sync order) and origin_wall_ts (the sender's wall clock, for display). Sort by hlc for stable, cross-node-consistent ordering. Show origin_wall_ts as the human time, but treat it as untrusted -- the sender sets it and nothing validates it, so clamp obvious outliers and never use it for ordering. Note a known gap: a malicious relay can rewrite hlc without invalidating the client signature, so hlc is not cryptographically authoritative; prefer a node you trust until this is closed.

`seq`, read progress, and switching nodes

seq is assigned locally by each node (last_seq + 1 at write time), not globally. Because nodes can apply the same messages in different orders (gossip vs. anti-entropy sync), the seq of one message can differ between nodes -- and so can pagination cursors, which embed the storage key. Consequences:

Use msg_id (deterministic BLAKE3) as the stable, cross-node message identity.
Treat seq, cursors, and read progress (which is keyed by seq) as node-relative. Marking read on node A does not map cleanly onto node B.
For consistent unread/read state, keep one identity pinned to one node until globally consistent sequencing lands.

Pagination and the message tail

GET .../messages returns ascending HLC with from/to/after -- forward only. There is no backward (before) cursor, so "load the newest N, scroll up to older" is not directly supported. Today: page forward and cache locally, using /conversations (last_ts, unread) to know there is a new tail. Since reads never block on the network, a node that is behind returns a partial page with no "is this complete?" flag -- show a soft "syncing" state rather than implying the history is final.

Clock skew and `X-Ts`

X-Ts must be within +/- 30 s of the node's clock or the request is rejected (401). There is no server-time endpoint yet, so sync the device clock (NTP); if you see unexplained 401s, suspect clock drift before signature bugs.

Key custody and recovery

Every request is signed by the user's secp256k1 key, so the key is the identity. Store it in the platform secure store (iOS Secure Enclave / Android Keystore). Note that recoverable ECDSA with hardware-backed keys is fiddly: you must recover the recovery byte v (the node tolerates a wrong v by trying both). There is no key backup or recovery -- losing the key permanently loses the identity. Design enrollment and backup UX accordingly.

Multi-device

One keypair is one identity. Multi-device is not specified: sharing the private key across devices authenticates fine, but read progress is node-relative (see above) and any Layer 2 session state is yours to coordinate. Treat the protocol as single-device today.

Identity blobs and key trust

PUT /identity is signed, so a node can attest who published a blob (the address owner), and the blob propagates network-wide (last-write-wins by HLC). But the blob's content is opaque -- there is no protocol-level signature-over-content, version, or fingerprint. Your Layer 2 must add its own versioning and a fingerprint/verification step (trust-on-first-use plus out-of-band verification) before trusting a key bundle.

Groups: rekey is not atomic

A compound ops call can bundle membership changes with accompanying messages (e.g. an MLS Commit), but there is no atomicity between them: a Remove can apply while the rekey message is lost, leaving the group in a broken crypto state. Until atomic membership+rekey exists, detect the gap (an expected rekey never arrives) and recover by re-issuing it.

Text length and large messages

text is validated as 1-1000 Unicode scalar values (Rust chars), not bytes -- an emoji counts as one or more chars, and percent-encoding during canonicalization does not change the count. There is no server-side chunking: messages longer than 1000 must be split client-side, and reassembly is your Layer 2 concern.

Feature scope today

No media/attachments. Only text plus a small opaque control payload. Large binaries are out of scope under full replication.
No edit or delete. Messages are append-only and leave only via the retention window. Edits/deletes, if you need them, are a client-side Layer 2 convention (e.g. tombstone control messages), not a protocol feature.
No fetch-by-msg_id. Only ranges; to deep-link to a single message you currently fetch its range and filter client-side.
channel chats are reserved -- no channel create/post/subscribe endpoints exist yet.

Keyboard shortcuts

p2p-mes Protocol Documentation