Trivial dedupe
Repeated content is translated once. No background dedupe job to maintain; it's a property of the data model.
Same content, same fingerprint, translated once. A graph that's invariant to where your content lives.
Every translatable unit — a paragraph, a button label, a JSON-LD property — gets a 12-character fingerprint derived from blake3(unit_type || content). The fingerprint is the address. Look it up; get the translations.
Same content always produces the same fingerprint. So if "Add to cart" appears on 500 products, it has one fingerprint — and one translation per locale, shared across all 500.
<button data-pg="A3F5C9E27K2P">Add to cart</button> Añadir al carrito at /es/Repeated content is translated once. No background dedupe job to maintain; it's a property of the data model.
Edit a paragraph, its content hash changes, its fingerprint changes, only that unit re-translates. Cache is content-derived.
The graph isn't tied to a CMS schema. Move WordPress → Sanity → Payload; the same fingerprints match the same content.
-- One row per unique source unit per project
CREATE TABLE source_units (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
project_id uuid NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
unit_type text NOT NULL, -- 'paragraph' | 'button' | 'product_name' | ...
fingerprint text NOT NULL, -- blake3(unit_type || content), 12-char prefix
content text NOT NULL,
embedding vector(3072), -- OpenAI text-embedding-3-large
word_count int NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
UNIQUE (project_id, fingerprint)
);
-- Per source_unit × locale × branch
CREATE TABLE translations (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
source_unit_id uuid NOT NULL REFERENCES source_units(id) ON DELETE CASCADE,
locale text NOT NULL,
branch_id uuid REFERENCES branches(id),
content text NOT NULL,
model text NOT NULL, -- 'deepl-pro' | 'claude-sonnet' | ...
quality_score numeric(3,2), -- 0.00–1.00 from Haiku judge
status text NOT NULL DEFAULT 'pending',
-- 'pending' | 'approved' | 'rejected' | 'tm_match'
approved_by uuid REFERENCES users(id),
approved_at timestamptz,
created_at timestamptz NOT NULL DEFAULT now(),
UNIQUE (source_unit_id, locale, branch_id)
); <button>Add to cart</button>.fingerprint = blake3("button" || "Add to cart") = A3F5C9E27K2P. Inserts (idempotently) into source_units.translations get inserted with status='tm_match' and cost=0.data-pg attribute. The Worker substitutes at the edge per locale.Open spec — published with the marketing site, not behind a sales call.