You have a string, or a small object of strings, that needs to be in another language right now – a form label, a notification, a short block of UI copy a user is waiting on. You do not want to run a webhook endpoint or poll a job for a single round-trip. You want to send the text and read the translation back.
That is what the synchronous Localize endpoint is for: one request, translated data back in the same shape. You POST key-value content with a source and target locale, the call blocks while your engine translates, and the response hands back the same object with its values translated and its structure untouched. There is no job to track and no second call to make.
The translation is not a generic model call. It runs through the localization engine you configured – its glossary, brand voice, instructions, and per-locale model selection – the same engine the async API uses. The difference is only in shape: async fans one request out to many locales and delivers results as they land; this call does one locale pair and returns it inline.
One locale and a blocking call is fine? You are on the right page.
Reach for this endpoint when you need a single locale pair and can wait for one round-trip. When you have many target locales, long content, or want failures isolated per locale, the async Localization API takes one request, returns a 202 immediately, and runs each locale as an independent durable background workflow. One more difference beyond latency: the localization pipeline – pre-edit, human review, back-translation, and the other optional stages – runs on async jobs only. This synchronous call ignores pipeline configuration.
On this page
Request#
POST /process/localizeAuthenticate with the X-API-Key header. Keys are organization-scoped and reach every engine in your organization – see Authentication for where to generate one, and Errors and status codes for the full error model.
| Parameter | Type | Description |
|---|---|---|
engineId | string (optional) | The localization engine ID (eng_...). Uses your organization's default engine if omitted. |
sourceLocale | string | BCP-47 source locale (e.g. en). |
targetLocale | string | BCP-47 target locale (e.g. de). |
data | object | Key-value content to translate. Nested objects and arrays are supported; the response mirrors whatever shape you send. |
context | string (optional) | Broad context for this translation payload, such as the product surface, audience, or purpose. Applies to the whole request. |
hints | object (optional) | Contextual hints per key. Keys match data keys; values are arrays of breadcrumb strings (e.g. { "nav.home": ["Navbar", "Home link"] }) that tell the engine where a string lives, so it disambiguates short or overloaded text. |
{
"engineId": "eng_abc123",
"sourceLocale": "en",
"targetLocale": "de",
"data": {
"greeting": "Hello, world!",
"cta": "Get started"
},
"hints": {
"cta": ["Landing page", "Primary button"]
}
}Response#
The response carries the translated content in the same shape you sent, plus the model that produced it and the per-request cost. The same keys come back, in the same nesting – your code can read the translation out of the structure it already knows.
| Field | Type | Description |
|---|---|---|
sourceLocale | string | BCP-47 source locale, echoed from the request. |
targetLocale | string | BCP-47 target locale, echoed from the request. |
data | object | Translated key-value content, matching the input shape. |
model | string (optional) | LLM model that produced this translation, formatted provider/model (e.g. anthropic/claude-sonnet-4.5). Read it to know which model in your fallback chain actually ran. Absent when no LLM call was made – see the callout below. |
usage | object (optional) | Token counts and per-request cost in USD. Absent when no LLM call was made. |
The usage object itemizes the cost of the call, so you can attribute spend without a separate billing lookup:
| Field | Type | Description |
|---|---|---|
inputTokens | number | Total input tokens consumed across all chunks. |
outputTokens | number | Total output tokens generated across all chunks. |
cacheReadTokens | number | Input tokens served from the provider's prompt cache, when the model reports them. |
cacheWriteTokens | number | Input tokens written to the provider's prompt cache, when the model reports them. |
llmCost | number | Upstream LLM provider cost in USD. 0 when no cost was reported. |
localizationCost | number | Lingo.dev's per-token cost in USD, computed from outputTokens. |
cost | number | Total request cost in USD (llmCost + localizationCost). |
{
"sourceLocale": "en",
"targetLocale": "de",
"data": {
"greeting": "Hallo, Welt!",
"cta": "Jetzt starten"
},
"model": "anthropic/claude-sonnet-4.5",
"usage": {
"inputTokens": 2789,
"outputTokens": 861,
"cacheReadTokens": 0,
"cacheWriteTokens": 0,
"llmCost": 0.02129,
"localizationCost": 0.001722,
"cost": 0.023012
}
}When `model` and `usage` are absent
If data is empty – no keys to translate – the endpoint short-circuits without calling an LLM, and the response omits model and usage. This is the one case where the cost fields are missing, and the reason is that there was no cost: nothing was translated, so nothing was spent. Every request that triggers a translation includes both fields. Treat them as optional in your parser, and you will not be surprised by the empty-input case.
Examples#
The same call in five languages. Each sends a flat object for clarity; data accepts nested objects and arrays too, and the response comes back in whatever shape you send.
const response = await fetch(
"https://api.lingo.dev/process/localize",
{
method: "POST",
headers: {
"X-API-Key": "your_api_key",
"Content-Type": "application/json",
},
body: JSON.stringify({
engineId: "eng_abc123",
sourceLocale: "en",
targetLocale: "de",
data: {
greeting: "Hello, world!",
cta: "Get started",
},
}),
}
);
const { data } = await response.json();
// { greeting: "Hallo, Welt!", cta: "Jetzt starten" }What happens during localization#
A single POST hides a sequence of steps, and it is worth knowing what they are – because they are why the output is consistent with the rest of your localized content rather than a one-off model guess. When a request hits the endpoint, the engine applies its full configuration in order:
Model selection – Selects the highest-priority LLM model matching the locale pair. Locale-specific models take precedence over wildcard (
*) models. If the primary model fails, the engine falls through to the next ranked model automatically.Brand voice – Loads the brand voice for the target locale, falling back to the wildcard brand voice if no locale-specific one exists.
Instructions – Loads every instruction matching the target locale, including wildcard instructions.
Glossary lookup – Splits input values into searchable chunks, generates embeddings, and runs a vector similarity search against the engine's glossary. Matched terms enforce exact translations, or mark terms as non-translatable so they pass through verbatim.
Generation – Sends the composed prompt to the selected model, then parses and validates the JSON response.
This is the same pipeline of engine steps the async API runs per job. Calling sync instead of async changes the delivery shape, not how a translation is produced – so a string translated here and the same string translated in an async job land on the same glossary terms and the same voice.
Model fallback is automatic, and the response tells you which one ran
If the primary model fails, the engine attempts the next model in rank order. This happens transparently – the response shape is identical regardless of which model produced the translation. The one signal of a fallback is the model field in the response: read it when you need to know exactly which model in your chain handled a given request.
