Pre-localization AI edit

A typo in your source is a typo you only get to fix once – before it multiplies. An async job fans one source payload out to every target locale, and each locale translates the text it was handed. So a misspelling, a dropped word, or a broken sentence in the source doesn't stay one problem. It becomes one problem in German, the same problem in French, and the same problem in every other locale the job touches – each one now needing its own correction after the fact.

Pre-localization AI edit (preEdit) closes that gap at the source. It is the first stage of the async localization pipeline: before the core translate step runs, an AI agent reviews the source payload and corrects typos, grammar mistakes, and spelling errors. The cleaned source is what gets translated – so you fix it once, before it fans out, instead of catching the same error in a dozen outputs.

This is an async pipeline stage, so it runs only on jobs created through the Async Localization API. The synchronous /localize endpoint runs the core translate step alone and ignores pipeline settings.

What the stage does#

preEdit operates on the source, not the translation. An AI agent reads your source payload and rewrites it to remove surface errors – typos, grammar, spelling – then hands the corrected text to the core localization step. Every target locale translates from that cleaned source.

Its scope is deliberately narrow, and that is the point. This is a copy-cleaning pass, not a rewrite: it targets the kind of surface noise that makes source text ambiguous to a translation model, so the model spends its attention on translating rather than guessing what a mangled sentence meant. Cleaner source produces more consistent translations across locales – because every locale starts from the same corrected text instead of each model independently interpreting the same error.

For idiomatic, native-sounding output – rewriting the translation itself to read like a native copywriter wrote it – that is a different stage. See Rephrase for natural copy. preEdit cleans the input; rephrase polishes the output.

It can't make the job worse#

The first question a careful engineer asks about an AI step that edits their content before translation is the right one: what happens when that step gets it wrong, or doesn't run at all?

preEdit is a non-critical stage. If the pre-edit call fails or times out, the original source is passed through unchanged and the job continues to the translate step exactly as if the stage were off. A failure here costs you the cleanup on that job – not the job. The translation still ships.

What if pre-edit fails or times out?

The job does not fail. Non-critical stages fall back to their input: on a preEdit failure, the unedited source is translated as-is, and the job runs to completion. The job status becomes completed_with_warnings, the preEdit step is recorded as failed, and the reason lands in the job's warnings array – so you can see it happened without it blocking delivery. Reading those step records is covered on Observe pipeline runs.

So the honest framing of the floor: enabling preEdit cannot make a job fail that would otherwise have succeeded. The worst case is that it doesn't help on a given job and quietly steps aside.

What it is not#

Worth stating plainly, at the point you'd most want it to be more: preEdit is best-effort, and it is a copy-cleaning pass over surface errors – not a proofreader that understands your domain or a fact-checker that validates your claims. It corrects typos, grammar, and spelling. It does not verify that a price is right, that a product name is current, or that a sentence says what you meant it to say. If your source is factually wrong, preEdit will faithfully clean the grammar of a wrong sentence and translate it cleanly into every locale.

For terms that must stay exactly as written regardless of any AI pass – product names, trademarks, code identifiers – pin them at the source. Mark them non-translatable in your engine's glossary, or, for structural fields in a specific payload, exclude them with lockedKeys. Those are guarantees about the data; preEdit is a best-effort cleanup around them.

When to enable it#

preEdit earns its extra pass when your source is likely to carry noise, and it's redundant when your source is already clean.

Enable it when source content is user-generated, machine-extracted, scraped, OCR'd, or otherwise authored outside an editorial process – the cases where surface errors are common and the multiply-across-locales cost is real.
Skip it for curated content that has already passed editorial or human review. If the source is clean, there is nothing for the stage to correct, and you would be paying for an AI pass that has no work to do. Each enabled stage is one more step the job runs and one more line on its cost – worth it where source quality is uncertain, wasted where it isn't.

That is the whole trade: spend one pass up front, on the jobs where source quality is uncertain, to fix it once before it fans out – instead of correcting the same error across every locale after delivery.

You toggle preEdit on the engine's Pipeline tab, where it applies to every async job routed to that engine, or override it for a single submission with pipelineConfig on the create-jobs request. Both layers, and how an omitted stage inherits the engine default, are covered on Configure the pipeline.

Next steps#

Configure the pipeline

Enable preEdit as an engine default or override it per request with pipelineConfig

Observe pipeline runs

Read the preEdit step record and find warnings when a non-critical stage falls back

Rephrase for natural copy

The output-side counterpart - rewrite the translation to read native, not clean the input

Localization Pipeline

All the stages that wrap the core translate step, and how they fit together