Docs

// overview

How String Catalog Localization Works

Learn how String Catalog automates iOS and Android localization — from xcstrings parsing and state tracking to automated pull request delivery.

String Catalog runs a parser-aware app localization automation pipeline for .xcstrings, Android strings.xml, and TestFlight WhatToTest.<locale>.txt files. It scans your project (or accepts uploaded files), extracts and tracks string state, queues missing or needs-review translations, and writes normalized output back to disk. For repository projects, those changes can be delivered through an automated pull request workflow.

The pipeline is built to be:

  • Parser-aware (XCStrings, Android XML, TestFlight)
  • Rename-aware
  • Review-state aware
  • Locale normalization aware
  • Safe to write back into source control

This page explains the full localization workflow end-to-end.

High-Level Pipeline

For repository-based projects, the pipeline runs as follows:

  1. Clone the configured repository and checkout the target branch.
  2. Locate supported localization files using consistent parser detection.
  3. Validate file safety and encoding before parsing.
  4. Parse base-language and localized entries.
  5. Store extracted keys and track translation state over time.
  6. Detect changes, renames, and stale keys (if enabled).
  7. Queue missing or needs-review translations.
  8. Run quality checks (if enabled).
  9. Write normalized files back to disk.
  10. Commit, push, and open or update a pull request.

For manual upload projects, the flow is similar but excludes repository cloning and pull request behavior.

Parser Detection & File Discovery

String Catalog selects a parser based on file type and structure:

  • .xcstrings
  • strings.xml
  • TestFlight WhatToTest.<locale>.txt files

Repository projects discover files by scanning the checked-out working tree. Manual upload projects operate on the uploaded file payload. In both cases, parser selection is consistent so the same inputs produce the same outputs.

Each parser is responsible for:

  • Determining base-language records
  • Identifying localized variants
  • Preserving format-specific structure and markup
  • Handling file-structure-specific edge cases

Only supported and validated files enter the pipeline.

Validation & Safety Model

Before parsing, each file is validated to ensure safe and predictable processing.

Validation includes:

  • Encoding and structure safety checks
  • Non-text and unsupported payload detection
  • XML safety checks for unsafe constructs
  • Complexity and size guardrails

Files that fail validation are rejected before parsing begins.

State Tracking Model

String Catalog maintains structured state for each string entry:

  • Base-language value
  • Localized values
  • Translation status
  • Needs-review flag
  • Stale marker (if applicable)

This state model allows the system to reason about:

  • When translations are missing
  • When base content changes
  • When keys are renamed
  • When localized values require review

Rename-Aware Retranslation

When enabled, rename-aware retranslation detects key identifier changes where the underlying value is carried forward.

This is especially important when localization files are modified outside of Xcode or other IDE tooling, where rename metadata may not be preserved.

Behavior:

  • If a key is renamed but its value is carried forward, translations can often be preserved.
  • If the value meaningfully changes, translations are requeued.
  • Detection is based on structural comparison rather than relying solely on editor-generated metadata.

This helps prevent translation loss during refactors while maintaining integrity when semantics change.

Needs-Review Propagation

When a base-language value changes, non-base locales may be marked as NeedsReview.

This ensures:

  • Outdated translations are surfaced.
  • Review state propagates consistently from base to non-base locales.
  • Changes do not silently overwrite existing translations.

Internal safeguards help reduce redundant review triggers and duplicate processing.

Locale Normalization & Preferred Codes

Locale variants are normalized internally to ensure consistent tracking.

For example:

  • en, en-US, and en_US are canonicalized for state tracking.
  • Preferred codes are preserved when writing back to disk (for example en-US).
  • Locale matching remains predictable across parsers.

Normalization prevents duplicate locale entries while maintaining compatibility with platform expectations.

Stale Key Handling

When enabled, stale key removal can delete entries marked as stale.

Stale behavior differs by project type:

  • Repository projects generally preserve localized rows unless source keys are removed from the base language.
  • Manual upload projects can prune localized rows that disappear from incoming file payloads.

Stale removal is optional and only runs when explicitly enabled.

Write-Back Normalization

After translation and review processing, files are rewritten in normalized form.

Normalization ensures:

  • Correct encoding
  • Proper escaping
  • Preservation of inline XML markup (including elements such as xliff:g)
  • Platform-compliant structure for XCStrings and Android XML

String Catalog avoids double encoding and unintended structural mutations.

Repository Commit & Pull Request Flow

For repository-based projects:

  • Changes are committed to the configured branch.
  • Pull requests are created or updated automatically.
  • Repeat syncs aim to avoid unnecessary churn when output is already normalized.

Background Processing

Sync and translation work runs asynchronously.

This ensures:

  • Large projects do not block UI interactions.
  • Translation work can be retried safely.
  • Processing remains consistent across concurrent runs.
  • Repeat runs aim to stay consistent for the same inputs and avoid duplicate work.

Summary

String Catalog provides a parser-aware localization automation pipeline that:

  • Understands platform-specific file formats.
  • Tracks translation state and review status.
  • Detects renames and base-language changes.
  • Normalizes locale handling.
  • Safely writes back into source-controlled projects.

The result is automated localization that preserves structure, reduces churn, and integrates cleanly into modern development workflows.