Source code
Revision control
Copy as Markdown
Other Tools
Metadata-Version: 2.1
Name: moz.l10n
Version: 0.5.6
Summary: Mozilla tools for localization
Author-email: Mozilla <l10n-drivers@mozilla.org>, Eemeli Aro <eemeli@mozilla.com>
License: Apache-2.0
Platform: any
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Localization
Classifier: Topic :: Software Development :: Testing
Requires-Python: ~=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fluent.syntax~=0.19.0
Requires-Dist: gitignorant~=0.3.1
Requires-Dist: iniparse~=0.5
Requires-Dist: polib~=1.2
Requires-Dist: tomli>=1.1.0; python_version < "3.11"
Provides-Extra: xml
Requires-Dist: lxml~=5.0; extra == "xml"
# moz.l10n
This is a library of Python tools and utilities for working with localization files,
primarily built for internal use at Mozilla.
The core idea here is to establish [Message](./moz/l10n/message.py) and [Resource](./moz/l10n/resource.py)
as format-independent representations of localizable and localized messages and resources,
so that operations like linting and transforms can be applied to them.
The Message and Resource representations are drawn from work done for the
Unicode [MessageFormat 2 specification](https://github.com/unicode-org/message-format-wg/tree/main/spec)
Support for XML formats (`android`, `xliff`) is an optional extra;
to support them, install as `moz.l10n[xml]`.
## Command-line Tools
For usage details, use each command's `--help` argument.
### `l10n-build`
Build localization files for release.
Iterates source files as defined by `--config`, reads localization sources from `--base`, and writes to `--target`.
Trims out all comments and messages not in the source files for each of the `--locales`.
Adds empty files for any missing from the target locale.
### `l10n-build-file`
Build one localization file for release.
Uses the `--source` file as a baseline, applying `--l10n` localizations to build `--target`.
Trims out all comments and messages not in the source file.
### `l10n-compare`
Compare localizations to their `source`, which may be
- a directory (using `L10nDiscoverPaths`),
- a TOML config file (using `L10nConfigPaths`), or
- a JSON file containing a mapping of file paths to arrays of messages.
### `l10n-fix`
Fix the formatting for localization resources.
If `paths` is a single directory, it is iterated with `L10nConfigPaths` if `--config` is set, or `L10nDiscoverPaths` otherwise.
If `paths` is not a single directory, its values are treated as glob expressions, with `**` support.
## moz.l10n.paths
### L10nConfigPaths
Wrapper for localization config files.
Supports a subset of the format specified at:
Differences:
- `[build]` is ignored
- `[[excludes]]` are not supported
- `[[filters]]` are ignored
- `[[paths]]` must always include both `reference` and `l10n`
Does not consider `.l10n-ignore` files.
### L10nDiscoverPaths
Automagical localization resource discovery.
Given a root directory, finds the likeliest reference and target directories.
The reference directory has a name like `templates`, `en-US`, or `en`,
and contains files with extensions that appear localizable.
The localization target root is a directory with subdirectories named as
BCP 47 locale identifiers, i.e. like `aa`, `aa-AA`, `aa-Aaaa`, or `aa-Aaaa-AA`.
An underscore may also be used as a separator, as in `en_US`.
## moz.l10n.resources
Parsers and serializers are provided for a number of formats,
using common and well-established libraries to take care of the details.
A unified API for these is provided,
such that `FORMAT_parse(text)` will always accept `str` input,
and `FORMAT_serialize(resource)` will always provide a `str` iterator.
All the serializers accept a `trim_comments` argument
which leaves out comments from the serialized result,
but additional input types and options vary by format.
The library currently supports the following resource formats:
- `android`: Android string resources (strings.xml)
- `dtd`: .dtd
- `fluent`: Fluent (.ftl)
- `inc`: .inc
- `ini`: .ini
- `plain_json`: Plain JSON (.json)
- `po`: Gettext (.po, .pot)
- `properties`: .properties
- `webext`: WebExtensions (messages.json)
- `xliff`: XLIFF 1.2, including XCode customizations (.xlf, .xliff)
### add_entries
```python
def add_entries(
target: Resource,
source: Resource,
*,
use_source_entries: bool = False
) -> int
```
Modifies `target` by adding entries from `source` that are not already present in `target`.
Standalone comments are not added.
If `use_source_entries` is set,
entries from `source` override those in `target` when they differ,
as well as updating section comments and metadata from `source`.
Entries are not copied, so further changes will be reflected in both resources.
Returns a count of added or changed entries and sections.
### detect_format
```python
def detect_format(name: str | None, source: bytes | str) -> Format | None
```
Detect the format of the input based on its file extension
and/or contents.
Returns a `Format` enum value, or `None` if the input is not recognized.
### iter_resources
```python
def iter_resources(
root: str,
dirs: list[str] | None = None,
ignorepath: str = ".l10n-ignore"
) -> Iterator[tuple[str, Resource[Message, str] | None]]
```
Iterate through localizable resources under the `root` directory.
Use `dirs` to limit the search to only some subdirectories under `root`.
Yields `(str, Resource | None)` tuples,
with the file path and the corresponding `Resource`,
or `None` for files that could not be parsed as localization resources.
To ignore files, include a `.l10n-ignore` file in `root`,
or some other location passed in as `ignorepath`.
This file uses a git-ignore syntax,
and is always based in the `root` directory.
### l10n_equal
```python
def l10n_equal(a: Resource, b: Resource) -> bool
```
Compares the localization-relevant content
(id, comment, metadata, message values) of two resources.
Sections with no message entries are ignored,
and the order of sections, entries, and metadata is ignored.
### parse_resource
```python
def parse_resource(
input: Format | str | None,
source: str | bytes | None = None
) -> Resource[Message, str]
```
Parse a Resource from its string representation.
The first argument may be an explicit Format,
the file path as a string, or None.
For the latter two types,
an attempt is made to detect the appropriate format.
If the first argument is a string path,
the `source` argument is optional,
as the file will be opened and read.
### serialize_resource
```python
def serialize_resource(
resource: Resource[str, str] | Resource[Message, str],
format: Format | None = None,
trim_comments: bool = False
) -> Iterator[str]
```
Serialize a Resource as its string representation.
If `format` is set, it overrides the `resource.format` value.
With `trim_comments`,
all standalone and attached comments are left out of the serialization.