One of the core features Transifex provides is handling files with translatable content (resources) in various localization formats, like XML or PO files.
Part of that functionality is to be able to import such files to its internal storage and export them, whenever the user requests them, either to ship them with his software or to translate the file to another language with his local computer. Although both operations might use some customized code to handle certain formats (especially for importing resources), there are specific steps that are followed in each case.
Whenever you upload a file with strings in it, Transifex will try to parse it, extract the necessary information and then store that information in the database.
Since each format is different, there are specialized parsers for each one. In some cases, Transifex uses a third-party parser, like polib for PO files. In other cases, we have developed custom parsers.
The main responsibility of a parser is to extract the necessary information from the imported file.
In case the file is the source file (that is, it is the file with the strings in the source language), we are interested in three things:
msgid entries in
a PO file). The keys are used to uniquely match the strings in the
source language with those in translations. We also generate a
unique hash for each key as an identifier.msgstr entries in PO file). These are the actual strings
of the source language.In case the file is a translation of the resource in a language, we are only interested in the translations (this means that any changes in the file are ignored).
As soon as we have the necessary information from the previous step, we store it in the database as source entities, translations and templates.
Whenever a user asks to download a translation file in a particular language, the file has to be exported from the database.
The procedure is quite standard for all formats. After fetching the template and the translation strings in the requested language, we do a search-&-replace in the template, replacing the hashes in it with the actual strings that correspond to each hash. Next, any format-specific operations are performed (like adding the translator copyrights in PO files) and the result is delivered to the user.
You can find more details for the storage engine of Transifex in the docs.