| Localization of PO and MO files |
|
|
|
|
PO and MO files are the files of the Gettext library, which is commonly used in free software. Besides the implementation for the standard C++, there are implementations of the library for a lot of programming languages: PHP, Phyton, Perl, Pascal, Java and many others. The description of the Gettext library is available at: http://www.gnu.org/software/gettext/manual/gettext.html Strings in PO and MO files are stored as lists of entries. Each entry contains the fields msgid — original string — and msgstr — translation. The first record in the msgid field contains a blank line; in msgstr that's the header. The header is a field set. The field Content-Type contains the name of the file's encoding. PO files are textual; MO files are binary. MO files are obtained by compiling the PO files. The MO files contain the subsets of fields from the PO files; they do not store flags, comments, links and obsolete entries. PO files do not contain the BOM (Byte Order Mask) signature at the beginning of the file; therefore, the UTF-16 and UTF-32 encodings in PO files and, respectively, in MO files are not supported. The file header does not include information on the language; therefore, in <%APP%> msgid and msgstr languages must be set in the file properties dialog on the source resources tab. PluralsSupport for plurals is the primary difference between the Gettext library and the other localization tools. Support for plurals improves the perception of the text. For example, instead of the string: "% day(s) ago" you could specify two strings: "%d day ago" "%d день назад" (1, 11, 21, etc.) msgid "%d day ago" "Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%" where: nplurals is the number of options for plural, and plural is the formula for calculating the index. Format SpecifiersAnother distinctive feature of PO files is the support for several format specifier types for each string. The specifier type is specified in the flags field. It may include several specifier types at once; i.e. the string can be simultaneously used by several applications written in different languages. Those languages, of course, must support similar specifier formats. When checking the validity, it is necessary to check the strings for all the specified formats. Support for PO and MO files in RadialixRadialix supports all the above specified features of the Gettext library files. Radialix implements:
String Extraction SettingsSettings for extracting strings from PO and MO files are displayed on the Source Resources tab in the file properties dialog.
Radialix allows selecting which strings are to be extracted - msgid and msgstr selection options. If both the fields are selected, the translation will be imported when extracting the resources and when adding new strings when updating the resources. Radialix can automatically detect the encoding of the file by reading the file header - the <Auto> item on the Encoding list. The obtained encoding name appears on the same list following the <Auto> tag. The string conversion result appears in the Preview grid; error count - in the Character conversion errors. The initial string extraction settings can be made in the new project properties dialog, in the Parsers>Gettext Files section of the Project>Default Project Properties menu. Target File Settings
Radialix supports the creation of both MO and PO files - the Target Build Action item in the file properties dialog on the Target Settings tab.
In the target settings, you can enable the automatic filling of the header with data from project information - the Options tab; make plural form settings and file encoding - the Plurals and Encodings tabs respectively. The Byte order option is used only for creating MO files. The <Default> value stands for the byte order in the original file – if it is a MO file – or the LittleEndian order – if otherwise. The plural forms settings (number of forms and formula for calculating string index) can be set in the Language-Formula grid. To obtain the settings for the required language, <%APP%> searches the grid beginning with the first record. When the language match is found or when the entry tag contains <Any Language>, the search halts, and the program uses the parameters set in that entry. If no suitable entry is found, an error message will be displayed when creating the target files.
For new files, the encoding is also set as a grid with Language as the first column. Similarly Radialix tries to find the encoding. The <Default> encoding stands for the encoding of the original file if it doesn't use UTF-8 or, otherwise, the default encoding for the target language.
Just as with the string extraction settings, the initial setting values for the new files can be edited in the new project properties dialog, in the Parsers>Gettext Files section of the Project>Default Project Properties menu. Plural Formula EditorThe plural formula editor is used for the number of forms and the formula for calculating the index of the msgstr string. The formula is a C expression, which has one integer parameter n. This is a numeric argument, which is substituted to the string. For example, in the strings msgstr[0] "%d день назад" "1 день назад" (21, 31, 41 etc.)
Editing Strings in PO and MO FilesIn PO and MO files, strings appear in a grid of strings in the ENTRIES resource; obsolete strings appear in the OBSOLETE resource. The OBSOLETE strings are not stored in MO files and by default have the Read-only attribute. The comments and the reference cont are displayed in the respective columns of the grid. To view a reference, double-click on the cell in the Reference column. Strings that have plural forms are displayed as substrings of the same string and are separated with the zero character \0x00 from one another. Such strings can be edited right in the cells of the strings grid.
However, with a lot more convenience the same can be done in the string editor, which can be opened by pressing the F2 key or by selecting the corresponding command on the popup menu. In the editor, the translation is entered in the String column of the Translation grid. The # column displays the index of the msgstr string, and the n column shows the value of the string selection parameter. The additional editor commands (copying string, inserting characters, etc.) are available on the popup menu.
Just as for the regular strings, inserting hot-key markers, maintaining the first and last characters of the string, and the automatic translation are also supported for plural strings. When performing the automatic translation, the program automatically defines the index of the msgstr string that matches the singular and inserts the translation of the msgid into it. To the rest of the strings, it inserts the automatic translation of the msgid_plural string. All of the string forms are stored in the translation memory as a single entry that consists of substrings separated with the zero character \0x00. Plural ValidationThe option for checking the number of plurals is available in the project validation settings in the Strings/Text section. This option is always enabled and is not available for editing.
The validation is carried out in compliance with the number of plurals specified in the target file settings, in the file properties. |