XML Localization Interchange File Format (XLIFF), Translation Memory eXchange (TMX), and Term Base eXchange (TBX). Do any of these sound familiar to you? These are file formats widely used in the translation and localization industry.

The types of files that commonly use these formats are:

  • Files prepared for translation/translated files
  • Translation memories
  • Terminology databases

Written with  XML technology, these formats can be used across multiple translation software programs and tools without any file conversion. Let’s say you have a Trados translation memory (TM) that you need to use on Catalyst. You can export your TM as an XML file with Trados and then import that same XML file into Catalyst. Both software programs will be able to read, write, and use the information in the exported file.

How Does XML Work in Translation and Localization?

Software programs read XML files in the same way one would methodically look for information in folders: open the first folder, read what’s inside, close the folder and open the next folder.

Folder Organisation

We could compare the tags (in red below) to folder names. When a software program meets a tag such as <xliff …>, it opens the “xliff” folder. It closes the folder when it meets the </xliff> closing tag. Meanwhile, it has read all that’s between these two tags.

XLIFF file preview

An XLIFF file contains many translation units that associate a source term with its target equivalent. Extra information is added through the attributes (in green). That’s how you know the name, the approved status of the translation unit, or the language of each element.

More about XLIFF, TMX and TBX

These three file formats are fairly standard. There are specific tags for each of them.
Here are some information specific to each:

  • XLIFF (translation file): matching translation memory segments, quality match, translator contact details…
  • TMX (translation memory): information about the usage of the term, about how to translate a specific proper name, formatting information…
  • TBX (terminology base): date and author of the term translation entry, everything about the term (entry, definition, field, context…)

You now know a little bit more about these XML formats and the purpose of file conversion in translation and localization processes. You are able to better understand the structure of their files, and how their versatile nature is key to enabling rapid and efficient localization.

Sources : XML in Localisation: A practical analysis