File Engineering

File Engineering is a rather exotic term for preparing files for translation and client delivery. You may receive files in formats that translators will baulk at, including XML, HTML, PHP, and various data files with no internal structure. If you need to find a way to deliver text to translators and recreate the original document in the target language I can probably help.

I’ve had a lot of experience dealing with these various kinds of file and preparing the text for translation.

EXAMPLE 1 – The file below is a block of PHP variables. The text for translation is in the second pair of single quotation marks after the equals sign. Because the file is a basic text file with no structure (i.e. mark-up), grabbing and isolating that text is tricky. I use a combination of regex expressions to do this and provide the translator with an Excel file.

Sample PHP variable file

 

EXAMPLE 2 – The sample below is from a ‘header’ file for the programming language C+. To display properly in the application any non-Latin characters had to be converted to a special hexadecimal format.

The original Russian translation looked like this: Идет поиск – пожалуйста, подождите…

This had first to be converted to hexadecimal notation:
1104180434043504420020043F043E04380441043A002020130020043F
043E04360430043B04430439044104420430002C0020043F043E043404
3E04360434043804420435002E002E002E00

The next step was to convert that string into a format that the C+ language would understand. This was done by writing a Word Macro to convert the plain hex above into the following :

\x11\x04\x18\x04\x34\x04\x35\x04\x42\x00\x20\x04\x3F\x04\x3E
\x04\x38\x04\x41\x04\x3A\x00\x20\x20\x13\x00\x20\x04\x3F
\x04\x3E\x04\x36\x04\x30\x04\x3B\x04\x43\x04\x39\x04\x41\x04\x42
\x04\x30\x00\x2C\x00\x20\x04\x3F\x04\x3E\x04\x34\x04\x3E\x04
\x36\x04\x34\x04\x38\x04\x42\x04\x35\x00\x2E\x00\x2E\x00\x2E\x00

 

© Steve Sutcliffe 2012.