Alignment - FAQ

Frequently asked questions:

When to use alignment? Always?

Not always! Use alignment only when necessary and when you made sure that the quality is to your expectations.

If you create a Beebox project for the purpose of aligning a lot of files, you would of course enable alignment by default (no need to create instructions files). Please note that some languages may encounter difficulties with alignment due to different lengths between phrases in the source language and the target language.

If you integrate the Beebox with a CMS you should use instructions files so that the CMS tells the Beebox exactly which translated content needs to be aligned: The content that was actually edited or annotated by the user.

When and how does the Beebox align?

When the Beebox receives a source file it proceeds as follows:

The source file is segmented and pre-translated from the Beebox memories.

When you further include translated files and the instructions file, it adds these steps:

The translated files are aligned. For the alignment algorithm to work more precisely, a “training” file is created on the fly. The file includes all the pre-translations found in step 1 plus any alignment dictionaries that were optionally configured in the Beebox project.
The pre-translations of step 2 are now replaced by the aligned translations. If alignment is not possible for some pieces of the content, the pre-translations in step 1 are kept.

The next steps are the same whether you align or not:

The Beebox selects all unapproved translations (segments). These are sent to MT and/or a TMS. Approved translations are considered “final” and are not sent to MT or a TMS.

When does the Beebox approve aligned content?

How does the Beebox decide if translations are approved or not? There are two rules:

By default, aligned translations are flagged as “unapproved”. This is a design decision since alignment may not be perfect and require human approval in the TMS.
If an aligned translation (step 4) is identical to the pre-translation (step 1) and the pre-translation itself is approved, then the aligned translation is considered “approved” too.

If these rules sound too technical, the following two examples may add some clarity:

If you send the file and there is no memory yet in the Beebox, the system will yield translated segments coming exclusively from the translated file (no pre-translation in step 1).
All segments are “unapproved” and a human will have to approve/fix the results.

You already had translated source file in the past. You now receive a new version of the source file + translations. At this point, both the source file and the translated file may have been edited, maybe or maybe not.
The Beebox will align the files and identify all changes, both in the source and the translations (with respect to the previously done translations). All the changes are flagged as “unapproved”. All the content that did not change are flagged as “approved”.
The final result: Only the changed segments will be sent to the TMS.

What technology do you use?

The Beebox integrates the “hunalign” open source library for alignment purposes. The library home page is: http://mokk.bme.hu/resources/hunalign/

We improve on alignment quality with mechanisms that intelligently incorporate translation memory technology and pretranslations available per each file to align.