When setting up a file format configuration for XLIFF files, there are many options to choose from to ensure extraction is successful. This page will explain configuration options for XLIFF files in Wordbee Translator.
Wordbee supports many of the XLIFF 2.0 features. Read all the details in the dedicated page with XLIFF 2 information.
To view and edit the XLIFF filter options, go to Translation Settings > Document Formats and select XLIFF files. The XLIFF files - Configuration window opens. Here you can configure the following:
The General tab contains options for extracting content, defining the file as HTML, handling whitespaces and symbols, excluding content, and text segmentation. The options are described below:
- Content - Extract XLIFF existing translations (if any) and set segment status to 'Translated' in the translated XLIFF file.
- Comments - Extract XLIFF notes on segment level to Wordbee comments and write new comments added during translation work in the translated file.
- HTML Content - Inform the system that the content is HTML, set up a configuration for HTML extraction, and split text at HTML break tags.
- Whitespaces and Symbols - Do not show leading and trailing whitespaces, do not show preceeding and trailing markup, do not translate texts containing neither letters or digits, and always preserve whitespacees by default.
- Text Segmentation - Split segments at XLIFF segmentation boundaries, enable SRX rules for text segmentation, and select "Always split text at line breaks".
When enabling the option "Always split text at line breaks" in the XLIFF configuration, consider the following scenarios:
Scenario 1. If there is no HTML content in the XLIFF file:
There will be no line breaks when enabling the Content is HTML option because the HTML parser removes white spaces from the segments which are considered line brakes in XLIFF.
Scenario 2. If there is HTML content in the XLIFF file:
There will be line breaks when enabling the Content is HTML option provided that the HTML content contains HTML breaking tags (eg: <p>, <div>....etc). The HTML parser will remove the white spaces from the segments unless the HTML breaking tags (eg: <p>, <div>) are included in both the HTML content of the XLIFF file and the HTML configuration attached to the XLIFF configuration. See XLIFF 2 Information.
Do not translate settings
Exclude Content - Configure content to be translated or not translated when the system looks for texts or regular expression patterns.
- Words or terms
- Attributes and comments
SDL XLIFF settings
The SDL XLIFF tab may be used to load advanced properties when XLIFF files have been produced by other CAT tools.
- Extract Origin of Translations - The SDL 'origin' attribute specifies the origin of the translation: 'tm' for translation memory, 'mt' for machine translation, etc. The SDL 'percent' attribute tells whether a pretranslation is exact, fuzzy or perfect. These fields will be mapped to the respective fields in Wordbee Translator. The Wordbee word count will then take into consideration these values.
Pass over restrictions on the size of the segments to highlight issues when performing quality assurance checks.
Web preview settings
Wordbee supports web preview with XSLT stylesheets allowing the conversion of XML files into HTML for easier and customizable previewing in the web browser. The stylesheet must convert the XML file to HTML (find more details in this help page XSLT). You can have a library with all your stylesheets available in the platform by uploading them within a specific folder in My Company > Documents.
Find more help on how to set this up within the parser page itself:
View our XLIFF file format Questions and Answers section to learn how to perform common file format customizations. These examples are the most frequently answered by our support team.
To learn more about working with file format configurations, see the following pages:
- Viewing file format configurations
- Modifying file format configurations
- Creating file format configurations
- Test and validate file format configurations