Skip to main content

Getting started with the Terminology Extraction module

Learn how to quickly get started with Wordbee Terminology Extraction. The term extraction is a dedicated module that helps your teams get terminology prepared for any project in a simple way at any time. This module features a term extraction tool that recognizes possible terms from documents and presents them in term collections. The Terminology Extraction module is integrated, which means that you can preview and filter the results of the term extraction and create termbases with the related terms directly in your Wordbee Translator platform. It also features export and import capabilities for standard exchange of data and for teams to continue their terminology work offline. By means of TBX files or Excel Templates, term collections can be conveniently prepared offline and imported back to the system to ultimately create terminology databases. For more details, see the main features and benefits from Wordbee Term Extraction.

How to access the Terminology Extraction module

  • Go to Settings and select Licenses & Usage from the drop-down menu.

  • If the Terminology Extraction module is activated, go to Resources and click on Terminology Extraction to create your first term collection.

How to run a term extraction

Follow the steps below extract terminology from your working files:

  1. Go to the 2. Documents tab of any project and prepare your file(s) for online translation.

  2. Then, select the file or files you want to extract terms from and click under Tools > Extract terminology from files.

  3. Review all details before launching the text extraction request. Make sure the languages and the reference are right so you can easily find the results later in the system. Click Submit request to when you are ready.

  4. The request is now being processed and a term collection is added into the system to present the results of the extraction*. Depending on the size of the files, the results should be available within minutes.

    *Find all your extraction requests in the Term extraction section of the platform (Go to Resources > Term extraction).

  5. Check the status of your request in the list of term collections. Once it is updated to “Ready”, you will be able to see the terms detected in your files. Click “select” to see more details for any collection.

Every time you submit documents for term extraction, you will get a reminder on the features and credit included in your subscription plan.

This is how a sample term collection issued from an extraction request looks like

When to use term extraction

As a project manager, you can use term extraction to get term candidates detected in your working files and create a list of what you consider should be the terminology for your project. This whole process is custom to your needs, so follow the steps below and adapt when required.

You can benefit from bilingual term candidates if:

  • the files you want to work on show some pre-translations in the Wordbee Editor and

  • and both languages are supported by the extractor. See the full list of supported languages by the extractor here.

Work cycle with the Terminology Extraction module

1. Run the extraction

Run the extraction as explained in the How to section above.

2. Narrow down the results

Once the results are ready in your collection, narrow down the list of candidates by using the filters.

  • Use the preview and the statistics to get a rough idea of where to define your threshold and decide what you will focus on.

  • Remove filtered terms if you need to get rid of some of the raw data.

3. Export candidates

Prepare and validate the terminology offline. Export* your collection into Excel for further analysis and clean-up. Validate the results of your work with your stakeholders.
*if you Export the collection in TBX you can import the clean results as a Terminology database directly into the related section of the system. See the Learn more section below for more details.

4. Import final terms

Work offline and import your work into the collection as you go to keep it central. The terms you keep in your online collection will be the ones used for creating a database once you decide the work for each language pair is done. Import your Excel as many times as you need to get to the list of terms that are worth documenting.

While doing the clean-up, you may want to document several terms that are related together, especially if they are referring to the same concept. Use Excel as your working template. For example, the lemma originally shows the base form of a term. You can adjust your Excel and use that column as a central element to organize your terminology work.

Tip: use the exported Excel as your working template

Relating terms together will make it possible to later index these into the same concept when creating a database in Wordbee.

  • Repurpose the lemma column in Excel to link terms together so later you can create concept-oriented terminology databases.

  • Delete/Add lines into the Excel and import the file to reduce/extend the term collection.

Excel file ready for import

5. Create termbase*

Focus on one of the languages of your collection and create a new database with your terminology work. This operation can be done:

  • for the source language on its own

  • for each target language.

As a result, you can create monolingual or bilingual terminology databases with your terms and their contextual examples for later reference. These are stored in standard TBX fields.

6. Attach it to your project(s)

Add the new termbase to your project(s) so it can be referenced during the translation process. Activate the QA flag in the project if you want to flag inconsistencies when QA checks are done.

*By using the option “Aggregate terms with identical lemma” when creating a termbase, terms that have the same lemma can be mapped into the same concept. These can be further updated online in the Terminology Management tool.

Sample termbase created with the Excel above

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.