Skip to main content

Word Counting

What is word counting?

Word counting is the most frequently used method of quantifying the work involved in translation and localization projects. The word represents the standard unit of measure and serves as the basis for the financial and planning calculations. Other units of measure are characters, lines and pages. When you create a project in a translation environment system, import a document for translation and attach a translation memory, the system will automatically analyse your text, parse and calculate the total number of words in your source document.

Additionally, translation tools apply a so-called weighted word count, which means that they also count the repetitions in the source document and the different types of matches (leverage) against the translation memories and terminology databases that are available in the project.

Word counting algorithms in Wordbee

Here is an example of a word count analysis in a standard project:

In Wordbee you can choose between two types of counting algorithms:

  1. A default counting algorithm that counts the source words similarly to other translation tools.

  2. A counting algorithm compatible with Microsoft Word. The MS word counts are generally smaller.

Even when performing a word count with the same file and translation memory attached, the word counting algorithms and TM match values may differ from tool to tool. There is no standard way of defining what a “word” is. For example, the date in Wordbee can be counted as one word, whereas in other tools it can be counted as three words. Furthermore, the TM match values are influenced by several factors, such as the project and file filter settings, segmentation rules and the TM fuzzy matching technology. This has implications for the financial calculations as well. If the project manager and the supplier use different translation tools, the price calculations will be slightly different.

What is a “word”?

A word is generally defined as a unit of language, consisting of one of more spoken sounds or their written representation. Words are composed of one or more morphemes, usually separated by spaces or punctuation characters in writing. In computer-aided translation, punctuation characters, tags, symbols, numbers may be counted as words as well. Check the Advanced count options in Word Count Settings to see what elements in Wordbee can be counted as a “word”.

What is a “segment”?

A segment in computer-assisted translation is considered a source-text sentence or sentence-like unit (e.g. headings, titles or elements in a list). Each tool has segmentation rules that define the boundaries of a segment. For example, a segment could end with a punctuation mark, line breaks or tags. In Wordbee you can configure the segmentation rules in Settings > Translation Settings > Segmentation rules.

How to access the Word Counting configuration

You can view and customize the default word count settings for each type of project or client in the Translation Settings area of Wordbee.

  • Go to Settings.

  • Search for Word Counting.

  • Click on Configure to access the word count settings.

In the next subsections, you will learn about the different word counting options and how to customize the word count profiles in Standard and Codyt projects.

Learn more

Check the articles in this section to learn how to view, edit and configure the word count profiles in Standard and Codyt projects.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.