Machine Translation with LLMs

To create a machine translation profile that leverages Large Language Models (LLMs), kindly refer to the detailed instructions provided here. While the overall process is similar, it's important to note that there are specific settings unique to LLM integrations.

Credentials

Due to their complexity, credentials for Large Language Models (LLMs) are managed within specialized profiles, which are then associated with the relevant machine translation (MT) profile. Learn more about LLM providers and their credential requirements here.

Custom Prompt

Every MT request via an LLM starts with a pre-set prompt but users have the option to apply a custom prompt to influence the LLM’s tone, style, or specific content domain to potentially improve translation quality. It's advisable to keep prompts concise, as overly detailed prompts might negatively impact translation quality. Please experiment with prompts as every LLM provider processes them differently. It's important to note that LLMs might not support all languages; ensure you check compatibility for less commonly used languages directly with the LLM provider.

Request Limits

LLM providers set various limits on their API usage to prevent service overload. To ensure requests are successful and users aren't blocked due to misuse, Wordbee provides options to implement specific usage restrictions in MT profiles.

Tokens per minute

Each LLM specifies the maximum number of tokens it can handle in a single request, a limit that is either preset or customizable in the LLM system. Wordbee Translator utilizes this token limit to divide larger documents into multiple machine translation requests, ensuring that entire documents are processed. Users are advised to align the token limit in Wordbee Translator with that of the LLM or set it lower, as requests exceeding the limit may be declined. Although a higher token count per minute can accelerate processing, surpassing these limits may adversely affect processing times, efficiency, and could result in unsuccessful requests.

Maximum wait time

When multiple requests reach the same LLM system at the same time, some might be rejected due to system overload. To address this issue, LLM providers have implemented safety mechanisms that automatically accept retries of the requests that were initially rejected. While systems allow for several retries, too many retry attempts can result in throttling or even blocking of the requesting systems.

In the event of a failed request, Wordbee Translator will attempt to resend the request a predetermined number of times. This number is specific to each LLM provider and based on their API guidelines. The maximum wait time refers to the longest duration Wordbee Translator will wait before sending all retry requests to an LLM provider. A longer wait time means spreading out the retries over a more extended period of time, whereas a shorter wait triggers the same number of retry attempts but within a shorter timeframe.

For businesses that often send many requests in a short span, extending the retry interval can help minimize the instances of system overload. For less frequently used LLMs, shorter intervals can speed up processing times as retries are triggered faster.