Integrated OCR and PDF to Word Converter
Translate images with OCR
Wordbee includes a tool to extract text contained in images such as PNG, JPG, TIFF, ICO, BMP and many more.
You get text back as an HTML file, ready to be edited or translated.
Image to text sample
Let us look at a sample image and how the result is extracted. Yes you can copy & paste the text to the right, it is no image any longer.
Uploaded image: | Extracted text: 好一朵美丽的茉莉花 |
Convert image files to text
Go to a project and open the document library. Upload or drag & drop your image file:
Click images to select one or more images. Then click the OCR link above the files. The OCR tool opens:
Choose one of the OCR systems and hit "Process files". The results are saved next to the images:
To rapidly check if the text was properly recognized, right click one of the html files and select the "Open With" > "Web Browser" option:
Handwritten text (English only)
Among the integrated OCR systems you have one from Microsoft that can extract handwritten English text.
Sample image:
Converted to:
chapter
Mr. Sherlock Holmes
In the year 1878 I took my degree of Doctor
of medicine of the university of London
and proceeded to Netley to go through
the course precribed for surgeons . Having
completed my studies there . I was duly attached
to the Fifth Northumberland Fusiliers as
Assistant
Surgeon
Enabling OCR systems
The OCR technology is provided by Google and Microsoft (more systems may be added in the future). You need to obtain credentials from either of those. Both propose free plans and the free plans allow to convert up to 5000 images per month. Pretty much. Beyond that you are charged but the cost is very reasonable.
Go to Settings > Image to Text (OCR) and enable the systems you need:
This page gives all information needed.
The sign-up process with Google and Microsoft is not the most intuitive. If you have any questions please contact our support team.
What about PDF files?
If the PDF was created with a word processor and it is not a scan, then you can use the PDF converted tool in the project page.
If the PDF contains scanned pages, then you could save the individual pages as image files. This can be done with a screenshot tool.
We may add a direct PDF scan to text feature at a later time.
PDF to Word Converter
Thanks to its PDF to Word Converter option, Wordbee Translator enables you to easily translate searchable PDF documents. No more conversion using a third party tool, you can now process your documents in one go!
The mechanism is simple, thanks to this option you will have the possibility to upload your searchable PDF files on your Wordbee platform and process them as any other file, the PDF Converter solution will instantly convert the documents into Word files (.docx) in order to extract the content to translate.
The output format can be '.docx' or 'pdf'. Note that the layout information may be lost. A DTP task is recommended to make sure the layout of the source document is respected in the translated one.
Any text contained in an image is not extracted as the converter does not do optical character recognition (OCR).