Microsoft has announced that the document translation feature built into Azure Translator can now scan and translate PDF documents. The company says users no longer need to pre-process documents through an OCR engine before trying to translate them.
The Document Translation feature, first launched a year ago, can translate multiple documents into more than 110 languages and dialects at once. Today's update means PDF files are now fully supported, as well as Word and PowerPoint files. According to the company, being able to scan PDFs with scanned image content is highly desirable.
Explaining some of the features, Microsoft has said:
The File Translation Service now has the intelligence to
- identify whether a PDF document contains scanned image content,
- Route PDFs containing scanned image content to the internal OCR engine to extract text,
- Reconstruct the translated content into regular text PDFs while preserving the original layout and structure.
While document translation is available in 110 languages and dialects, the new scanning feature is only available in 68 source languages and 87 target languages. Microsoft has promised to add support for more "in due course."
Microsoft says there are no code changes required to start using the new feature, and all PDFs can be submitted to Translator immediately. New features won’t cost customers more money. There are two pricing plans available for document translation through Azure; they include a pay-as-you-go plan and a D3 quantity discount plan for higher volumes.
The above is the detailed content of Azure Translator can now scan and translate PDF documents. For more information, please follow other related articles on the PHP Chinese website!