Home >Web Front-end >JS Tutorial >AI Pronunciation Trainer

AI Pronunciation Trainer

Patricia Arquette
Patricia ArquetteOriginal
2024-12-30 00:46:10955browse

AI Pronunciation Trainer

In this article I present a project I'm currently working on: AI Pronunciation Trainer (online here), a tool designed to help you improve your pronunciation using the power of artificial intelligence. This project is a refactor of Thiagohgl's original AI Pronunciation Trainer to which I made several improvements to make the tool more effective and easier to use.

What it is and what it does

AI Pronunciation Trainer is a tool that uses artificial intelligence to evaluate your pronunciation and provide feedback, helping you improve and be understood more clearly. Use Silero STT / TTS models for speech-to-text and text-to-speech functionality, ensuring accurate and reliable pronunciation assessment.

Refactor: update of the Frontend and Backend Libraries

I updated the backend libraries bringing PyTorch, in particular, to version 2.5.x. I also changed the version of the German Speech-to-Text model to fix a bug that prevented the use of PyTorch after version 1.13.x.
Also:, regarding the frontend:

  • Updated javascript libraries using the latest versions of jQuery (3.7.1) and Bootstrap (5.3.3)
  • New frontend based on Gradio 5.x
  • Added E2E tests with Playwright
  • Added the ability to write, read and obviously evaluate a free choice sentence
  • Guided tour for new users with driver.js and custom css/javascript inside Gradio blocks
  • Playback of individual words in the recording followed by the 'ideal' pronunciation of the same word read by the Text-to-Speech engine
  • Also added an in-browser Text-to-Speech feature (on Windows 11 it only works if the English and German language packs are installed)

Online version: the demo in the HuggingFace space

You can try my project online on my HuggingFace Space. This online demo allows you to experiment with the tool's capabilities without any installation or configuration. The HuggingFace space provides a convenient and accessible way to test AI Pronunciation Trainer and see how it can help you improve your pronunciation. Please be patient, sometimes it is a little slow or sleeping if no one has used it for a while (locally it is much faster, especially if you have a powerful computer). There is also an embedded version of the HuggingFace.

space

Future Works

While it works quite well, there is obviously room for improvement. Here are some of the future improvements I plan to implement:

  • Receive feedback from the author of the original work on my documentation and changes
  • Ask the author of the original work for some explanations on the architectural and functional choices he made
  • Evaluate the transition from PyTorch to ONNX Runtime
  • Add more E2E tests with Playwright

Conclusion

I believe that AI Pronunciation Trainer is a useful tool for anyone who wants to improve their pronunciation independently. With the power of AI and improvements made during the refactor, this tool provides accurate and reliable feedback to help you speak more clearly and confidently. I invite you to try the HuggingFace Space demo and understand how this project can help you on your path to better pronunciation.

The above is the detailed content of AI Pronunciation Trainer. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn