Roadmap
With advancements in AI model accuracy and improvements in device performance, it is now possible to run certain models locally to solve tasks that were previously considered challenging.
Language barriers remain a major obstacle to communication between regions in the foreseeable future.
DuRT will continue to be updated and maintained.
Task List
- Support non-streaming Whisper models to enable voice recognition in more languages (Highest Priority)
- Add functionality to generate complete subtitle files for local video or audio files (High Priority) Since Whisper-based recognition is already implemented, this feature should not be difficult to achieve.
- Utilize better translation models for improved results (High Priority)
- Resolve the issue where some models produce recognition results without punctuation (Medium-High Priority)
- Adapt more streaming speech recognition models to improve accuracy and support additional languages (Medium Priority)
- Add more translation methods, such as platform translation APIs and large model APIs (Medium Priority)
- Make Whisper almost real-time for speech recognition (Medium Priority)
- Add microphone source selection, such as choosing the microphone from headphones (Medium Priority)
- Support Apple Speech Recognition on macOS (Medium Priority)
- Support macOS's built-in translation features (Medium Priority)
- Optimize the overall process using large models (Medium Priority)
- Support AppleScript automation (Medium-Low Priority)
- Enable Whisper to automatically detect languages without manual selection (Medium-Low Priority)
- Add internal audio source selection, such as recognizing sound from a specific app only (Low Priority) Feasibility is uncertain.
- Support custom recognition models (Lowest Priority) This feature may only be needed by a small number of users, requiring familiarity with AI models.
Discussions and Suggestions
If you have better ideas or suggestions, feel free to communicate with us via Contact Us.
Open Source Acknowledgments
This project has drawn inspiration from many open-source projects. We are grateful to the following: