talk
whisper-live-transcription
talk | whisper-live-transcription | |
---|---|---|
3 | 3 | |
559 | 107 | |
- | - | |
8.1 | 7.9 | |
8 months ago | 4 months ago | |
TypeScript | Python | |
- | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
talk
-
ChatGPT can now see, hear, and speak – openai.com
Also curious to hear about your setup. Using whisper too? When I was experimenting with it there was still a lot of annoyance about hallucinations and I was hard coding some "if last phrase is 'thanks for watching', ignore last phrase"
I was just googling a bit to see what's out there now for whisper/llama combos and came across this: https://github.com/yacineMTB/talk
There's a demo linked on the github page that seems relatively fast at responding conversationally, but still maybe 1-2 seconds at times. Impressive it's entirely offline.
- Is anyone doing always-on voice to text with a local llama at home?
-
Giving LLM’s a <Backspace> Token
Here’s a project attempting to do just this!
https://github.com/yacineMTB/talk
whisper-live-transcription
-
OpenAI releases Whisper v3, new generation open source ASR model
I implemented a dummy real-time (tested on Mac M1) transcription approach with Whisper. You can find the project here: https://github.com/gaborvecsei/whisper-live-transcription
The idea was to provide transcription results as fast as you can, and you can refine it along the way by providing more and more context.
-
ChatGPT can now see, hear, and speak – openai.com
Here's a link to a project that claims half second latency for the transcription part: https://github.com/gaborvecsei/whisper-live-transcription
- Show HN: Live Transcription with Whisper in a client-server setup
What are some alternatives?
llama_farm - Use local llama LLM or openai to chat, discuss/summarize your documents, youtube videos, and so on.
chatcraft.org - Developer-oriented ChatGPT clone
willow - Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
openWakeWord - An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.
whisper-dictation - Dictation app based on the OpenAI speed to text models
awesome-talking-head-generation
faster-whisper-dictation - Dictation app based on the Faster Whisper transcription with CTranslate2
vllm - A high-throughput and memory-efficient inference and serving engine for LLMs
nerd-dictation - Simple, hackable offline speech to text - using the VOSK-API.
awesome-talking-head-generatio