talk
talk | awesome-talking-head-generation | |
---|---|---|
3 | 2 | |
559 | 1,182 | |
- | - | |
8.1 | 6.8 | |
8 months ago | 29 days ago | |
TypeScript | ||
- | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
talk
-
ChatGPT can now see, hear, and speak – openai.com
Also curious to hear about your setup. Using whisper too? When I was experimenting with it there was still a lot of annoyance about hallucinations and I was hard coding some "if last phrase is 'thanks for watching', ignore last phrase"
I was just googling a bit to see what's out there now for whisper/llama combos and came across this: https://github.com/yacineMTB/talk
There's a demo linked on the github page that seems relatively fast at responding conversationally, but still maybe 1-2 seconds at times. Impressive it's entirely offline.
- Is anyone doing always-on voice to text with a local llama at home?
-
Giving LLM’s a <Backspace> Token
Here’s a project attempting to do just this!
https://github.com/yacineMTB/talk
awesome-talking-head-generation
-
Ask HN: How does Heygen AI video generation works?
I assume it's a SOTA version of "talking head generation" or something related.
https://paperswithcode.com/task/talking-head-generation
https://github.com/harlanhong/awesome-talking-head-generatio...
-
ChatGPT can now see, hear, and speak – openai.com
As soon as they release the API, we can build an AI "bartender". Combine the voice output and input with NeRF talking heads such as from Diarupt or https://github.com/harlanhong/awesome-talking-head-generatio....
You will now be able to feed it images and responses of the customers. Give it a function to call complementaryDrink(customerId)
What are some alternatives?
llama_farm - Use local llama LLM or openai to chat, discuss/summarize your documents, youtube videos, and so on.
chatcraft.org - Developer-oriented ChatGPT clone
willow - Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
CVPR2022-DaGAN - Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
vllm - A high-throughput and memory-efficient inference and serving engine for LLMs
whisper-live-transcription - Live-Transcription (STT) with Whisper PoC
nerd-dictation - Simple, hackable offline speech to text - using the VOSK-API.
awesome-talking-head-generatio