Openai-whisper-cpu Alternatives

Similar projects and alternatives to openai-whisper-cpu

text-generation-webui

876 37,023 9.9 Python openai-whisper-cpu VS text-generation-webui

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
whisper

345 61,790 6.4 Python openai-whisper-cpu VS whisper

Robust Speech Recognition via Large-Scale Weak Supervision
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
whisper.cpp

187 31,817 9.8 C openai-whisper-cpu VS whisper.cpp

Port of OpenAI's Whisper model in C/C++
FlexGen

39 9,035 3.5 Python openai-whisper-cpu VS FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.
whisperX

24 9,391 8.4 Python openai-whisper-cpu VS whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
buzz

21 10,177 8.5 Python openai-whisper-cpu VS buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
intel-extension-for-pytorch

16 1,380 9.7 Python openai-whisper-cpu VS intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
deepsparse

21 2,895 9.5 Python openai-whisper-cpu VS deepsparse

Sparsity-aware deep learning inference runtime for CPUs
BentoML

16 6,627 9.8 Python openai-whisper-cpu VS BentoML

The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
frogbase

14 754 4.3 Python openai-whisper-cpu VS frogbase

Discontinued Transform audio-visual content into navigable knowledge.
kernl

8 1,472 1.5 Jupyter Notebook openai-whisper-cpu VS kernl

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
WAAS

12 1,755 7.0 JavaScript openai-whisper-cpu VS WAAS

Whisper as a Service (GUI and API with queuing for OpenAI Whisper)
whisper-asr-webservice

11 1,718 7.8 Python openai-whisper-cpu VS whisper-asr-webservice

OpenAI Whisper ASR Webservice API
serve

11 3,985 9.5 Java openai-whisper-cpu VS serve

Serve, optimize and scale PyTorch models in production (by pytorch)
modal-examples

9 585 9.5 Python openai-whisper-cpu VS modal-examples

Examples of programs built using Modal
whisper-playground

7 763 6.1 Python openai-whisper-cpu VS whisper-playground

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/
llama-cpp-python

56 6,725 9.8 Python openai-whisper-cpu VS llama-cpp-python

Python bindings for llama.cpp
transformer-deploy

8 1,626 6.8 Python openai-whisper-cpu VS transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
yt-whisper

3 1,320 0.0 Python openai-whisper-cpu VS yt-whisper

Using OpenAI's Whisper to automatically generate YouTube subtitles
coriander

3 835 0.0 LLVM openai-whisper-cpu VS coriander

Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better openai-whisper-cpu alternative or higher similarity.

Suggest an alternative to openai-whisper-cpu

openai-whisper-cpu reviews and mentions

Posts with mentions or reviews of openai-whisper-cpu. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-05-14.

How to run Llama 13B with a 6GB graphics card
12 projects | news.ycombinator.com | 14 May 2023

I feel the same.
For example some stats from Whisper [0] (audio transcoding) show the following for the medium model (see other models in the link):
---
GPU medium fp32 Linear 1.7s
CPU medium fp32 nn.Linear 60.7
CPU medium qint8 (quant) nn.Linear 23.1
---
So the same model runs 35.7 times faster on GPU, and compared to an CPU-optimized model still 13.6.
I was expecting around an order or magnitude of improvement. Then again, I do not know if in the case of this article the entire model was in the GPU, or just a fraction of it (22 layers), which might explain the result.
[0] https://github.com/MiscellaneousStuff/openai-whisper-cpu
Whispers AI Modular Future
14 projects | news.ycombinator.com | 20 Feb 2023

According to https://github.com/MiscellaneousStuff/openai-whisper-cpu the medium model needs 1.7 seconds to transcribe 30 seconds of audio when run on a GPU.
[P] Transcribe any podcast episode in just 1 minute with optimized OpenAI/whisper
4 projects | /r/MachineLearning | 6 Nov 2022

There is a very simple method built-in to PyTorch which can give you over 3x speed improvement for the large model, which you could also combine with the method proposed in this post. https://github.com/MiscellaneousStuff/openai-whisper-cpu
[D] How to get the fastest PyTorch inference and what is the "best" model serving framework?
8 projects | /r/MachineLearning | 28 Oct 2022

For CPU inference, model quantization is a very easy to apply method with great average speedups which is already built-in to PyTorch. For example, I applied dynamic quantization to the OpenAI Whisper model (speech recognition) across a range of model sizes (ranging from tiny which had 39M params to large which had 1.5B params). Refer to the below table for performance increases:
[P] OpenAI Whisper - 3x CPU Inference Speedup
1 project | /r/MachineLearning | 27 Oct 2022

GitHub
A note from our sponsor - InfluxDB
www.influxdata.com | 23 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →