Whisper Showdown - Better Programming
OpenAI's Whisper, an automatic speech recognition (ASR) model, has evolved significantly since 2022. Initially requiring expensive GPUs, developers have now adapted it to run on regular CPUs and even Android phones. This progress is crucial for large language model applications such as ChatGPT and GPT-4.
Whisper is a powerful tool for transcribing YouTube channels, a feature essential for creating an app that can "chat" with these channels. Although YouTube captions are an option, they can be unreliable, and it's preferable not to be locked into the YouTube platform.
Whisper's performance varies across different models and systems. Tests were conducted on an M1 Pro laptop, an M2 Pro mini, and a PC with an older Intel 9900k and Nvidia RTX 2080 Ti. The speed and cost of each setup were evaluated, with benchmarks and test results shared.
The Whisper model performs well on both CPU and GPU setups, with the CPU-based implementations showing some limitations. The M1 Pro and M2 Pro Mini offer the lowest cost for running machine learning models for transcribing videos, but there are significant upfront costs depending on your build. On the other hand, the cloud-based RTX 4090 and A100 provide better performance at a higher cost due to rental fees.
The choice of architecture depends on the desired balance between cost and performance. For those with a budget, the M1 Pro and M2 Pro Mini save energy and money. For those who prioritize speed, the RTX 4090 and A100 are better options, albeit more expensive.
The RTX 2080 Ti, despite being older, offers competitive performance, especially with larger models. It becomes a viable option when transcribing larger videos using the >small transcription models.
In conclusion, the choice of computer model for video transcription should be based on individual needs and budget. It's essential to consider the frequency of transcription, the desired speed, and the available budget. New technology may offer better and cheaper options in the future, so it's crucial to stay updated.
The original article: https://betterprogramming.pub/whisper-showdown-427ce5f486ea