Quantitative Analysis of Voice Recognition Models

Dublin Core

Title

Author

Faris Muhović

Abstract

With the growing adoption of virtual communication and voice-driven applications, the need for accurate, real-time, and privacy-conscious transcription tools has become critical. Existing solutions largely rely on cloud infrastructure, introducing concerns around latency, cost, and data privacy. This project investigates whether modern speech recognition models can perform competitively in fully offline environments while maintaining accuracy and responsiveness.
To this end, we conducted a comparative evaluation of four voice transcription model, Whisper, Faster-Whisper, Wav2Vec2, and Vosk, using the AMI Meeting Corpus. Each model was assessed based on four key metrics: Word Error Rate (WER), Character Error Rate (CER), BLEU, and ROUGE-L. Our findings demonstrate that Faster-Whisper outperforms the others in accuracy and latency, making it a strong candidate for edge deployment.
Building upon this analysis, a lightweight desktop application was developed using Python and PyQt5. The app captures microphone input in real time, applies VAD (Voice Activity Detection) and loudness filtering to reduce noise, and transcribes valid segments using Faster-Whisper. Additionally, the tool integrates Ollam, a local LLM engine to optionally generate intelligent responses to transcribed text.
This work contributes a dual outcome: a detailed empirical evaluation of modern transcription models on realistic meeting audio, and a functional, privacy-preserving voice assistant prototype for local systems. The results highlight the feasibility and value of running sophisticated voice AI tools on personal machines without cloud

dependency, paving the way for secure adoption in sensitive domains such as legal, healthcare, and enterprise communication.

Keywords

speech recognition, Whisper, Faster-Whisper, transcription models, real-time, privacy, WER, PyQt5

Quantitative Analysis of Voice Recognition Models

Dublin Core

Title

Author

Abstract

Keywords

Document Viewer