Please use this identifier to cite or link to this item:
http://dspace.aiub.edu:8080/jspui/handle/123456789/2661
Title: | Enhancing Bangla Local Speech-to-Text Conversion Using Fine-Tuning Wav2vec 2.0 with OpenSLR and Self-Compiled Datasets Through Transfer Learning |
Authors: | Hossain, S.K. Muktadir Rihan, Rahat Ahmed, Imtiaz Boni, Pritam Khan Gomes, Dipta |
Keywords: | Bangla Speech Recognition wav2vec 2.0 Transfer Learning Speech Technology Automatic Speech Recognition (ASR) |
Issue Date: | 15-Mar-2025 |
Publisher: | IEOM Society International |
Abstract: | An improved method to create an enhanced Bangla standard and local speech. The wav2vec 2.0 model has been fine-tuned using additional datasets collected alongside OpenSLR data. Our findings have shown that there are gains in transcrip- tion accuracy of as much as eleven percent, which is impressive given the low resources and languages employed, proving the merits of transfer learning and fine-tuning. The work of the research is aimed at expanding the knowledge base concerning the use of novel deep learning algorithms in small languages in the field of speech technology. The evaluation metrics included Word Error Rate (WER) and Character Error Rate (CER), with the fine-tuned model achieving an overall WER of 11.27% and CER of 6.03%. Comparative analysis with previous work shows a significant improvement from baseline models, highlighting the efficacy of the wav2vec 2.0 model in leveraging large and diverse datasets. The experimental setup was supported by a cluster computing environment with NVIDIA CUDA-compatible GPUs, underscoring the computational resources required for effective Automatic Speech Recognition (ASR) model training. The re- sults demonstrate substantial advancements in ASR performance for Bengali, with the fine-tuned model outperforming previous benchmarks and showcasing the benefits of self-supervised learn- ing approaches. |
Description: | None |
URI: | http://dspace.aiub.edu:8080/jspui/handle/123456789/2661 |
ISBN: | 979-8-3507-4443-9 |
ISSN: | 2169-8767 |
Appears in Collections: | Publications: Journals |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Enhancing Bangla Local STT.pdf | First paper Manuscript | 151.85 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.