Article Open Access http://dx.doi.org/10.26855/er.2024.12.008
Research on Personalized Speech Synthesis Model for Korean Language Learners
Xinxin Zhao1,*, Yunning Wang2
1International Commerce, Graduate School of International Studies (GSIS), Seoul National University, Seoul 08826, South Korea.
2Department in Anthropology, Seoul National University, Seoul 08826, South Korea.
*Corresponding author: Xinxin Zhao
Published: December 27,2024
Abstract
This research aims to explore and implement a personalized speech synthesis model tailored for Korean language learners. Despite significant advancements in general speech synthesis technology, the quality and naturalness of speech synthesis remain challenging for Korean language learners. In this study, we employ deep learning techniques and combine research on facial muscle movements with speech learning to design an innovative framework for personalized speech synthesis. Initially, a substantial amount of speech data from Korean language learners is collected and subjected to preprocessing and annotation. Subsequently, we construct a personalized synthesis model based on deep neural networks to achieve pronunciation correction and fluency improvement for individual learners. The novelty of this research lies in the integration of facial muscle movements with speech learning, leading to optimization in personalized speech synthesis. This innovation holds vital practical implications for enhancing Korean language learners' pronunciation and improving their language communication skills.
References
Chen, Z., Zhang, Z., Wang, B., & Xie, Y. (2023). Application and prospect of AI speech synthesis technology. Film and Television Production, 29(3), 51-55.
Niu, F., & Wushouer, S. (2022). Prosodic enhanced Chinese speech synthesis system. Modern Electronic Technology, 45(13), 87-92.
Pan, X., Lu, T., Du, Y., & Tong, X. (2021). A review of speech synthesis and conversion technology based on deep learning. Computer Science, 48(8), 200-208.
Wang, Y. (2024). Limitations and potentials of ChatGPT-4 in gender expression: A future research outlook from a feminist perspective. Edelweiss Applied Science and Technology, 8(6), 5855-5868. https://doi.org/10.55214/25768484.v8i6.3267.
Zeng, Z., Wang, J., & Cheng, N. (2020). Prosody learning mechanism for speech synthesis system without text length limit. Conference presentation.
How to cite this paper
Research on Personalized Speech Synthesis Model for Korean Language Learners
How to cite this paper: Xinxin Zhao, Yunning Wang. (2024). Research on Personalized Speech Synthesis Model for Korean Language Learners. The Educational Review, USA, 8(12), 1465-1470.
DOI: http://dx.doi.org/10.26855/er.2024.12.008