论文信息 - A Hybrid Model for the Prediction of the Linguistic Origin of Surnames

A Hybrid Model for the Prediction of the Linguistic Origin of Surnames

The prediction of the linguistic origin of surnames is a basic functionality required in the design of high-quality multilanguage speech synthesizers. The assignment of a given string representing a surname to a specific language is typically based on a set of rules which can hardly be written in an explicit form. The approach we propose faces this problem combining a rule-based system with a module based on evidential reasoning and a module based on neural networks. The resulting hybrid system combines the different sources of information, merging both knowledge from experts on linguistics and knowledge automatically acquired using learning from examples. The system has been validated on a large database containing surnames belonging to four different languages, showing its effectiveness for real-world applications.

[1] Joseph Picone,et al. Improved surname pronunciations using decision trees , 1998, ICSLP.

[2] J. Kacprzyk,et al. Advances in the Dempster-Shafer theory of evidence , 1994 .

[3] Lotfi A. Zadeh,et al. A Simple View of the Dempster-Shafer Theory of Evidence and Its Implication for the Rule of Combination , 1985, AI Mag..

[4] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[5] Rolf Carlson,et al. Predicting name pronunciation for a reverse directory service , 1989, EUROSPEECH.

[6] Kenneth Ward Church. Stress assignment in letter‐to‐sound rules for speech synthesis , 1985 .

[7] Glenn Shafer,et al. A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[8] Bert Van Coile,et al. On the development of a name pronunciation system , 1992, ICSLP.

[9] Tony Vitale,et al. An Algorithm for High Accuracy Name Pronunciation by Parametric Speech Synthesizer , 1991, Comput. Linguistics.