Control of a Speech Robot via an Optimum Neural-Network-Based Internal Model With Constraints

An optimum internal model with constraints is proposed and discussed for the control of a speech robot, which is based on the human-like behavior. The main idea of the study is that the robot movements are carried out in such a way that the length of the path traveled in the internal space, under external acoustical and mechanical constraints, is minimized. This optimum strategy defines the designed internal model, which is responsible for the robot task planning. First, an exact analytical way to deal with the problem is proposed. Next, by using some empirical findings, an approximate solution for the designed internal model is developed. Finally, the implementation of this solution, which is applied to the control of a speech robot, yields interesting results in the field of task-planning strategies, task anticipation (namely, speech coarticulation), and the influence of force on the accuracy of executed tasks.

[1]  F H Guenther,et al.  Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production. , 1995, Psychological review.

[2]  Atsuo Takanishi,et al.  Development of a talking robot with vocal cords and lips having human-like biological structures , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  J R Flanagan,et al.  The Role of Internal Models in Motion Planning and Control: Evidence from Grip Force Adjustments during Movements of Hand-Held Loads , 1997, The Journal of Neuroscience.

[4]  D. Ostry,et al.  Control of jaw orientation and position in mastication and speech. , 1994, Journal of neurophysiology.

[5]  Man Mohan Sondhi,et al.  A nonlinear articulatory speech synthesizer using both time- and frequency-domain elements , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  C. M. Goss,et al.  GARY??S ANATOMY OF THE HUMAN BODY: , 1967 .

[7]  M. Forray,et al.  Variational Calculus in Science and Engineering , 1968 .

[8]  Florian Vogt Finite element modeling of the tongue , 2005, AVSP.

[9]  N. Hogan,et al.  Does the nervous system use equilibrium-point control to guide single and multiple joint movements? , 1992, The Behavioral and brain sciences.

[10]  T. Flash,et al.  Moving gracefully: quantitative theories of motor coordination , 1987, Trends in Neurosciences.

[11]  William Rowan Hamilton Second Essay on a General Method in Dynamics. [Abstract] , 1830 .

[12]  Emilio Bizzi,et al.  The equilibrium-point framework: A point of departure , 1992 .

[13]  Ferdinando A. Mussa-Ivaldi,et al.  Movement control: Does the nervous system use equilibrium-point control to guide single and multiple joint movements? , 1994 .

[14]  T. M. Nearey Static, dynamic, and relational properties in vowel perception. , 1989, The Journal of the Acoustical Society of America.

[15]  Terence D. Sanger,et al.  Neural network learning control of robot manipulators using gradually increasing task difficulty , 1994, IEEE Trans. Robotics Autom..

[16]  K. A. Semendyayev,et al.  Handbook of mathematics (3rd ed.) , 1997 .

[17]  Stefano Mischler,et al.  Effect of surface structure on frictional behaviour of a tongue/palate tribological system , 2006 .

[18]  Xue Gu,et al.  Robot Movement Planning and Control Based on Equilibrium Point Hypothesis , 2006, 2006 IEEE Conference on Robotics, Automation and Mechatronics.

[19]  Olov Engwall,et al.  A 3d tongue model based on MRI data , 2000, INTERSPEECH.

[20]  Frank H. Guenther,et al.  Learning Sound Categories: A Neural Model and Supporting Experiments , 2002 .

[21]  J. Flanagan,et al.  Chapter 2 Control of Human Jaw and Multi-Joint Arm Movements , 1990 .

[22]  Mark Carlson,et al.  A Computational Approach to Muscle Modeling of the Human Tongue via the Finite Element Method Along With Motion Control Correlations With MRI Tracking Data for Simple Speech Patterns , 2008 .

[23]  M. Kawato,et al.  Coordinates transformation and learning control for visually-guided voluntary movement with iteration: A Newton-like method in a function space , 1988, Biological Cybernetics.

[24]  W. Hamilton XV. On a general method in dynamics; by which the study of the motions of all free systems of attracting or repelling points is reduced to the search and differentiation of one central relation, or characteristic function , 1834, Philosophical Transactions of the Royal Society of London.

[25]  Thomas Baer,et al.  An articulatory synthesizer for perceptual research , 1978 .

[26]  M. Kawato,et al.  A hierarchical neural-network model for control and learning of voluntary movement , 2004, Biological Cybernetics.

[27]  O. Fujimura,et al.  Computational model of the tongue: A revised version , 1977 .

[28]  M. Mon-Williams,et al.  Motor Control and Learning , 2006 .

[29]  Joseph S. Perkell,et al.  A physiologically-oriented model of tongue activity in speech production , 1974 .

[30]  S. Maeda An articulatory model of the tongue based on a statistical analysis , 1979 .

[31]  Lakhmi C. Jain Recent Advances in Artificial Neural Networks , 2000 .

[32]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[33]  R. Weinstock Calculus of Variations: with Applications to Physics and Engineering , 1952 .

[34]  J. L. Lagrange,et al.  Oeuvres de Lagrange , 1970 .

[35]  F. Guenther,et al.  A theoretical investigation of reference frames for the planning of speech movements. , 1998, Psychological review.

[36]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[37]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[38]  T M Nearey,et al.  On the sufficiency of compound target specification of isolated vowels and vowels in /bVb/ syllables. , 1992, The Journal of the Acoustical Society of America.

[39]  Man Mohan Sondhi,et al.  A hybrid time-frequency domain articulatory speech synthesizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[40]  Jose C. Principe,et al.  Handbook of Neural Network Signal Processing , 2018 .

[41]  David J. Ostry,et al.  The control of multi-muscle systems: human jaw and hyoid movements , 1996, Biological Cybernetics.

[42]  D J Ostry,et al.  Coarticulation of jaw movements in speech production: is context sensitivity in speech kinematics centrally planned? , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[43]  Shane Anthony Migliore CONTROL OF ROBOTIC JOINTS USING PRINCIPLES FROM THE EQUILIBRIUM POINT HYPOTHESIS OF ANIMAL MOTOR CONTROL , 2004 .

[44]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[45]  M. Schroeder Determination of the geometry of the human vocal tract by acoustic measurements. , 1967, The Journal of the Acoustical Society of America.

[46]  Martin J Watson MSc Mcsp Neurophysiological Basis of Movement , 1999 .

[47]  H. Hatze,et al.  Energy-optimal controls in the mammalian neuromuscular system , 1977, Biological Cybernetics.

[48]  P. Fitts The information capacity of the human motor system in controlling the amplitude of movement. , 1954, Journal of experimental psychology.

[49]  Mitsuo Kawato,et al.  Internal models for motor control and trajectory planning , 1999, Current Opinion in Neurobiology.

[50]  P. Fitts,et al.  Information capacity of discrete motor responses under different cognitive sets. , 1966, Journal of experimental psychology.

[51]  O. C. Zienkiewicz,et al.  The Finite Element Method: Basic Formulation and Linear Problems , 1987 .

[52]  J. Jenkins,et al.  Dynamic specification of coarticulated vowels. , 1983, The Journal of the Acoustical Society of America.

[53]  Leonhard Euler Methodus inveniendi lineas curvas maximi minimive proprietate gaudentes, sive solutio problematis isoperimetrici latissimo sensu accepti , 2013, 1307.7187.

[54]  D J Ostry,et al.  Are complex control signals required for human arm movement? , 1998, Journal of neurophysiology.

[55]  M. Macchi,et al.  Identification of vowels spoken in isolation versus vowels spoken in consonantal context. , 1980, The Journal of the Acoustical Society of America.

[56]  Lakhmi C. Jain,et al.  Recent advances in artificial neural networks: design and applications , 2000 .

[57]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[58]  M. Kawato,et al.  Formation and control of optimal trajectory in human multijoint arm movement , 1989, Biological Cybernetics.

[59]  W. Strange,et al.  Identification of coarticulated vowels. , 1980, The Journal of the Acoustical Society of America.

[60]  William S. Levine,et al.  Controlling the shape of a muscular hydrostat : A tongue or tentacle , 2005 .

[61]  Frank H. Guenther,et al.  A neural network model of speech acquisition and motor equivalent speech production , 2004, Biological Cybernetics.

[62]  D J Ostry,et al.  A dynamic biomechanical model for neural control of speech production. , 1998, The Journal of the Acoustical Society of America.

[63]  A. G. Feldman Once More on the Equilibrium-Point Hypothesis (λ Model) for Motor Control , 1986 .

[64]  A. G. Feldman,et al.  The influence of different descending systems on the tonic stretch reflex in the cat. , 1972, Experimental neurology.

[65]  N. Hogan An organizing principle for a class of voluntary movements , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[66]  J. Cooke 11 The Organization of Simple, Skilled Movements , 1980 .

[67]  R. E. English,et al.  Towards Articulatory Speech Synthesis with a Dynamic 3 D Finite Element Tongue Model , 2006 .

[68]  J. Troutman Variational Principles in Mechanics , 1983 .

[69]  Dana H. Ballard,et al.  An Equilibrium Point based Model Unifying Movement Control in Humanoids , 2006, Robotics: Science and Systems.

[70]  Granino A. Korn,et al.  Mathematical handbook for scientists and engineers. Definitions, theorems, and formulas for reference and review , 1968 .

[71]  A. J. Barret,et al.  Methods of Mathematical Physics, Volume I . R. Courant and D. Hilbert. Interscience Publishers Inc., New York. 550 pp. Index. 75s. net. , 1954, The Journal of the Royal Aeronautical Society.

[72]  Granino A. Korn,et al.  Mathematical handbook for scientists and engineers , 1961 .

[73]  Dr. M. G. Worster Methods of Mathematical Physics , 1947, Nature.

[74]  Shang-Liang Chen,et al.  Orthogonal least squares learning algorithm for radial basis function networks , 1991, IEEE Trans. Neural Networks.

[75]  T. M. Nearey,et al.  Effects of consonant environment on vowel formant patterns. , 1997, The Journal of the Acoustical Society of America.

[76]  Yohan Payan,et al.  Efficient 3D Finite Element Modeling of a Muscle-Activated Tongue , 2006, ISBMS.

[77]  D. Kewley-Port,et al.  Thresholds for second formant transitions in front vowels. , 2000, The Journal of the Acoustical Society of America.

[78]  V. Smirnov,et al.  A course of higher mathematics , 1964 .

[79]  Atsuo Takanishi,et al.  Development of a talking robot , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[80]  K. Stevens,et al.  Perturbation of vowel articulations by consonantal context: an acoustical study. , 1963, Journal of speech and hearing research.

[81]  David J. Ostry,et al.  Human jaw movement kinematics and control , 1992 .

[82]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[83]  V. Arnold Mathematical Methods of Classical Mechanics , 1974 .

[84]  Michael I. Jordan,et al.  The Handbook of Brain Theory and Neural Networks , 2002 .

[85]  H. K. Dunn The Calculation of Vowel Resonances, and an Electrical Vocal Tract , 1950 .

[86]  R. Wilhelms-Tricarico Physiological modeling of speech production: methods for modeling soft-tissue articulators. , 1995, The Journal of the Acoustical Society of America.

[87]  Stevan Harnad,et al.  Movement control: Contents , 1994 .

[88]  D. Ostry,et al.  Control of Human Arm and Jaw Motion: Issues Related to Musculo-Skeletal Geometry , 1997 .

[89]  J. Flanagan,et al.  The Origin of Electromyograms - Explanations Based on the Equilibrium Point Hypothesis , 1990 .

[90]  W. L. Nelson Physical principles for economies of skilled movements , 1983, Biological Cybernetics.

[91]  R. Suzuki,et al.  Minimum Muscle-Tension Change Trajectories Predicted by Using a 17-Muscle Model of the Monkey's Arm. , 1996, Journal of motor behavior.

[92]  J. Z. Zhu,et al.  The finite element method , 1977 .

[93]  Louis-Jean Boë,et al.  La parole et son traitement automatique , 1989 .

[94]  Masaaki Honda,et al.  Estimation of articulatory movements from speech acoustics using an HMM-based speech production model , 2004, IEEE Transactions on Speech and Audio Processing.

[95]  C. C. Goodyear,et al.  On the use of neural networks in articulatory speech synthesis , 1993 .

[96]  Shinji Maeda Improved articulatory models , 1988 .

[97]  Björn Lindblom,et al.  Explaining Phonetic Variation: A Sketch of the H&H Theory , 1990 .

[98]  E. M. Lifshitz,et al.  Course in Theoretical Physics , 2013 .

[99]  P. Fitts,et al.  INFORMATION CAPACITY OF DISCRETE MOTOR RESPONSES. , 1964, Journal of experimental psychology.

[100]  J. W. Humberston Classical mechanics , 1980, Nature.

[101]  J. E. Miller,et al.  Computational model of the tongue , 1975 .

[102]  Yohan Payan,et al.  Modeling the production of VCV sequences via the inversion of a biomechanical model of the tongue , 2006, INTERSPEECH.