Mechanisms of Speech Recognition

Mechanisms of Speech Recognition
Author: W. A. Ainsworth
Publisher: Elsevier
Total Pages: 153
Release: 2014-05-18
Genre: Medical
ISBN: 1483137929

Mechanisms of Speech Recognition explores the mechanisms underlying speech recognition. Topics covered include the auditory system, speech production, auditory psychophysics, speech synthesis and analysis, vowel and consonant recognition, and perception of prosodic features and of distorted speech. Automatic speech recognition and models of speech recognition are also given consideration. This volume consists of 11 chapters and begins with an overview of speech recognition, communication, and production. More specifically, it examines the way in which the organs of the vocal apparatus are employed to transform a message consisting of a string of linguistic units, such as words or phonemes, into a wave of continuous sounds which are recognized as speech. The auditory system and its parts are then described, from the ears to the organ of Corti and nerve cells. The chapters that follow focus on the behavior of the hearing system, the various techniques of analyzing speech sounds, and speech synthesizers such as vocoders. The mechanisms underlying the recognition of vowels and consonants are also described, along with the physical parameters of the speech wave which signal the prosody of an utterance, the effects of distortions in the speech wave on speech perception, and tools used in automatic speech recognition. The book concludes with an evaluation of models of speech recognition. This book will be of interest to phoneticians, linguists, physiologists, psychologists, and physicists.

Mechanisms of Speech Recognition

Mechanisms of Speech Recognition
Author: William Anthony Ainsworth
Publisher: Pergamon
Total Pages: 139
Release: 1976-01-01
Genre: Auditory perception
ISBN: 9780080203942

Describes the acoustics of speech production & the mechanisms of the ear. Introduces psychological techniques to show the sensitivity & limits of hearing. Describes methods of analysing & synthesizing speech sounds. Gives an introduction to the machine recognition of speech.

Speech Production and Speech Modelling

Speech Production and Speech Modelling
Author: W.J. Hardcastle
Publisher: Springer Science & Business Media
Total Pages: 454
Release: 2012-12-06
Genre: Language Arts & Disciplines
ISBN: 9400920377

Speech sound production is one of the most complex human activities: it is also one of the least well understood. This is perhaps not altogether surprising as many of the complex neurological and physiological processes involved in the generation and execution of a speech utterance remain relatively inaccessible to direct investigation, and must be inferred from careful scrutiny of the output of the system -from details of the movements of the speech organs themselves and the acoustic consequences of such movements. Such investigation of the speech output have received considerable impetus during the last decade from major technological advancements in computer science and biological transducing, making it possible now to obtain large quantities of quantative data on many aspects of speech articulation and acoustics relatively easily. Keeping pace with these advancements in laboratory techniques have been developments in theoretical modelling of the speech production process. There are now a wide variety of different models available, reflecting the different disciplines involved -linguistics, speech science and technology, engineering and acoustics. The time seems ripe to attempt a synthesis of these different models and theories and thus provide a common forum for discussion of the complex problem of speech production. Such an activity would seem particularly timely also for those colleagues in speech technology seeking better, more accurate phonetic models as components in their speech synthesis and automatic speech recognition systems.

The Speech Chain

The Speech Chain
Author: Peter B. Denes
Publisher: Waveland Press
Total Pages: 256
Release: 2015-07-10
Genre: Education
ISBN: 1478631074

Speech is usually taken for granted, and its fundamental importance is often overlooked. Communication by speech sets humans apart from other animals: it facilitates our ability to think abstractly, it allows us to coordinate our efforts with one another, and it contributes significantly to the development of human societies. Spoken communication is an extremely intricate process. A complex chain of events links speaker to listener, a chain that involves not only physics and acoustics, but also anatomy, physiology, linguistics, and psychology. The Speech Chain explains simply and clearly the basic mechanisms involved in spoken communication, from the speaker’s production of words, to the transmission of sound, to the listener’s perception of what has been said. The Speech Chain has been well-known as an easy-to-read introduction to the fundamentals of spoken communication. The book has now been thoroughly revised and updated to give a state-of-the art description of each link in the speech chain. Included are new chapters on the digital processing of speech and on the use of computers for the generation of synthetic speech and for automatic speech recognition. Professionals, teachers, students, and others interested in how we communicate with one another will find The Speech Chain a useful introduction to this uniquely human capability. This interdisciplinary account is also accessible to persons with no previous knowledge of the fields involved.

Dynamics of Speech Production and Perception

Dynamics of Speech Production and Perception
Author: P.L. Divenyi
Publisher: IOS Press
Total Pages: 388
Release: 2006-09-20
Genre: Language Arts & Disciplines
ISBN: 1607502038

The idea that speech is a dynamic process is a tautology: whether from the standpoint of the talker, the listener, or the engineer, speech is an action, a sound, or a signal continuously changing in time. Yet, because phonetics and speech science are offspring of classical phonology, speech has been viewed as a sequence of discrete events-positions of the articulatory apparatus, waveform segments, and phonemes. Although this perspective has been mockingly referred to as "beads on a string", from the time of Henry Sweet's 19th century treatise almost up to our days specialists of speech science and speech technology have continued to conceptualize the speech signal as a sequence of static states interleaved with transitional elements reflecting the quasi-continuous nature of vocal production. This book, a collection of papers of which each looks at speech as a dynamic process and highlights one of its particularities, is dedicated to the memory of Ludmilla Andreevna Chistovich. At the outset, it was planned to be a Chistovich festschrift but, sadly, she passed away a few months before the book went to press. The 24 chapters of this volume testify to the enormous influence that she and her colleagues have had over the four decades since the publication of their 1965 monograph.

Speech Processing in the Auditory System

Speech Processing in the Auditory System
Author: Steven Greenberg
Publisher: Springer Science & Business Media
Total Pages: 487
Release: 2006-05-09
Genre: Science
ISBN: 0387215751

Although speech is the primary behavioral medium by which humans communicate, its auditory basis is poorly understood, having profound implications on efforts to ameliorate the behavioral consequences of hearing impairment and on the development of robust algorithms for computer speech recognition. In this volume, the authors provide an up-to-date synthesis of recent research in the area of speech processing in the auditory system, bringing together a diverse range of scientists to present the subject from an interdisciplinary perspective. Of particular concern is the ability to understand speech in uncertain, potentially adverse acoustic environments, currently the bane of both hearing aid and speech recognition technology. There is increasing evidence that the perceptual stability characteristic of speech understanding is due, at least in part, to elegant transformations of the acoustic signal performed by auditory mechanisms. As a comprehensive review of speech's auditory basis, this book will interest physiologists, anatomists, psychologists, phoneticians, computer scientists, biomedical and electrical engineers, and clinicians.

Mechanisms of Speech Recognition

Mechanisms of Speech Recognition
Author: William Anthony Ainsworth
Publisher: Pergamon
Total Pages: 160
Release: 1976
Genre: Auditory perception
ISBN: 9780080203959

Describes the acoustics of speech production & the mechanisms of the ear. Introduces psychological techniques to show the sensitivity & limits of hearing. Describes methods of analysing & synthesizing speech sounds. Gives an introduction to the machine recognition of speech.

Introduction to Digital Speech Processing

Introduction to Digital Speech Processing
Author: Lawrence R. Rabiner
Publisher: Now Publishers Inc
Total Pages: 212
Release: 2007
Genre: Computers
ISBN: 1601980701

Provides the reader with a practical introduction to the wide range of important concepts that comprise the field of digital speech processing. Students of speech research and researchers working in the field can use this as a reference guide.

Dynamic Speech Models

Dynamic Speech Models
Author: Li Deng
Publisher: Springer Nature
Total Pages: 105
Release: 2022-05-31
Genre: Technology & Engineering
ISBN: 3031025555

Speech dynamics refer to the temporal characteristics in all stages of the human speech communication process. This speech “chain” starts with the formation of a linguistic message in a speaker's brain and ends with the arrival of the message in a listener's brain. Given the intricacy of the dynamic speech process and its fundamental importance in human communication, this monograph is intended to provide a comprehensive material on mathematical models of speech dynamics and to address the following issues: How do we make sense of the complex speech process in terms of its functional role of speech communication? How do we quantify the special role of speech timing? How do the dynamics relate to the variability of speech that has often been said to seriously hamper automatic speech recognition? How do we put the dynamic process of speech into a quantitative form to enable detailed analyses? And finally, how can we incorporate the knowledge of speech dynamics into computerized speech analysis and recognition algorithms? The answers to all these questions require building and applying computational models for the dynamic speech process. What are the compelling reasons for carrying out dynamic speech modeling? We provide the answer in two related aspects. First, scientific inquiry into the human speech code has been relentlessly pursued for several decades. As an essential carrier of human intelligence and knowledge, speech is the most natural form of human communication. Embedded in the speech code are linguistic (as well as para-linguistic) messages, which are conveyed through four levels of the speech chain. Underlying the robust encoding and transmission of the linguistic messages are the speech dynamics at all the four levels. Mathematical modeling of speech dynamics provides an effective tool in the scientific methods of studying the speech chain. Such scientific studies help understand why humans speak as they do and how humans exploit redundancy and variability by way of multitiered dynamic processes to enhance the efficiency and effectiveness of human speech communication. Second, advancement of human language technology, especially that in automatic recognition of natural-style human speech is also expected to benefit from comprehensive computational modeling of speech dynamics. The limitations of current speech recognition technology are serious and are well known. A commonly acknowledged and frequently discussed weakness of the statistical model underlying current speech recognition technology is the lack of adequate dynamic modeling schemes to provide correlation structure across the temporal speech observation sequence. Unfortunately, due to a variety of reasons, the majority of current research activities in this area favor only incremental modifications and improvements to the existing HMM-based state-of-the-art. For example, while the dynamic and correlation modeling is known to be an important topic, most of the systems nevertheless employ only an ultra-weak form of speech dynamics; e.g., differential or delta parameters. Strong-form dynamic speech modeling, which is the focus of this monograph, may serve as an ultimate solution to this problem. After the introduction chapter, the main body of this monograph consists of four chapters. They cover various aspects of theory, algorithms, and applications of dynamic speech models, and provide a comprehensive survey of the research work in this area spanning over past 20~years. This monograph is intended as advanced materials of speech and signal processing for graudate-level teaching, for professionals and engineering practioners, as well as for seasoned researchers and engineers specialized in speech processing