Development of Digital Signal Processing and Statistical Classification Methods for Distinguishing Nasal Consonants

Development of Digital Signal Processing and Statistical Classification Methods for Distinguishing Nasal Consonants
Author:
Publisher:
Total Pages:
Release: 2003
Genre:
ISBN:

For almost half a century, people have been looking for efficient classifiers to distinguish two nasal sounds, / / from / /, uttered by a single speaker. From the middle of the last decade, there has been little progress in research on this topic. In recent years, we, researchers of the Voice I/O Group in Department of Computer Science at North Carolina State University, have conducted some new trials on this classical problem. In this thesis, those trials are briefly summarized. Instead of simply using the Fourier transform to produce the spectra as people usually did in the past, the author uses other kinds of transforms to extract more feature differences between / / and / /. The new transforms can be the alternatives of frequencies, such as singular values or eigenvalues, or even other transforms such as wavelets, which can deal with non-stationary systems quite well. We combine together the old and new features to get a larger feature vector, which will bring more classification information. We collect multiple voice samples of a single speaker and calculate the above feature representations, then use them as input of some popular statistical classification techniques, such as Principle Component Analysis (PCA), Discriminant Analysis (DA), and Support Vector Machine (SVM). By way of one training process, one testing process, and one heuristic scheme, we can identify the nasals with low error rates.

Introduction to Digital Speech Processing

Introduction to Digital Speech Processing
Author: Lawrence R. Rabiner
Publisher: Now Publishers Inc
Total Pages: 212
Release: 2007
Genre: Computers
ISBN: 1601980701

Provides the reader with a practical introduction to the wide range of important concepts that comprise the field of digital speech processing. Students of speech research and researchers working in the field can use this as a reference guide.

Speech and Audio Signal Processing

Speech and Audio Signal Processing
Author: Ben Gold
Publisher: John Wiley & Sons
Total Pages: 684
Release: 2011-08-23
Genre: Technology & Engineering
ISBN: 0470195363

When Speech and Audio Signal Processing published in 1999, it stood out from its competition in its breadth of coverage and its accessible, intutiont-based style. This book was aimed at individual students and engineers excited about the broad span of audio processing and curious to understand the available techniques. Since then, with the advent of the iPod in 2001, the field of digital audio and music has exploded, leading to a much greater interest in the technical aspects of audio processing. This Second Edition will update and revise the original book to augment it with new material describing both the enabling technologies of digital music distribution (most significantly the MP3) and a range of exciting new research areas in automatic music content processing (such as automatic transcription, music similarity, etc.) that have emerged in the past five years, driven by the digital music revolution. New chapter topics include: Psychoacoustic Audio Coding, describing MP3 and related audio coding schemes based on psychoacoustic masking of quantization noise Music Transcription, including automatically deriving notes, beats, and chords from music signals. Music Information Retrieval, primarily focusing on audio-based genre classification, artist/style identification, and similarity estimation. Audio Source Separation, including multi-microphone beamforming, blind source separation, and the perception-inspired techniques usually referred to as Computational Auditory Scene Analysis (CASA).

Practical Digital Signal Processing

Practical Digital Signal Processing
Author: Edmund Lai
Publisher: Elsevier
Total Pages: 299
Release: 2003-10-21
Genre: Technology & Engineering
ISBN: 0080473849

The aim of this book is to introduce the general area of Digital Signal Processing from a practical point of view with a working minimum of mathematics. The emphasis is placed on the practical applications of DSP: implementation issues, tricks and pitfalls. Intuitive explanations and appropriate examples are used to develop a fundamental understanding of DSP theory, laying a firm foundation for the reader to pursue the matter further. The reader will develop a clear understanding of DSP technology in a variety of fields from process control to communications. * Covers the use of DSP in different engineering sectors, from communications to process control * Ideal for a wide audience wanting to take advantage of the strong movement towards digital signal processing techniques in the engineering world * Includes numerous practical exercises and diagrams covering many of the fundamental aspects of digital signal processing

Machine Learning for Audio, Image and Video Analysis

Machine Learning for Audio, Image and Video Analysis
Author: Francesco Camastra
Publisher: Springer
Total Pages: 564
Release: 2015-07-21
Genre: Computers
ISBN: 144716735X

This second edition focuses on audio, image and video data, the three main types of input that machines deal with when interacting with the real world. A set of appendices provides the reader with self-contained introductions to the mathematical background necessary to read the book. Divided into three main parts, From Perception to Computation introduces methodologies aimed at representing the data in forms suitable for computer processing, especially when it comes to audio and images. Whilst the second part, Machine Learning includes an extensive overview of statistical techniques aimed at addressing three main problems, namely classification (automatically assigning a data sample to one of the classes belonging to a predefined set), clustering (automatically grouping data samples according to the similarity of their properties) and sequence analysis (automatically mapping a sequence of observations into a sequence of human-understandable symbols). The third part Applications shows how the abstract problems defined in the second part underlie technologies capable to perform complex tasks such as the recognition of hand gestures or the transcription of handwritten data. Machine Learning for Audio, Image and Video Analysis is suitable for students to acquire a solid background in machine learning as well as for practitioners to deepen their knowledge of the state-of-the-art. All application chapters are based on publicly available data and free software packages, thus allowing readers to replicate the experiments.

Speech, Audio, Image and Biomedical Signal Processing using Neural Networks

Speech, Audio, Image and Biomedical Signal Processing using Neural Networks
Author: Bhanu Prasad
Publisher: Springer Science & Business Media
Total Pages: 419
Release: 2008-01-03
Genre: Computers
ISBN: 3540753974

Humans are remarkable in processing speech, audio, image and some biomedical signals. Artificial neural networks are proved to be successful in performing several cognitive, industrial and scientific tasks. This peer reviewed book presents some recent advances and surveys on the applications of artificial neural networks in the areas of speech, audio, image and biomedical signal processing. It chapters are prepared by some reputed researchers and practitioners around the globe.