Speech Synthesis and Speech Recognition

Master in Computer Science
Specialization: Methods and Models in Artificial Intelligence
SYLLABUS

Academic year 2011- 2012



  Departament Home

Code: MMIA214
Course: Prof.Dr.N.Tandareanu
Cycle 2 Year II;
Sem. 1: Course: 28h, Lab: 28h
Credits: 8
Profil: computer science
Type: optional
Objectives:
  • Assimilation of the concepts assigned to the process of voice recognition and voice synthesis
  • Familiarization with the features concerning the implementation in Java of the applications by voice
I. Apache Ant.
  1. Generalities about this product
  2. Instaling Apache Ant
  3. Projects, properties, tags
  4. Build files. Example.
II. Interfaces by voice
  1. Applications by voice
  2. The use of voice applications
  3. Designing voice applications
  4. Voice technology
  5. Speech synthesis
III. Speech synthesis by Java Speech API
  1. What is JSAPI?
  2. Speech engine, properties
  3. The states of a speech engine
  4. Locating, Selecting and Creating Engines
  5. Speech Events
  6. The synthesiser as an engine
  7. Speech Synthesis: javax.speech.synthesis
  8. Send a text to be spoken
IV. Voice recognition
  1. Generalities
  2. Architecture of Sphinx
  3. FrontEnd module
  4. Linguist module
  5. Recognizers. Selecting a recognizer.
V Java Speech Grammar Format
  1. Introduction
  2. Definitions
    2.1 Grammar Names and Package Names
    2.2 Rulenames
    2.3 Tokens
    2.4 Comments
  3. Grammar Header
    3.1 Self-Identifying Header
    3.2 Grammar Name Declaration
    3.3 Import
  4. Grammar Body
    4.1 Rule Definitions
    4.2 Rule Expansions
    4.3 Composition
    4.4 Grouping
    4.5 Unary Operators
    4.6 Tags
    4.7 Precedence
    4.8 Recursion
    4.9 Uses of NULL and VOID
  5. Examples
Bibliography:
  1. Java Speech Grammar Format Specification - JSGF documentation
  2. The CMU-Cambridge Statistical Language Modeling Toolkit v2 ,
    http://svr-www.eng.cam.ac.uk/~prc14/toolkit_documentation.html
  3. Willie Walker, Paul Lamere, Philip Kwok, Bhiksha Raj, Rita Singh, Evandro Gouvea, Peter Wolf, Joe Woelfel - Sphinx-4: A Flexible Open Source Framework for Speech Recognition, SMLI TR2004-0811 c 2004 SUN MICROSYSTEMS INC.
Practical works
Practical works
Documentation
Speech synthesis (Course Notes)
Speech recognition (Course Notes)
Apache Ant
FreeTTS
Sphinx 4-1.0
JSAPI.html

Last update: Sept. 2010