Speech and Image Technology

Higher education teachers: Mihelič France
Credits: 12
Semester: summer, summer
Subject code: 64154

Subject description


  • registration to the 3. year study

Content (Syllabus outline):

  • Introduction: description of the field, short outline of the historical develoment of speech and image technologies.
  • Basic characteristics of visual and auditory perception and human speech-based communication. Representation of speech and image patterns.
  • Pattern recognition: structural description, pattern recognition systems in general, feature extraction, learning, classification and clustering in pattern recognition systems.
  • Speech processing: acquisition and preprocessing, speech features, speech signal segmentation, databases of speech.
  • Speech recognition: types of speech-recognition systems, statistical modelling, acoustic and langauge modelling, semantic analysis of speech.
  • Artificial speech: systems for speech synthesis in general, grapheme-to-phoneme conversion, prosody modelling, speech-synthesis procedures.
  • Dialogue: automated dialogue systems in general, approached to designing human-computer dialogue systems, assessment of dialogue systems.
  • Image technologies: terminology, use-cases, basic image transformations, color images and color spaces, image coding.
  • Image processing: image processing in the spatial and frequency domains, noise models and image restoration, morphological operations and algorithms, edge detection.
  • Advanced algorithms, local descriptors and their applications, object detection in images, object recognition from image data, subspaces for data representation.
  • Image segmentation: clustering techniques and thier application to image segmentation, mean-shift .

Objectives and competences:

The aim of this course is to acquaint students with the field of speech and image technologies and introduce various algoritms, techniques, and methods to acomplish tasks related to this field.

Intended learning outcomes:

Knowledge about the representation, description, synthesis and recognition of speech and image signals. Understanding the complexity and interdisciplinarity of the field of speech and image technologies. Knowledge and understanding of the structure and capabilities of speech- and image-based technologies.

Learning and teaching methods:

  • lectures,
  • interactive teaching,
  • practical assignements.

Study materials

  1. Rabiner L., Schafer R., Theory and Applications of Digital Speech Processing, Prentince Hall, 1. Ed., 2010
  2. Gonzales R. C., Woods, R.E., Digital Image Processing, 3 izdaja, Prentice Hall, 2007
  3. R.C. Gonzales, R.E. Woods, S.L. Eddins, Digital image processing using Matlab, 2 izdaja. Gatesmark Publishing, 2009

Study in which the course is carried out

  • 2 year - 1st cycle - Multimedia
  • 3 year - 1st cycle - Multimedia