Guest Lecture about Perceptually Motivated Audio Loss Functions for Deep Learning

In audio and speech coding, perceptually weighted error functions are commonly used. In audio coding it is a perceptual model which controls the quantization step size of subband signals, in speech coding it is a perceptual error weighting filter. In deep learning for audio signals a similar principle can be used for the loss function in training a network. This talk shall give an overview of existing approaches, and outlooks of possible future directions, specifically a psycho- acoustic loss function which is based on a psychoacoustic model from audio coding.

Gerald Schuller’s bio:
Gerald Schuller is a full professor at the Institute for Media Technology of the Technical University of Ilmenau, since 2008. He was head of the Audio Coding for Special Applications group of the Fraunhofer Institute for Digital Media Technology in Ilmenau, Germany, since January 2002 until 2008, and is now a member of Fraunhofer IDMT.
Before joining the Fraunhofer Institute, he was a Member of Technical Staff at Bell Laboratories, Lucent Technologies, and Agere Systems, a Lucent Spin-off, from 1998 to 2001. There he worked in the Multimedia Communications Research Laboratory.
He received his Diplom degree in Electrical Engineering from the Technical University of Berlin in 1989, and his Ph.D. (Dr.-Ing.) degree from the University of Hanover in 1997, studied at the Massachusetts Institute of Technology in 1989/90 and at the Georgia Institute of Technology in 1993.
He was Associate Editor of the IEEE Transactions on Speech and Audio Processing from 2002 until 2006, and the IEEE Transactions on Signal Processing from 2006 to 2009, and of the IEEE Transactions on Multimedia from 2008 to 2011. He is recipient of the 2006 IEEE Best Paper Award in the Audio and Electroacoustics Area.
His research interests are in filter banks, audio coding, music signal processing, and deep learning for multimedia. He is probably best known for his work on low delay filter banks, which became part of the MPEG-4 ELD-AAC audio coding standard, which is now part of the iOS and Android operating systems and is used for instance in the Facetime application.

Admission is free and open to the public.

___________________________________

Visit the School of Music's website to view the schedule of concerts and events for this semester.

Follow the School of Music on Facebook and Instagram to keep up with our news and events.

Media

No media selected

Summary

Details

Tuesday

Mar 26 2024

11:00am - 12:00pm

Location: West Village 275

In campus calendar: No

Sidebar Content

No sidebar content

Groups

Status

Workflow status: Published
Created by: tma98
Created: 03/13/2024
Modified By: tma98
Modified: 03/13/2024

Mercury (Hg)

Guest Lecture about Perceptually Motivated Audio Loss Functions for Deep Learning

Log in

Georgia Institute of Technology

Guest Lecture about Perceptually Motivated Audio Loss Functions for Deep Learning

Primary tabs

Log in

Georgia Institute of Technology