TitleSource separation
Publication TypeConference Paper
Year of Publication2013
AuthorsUhle C, Driedger J, Edler B, Ewert S, Graf F, Kubin G, Müller M, Ono N, Pardo B, Serrà J
EditorS. Müller NM, Schuller B.
Conference NameDagstuhl Seminar 13451: Computational Audio Analysis
Conference LocationWadern, Germany
Date Published04/11/2013

Our group attracted participants from various research areas to discuss aspects of source separation for audio signals, performed either in a blind way or by using additional knowledge about the underlying sources or the mixing process. In many source separation approaches, one assumes that sources are independent, uncorrelated, and do not overlap with regard to a given representation. Also one often presupposes that the mixing process is linear and time-invariant. However, in practice these assumption are often violated. In addition, sound sources may influence or interact with each other, so that the separated source signals may sound unnatural or different to situation where they occur in an isolated fashion. Examples are the coupling between piano strings and the Lombard effect that describes the adaption of a speaker to noisy environments. Further fundamental problems in source separation are the unmasking of undesired sounds (e. g., FM noise or audio coding artifacts), shortcomings of objective evaluation metrics, or the sound quality (e. g., due to the phase reconstruction problem). Last but not least, even the definition of what to understand by a source is ambiguous: a source can be a physical entity that emits sound, an object or event that is perceived by a human listener (stream), or a musical voice in a polyphonic sound mixture. There are various applications that motivate ongoing research in source separation including remixing and upmixing, Karaoke applications, speech enhancement for hearing aids and communication, dialogue enhancement, audio editing, and audio content analysis. Besides these applications, source separation is a fascinating, intellectual, and interdisciplinary research area that requires and provides a deep understanding of the underlying audio material with regard to various aspects ranging from physical processes to cognitive aspects.