Most of the music signals used today are digitally recorded, stored and processed during production. The digitally available music signal very often consists of the superposition of several separate signals that are dedicated to an isolated component, such as an instrument or another sound source. These individual signals, as well as the mixed signal, can be digitally post-processed as often as desired in order to achieve a subsequent improvement of individual musical elements or a desired modification of recorded musical passages.
However, if only the mixed signal is available, targeted post-processing or extraction of the signal of individual components can only be carried out in exceptional cases. Because targeted post-processing requires the separation of the mixed signal into the partial signal to be changed and the unchanged residual signal.
Especially for monaural music signals, which are recorded via only one microphone and therefore do not contain any additional information such as the location of the sound sources, the separation of the mixed signal into its individual components, as shown schematically in the following sketch in case of an orchestra, remains a great challenge.
This research project therefore focuses on the separation of monaural music signals for the polyphonic case, i.e. music signals with several individual sources. In the separation of musical elements, the challenge lies in the superposition of the spectra of simultaneously occurring elements, which are typical for polyphonic music signals. In most cases, a suitable time-frequency representation of the mixed signal, such as the short-time Fourier transform, is the basis for the separation process. The methods to be developed should provide the different separated partial signals of the components to be separated in such a way that a targeted post-processing can be easily implemented.