[Home] [Publications] [Research] [Teaching] [Short Bio] [Demo & Data] [Codes]


Free Software Toolboxes



Sparse Representation & Dictionary Learning Algorithms with Applications in Denoising, Separation, Localisation and Tracking


SimCO: sparse synthesis model based dictionary learning algorithms and their applications in image denoising

The Matlab codes and the details can be found from here.

Key references:


Analysis SimCO: sparse analysis model based dictionary learning algorithms and their applications in image denoising

The Matlab codes can be found from here.

Key references:


Sparse analysis model based multiplicative noise removal

The Matlab codes can be found from Jing Dong's Github page in here.

Key references:


Consistent dictionary learning for audio declipping

The Matlab codes can be found from Lucas Rencker's Github page or personal page

Key references:


Audio-visual dictionary learning and its application to multimodal source separation

Download the Matlab codes: slim version (5.7M, including core codes, and test example using synthetic data), and full version (1.3G, including core codes, test examples and comprehensive test results shown in the paper).

Key references:



Deep Learning Algorithms with Applications in Classification, Detection, Tagging, and Separation


Deep learning baselines for DCASE challenge 2016

The Python codes for the implementation of the deep learning baselines can be downloaded from Qiuqiang Kong's Github page:

  • Python codes for task 1: acoustic scene classification (More about this task can be found from here).

  • Python codes for task 2: event detection in synthetic audio (More about this task can be found from here)

  • Python codes for task 3: event detection in real life audio (More about this task can be found from here)

  • Python codes for task 4: audio tagging (More about this task can be found from here)

    Key references: Q. Kong, I. Sobieraj, W. Wang and M. D. Plumbley, "Deep Neural Network Baseline for DCASE Challenge 2016," in DCASE2016 workshop. [PDF]


    Hierarchical DNN for acoustic scene classification

    The C/C++ codes can be found from Yong Xu's Github page in here.

    Key references:


    Fully deep neural networks for audio tagging

    The C/C++ codes can be found from Yong Xu's Github page in here.

    Key references:


    Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

    The Python codes can be found from Yong Xu's Github page in here.

    Key references:


    Deep learning for sterero speech separation

    The Matlab codes can be found from here (to be added soon).

    Key references: Y. Yu, W. Wang, and P. Han, "Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks", EURASIP Journal on Audio Speech and Music Processing, 2016:7, 18 pages, DOI 10.1186/s13636-016-0085-x, 2016. [PDF]



    Particle Filtering, PHD Filtering, & Particle Flow Algorithms with Applications in Multimodal Fusion and Tracking


    Adaptive particle filtering for audio-visual tracking of multiple speakers

    The Matlab codes, demos and the details can be found from here.

    Key references: V. Kilic, M. Barnard, W. Wang, and J. Kittler, "Audio assisted robust visual tracking with adaptive particle filtering", IEEE Transactions on Multimedia, vol. 17, no. 2, pp. 186-200, 2015. [PDF]


    PHD filtering, Mean-Shift PHD filtering, and Sparse Sampling MS-PHD filtering for audio-visual tracking of multiple speakers

    The Matlab codes, demos, and details can be found from here.

    Key references: V. Kilic, M. Barnard, W. Wang, A. Hilton, and J. Kittler, "Mean-Shift and Sparse Sampling Based SMC-PHD Filtering for Audio Informed Visual Speaker Tracking", IEEE Transactions on Multimedia, vol. 18, no. 10, October 2016. [PDF]



    Convolutive ICA, NMF, Time-Frequency Masking with applications in Blind Source Separation & Computational Auditory Scene Analysis


    Underdetermined speech source separation based on sparse coding and dictionary learning

    The Matlab codes and the details can be found from.

    Key references:


    Convolutive speech source separation based on probabilistic time-frequency masking

    The Matlab codes and the details can be found from here (to be added soon).

    Key references:



    Spatial Audio


    Sparse L1-Optimal Multi-Loudspeaker Panning

    The Matlab codes and the details can be found from here.

    Key references:


    [Home] [Publications] [Research] [Teaching] [Short Bio] [Demo & Data] [Codes]

    Last updated in 14 March 2018
    First created in 14 March 2018