[Home] [Publications] [Research] [Teaching] [Short Bio] [Demo & Data] [Codes]

Free Software Toolboxes

Sparse Representation & Dictionary Learning Algorithms with Applications in Denoising, Separation, Localisation and Tracking

SimCO: sparse synthesis model based dictionary learning algorithms and their applications in image denoising

The Matlab codes and the details can be found from here.

Key references:

W. Dai, T. Xu, and W. Wang, "Simultaneous Codeword Optimisation (SimCO) for Dictionary Update and Learning", IEEE Transactions on Signal Processing, vol. 60, no. 12, pp. 6340-6353, 2012. [PDF]

Analysis SimCO: sparse analysis model based dictionary learning algorithms and their applications in image denoising

The Matlab codes can be found from here.

Key references:

J. Dong, W. Wang, W. Dai, M. Plumbley, Z. Han, and J. A. Chambers, "Analysis SimCO Algorithms for Sparse Analysis Model Based Dictionary Learning", IEEE Transactions on Signal Processing, vol. 64, no. 2, pp. 417 - 431, 2016. [PDF]

Sparse analysis model based multiplicative noise removal

The Matlab codes can be found from Jing Dong's Github page in here.

Key references:

J. Dong, Z. Han, Y. Zhao, W. Wang, A. Prochazka, and J. Chambers, "Sparse Analysis Model Based Multiplicative Noise Removal with Enhanced Regularization", Signal Processing, February 2017. [PDF]

Consistent dictionary learning for audio declipping

The Matlab codes can be found from Lucas Rencker's Github page or personal page

Key references:

L. Rencker, F. Bach, W. Wang, and M. Plumbley, "Consistent Dictionary Learning for Signal Declipping", in Proc. 14th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA 2018), Guildford, UK, July 2-6, 2018. [PDF]

Audio-visual dictionary learning and its application to multimodal source separation

Download the Matlab codes: slim version (5.7M, including core codes, and test example using synthetic data), and full version (1.3G, including core codes, test examples and comprehensive test results shown in the paper).

Key references:

Q. Liu, W. Wang, P. Jackson, M. Barnard, J. Kittler, and J.A. Chambers, "Source Separation of Convolutive and Noisy Mixtures using Audio-Visual Dictionary Learning and Probabilistic Time-Frequency Masking", IEEE Transactions on Signal Processing, vol. 61, no. 22, pp. 5520-5535, 2013. [PDF]

Deep Learning Algorithms with Applications in Classification, Detection, Tagging, and Separation

Deep learning baselines for DCASE challenge 2016

The Python codes for the implementation of the deep learning baselines can be downloaded from Qiuqiang Kong's Github page:

Python codes for task 1: acoustic scene classification (More about this task can be found from here).

Python codes for task 2: event detection in synthetic audio (More about this task can be found from here)

Python codes for task 3: event detection in real life audio (More about this task can be found from here)

Python codes for task 4: audio tagging (More about this task can be found from here)

Key references: Q. Kong, I. Sobieraj, W. Wang and M. D. Plumbley, "Deep Neural Network Baseline for DCASE Challenge 2016," in DCASE2016 workshop. [PDF]

Hierarchical DNN for acoustic scene classification

The C/C++ codes can be found from Yong Xu's Github page in here.

Key references:

Y. Xu, Q. Huang, W. Wang, M. D. Plumbley, "Hierarchical learning for DNN-based acoustic scene classification," in DCASE2016 workshop. [PDF]

Fully deep neural networks for audio tagging

The C/C++ codes can be found from Yong Xu's Github page in here.

Key references:

Y. Xu, Q. Huang, W. Wang, P. J. B. Jackson and M. D. Plumbley, "Fully DNN-Based Multi-Label Regression for Audio Tagging," in DCASE 2016 Workshop. [PDF]

Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

The Python codes can be found from Yong Xu's Github page in here.

Key references:

Y. Xu, Q. Huang, W. Wang, P. Foster, S. Sigtia, P. J. B. Jackson, and M. D. Plumbley, "Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging," IEEE/ACM Transactions on Audio Speech and Language Processing, February 2017. [PDF] (in press)

Deep learning for sterero speech separation

The Matlab codes can be found from here (to be added soon).

Key references: Y. Yu, W. Wang, and P. Han, "Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks", EURASIP Journal on Audio Speech and Music Processing, 2016:7, 18 pages, DOI 10.1186/s13636-016-0085-x, 2016. [PDF]

Particle Filtering, PHD Filtering, & Particle Flow Algorithms with Applications in Multimodal Fusion and Tracking

Adaptive particle filtering for audio-visual tracking of multiple speakers

The Matlab codes, demos and the details can be found from here.

Key references: V. Kilic, M. Barnard, W. Wang, and J. Kittler, "Audio assisted robust visual tracking with adaptive particle filtering", IEEE Transactions on Multimedia, vol. 17, no. 2, pp. 186-200, 2015. [PDF]

PHD filtering, Mean-Shift PHD filtering, and Sparse Sampling MS-PHD filtering for audio-visual tracking of multiple speakers

The Matlab codes, demos, and details can be found from here.

Key references: V. Kilic, M. Barnard, W. Wang, A. Hilton, and J. Kittler, "Mean-Shift and Sparse Sampling Based SMC-PHD Filtering for Audio Informed Visual Speaker Tracking", IEEE Transactions on Multimedia, vol. 18, no. 10, October 2016. [PDF]

Convolutive ICA, NMF, Time-Frequency Masking with applications in Blind Source Separation & Computational Auditory Scene Analysis

Underdetermined speech source separation based on sparse coding and dictionary learning

The Matlab codes and the details can be found from.

Key references:

T. Xu, W. Wang, and W. Dai, "Sparse Coding with Adaptive Dictionary Learning for Underdetermined Blind Speech Separation", Speech Communication, vol. 55, no. 3, pp. 432-450, 2013. [PDF]

Convolutive speech source separation based on probabilistic time-frequency masking

The Matlab codes and the details can be found from here (to be added soon).

Key references:

A. Alinaghi, P. Jackson, Q. Liu, and W. Wang, "Joint Mixing Vector and Binaural Model Based Stereo Source Separation", IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 22, no. 9, pp. 1434-1448, 2014. [PDF]

Spatial Audio

Sparse L1-Optimal Multi-Loudspeaker Panning

The Matlab codes and the details can be found from here.

Key references:

A. Franck, W. Wang, F.M. Fazi, "Sparse, L_1-Optimal Multi-Loudspeaker Panning and its Relation to Vector Base Amplitude Panning", IEEE/ACM Transactions on Audio Speech and Language Processing, February 2017.[PDF]

[Home] [Publications] [Research] [Teaching] [Short Bio] [Demo & Data] [Codes]

Last updated in 14 March 2018

First created in 14 March 2018