Research

[Home] [Publications] [Research] [Teaching] [Short Bio] [Demo & Data] [Codes]

Funded Projects (Grants)

I appreciate the financial support for my research from the following bodies (since 2008): Engineering and Physical Science Research Council (EPSRC), Ministry of Defence (MoD), Defence Science and Technology Laboratory (Dstl), Department of Defence (DoD), Home Office (HO), Royal Academy of Engineering (RAENG), European Commission (EC), National Natural Science Foundation of China (NSFC), Shenzhen Science and Technology Innovation Council (SSTIC) of China, the University Research Support Fund (URSF), and the Ohio State University (OSU), and UK/EU industries including BBC, Atlas, Kaon, Huawei, Tencent, NPL, and Samsung. [Total award to Surrey where I am a Principal Investigator (PI) or Co-Investigator (CI): £13M+ (as PI £2.2M, as CI £14M+). As PI/CI, on a total grant award portfolio: £30M+]

04/2022-04/2026, PI, "Uncertainty modelling and quantification for heterogeneous sensor/effector networks", SAAB (Surrey investigators: Wenwu Wang (PI), Pei Xiao (CI)) Project partner: SAAB.
04/2022-04/2026, PI, "Cooperative sensor fusion and management for distributed sensor swarms", SAAB (Surrey investigators: Wenwu Wang (PI), Pei Xiao (CI)) Project partner: SAAB.
01/2022-01/2026, CI, "Differentiable particle filters for data-driven sequential inference", EPSRC & NPL (iCASE). (Surrey investigators: Yunpeng Li (PI), Wenwu Wang (CI)) Project partner: National Physical Laboratory.
07/2021-07/2026, CI, "BBC Prosperity Partnership: AI for Future Personalised Media Experiences", EPSRC (Prosperity Partnership scheme). (Surrey investigators: Adrian Hilton (PI, project lead), CIs: Philip Jackson, Armin Mustafa, Jean-Yves Guillemaut, Marco Volina, and Wenwu Wang; project manager: Elizabeth James ) [The project is led by University of Surrey, jointly with BBC and University of Lancaster, with supports from 10+ industrial partners.] (project website)
07/2021-07/2024, CI, "Uncertainty Quantification for Robust AI through Optimal Transport", University of Surrey (project-based Doctoral College studentship competition). (Surrey investigators: Yunpeng Li (PI), CI: Wenwu Wang.
02/2021-02/2023, PI, "Automated Captioning of Image and Audio for Visually and Hearing Impaired", British Council (Newton Institutional Links Award). (Surrey investigators: Wenwu Wang) [The project is led by University of Surrey, jointly with Izmir Katip Celebi University (IKCU) @ Turkey (Volkan Kilic).]
04/2021-04/2024, CI, "Multimodal video search by examples", EPSRC (responsive mode). (Surrey investigators: PI: Josef Kittler, CIs: Miroslaw Bober, Wenwu Wang and Mark Plumbley) [The project is led by Ulster University (Hui Wang), jointly with Univ. of Surrey and Univ of Cambridge (Mark Gales).]
01/2021-10/2022, PI, "Acoustic surveillance", MoD (DASA call on countering drones). (Surrey investigators: Wenwu Wang) [The project is led by Airspeed Electronics.]
11/2020-11/2023, PI, "SIGNetS: signal and information gathering for networked surveillance", DoD & MoD (UDRC phase 3 call on the application theme Signal and Information Processing for Decentralized Intelligence, Surveillance, and Reconnaissance). (Surrey investigators: Wenwu Wang (PI) and Pei Xiao (CI)) [The project is a collaboration between University of Cambridge (Simon Godsill, project lead), University of Surrey, and University of Sheffield (Lyudmila Mihaylova).] (project website)
08/2020-08/2023, PI, "Particle flow PHD filtering for audio-visual multi-speaker speech tracking", Tencent (Rhino-Bird funding scheme). (Surrey investigator: Wenwu Wang) [Industry partner: Yong Xu @ Tencent AI Lab]
01/2020-10/2022, PI, "Deep embedding techniques for audio scene analysis and source separation", ASEM-DUO (Duo-India Professor Fellowship). [jointly with Dr Vipul Arora at Indian Institute of Technology (IIT) Kanpur] (Surrey investigator: Wenwu Wang).
03/2017-01/2023, CI, "Audio Visual Media Research", EPSRC (Platform grant). [Surrey investigators: PI: Adrian Hilton, CIs: Mark Plumbley, Josef Kittler, Wenwu Wang, John Collomosse, Philip Jackson, and Jean-Yves Guillemaut.]
------
08/2020-05/2021, PI, "Audio tagging for meta data generation of media for programme recommendation", EPSRC (impact acceleration account). (Surrey investigators: Wenwu Wang (PI) and Mark Plumbley (CI))
09/2018-03/2019, PI, "Array optimisation with sensor failure", EPSRC (impact acceleration account). [jointly with Kaon] [Surrey investigators: Wenwu Wang.]
02/2018-12/2018, PI, "Speech detection, separation and localisation with acoustic vector sensor", Huawei (HIRP). [Surrey investigators: Wenwu Wang.]
01/2017-01/2020, PI, "Improving the Robustness of UWAN Data Transmitting and Receiving Utilize Deep Learning and Statistical Model", NSFC (Youth Science Foundation). [Surrey investigators: Wenwu Wang.]
02/2016-02/2019, CI, "ACE-CReAte: Audio Commons", EC (Horizon 2020). [jointly wih Universitat Pompeu Fabra, Queen Mary University of London, Jamendo SA, AudioGaming, and Waves Audio Ltd.] [Surrey investigators: PI: Mark Plumbley, CIs: Wenwu Wang, Tim Brookes, and David Plans.] (project website)
02/2016-02/2019, PI, "Marine environment surveillance technology based on underwater acoustic signal processing", SSTIC ('international collaboration' call). [jointly wih Harbin Institute of Technology at Shenzhen] [Surrey investigators: Wenwu Wang.]
01/2016-01/2019, CI, "Making sense of sounds", EPSRC ('making sense from data' call). [jointly wih Salford University] [Surrey investigators: Mark Plumbley (PI), CIs: Wenwu Wang, Philip Jackson and David Frohlich.] (project website)
01/2015-01/2019, CI, "MacSeNet: machine sensing training network", EC (Horizon 2020, Marie Curie Actions - Innovative Training Network). [jointly with INRIA (France), University of Edinburgh (UK), Technical University of Muenchen (Germany), EPFL (Switzerland), Computer Technology Institute (Greece), Institute of Telecommunications (Portugal), Tampere University of Technology (Finland), Fraunhofer IDMT (Germany), Cedar Audio Ltd (Cambridge, UK), Audio Analytic (Cambridge, UK), VisioSafe SA (Switzerland), and Noiseless Imaging Oy (Finland)] [Surrey investigators: Mark Plumbley (PI) and Wenwu Wang (CI)] (project website)
10/2014-10/2018, CI, "SpaRTaN: Sparse representation and compressed sensing training network", EC (FP7, Marie Curie Actions - Initial Training Network). [jointly with University of Edinburgh (UK), EPFL (Switzerland), Institute of Telecommunications (Portugal), INRIA (France), VisioSafe SA (Switzerland), Noiseless Imaging Oy (Finland), Tampere University of Technology (Finland), Cedar Audio Ltd (Cambridge, UK), and Fraunhofer IDMT (Germany)] [Surrey investigators: Mark Plumbley (PI) and Wenwu Wang (CI).] (project website)
01/2014-01/2019, CI, "S3A: future spatial audio for an immersive listener experience at home", EPSRC (programme grant). [jointly with University of Southampton, University of Salford, and BBC.] [Surrey investigators: PI: Adrian Hilton, CIs: Philip Jackson, Wenwu Wang, Tim Brookes, and Russell Mason.] (project website)
04/2013-06/2018, PI, "Signal processing solutions for a networked battlespace", EPSRC and Dstl ('signal processing' call). [jointly with Loughborough University, University of Strathclyde, and Cardiff University.] [Surrey investigators: Wenwu Wang (PI), Josef Kittler (CI), and Philip Jackson (CI)] (project website)
09/2015-06/2016, PI, "Array processing exploiting sparsity for submarine hull mounted arrays", Atlas Electronik & MoD (MarCE scheme) [Surrey investigators: Wenwu Wang.]
03/2015-09/2015, PI, "Speech enhancement based on lip tracking", EPSRC (impact acceleration account). [jointly with SAMSUNG (UK)] [Surrey investigators: Wenwu Wang.]
10/2013-03/2014, PI, "Enhancing speech quality using lip tracking", SAMSUNG (industrial grant). [Surrey investigators: Wenwu Wang.]
12/2012-12/2013, PI, "Audio-visual cues based attention switching for machine listening", MILES and EPSRC (feasibility study). [jointly with School of Psychology and Department of Computing.] [Surrey investigators: PI: Wenwu Wang, CIs: Mandeep Dhami, Shujun Li, and Anthony Ho.]
11/2012-07/2013, PI, "Audio-visual blind source separation", NSFC (international collaboration scheme). [jointly with Nanchang University, China.] [Surrey investigators: Wenwu Wang.]
12/2011-03/2012, PI, "Enhancement of audio using video", HO (pathway to impact). [jointly with University of East Anglia.] [Surrey investigators: Wenwu Wang and Richard Bowden (CI).]
10/2010-10/2013, CI, "Audio and video based speech separation for multiple moving sources within a room environment", EPSRC (responsive mode). [jointly with Loughborough University.] [Surrey investigators: Josef Kittler (PI) and Wenwu Wang (CI).]
10/2009-10/2012, PI, "Multimodal blind source separation for robot audition", EPSRC and Dstl ('signal processing' call). [Surrey investigators: PI: Wenwu Wang, CIs: Josef Kittler and Philip Jackson.] (project website)
05/2008-06/2008, PI, "Convolutive non-negative sparse coding", RAENG (international travel grant).[Surrey investigators: Wang.]
02/2008-06/2008, PI, "Convolutive non-negative matrix factorization", URSF (small grant). [Surrey investigators: Wang.]
02/2008-03/2008, PI, "Computational audition", OSU (visiting scholarship). [Surrey investigators: Wang.] (Collaborator: Prof Deliang Wang)

Research Team

Postdoc Research Fellows

Dr Jianyuan Sun (09/2021 -): Deep learning for audio classfication and captioning

Dr Syed Ahmad Soleymani (01/2023 -): Sensor fusion with autonomous sensor management

------

Dr Shidrokh Goudarzi (09/2021 -01/2023): Q-learning for autonomous sensor management

Dr Saeid Safavi (02/2021-07/2022): Machine learning for audio detection and localization

Dr Gishantha Thantulage (09/2021 - 03/2022): Machine learning (Co-supervisor. Co-supervised with Prof Anil Fernando)

Dr Oluwatobi Baiyekusi (09/2021 - 03/2022): Deep learning for media content analysis (Co-supervisor. Co-supervised with Prof Anil Fernando)

Dr Tassadaq Hussain (03/2021 - 07/2021): Audio tagging for program recommendation (Main supervisor. Co-supervised with Prof Mark Plumbley)

Dr Lam Pham (08/2020 - 02/2021): Audio tagging for meta data generation for program recommendation (Main supervisor. Co-supervised with Prof Mark Plumbley)

Dr Yin Cao (09/2018 - 12/2020): Audio scene classification, event detection and audio tagging (Main supervisor. Co-supervised with Prof Mark Plumbley)

Dr Saeid Safavi (03/2018 - 06/2020): Machine learning for predicting perceptual reverberation (Main supervisor. Co-supervised with Prof Mark Plumbley)

Dr Mark Barnard (08/2018 - 03/2019): Array optimisation with sensor failure
Dr Qingju Liu (04/2014 - 05/2019): Source separation and objectification for future spatial audio (Primary supervisor. Co-supervised with Dr Philip Jackson and Prof Adrian Hilton)

Dr Cemre Zor (04/2013 - 02/2019): Statistical anomaly detection (Primary supervisor. Co-supervised with Prof Josef Kittler)

Dr Qiang Huang (04/2016 - 10/2018): Semantic Audio-Visual Processing and Interaction (Co-supervisor. Co-supervised with Dr Philip Jackson and Prof Mark Plumbley)

Dr Yong Xu (04/2016 - 05/2018): Machine Listening (Main supervisor. Co-supervised with Prof Mark Plumbley and Dr Philip Jackson)

Dr Viet Hung Tran (08/2017 - 06/2018): Acoustic source localisation and separation

Dr Mark Barnard (09/2014 - 08/2018): Underwater acoustic signal processing (major) / Visual tracking for future spatial audio (minor) (Main supervisor. Co-supervised with Prof Adrian Hilton and Dr Philip Jackson)

Dr Lu Ge (03/2015 - 09/2015): Audio-visual signal processing

Dr Swati Chandna (05/2013 - 11/2014): Bootstrapping for robust source separation (Primary supervisor. Co-supervised with Dr Philip Jackson)

Dr Mark Barnard (10/2010 - 12/2013): Audio-visual speech separation of multiple moving sources (Primary supervisor. Co-supervised with Prof Josef Kittler. External Collaborators: Prof Jonathon Chambers, Loughborough University; Dr Sangarapillai Lambotharan, Loughborough University; Prof Christian Jutten, Grenoble, France, and Dr Rivet Bertrand, Grenoble, France)

Dr Qingju Liu (01/2013 - 03/2014): Words spotting from noisy mixtures & Lip-tracking for voice enhancement

PhD Students

Peng Zhang: Audio-visual scene analysis (Primary supervisor. Co-supervised with Prof Philip Jackson, & collaborators Dr Pablo Martinez-Nuevo and Dr Sven Ewan Shepstone from B&O)
Yang Xiang: Direct and indirect generation of room acoustic embeddings (Co-supervisor. Co-supervised with Prof Philip Jackson)
Liting Gao: Multimodal signal processing for digital fishery (Primary supervisor. Co-supervised with Dr Jean-Yves Guillemaut)
Qingyu Luo: Intelligent audio rendering (Co-supervisor. Co-supervised with Prof Philip Jackson, & collaborators Dr Pablo Martinez-Nuevo and Dr Sven Ewan Shepstone from B&O)
Ozkan Cayli: Distributed sensing and learning (Primary supervisor. Co-supervised with Prof Pei Xiao, & collaborators Dr Daniel Fransson and Dr Jarryd Braithwaite from Saab)
Yuelan Cheng: Audio question answering and reasoning (Primary supervisor. Co-supervised with Prof Guoping Lian from Unilever/CPE)
Xinran Liu: Cross-modality generation (Co-supervisor. Co-supervised with Dr Diptesh Kanojia and Dr Zhenhua Feng)
John-Joseph Brady: Differentiable particle filtering (Co-supervisor. Co-supervised with Dr Yunpeng Li)
Zhi Qin Tan: Bayesian machine learning (Co-supervisor. Co-supervised with Dr Yunpeng Li. Zhi Qin has moved to Kings College London with Yunpeng)
Junqi Zhao: Audio restoration with generative models (Primary supervisor. Co-supervised with Prof Mark Plumbley)
Haojie Chang: Audio-visual analysis of fish behaviour (Primary supervisor. Co-supervised with Dr Lian Liu from CPE Department)
Jiaxi Li: Statistical machine learning (Co-supervisor. Co-supervised with Dr Xilu Wang)
Yi Yuan: Deep learning for intelligent sound generation (Primary supervisor. Co-supervised with Prof Mark Plumbley)
Yaru Chen: Multimodal learning and analysis of fish behaviour (Primary supervisor. Co-supervised with Prof Tao Chen from CPE Department)
Meng Cui: Machine learning for multimodal analysis of fish behaviour (Primary supervisor. Co-supervised with Prof Guoping Lian from Unilever/CPE and Prof Tao Chen from CPE Department)
Haohe Liu (PhD thesis writing up): Audio tagging (Co-supervisor. Co-supervised with Prof Mark Plumbley)
Yanze Xu (PhD thesis writing up): Recognition of paralinguistic features for singing voice description (Co-supervisor. Co-supervised with Prof Mark Plumbley)
James King (PhD thesis writing up): Information theoretic learning for sound analysis (Co-supervisor. Co-supervised with Prof Mark Plumbley)
Jinzheng Zhao (written up and waiting for viva): Audio-visual multi-speaker tracking (Primary supervisor. Co-supervised with Prof Mark Plumbley, and Dr Yong Xu (Tencent AI Lab, USA)
Andrew Bailey (PhD awarded in February 2025): Machine Learning for Depression Detection and Emotion Recognition (Co-supervisor. Co-supervised with Prof Mark Plumbley)
Xubo Liu (PhD awarded in January 2025): Audio Source Separation and Creation Empowered by Natural Language Intelligence (Primary supervisor. Co-supervised with Prof Mark Plumbley)
Peipei Wu (PhD awarded in September 2024): Distributed Audio-Visual Multi-Target Tracking (Primary supervisor. Co-supervised with Dr Philip Jackson)
Xinhao Mei (PhD awarded in June 2024): Sound to Text: Automated Audio Captioning using Deep Learning (Primary supervisor. Co-supervised with Dr Yunpeng Li and Prof Mark Plumbley)
Buddhiprabha Erabadda (PhD awarded in June 2023): Machine Leaning for Video Coding and Quality Assessment (Co-supervisor. Co-supervised with Prof Anil Fernando)
Mukunthan Tharmakulasingam (PhD awarded in April 2023): Interpretable Machine Learning Models to Predict Antimicrobial Resistance (Co-supervisor. Co-supervised with Prof Anil Fernando and Prof Roberto La Ragione)
Jingshu Zhang (PhD awarded in December 2022): Phase Aware Speech Enhancement and Dereverberation (Co-supervisor. Co-supervised with Prof Mark Plumbley)
Shuoyang Li (PhD awarded in June 2022): Sketching and Streaming based Subspace Clustering for Large-scale Data Classification (Primary supervisor. Co-supervised with Dr Philip Jackson, and Dr Yuantao Gu from Tsinghua University, China)
Jayasingam Adhuran (PhD awarded in April 2022): QoE Aware VVC Based Omnidirectional and Screen Content Coding (Co-supervisor. Co-supervised with Prof Anil Fernando)
Turab Iqbal (PhD awarded in December 2021): Noisy Web Supervision for Audio Classification (Primary supervisor. Co-supervised with Prof Mark Plumbley)
Lucas Rencker (PhD awarded in August 2020): Sparse Signal Recovery From Linear and Nonlinear Compressive Measurements (Primary supervisor. Co-supervised with Prof Mark Plumbley, and Prof Francis Bach, INRIA, France) [Lucas was a Marie Curie Early Stage Researcher]
Iwona Sobieraj (PhD awarded in June 2020): Environmental Audio Analysis by Non-negative Matrix Factorization (Co-supervisor. Co-supervised with Prof Mark Plumbley) [Iwona was a Marie Curie Early Stage Researcher]
Alfredo Zermini (PhD awarded in April 2020): Deep learning for speech separation (Primary supervisor. Co-supervised with Prof Mark Plumbley, and Prof Francis Bach, INRIA, France) [Alfredo was a Marie Curie Early Stage Researcher]
Yang Liu (PhD awarded in February 2020): Particle Flow PHD Filtering for Audio-Visual Multi-Speaker Tracking (Primary supervisor. Co-supervised with Prof Adrian Hilton)
Hanne Stenzel (PhD awarded in December 2019): Influences on perceived horizontal audio-visual spatial alignment (Co-supervisor. Co-supervised with Dr Philip Jackson)
Cian O'Brien (PhD awarded in November 2019): Low rank modelling for polyphonic music analysis (Co-supervisor. Co-supervised with Prof Mark Plumbley) [Cian was a Marie Curie Early Stage Researcher]
Qiuqiang Kong (PhD awarded in September 2019): Sound event detection with weakly labelled data (Co-supervisor. Co-supervised with Prof Mark Plumbley)
Luca Remaggi (PhD awarded in August 2017): Estimation of Room Reflection Parameters for a Reverberant Spatial Audio Object (Co-supervisor. Co-supervised with Dr Philip Jackson)

Pengming Feng (PhD awarded in November 2016): Enhanced particle PHD filtering for multiple human tracking (Co-supervisor. Co-supervised with Prof Jonathon Chambers and Dr Syed Mohsen Naqvi, Newcastle University)

Atiyeh Alinaghi (PhD awarded in October 2016): Blind convolutive stereo speech separation and dereverberation (Co-supervisor. Co-supervised with Dr Philip Jackson)

Jing Dong (PhD awarded in July 2016): Sparse Analysis Model Based Dictionary Learning and Signal Reconstruction (Primary supervisor. Co-supervised with Dr Philip Jackson; External Collaborator: Dr Wei Dai, Imperial College London)

Shahrzad Shapoori (PhD awarded in April 2016): Detection of medial temporal brain discharges from EEG signals using joint source separation-dictionary learning (Co-supervisor. Co-supervised with Dr Saeid Sanei, Department of Computing)

Volkan Kilic (PhD awarded in January 2016): Audio visual tracking of multiple moving sources (Primary supervisor. Co-supervised with Prof Josef Kittler and Dr Mark Barnard)

Marek Olik (PhD awarded in January 2015): Personal sound zone reproducation with room reflections (Co-supervisor. Co-supervised with Dr Philip Jackson)

Syed Zubair (PhD awarded in June 2014): Dictionary learning for signal classification (Primary supervisor. Co-supervised with Dr Philip Jackson; Internal collaborator: Dr Fei Yan; External collaborator: Dr Wei Dai, Imperial College London)

Philip Coleman (PhD awarded in May 2014): Loudspeaker array processing for personal sound zone reproduction (Co-supervisor. Co-supervised with Dr Philip Jackson)

Qingju Liu (PhD awarded in October 2013): Multimodal blind source separation for robot audition (Primary supervisor. Co-supervised with Dr Philip Jackson, Prof Josef Kittler; External collaborator: Prof Jonathon Chambers, Loughborough University)
Tao Xu (PhD awarded in June 2013): Dictionary learning for sparse representations with applications to blind source separation (Primary supervisor. Co-supervised with Dr Philip Jackson; External collaborator: Dr Wei Dai, Imperial College London)

Rakkrit Duangsoithong (PhD awarded in Oct 2012): Feature selection and causal discovery for ensemble classifiers (Co-supervisor; Co-supervised with Dr Terry Windeatt)

Tariqullah Jan (PhD awarded in Feb 2012): Blind convolutive speech separation and dereverberation (Primary Supervisor; Co-Supervised with Prof Josef Kittler; External collaborator: Prof DeLiang Wang, The Ohio State University)

Academic Visitors

Mr Yiming Zhang (01/2024-): PhD Student, BUPT, China. Topic: Audio captioning.

Mr Jinhua Liang (11/2022-): PhD Student, Queen Mary University of London, UK. Topic: Audio classification.

------

Mr Xuenan Xu (09/2023-04/2024): PhD Student, Shanghai Jiaotong University, China. Topic: Audio captioning.

Dr Vipul Arora (09/2022 - ): Associate Professor, Indian Institute of Technology, India. Topic: Audio source separation and scene analysis.

Mr Bin Lin (08/2019 - 08/2020): Senior Research Engineer, China Academy of Space Technology, China. Topic: Sparse analysis model based dictionary learning from nonlinear measurements.

Dr Takahiro Murakami (04/2019 - 04/2020): Assistant Professor, Meiji University, Japan. Topic: Microphone array calibration.

Dr Shiyong Lan (10/2018 -10/2019): Associate Professor, Sichuan University, Chengdu, China. Topic: Multimodal tracking.

Prof Jinjia Wang (07/2017 - 07/2018): Professor, Yanshan University, Qinghuangdao, China. Topic: Deep sparse learning.

Dr Ning Li (08/2017 - 08/2018): Associate Professor, Harbin Engineering University, Harbin, China. Topic: Blind sparse inverse filtering and deconvolution.

Dr Yang Chen (08/2017 - 08/2018): Associate Professor, Changzhou University, China. Topic: Acoustic source localisation and separation.

Dr Yina Guo (09/2016 - 02/2017): Associate Professor, Taiyuan University of Science and Technology, Taiyuan, China. Topic: Blind source separation.

Dr Ronan Hamon (06/2016-11/2016), Postdoctoral Researcher, QARMA Team (LIF - Aix-Marseille Universite), France. Topic: Perceptual and objective measure of musical noise in audio source separation & audio impainting. (Co-supervisor. Co-supervised with Prof Mark Plumbley.)
Dr Zongxia Xie (01/2016 - 01/2017): Associate Professor, Tianjin University, Tianjin, China. Topic: Sparse representation for big uncertain data classification.

Mr Jian Guan (10/2014 - 01/2017): PhD student, Harbin Institute of Techonology, Shenzhen Graduate School, Shenzhen, China. Topic: Blind sparse deconvolution and dereverberation.

Dr Jesper Rindom Jensen (04/2016 - 04/2016): Postdoctoral Research Fellow, Aalborg University, Denmark. Topic: Audio-visual speech processing.

Dr Xiaorong Shen (02/2015 - 12/2015): Associate Professor, Beihang University, Beijing, China. Topic: Audio-visual source detection, localization and tracking.

Mr Luc Guy (06/2015 - 09/2015): MSc student, Polytech Montpellier, France. Topic: Music audio source separation.

Mr Hatem Deif (02/2015 - 02/2015): PhD student, Brunel University, London, UK. Topic: Single channel audio source separation.

Dr Yang Yu (04/2014 - 04/2015): Associate Professor, Northwestern Polytechnical University, Xi'an, China. Topic: Underwater acoustic source localisation and tracking with sparse array and deep learning.

Mr Jamie Corr (10/2014 - 10/2014): PhD student, Strathclyde Univeristy, Glasgow, UK. Topic: Underwater acoustic data processing with polynomial matrix decomposition.

Dr Xionghu Zhong (07/2014 - 07/2014): Independent Research Fellow, Nanyang Technological University, Singapore. Topic: Acoustic source tracking.

Xiaoyi Chen (10/2012 - 09/2013 ): PhD student, Northwestern Polytechnical University, Xi'an, China. Topic: Convolutive blind source separation of underwater acoustic mixtures.

Dr Ye Zhang (12/2012 - 08/2013): Associate Professor, Nanchang University, Nanchang, China. Topic: Analysis dictionary learning and source separation.

Victor Popa (04/2013 - 07/2013), PhD student, University Politehnica of Bucharest, Bucharest, Romania. Topic: Audio source separation.

Dr Stefan Soltuz (10/2008 -07/2009), Research Scientist, Tiberiu Popoviciu Institute of Numerical Analysis, Romania. Topic: Non-negative matrix factorization for music audio separation (Primary supervisor. Co-supervised with Dr Philip Jackson)

Yanfeng Liang (MSc, 05/2009), MSc Student: Harbin Engineering University, Harbin, China. Topic: Adaptive signal processing for clutter removal in radar images (Co-supervisor. Co-supervised with Prof Jonathon Chambers, Loughborough University)

MSc Students

Miss Nanqin Luo (MSc, 2021); Project: Sound source localization

Mr Yue Zhu (MSc, 2021); Project: Detection and localisation of drones with a microphone array

Mr Umar Abdulkadir Isa (MSc, 2021); Project: Attention neural networks for audio event detection

Mr Yi Zhou (MSc, 2021); Project: Audio tagging

Mr Tong Cheng (MSc, 2021); Project: Audio context analysis for multimodal search and retrieval

Mr Tugay Karakaya (MSc, 2021); Project: Distributed sensor fusion for source tracking

Mr Hou Ruilin (MSc, 2021); Project: Machine Audio Generation

Ms Siyuan Chang (MSc, 2021); Project: Learning to recognise sounds using web data

Mr Yifan Zhang (MSc, 2021); Project: Automated Audio Captioning

Mr Venkata Phaninder Esrapu (MSc, 2021); Project: Sound Event Detection and Localization

Miss Joana Ann James (MSc, 2021); Project: Augmenting Acoustic Scene Classification with Motion Sensors

------

Mr Xinhao Mei (MSc, 2020); Project: Deep Learning for Large-scale Speaker Verification in the Wild

Mr Peipei Wu (MSc, 2020); Project: Multi-camera tracking of multiple targets

Mr Minwoo Ju (MSc, 2020); Project: Music playlist sorting based on beat rate detection

Mr Chengqiao Hu (MSc, 2020); Project: Deep learning for speech separation

Mr Andrea Angiolini (MSc, 2020); Project: Deep learning for audio scene classification

Mr Colin Furner (MSc, 2020); Project: Music instrument recognition

Mr Feiyu Yan (MSc, 2020); Project: Learning with short utterances for speaker verification

Mr Yi Zhou, (MSc, 2020); Project: Source tracking with particle flow PhD filter

Mr Martyn Allen (MSc, 2020); Project: Dementia detection from speech

Mr Han Li (MSc, 2020); Project: Describing video by sound

Mr Meghana Santhosh (MSc, 2020); Project: Audio tagging

Mr Shaoyang Shi (MSc, 2020); Project: Sound event detection and localisation

Mr Albert De Mello (MSc, 2020); Project: Visual recognition of cued speech

Mr Zhijian He (MSc, 2019); Project: Deep learning for speech separation

Mr Jo Leung (MSc, 2019); Project: Stereo speech source separation

Miss Baoer Wang (MSc, 2018); Project: Visual recognition of cued speech

Mr Jintong Shen (MSc, 2018); Project: Automated audio description

Mr Thorin Farnsworth (MSc, 2018); Project: Audio segmentation and content categorisation

Mr Yichong Jiang (MSc, 2018); Project: Speech to text conversion

Mr Hanyu Liu (MSc, 2018); Project: Audio event detection

Mr Xinchun Yuan (MSc, 2018); Project: Sparse array

Mr Minghan Zhang (MSc, 2018); Project: Audio restoration

Mr Haoliang Wen (MSc, 2018); Project: Audio declipping

Mr Bangzheng Wu (MSc, 2018); Project: Detection of failed sensors in an acoustic array

Mr Haochen Yuan (MSc, 2018); Project: Spatial audio reproduction

Mr George Walker (MSc, 2016); Project: Sound event detection

Miss Mahalakshmi Venkatesh (MSc, 2016); Project: Audio classification with deep neural networks

Mr Han Hsueh Hsin (MSc, 2016); Project: Audio super-resolution

Mr Duanshun Li (MSc, 2016); Project: Image despeckling

Melih Engin (MSc, 2015); Project: Kernel sparse representation for image denoising

Yuting Liu (MSc, 2015); Project: Beamforming for sparse array

Bingxin Xia (MSc, 2015, awarded Distinction); Project: Real-time speech separation demonstration

Jiajian Liang (MSc, 2015); Project: Sound localization from incomplete microphone samples

Feiwen Min (MSc, 2015); Project: Audio-super-resolution

Yuan Zhang (MSc, 2015); Project: Speaker tracking with PhD filter

Ahmad Shehu (MSc, 2015); Project: Music instrument recognition

Denise Chew (MSc, 2014, awarded Distinction); Project: Audio impainting

Yan Yin (MSc, 2014); Project: Audio super-resolution

Dalton Whyte (MSc, 2014); Project: Audio retrieval using deep learning

Dan Hua (MSc, 2013, awarded Distinction) Project: Super-resolution audio based on sparse signal processing

Dichao Lu (MSc, 2013) Project: Polyphonic pitch tracking of music

Xiao Han (MSc, 2012, awarded Distinction); Project: Underdetermined reverberant speech separation

Jian Han (MSc, 2012, awarded Distinction); Project: Microphone array based acoustic tracking of multiple moving speakers (co-supervised with Dr Mark Barnard)

Tianyu Feng (MSc, 2012); Project: Multi-pitch estimation and tracking

Yuli Ling (MSc, 2012); Project: Audio event detection from sound mixtures

Danyang Shen (MSc, 2012); Project: Audio-visual tracking of multiple moving speakers (co-supervised with Dr Mark Barnard)

Kai Song (MSc, 2012); Project: Environment recognition from sound scenes (co-supervised with Dr Fei Yan)

Xinpu Han (MSc, 2012); Project: Compressed sensing for natural image coding

Steven Grima (MSc, 2011, awarded Distinction); Project: Multimodal tracking of multiple moving sources (co-supervised with Dr Mark Barnard)

Anil Lal (MSc, 2011, awarded Distinction); Project: Monaural music sound separation using spectral envelop template and isolated note information

Xi Luo (MSc, 2011, awarded Distinction); Project: Reverberant speech enhancement

Yunyi Wang (MSc, 2011); Project: Compressed sensing for image coding

Ritesh Agarwal (MSc, 2011); Project: Multiple pitch tracking

Yichen Li (MSc, 2011); Project: Environmental sound recognition (co-supervised with Dr Fei Yan)

Tengxu Yang (MSc, 2011); Project: Ideal binary mask estimation in computational auditory scene analysis

Jin Ke (MSc, 2011); Project: Audio-visual tracking and localisation of moving speakers (co-supervised with Dr Mark Barnard)

Zijian Zhang (MSc, 2011); Project: Convolutive blind source separation of speech mixtures

Hafiz Mustafa (MSc, 2010); Project: Single channel music sound separation

BSc Students

------

Mr Alexander Cochrane (BSc, 2020); Project: Dynamic RGB LED Music Mapping

Xiao Cao (BSc, 2014); Project: Real-time speech separation demonstration

Research Collaborations

Academic:

UK: Cambridge, Sheffield, Loughborough, Cardiff, Imperial, Strathclyde, Southampton, Queen Mary, Salford, UEA, etc.
EU: INRIA (France), Gipsa-lab (France), UPF (Spain), EPFL (Switzerland), CTI (Greece), IT (Portugal), TUT (Finland), Fraunhofer IDMT (Germany)
China: Peking, Tianjin, Qingdao, Hunan, etc.
Australia: RMIT.
Japan: RIKEN, JAIST, etc.
US: OSU
Singapore: NTU
Turkey: IKCU
India: IIT

Industrial:

UK: Atlas, Kaon, BBC, NPL, Qinetiq, Thales, Cedar Audio, Audio Analytic, Airspeed, Stellar, Digital Barriers, Selex Galileo, PrismTech, Steepest Ascent, etc.
EU: VisioSafe SA (Switzerland), Noiseless Imaging Oy (Finland), and Fraunhofer IDMT (Germany)
China: Tencent, Huawei
South Korea: Samsung
US: Texas Instruments

Current Opportunities

Postdoctoral Research Fellows

Vacancy available: Research Fellow position in "Machine Learning for Audio Captioning" available. (Closing date for applications: 10/06/2021)
Vacancy available: Research Fellow position in "Research Fellow in Autonomous Sensor Management and Fusion for Distributed Sensor Networks" available. (Closing date for applications: 16/03/2021) CLOSED
Vacancy available: Research Fellow in Acoustic Signal Processing and Machine Learning" available. (Closing date for applications: 03/01/2021) CLOSED
Vacancy available: Research Fellow in Advanced Machine Learning for Audio Tagging" available. (Closing date for applications: 27/06/2020) CLOSED
Vacancy available: Research Fellow position in "Deep Learning for Speech Source Separation" available. (Closing date for applications: 01/05/2018) CLOSED
Vacancy available: Research Fellow position in "Machine Listening" available. (Closing date for applications: 01/05/2018) CLOSED
Vacancy available: Research Fellow position in "Source Separation and Localisation" available. (Closing date for applications: 27/03/2017) CLOSED
Vacancy available: Research Fellow position in "Research Software Developer (Experimental Officer) ON "Making Sense of Sounds" available. (Closing date for applications: January 17, 2016) (CLOSED)
Vacancy available: Research Fellow position in "Machine Listening" available. (Closing date for applications: December 13, 2015) (CLOSED)
Vacancy available: Research Fellow position in "Semantic Audio-Visual Processing and Interaction" available. (Closing date for applications: December 13, 2015) (CLOSED)
Vacancy available: Research Fellow position in "Audio-Visual Signal Processing" available. (Closing date for applications: January 27, 2015) (CLOSED)
Vacancy available: Four research fellow positions in "Spatial Audio & Vision" available. Click here for more details. (Closing date for applications: February 2, 2014) (CLOSED)
Vacancy available: Research Fellow position in "Low-Complexity Source Separation Algorithms" (Fixed-term contract for three years. Closing date for applications: March 17, 2013) (CLOSED)
Vacancy available: Research Fellow position in "Statistical anomaly detection" (Closing date for applications: February 28, 2013) (CLOSED)
Vacancy available: Research Fellow position in "Audio and Video Based Speech Separation for Multiple Moving Sources Within a Room Environment" (Closing date for applications: August 9, 2010) (CLOSED)

Marie Curie Early Stage Researchers

Vacancy available: MacSeNet: Marie Curie Early Stage Researcher position in "Audio Restoration and Inpainting" available. (Closing date for applications: April 30, 2015) (CLOSED)
Vacancy available: MacSeNet: Marie Curie Early Stage Researcher position in "Sound Scene Analysis" available. (Closing date for applications: April 30, 2015) (CLOSED)
Vacancy available: SpaRTan: Marie Curie Early Stage Researcher position in "Sparse Time-Frequency Methods for Audio Source Separation" available. (Closing date for applications: January 25, 2015) (CLOSED)
Vacancy available: SpaRTan: Marie Curie Early Stage Researcher position in "Automatic Music Transcription Using Structured Sparsity" available. (Closing date for applications: January 25, 2015) (CLOSED)

PhD Students

If you wish to join CVSSP and work with me as a PhD student, please check the topic list and feel free to contact me if you have further inquiries. Students with background in engineering, mathematics, computing, physics or other related subjects are all welcome to apply. New project ideas, if not included in the list, are also encouraged to propose.

Vacancy available: PhD Studentship in Uncertainty Quantification for Robust AI through Optimal Transport (Closing date for applications: May 19, 2021)
Vacancy available: PhD Studentship in Multimodal BSS for Robot Audition (Closing date for applications: August 7, 2009) (CLOSED)
Vacancy available: PhD Studentship in Signal Processing for Machine Audition and Perception (Closing date for applications: August 8, 2008) (CLOSED)

Visiting Scholar

I welcome collaborations nationally and internationally. Please do not hesitate to contact me if you are interested in joining CVSSP as a visiting scholar.

Current Topics

Unsupervised learning techniques (including independent component analysis, independent vector analysis, latent variable analysis, sparse component analysis, non-negative matrix/tensor factorisation, low-rank representation, manifold learning, and subspace clustering)
Supervised learning techniques (including deep learning, dictionary learning, multimodal learning, and learning with priors and signal properties)
Computational auditory scene analysis (audio scene recognition, audio event detection, audio tagging, and audio captioning)
Audio signal separation (convolutive audio source separation, underdetermined audio source separation including monaural source separation)
Audio feature extraction and perception (including pitch detection, onset detection, rhythm detection, music transcription and low bit-rate audio coding)
Sound source localisation (using audio, video, depth information, with particle filtering, PHD filtering, and/or particle flow filtering)
Multimodal speech source separation (audio/visual source separation with modelled based techniques such as Gaussian mixture model and learning-based method such as audio-visual dictionary learning)
Sparse representation and compressed sensing (synthesis model and analysis model based dictionary learning for sparse represenation, with applications to audio source separation, speech enhancement, audio inpainiting, and image enhancement)
Cocktail party processing (using techniques such as independent component analysis, blind source separation, computational auditory scene analysis, sparse representation/dictionary learning, Gaussian mixture modelling and expectation maximisation, and multimodal fusion)
Non-negative sparse coding of audio signals (including sparsity constrained non-negative matrix factorisation for audio analysis)
3D positional audio technology (including head-related transfer functions, binaural modelling, multiple loudspeaker panning, and room geometry estimation)
Approximate joint diagonalization for source separation (including unitary or non-unitary constrained joint diagonalization approaches)
Robust solutions for permutation problem of frequency domain independent component analysis (including approaches using filter constraints, statistical characteristics of signals, and beamforming)
Convex and non-convex optimisation (gradient descent, Newton methods, interior point method, ADMM, etc.)
Psychoacoustics motivationed signal processing and machine learning methods (e.g. time-frequency masking, perceptually informed speech separation/enhancement, intelligibility adaptive speech separation algorithms)

... More information about my current research may be found in my publications.

Past Projects

During the period of 1997 to 2007, I worked on a number of projects in both academic institutes and industrial companies including:

OpenSL ES (Led by Creative, jointly with other Khronos Group's member companies, such as Nokia, Samsung, Beatnik, Sonaptic, NVIDIA, Symbian, Texas Instruments, Ericsson, etc.)
3D Positional Audio for Mobile Devices (Sensaura, Creative Technology Ltd)
Video Encoder/Decoder for SSEYO miniMIXA (Tao Group Ltd, jointly with Samsung Electronics Research Institute)
Audio Distortion Generator for Intent Sound System (Tao Group Ltd)
Floating/Fixed-Point Audio (Ogg Vorbis) Encoder/Decoder (Tao Group Ltd)
Fixed-Point Pitch to MIDI Converter (Tao Group Ltd)
Fixed-Point Sampling Rate Converter (Tao Group Ltd)
Blind Signal Processing for Multichannel Speech Enhancement (initially with King's College London, then transferred to Cardiff University, jointly with Laboratory for Advanced Brain Signal Processing, RIKEN, Japan)
Room Acoustics Parameters from Music (Cardiff University, jointly with Salford University and Manchester Metropolitan University)
Video Assisted Speech Source Separation (Cardiff School of Engineering, jointly with Cardiff School of Computer Science)
GPS/Celestial/Inertia Integrated Navigation System (Harbin Engineering University)
Submarine Voyage Training Simulator (Harbin Engineering University, jointly with Qingdao Submarine Academy, and Jiujiang Branch of China State Shipbuilding Corp. )
Electronic Chart Display and Information System (Harbin Engineering University)
SCM Communication System by Carrier Wave (Harbin Engineering University)

Academic Activities

Editorial Activities

Senior Area Editor, IEEE Transactions on Signal Processing, 2019-present
Associate Editor, IEEE/ACM Transactions on Audio Speech and Language Processing, 2020-present.
Specialty Editor in Chief, Frontiers in Signal Processing, 2021-present.
Associate Editor, EURASIP Journal on Audio Speech and Music Processing, 2019-present.
Associate Editor, IEEE Transactions on Signal Processing, 2014-2018.
Associate Editor, The Scientific World Journal: Signal Processing (Hindawi), 2014-2016
Technical Committee Activities
Elected Member, IEEE Signal Processing Theory and Methods Technical Committee, 2021-present
Elected Member, IEEE Machine Learning for Signal Processing Technical Committe, 2021-present
Elected Member, International Steering Committee of Latent Variable Analysis and Signal Separation (LVA/ICA), 2019-present
Selected Conference Activities
Satellite Workshop Co-Chair, 2022 Interspeech Conference (INTERSPEECH 2022), Incheon, Korea.
Publication Co-Chair, 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019), Brighton, UK.
Local Arrangement Co-Chair, 2013 IEEE International Conference on Machine Learning for Signal Processing (MLSP 2013), Southampton, UK.
Publicity Co-Chair, 2009 IEEE International Workshop on Statistical Signal Processing (SSP 2009), Cardiff, UK.
Session Chairs, for 10+ conferences such as ICASSP 2021, IJCAI 2019, DSP 2015, ISP 2015, DSP 2013, ICASSP 2012, SSPD 2012, EUSIPCO 2012, EUSIPCO 2011, and WCCI 2008.
Regular Technical/Program Committees
I am a regular (or irregular) technical and program committee members of some major signal processing and machine learning conferences, such as:
ICASSP, Interspeech, MLSP, SSP, EUSIPCO, WASPAA, SSPD, DSP, MMSP, NeuIPS, IJCAI, ICML, AAAI, BMVC, ICPR, UKCI, WCCI, ISNN, etc.
External PhD Examination
2025.09.26, University of Southampton, PhD Thesis: Improving Robustness to Domain Shift in Machine Learning.
2025.08.23, University of Wollongong (Australia), PhD Thesis: Learning to Efficiently Generate and Evaluating the Quality of Ambisonic Room Impulse Responses for Virtual Spatial Audio.
2024.08.23, Imperial College London, PhD Thesis: Wideband Signal Direction of Arrival Estimation via Convex/Nonconvex Optimization.
2024.06.28, Ghent University (Belgium), PhD Thesis: Advancing Machine Listening: Understanding Acoustic Scenes and Events and the Emotions They Evoke.
2024.06.20, Oxford University, PhD Thesis: Spatial Audio and Spatial Audio-Visual Learning.
2024.05.13, Nanyang Technological University (Singapore), PhD Thesis: Artificial Intelligence for Urban Soundscape Augmentation: a Benchmark Dataset, Probabilistic Models, and Real-life Validation.
2023.12.12, Sheffield University, PhD Thesis: Bistatic Automotive Radar Sensing: Signal Processing for Motion Parameter Estimation.
2023.05.24, Leicester University, PhD Thesis: Big Data Analytics and Machine Learning Tools for Space and Earth Observation.
2023.05.03, University College London, PhD Thesis: Optimal Transport for Latent Variable Models.
2023.04.24, Queen Mary University of London, PhD Thesis: Deep Learning Methods for Instrument Separation and Recognition.
2023.03.21, Sheffield University, PhD Thesis: Machine Learning Methods for Autonomous Classification and Decision Making.
2023.02.10, Aalborg University (Denmark), PhD Thesis: Data-driven Speech Enhancement: from Non-negative Matrix Factorization to Deep Representation Learning.
2023.01.11, Multimedia University (Malaysia), PhD Thesis: Automated Detection of profanities for film censorship using deep learning.
2022.04.20, Brunel University, PhD Thesis: Fast embedding for image classification & retrieval and its application to the hostel industry.
2022.01.18, Edinburgh University, PhD Thesis: Data aware sparse non-negative signal processing.
2021.12.10, International Islamic University (Pakistan), PhD Thesis: Optimized implementation of multi-layer convolutional sparse coding framework for high dimensional data.
2021.11.25, Newcastle University, PhD Thesis: Advanced informatics for event detection and temporal localization.
2021.09.20, Imperial College London, PhD Thesis: Super-resolved localization in multipath environments.
2021.06.29. Nanyang Technological University (Singapore), PhD Thesis: Audio intelligence and domain adaptation for deep learning models at the edge in smart cities.
2021.04.12, Leceister University, PhD Thesis: Learning and generalisation for high-dimensional data.
2021.03.20, National Institute of Technology Meghalaya (India), PhD Thesis: Building robust acoustic models for an automatic speech recognition system.
2020.02.15, Nanyang Technological University (Singapore), PhD Thesis: Complex-valued mixing matrix estimation for blind separation of acoustic convolutive mixtures
2019.12.02, Oxford University, PhD Thesis: Recurrent neural networks for time series prediction.
2019.12.05, Queen Mary University of London, PhD Thesis: Intelligent control of dynamic range compressor.
2019.11.20, Imperial College London, PhD Thesis: Deep dictionary learning for image enhancement.
2019.11.25, Newcastle University, PhD Thesis: Signal processing and machine learning techniques for automatic image-based facial expression recognition.
2019.03.08, University of East Anglia, PhD Thesis: Audio speech enhancement using masks derived from visual speech.
2017.01.12, Loughborough University, PhD Thesis: Enhanced Independent Vector Analysis for Speech Separation in Room Environments.
2016.10.28, Queen Mary University of London, PhD Thesis: Music transcription using NMF.
2016.07.29, Southampton University, PhD Thesis: Source separation in underwater acoustic problems.
2016.03.04, Aalborg University (Denmark), PhD Thesis: Enhancement of speech signals - with a focus on voiced speech models.
2015.10.02, Edinburgh University, PhD Thesis: "Acoustic source localization and tracking using microphone arrays".
2015.09.15, Loughborough University, PhD Thesis: "Loughborough University Spontaneous Expression Database and Baseline Results for Automatic Emotion Recognition".
2014.11.21, Cardiff University, MPhil Thesis: "Joint EEG-fMRI signal model for EEG separation and localization".
2013.11.20, Loughborough University, PhD Thesis: "Enhanced independent vector analysis for audio separation in a room environment".
2012.12.10, Queen Mary University of London, PhD Thesis: "Sparse approximation and dictionary learning with applications to audio signals".
2010.10.13, University of Edinburgh, PhD Thesis: "Acoustic source localization and tracking using microphone arrays".

[Home] [Publications] [Research] [Teaching] [Short Bio] [Demo & Data] [Codes]

Last Modified in May 2023

First created in May 2007

Research

Funded Projects (Grants)

------

Research Team

Postdoc Research Fellows

------

PhD Students

------

Academic Visitors

------

MSc Students

------

BSc Students

------

Research Collaborations

Academic:

Industrial:

Current Opportunities

Postdoctoral Research Fellows

Marie Curie Early Stage Researchers

PhD Students

Visiting Scholar

Current Topics

Past Projects

Academic Activities

Editorial Activities

Technical Committee Activities

Selected Conference Activities

Regular Technical/Program Committees

External PhD Examination