Session | Room | Chair | |
Tutorial | Room 1 | - | |
Date | Time | Title | Speakers |
3-Dec | 09:30-11:30 | [T01] EEG Signal Processing and Machine Learning | Saeid (Saeed) Sanei |
13:00-15:00 | [T03] Human-Centric RF Sensing: Pose Estimation, ECG Monitoring and Self-Supervised Learning | Yan Chen, Dongheng Zhang, Zhi Lu | |
15:30-17:30 | [T04] Emerging Topics for Speech Synthesis: Versatility and Efficiency | Yuki Saito, Shinnosuke Takamichi, Wataru Nakata |
Session | Room | Chair | |
Tutorial | Room 2 | - | |
Date | Time | Title | Speakers |
3-Dec | 09:30-11:30 | [T02] From Statistical to Causal Inferences for Time-Series and Tabular Data | Pavel Loskot |
Identification of brain states such as sleep, pain, mental fatigue, and emotions, abnormalities such as autism, cognitive impairment, and ageing, diseases such as seizure, and responses to various stimuli highly applicable to brain-computer interfacing, is a popular, hot, and demanding topic in signal processing and machine learning domains. The first step to achieve this is to understand the brain function and the manifestation of brain mal functioning. The second step is to design the necessary algorithms to highlight such functions and model the changes or abnormalities, and the third step is to score, identify, or classify the brain state or severity of the problem. Tensor factorization, beamforming, connectivity/causality estimation, evaluation of brain dynamics, distributed sensor networks (with application to multi-subject or hyperscanning BCI for rehabilitation purposes), detection of event related potentials (ERPs), and other methods to detect the biomarkers will be briefly referred to. The focus will be on application of tensors, deep learning, and regularized beamforming to epileptic seizures. Detection and localization of interictal epileptiform discharges (IEDs) as well as so-called delayed responses to deep brain stimulation (DBS) from scalp, intracranial, and joint scalp-intracranial EEG recordings will be emphasized. Also, it will be demonstrated how the uncertainty in data labelling can be incorporated into the IED detection algorithm.
SAEID SANEI, a Fellow of British Computer Society (FBCS), received his
PhD in biomedical signal processing from Imperial College London, UK. He
is an internationally known researcher in electroencephalography (EEG)
signal and machine learning. His current research focus is on
application of adaptive and nonlinear signal processing, subspace
analysis, and tensor factorization to EEG, BCI, speech, and medical
signal and images. He has published five books (monograms), several
edited books and book chapters, and over 440 papers in peer reviewed
journals and conference proceedings. He has served as an Associate
Editor for the IEEE Signal Processing Letters, IEEE Signal Processing
Magazine, and Journal of Computational Intelligence and Neuroscience. He
also served as the IEEE MLSP and SPTM Committee Member, and as the Chair
and Organizer of many IEEE prestigious conferences including ICASSP 2019
in Brighton, UK. Currently, he is a Visiting Academic in digital health
to Imperial College London, UK
The vast majority of generated data are longitudinal as they are being produced by all modern systems, which are becoming more and more intelligent and autonomous. These systems require monitoring various quantities in order to infer the past as well as future changes in their inner state, and also to detect the associated internal and external events. The longitudinal data are often represented as multivariate time-series, and are predominantly stored as tabular data in Excel spreadsheets, CSV files and other types of log-files. The traditional statistical methods for processing longitudinal data rely on Gaussianity, Markovian property, and auto-regressive modeling. These methods are attractive for their low-complexity and full interpretability, however, they fail to describe more complex spatio-termporal statistical dependencies, and their assumptions often cannot be satisfied. It is only very recently when the deep learning techniques started to be effective also for processing time-series data and capture their small-scale and large-scale dependencies. The most promising strategy is to combine the traditional statistical methods with more sophisticated deep learning methods, although there are still many unsolved challenges. For instance, the unique properties of tabular data make any statistical learning to be rather difficult. Or, there are procedures for embedding categorical variables in high-dimensional vector spaces, but it ignores cases of categorical random processes. Moreover, it is often important to infer causal explanations from data, i.e., to relate causes to effects to understand why a certain event occurred, or which event is very likely to occur in a near future. The two main objectives of this tutorial are: (1) explain the key ideas in performing statistical inferences for time-series and tabular data involving traditional statistical vs. modern deep-learning methods, (2) explain how to perform causal inferences for time-series data by formulating the relevant statistical inference problems, and how to recognize that causal relationships cannot be identified from given data.
Pavel Loskot joined the ZJU-UIUC Institute in January 2021 as an
Associate Professor after being 14 years with Swansea University in the
UK as the Senior Lecturer. He received his PhD degree in Wireless
Communications from the University of Alberta in Canada, and the MSc and
BSc degrees in Radioelectronics and Biomedical Electronics,
respectively, from the Czech Technical University of Prague in the Czech
Republic. In the past 25 years, he was involved in numerous
collaborative research and development projects with academia and
industry, and also held a number of consultancy contracts. He is the
Senior Member of the IEEE, Fellow of the Higher Education Academy in the
UK, and the Recognized Research Supervisor of the UK Council for
Graduate Education. His current research interests focus on mathematical
and probabilistic modeling, statistical signal processing and machine
learning for multi-sensor data.
This tutorial contributes to the evolving domain of human-centric RF sensing technologies by integrating deep learning methodologies to address human sensing tasks. Amidst rapid advancements in RF and deep learning technologies, our work is timely and essential for advancing non-intrusive, privacy-respecting human-centric applications. This tutorial presents state-of-the-art techniques in multi-person 3D pose estimation and ECG monitoring, facilitated by large-scale, real-world RF sensing datasets. Additionally, we delve into novel signal processing and deep learning integrations, such as temporal-spatial attention and self-supervised learning for unlabeled RF data, highlighting their impact on enhancing RF sensing capabilities. Our review also introduces cutting-edge methodologies and offers comprehensive insights to spur further research in RF-based human sensing technologies. This resource aims to bridge current gaps and guide the signal processing community through the burgeoning landscape of RF sensing applications, thereby catalyzing innovation and expanding the utility of RF sensing in real-world scenarios.
Dr. Chen is a Full Professor and the Vice Dean at the School of Cyber
Science and Technology of USTC. He received B.S., M.Phil., and Ph.D.
degrees from USTC, HKUST, and University of Maryland College Park,
respectively. He has authored 200+ papers and 2 books. He is the
associate editor for IEEE TSIPN and TNSE, and has leadership roles in
multiple conferences, including WCSP, PCM, APSIPA ASC, and ACM MM. His
awards include Best Paper at GLOBECOM 2013 and APSIPA ASC 2020, Best
Student Papers at PCM 2017 and ICASSP 2016, and an Honor Mention at MMSP
2022.
Dr. Zhang is currently a research associate professor at USTC. He
received his Ph.D from UESTC. His research focuses on smart sensing
technologies. He has authored 40+ papers in the field.
Dr. Lu is a postdoctoral researcher at USTC. He received Ph.D from
UESTC. His research focuses on smart sensing and multimedia.
Speech synthesis is an essential technology that enables natural speech communication between humans and robots. With the advancement of deep learning technologies, the naturalness of synthetic speech has remarkably improved and the range of reproducible voice qualities is becoming increasingly diverse. This improvement comes from the development of massive speech corpora and sophisticated machine learning algorithms, such as deep generative modeling and self-supervised learning. In this tutorial, we introduce the foundational technologies that support state-of-the-art speech synthesis methods and discuss the aspects that need further research for additional advancements. We first give a lecture on the basics of speech synthesis. We then survey work on deep learning-based speech synthesis from the perspective of data, learning, and evaluation. Specifically, we focus on two aspects towards building foundation models of speech synthesis: versatility and efficiency. Finally, we discuss some possible applications of recent speech synthesis technologies such as conversational agents and the impact of these applications on human society.
Yuki Saito received the Ph.D. degree in information science and
technology from the Graduate School of Information Science and
Technology, The University of Tokyo, Japan, in 2021. His research
interests include speech synthesis, voice conversion, and machine
learning. He is a member of the Acoustical Society of Japan, IEEE SPS,
and the Institute of Electronics, Information and Communication
Engineers. He was a recipient of eight paper awards, including the 2020
IEEE SPS Young Author Best Paper Award.
Shinnosuke Takamichi received the Ph.D. degree from the Graduate School
of Information Science, Nara Institute of Science and Technology, Japan,
in 2016. He is currently an Associate Professor with Keio University,
Japan. He was the recipient of more than ten paper/achievement awards,
including the 3rd IEEE SPS Japan Young Author Best Paper Award.
Wataru Nakata received the B.E. degree from The University of Tokyo,
Tokyo, Japan, in 2023. He is now a master's student at the University of
Tokyo. His research interests include, speech synthesis, natural
language processing and deep learning. He is a Student Member of the
Acoustical Society of Japan. He recieved the best student presentation
award of ASJ in 2022.