Seizure Detection from EEG Signals Using a Multimodal Deep Learning Model
MirHosseini, SeyedHasan (2025)
MirHosseini, SeyedHasan
2025
Matematiikan ja tilastollisen data-analyysin maisteriohjelma - Master's Programme in Mathematics and Statistical Data Analytics
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
Hyväksymispäivämäärä
2025-07-31
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202507317975
https://urn.fi/URN:NBN:fi:tuni-202507317975
Tiivistelmä
Seizure is a neurobiological disorder affecting millions of individuals across the globe, making accurate detection from EEG signals crucial for diagnosis and improving patient outcomes. Compared to traditional machine learning approaches typically requiring manual feature extraction, deep learning has advanced seizure detection by automating feature extraction. However, most studies rely on unimodal EEG data, failing to capture the full complexity of brain dynamics.
To address this limitation, this thesis proposes a multimodal deep learning model that integrates both raw EEG signals, processed using LSTM networks, and spectrograms analyzed by Convolutional Neural Networks CNNs. The outputs from these two modalities are then fused and processed through a Transformer architecture to model cross-channel dependencies and classify signals. To evaluate the effectiveness of the proposed model, two unimodal models: an LSTM-Transformer model using raw EEG signals and a CNN-Transformer model using spectrograms, were also developed as baselines.
The results showed that while the unimodal CNN-Transformer model achieved the highest overall accuracy with 90%, highlighting the effectiveness of spectral features, the proposed model exhibited superior class-specific performance, the highest recall for LPD (93%) and the precision for seizure (91%). The findings indicate that combining temporal and spectral modalities can enhance sensitivity and specificity in seizure detection. Future research could explore alternative fusion strategies along with more diverse classes to improve the model’s generalizability.
To address this limitation, this thesis proposes a multimodal deep learning model that integrates both raw EEG signals, processed using LSTM networks, and spectrograms analyzed by Convolutional Neural Networks CNNs. The outputs from these two modalities are then fused and processed through a Transformer architecture to model cross-channel dependencies and classify signals. To evaluate the effectiveness of the proposed model, two unimodal models: an LSTM-Transformer model using raw EEG signals and a CNN-Transformer model using spectrograms, were also developed as baselines.
The results showed that while the unimodal CNN-Transformer model achieved the highest overall accuracy with 90%, highlighting the effectiveness of spectral features, the proposed model exhibited superior class-specific performance, the highest recall for LPD (93%) and the precision for seizure (91%). The findings indicate that combining temporal and spectral modalities can enhance sensitivity and specificity in seizure detection. Future research could explore alternative fusion strategies along with more diverse classes to improve the model’s generalizability.
