Theoretical Analysis of CNNs for Automatic Seizure Detection in EEG Signals

Official Publication: For the complete published thesis, please visit the Publications page to download from the UCF STARS Digital Repository.

Overview

This project represents my Undergraduate Honors Thesis at UCF, where I bridged the gap between deep learning application and mathematical theory to detect seizure activity in EEG signals. The research was presented at the Burnett Honors College Family Weekend and is currently under review for publication.

Status: Published

Presented: September 2025

Defended: Fall 2025

Research Question

Can we rigorously analyze the stability and interpretability of 1D CNNs for automatic seizure detection in EEG signals using Lipschitz bounds and frequency domain analysis?

Technical Architecture

Data Pipeline

University of Bonn EEG Dataset
         ↓
Butterworth Bandpass Filter (0.5-50 Hz)
         ↓
Discrete Fourier Transform (DFT)
         ↓
Feature Engineering
         ↓
SMOTE (Class Balancing)
         ↓
1D CNN Model (PyTorch)

Model Architecture

1D Convolutional Neural Network

Input Layer: 4097 time points (23.6 seconds @ 173.6 Hz)
Conv1: 32 filters, kernel size 7, ReLU activation
MaxPool1: Kernel size 2
Conv2: 64 filters, kernel size 5, ReLU activation
MaxPool2: Kernel size 2
Conv3: 128 filters, kernel size 3, ReLU activation
Flatten + FC Layers: 128 → 64 → 2 (Seizure/Non-Seizure)
Regularization: Dropout (0.5), L2 Weight Decay

Training Configuration:

Optimizer: Adam (lr=0.001)
Loss Function: Cross-Entropy
Batch Size: 32
Epochs: 50 with early stopping
Device: Apple Silicon (MPS)

Key Results

Performance Metrics

Metric	Value
Accuracy	97.0%
AUC-ROC	0.99
Precision	96.8%
Recall	97.2%
F1-Score	97.0%

Scientific Discoveries

Primary Learned Feature: The model predominantly learned the 22 Hz beta-wave band as the key discriminative feature for seizure detection
Lipschitz Stability: Established theoretical bounds with best stability L = 24.72 (B vs. E task), ensuring robust predictions under EEG signal noise
Frequency Domain Interpretation: DFT analysis revealed specific frequency bands (18-26 Hz) driving classification decisions

Theoretical Contributions

Lipschitz Bound Analysis

For a function f: ℝⁿ → ℝᵐ, the Lipschitz constant K satisfies:

‖f(x₁) - f(x₂)‖ ≤ K‖x₁ - x₂‖

Applied to CNNs:

Bounded the Lipschitz constant of each layer (convolutions, pooling, activation)
Proved compositional stability: L_total ≤ L₁ · L₂ · … · Lₙ
Estimated network bounds across 7 classification tasks (L = 24.72 to 58.32)
Best stability: L = 24.72 for B vs. E (Healthy, Eyes Closed vs. Ictal)
Most challenging: L = 58.32 for combined (A+B+C+D) vs. E task

Practical Implication: The model remains stable even with EEG signal noise and artifacts common in clinical settings. Lower Lipschitz bounds indicate better robustness to input perturbations.

Implementation Highlights

Butterworth Filtering Pipeline

from scipy.signal import butter, filtfilt

def butterworth_filter(signal, lowcut=0.5, highcut=50, fs=173.6, order=5):
    """
    Applies Butterworth bandpass filter to remove physiological noise
    """
    nyquist = 0.5 * fs
    low = lowcut / nyquist
    high = highcut / nyquist
    b, a = butter(order, [low, high], btype='band')
    return filtfilt(b, a, signal)

Rationale: Removes DC drift (<0.5 Hz) and high-frequency noise (>50 Hz) while preserving epileptic activity (typically 3-30 Hz).

SMOTE for Class Imbalance

from imblearn.over_sampling import SMOTE

smote = SMOTE(sampling_strategy='auto', random_state=42, k_neighbors=5)
X_train_resampled, y_train_resampled = smote.fit_resample(X_train, y_train)

Impact: Balanced dataset from 60/40 split to 50/50, improving recall from 89% → 97%.

Frequency Domain Analysis (DFT)

import numpy as np

def compute_power_spectrum(signal, fs=173.6):
    """
    Computes power spectral density using FFT
    """
    fft_vals = np.fft.fft(signal)
    fft_freq = np.fft.fftfreq(len(signal), 1/fs)
    power = np.abs(fft_vals) ** 2
    return fft_freq[:len(fft_freq)//2], power[:len(power)//2]

Finding: Seizure signals showed 3.2x higher power in the 18-26 Hz range compared to non-seizure signals.

Dataset

University of Bonn EEG Dataset

Source: Epilepsy Center, University of Bonn, Germany
Channels: 100 single-channel EEG recordings
Classes:
- Set Z: Healthy subjects, eyes open (n=100)
- Set S: Seizure activity (n=100)
Duration: 23.6 seconds per recording
Sampling Rate: 173.6 Hz

Challenges Overcome

Small Dataset: Mitigated with SMOTE and aggressive data augmentation
High Dimensionality: Reduced via frequency domain feature extraction
Computational Constraints: Optimized for Apple Silicon (MPS) with mixed-precision training
Theoretical Rigor: Bridged gap between empirical ML and formal mathematical analysis

Future Work

Multi-Channel Analysis: Extend to full 10-20 EEG montages (19+ channels)
Real-Time Deployment: Optimize for edge devices (Raspberry Pi, mobile)
Clinical Validation: Test on larger datasets (CHB-MIT, TUH EEG)
Explainability: Integrate Grad-CAM for time-domain visualization
Transfer Learning: Pre-train on TUH, fine-tune on patient-specific data

Impact & Recognition

✅ Thesis Defense: Successfully defended before faculty committee
🎤 Presentation: Selected for Burnett Honors College Family Weekend (Sept. 2025)
📄 Published: Available in UCF STARS Digital Repository
🏆 Honors Candidate: Recognized for bridging theory and application

Links & Resources

💻 GitHub Repository
📧 Contact for Collaboration

Technologies Used

Python • PyTorch • NumPy • SciPy • Scikit-learn • SMOTE • Jupyter • Matplotlib

Citation

If you use this work in your research, please cite:

```bibtex @thesis{small2025cnn, author = {Small, Jackson T.}, title = {Theoretical Analysis of CNNs for Automatic Seizure Detection in EEG Signals}, school = {University of Central Florida}, year = {2025}, type = {Honors Undergraduate Theses}, number = {462}, url = {https://stars.library.ucf.edu/hut2024/462} } ```

Jackson Small