SeizurEShield - Empowering Epilepsy Care

Project Statement

About 1 in 10 people will experience a seizure in their lifetime, and for some, these seizures become a chronic problem called epilepsy. Epilepsy is characterized by recurrent seizures that can range in both frequency and severity and currently affects an estimated 50 million people worldwide, making it one of the most common neurological disorders globally next to Alzheimer’s (55 million) and Parkinson’s (10 million). It’s caused by a variety of things like traumatic brain injuries, strokes, genetic inheritance, and even severe infections, making it difficult to prevent.

However, what makes epilepsy so difficult to live with is its unpredictability. While approximately 65% of people with generalized epilepsy report “getting auras,” or feelings that can warn you if a seizure is about to happen, many people don’t experience those – there’s also 4 other types of epilepsy in addition. This means that at any given point in time a seizure could just happen with very little warning, and the impact of that seizure could range from time loss (absence seizures) to convulsions, complete unconsciousness, and immediate hospitalization (grand-mal seizures).

Ultimately, this problem presents both quality of life (always needing to be accompanied, restrictions on driving/etc.) and safety concerns (what if you’re standing when a seizure hits and you fall/get injured?) that can infringe on a person’s ability to even do something like going outside for a walk by themselves.

Most medicines do not have a complete 100% success rate for treating medical problems, and medications used to treat epilepsy are no exception to this rule. Thus, we’d like to propose the use of deep or machine learning techniques to first attempt to detect, and then possibly predict, seizure activity from EEG recordings of the human brain.

Project Outline

Our project, SeizurEShield, focuses on developing an algorithm to detect seizure activity from EEG recordings using advanced machine learning techniques. Key features of our project include:

Data Preprocessing Pipeline: Extracting and normalizing features from raw EEG signals.
Machine Learning Models: Implementing logistic regression, SVM, CNN, and/or LSTM models for seizure detection.
Frontend Development: Allowing users to upload EEG files and view results on our website.
Backend API Development: Creating a Flask application to enable endpoints for real-time data processing and seizure detection.
Evaluation and Optimization: Using performance metrics to tune and optimize our models.

Our project aims to provide accurate seizure detection insights to aid in timely medical intervention and reduce the manual effort required by neurologists.

Technical Approach

Data

The data set used included 4000+ different recordings housed in separate EDF files. These files also included annotations denoting when there was a seizure or not. The data came from people of all ages and was evenly distributed between ages and genders. The data pipeline we created was to first convert these EDF files into CSV files, included in this process was creating timestamps and then using these timestamps to join to the annotated files. Once the data was in that format we then proceeded to standardize the montage pairs so that we are able to get the readings at the individual nodes.

Model Development

For the features which are the most important are the standardized readings at each specific timestamp. We tested a few different approaches for modeling this data, we tested a SVM model approach, a RNN model approach, and a RESNET Transfer Learning approach. We ended up selecting the RNN model approach since this had the best metrics that we were evaluating against. We found that the model worked well in identifying when there was not a seizure since we had a very low false positive rate with a high accuracy which was a function of most of the data not including seizures. Where the model didn’t work as well was in identifying seizures which was illustrated by the higher false negative rate which we were able to get to a 35%.

When training the model we included a test train split and evaluated both the train and the test set for a list of common metrics, the metrics we cared about were: accuracy, loss, false positive rate, and false negative rate. We compared against different models that we were testing and selected the best final model while prioritizing the false negative rate. We prioritized this metric due to discussions with people in industry saying this would be the most important metric that they care about in our case. Our initial approaches had a higher false negative rate and we would focus on minimizing this rate when selecting the final model.

These model insights are very helpful to neurologists studying these EDF files where we are able to get insights very quickly and save them a lot of time when analyzing these studies. This model output is a good start for identifying when there is a seizure or not and allows for much faster turnaround to analyze these EDF files which would allow for more access to care for disadvantaged people.

Challenges

There were a few different challenges we faced during this, the first one was figuring out how to get the data into a format that a ML tool could be trained on. We attempted adding all of these files into a single CSV but found that was not practical, we then settled on converting each EDF into a different annotated CSV file. Our next technical challenge was figuring out how to standardize each of the different montages since there were different columns for each different montage. The technical innovation for both of these approaches is different to what other groups have done before in previous work since other groups would take one of the two things we worked on but instead we combined these approaches which is novel in comparison to other approaches. The challenge we ran into was minimizing the false negative rate, we found that our false negative rate was quite high so we set out to minimize this. To address this we added a layer to the RNN model where we added a LSTM layer which was able to drastically reduce the false negative rate. This is an innovation since previous work only included the RNN model without any LSTM.

Next Steps

If we would have more time and money we would focus on minimizing the false negative rate through 2 approaches. The first would be to increase the sequence length that the RNN is considering and performing some tuning on this. This would allow the model to have more context for the predictions that it is trying to take into account. The next thing we would have wanted to do is perform more in depth hyperparameter tuning, we would have considered different optimizers, changing the learning rate, and finally testing different number of layers that are included in the model.

Our Team

Bo He

Role: Infrastructure & Data Engineering Lead

Bio: Bo is a life long learner currently works as an investment engineer at Bridgewater. Outside of work and school, he enjoys reading, swimming and traveling.

Anna Li

Role: EDA Lead

Bio: Anna is an aspiring data scientist who currently works as a data analyst/data engineer at Boeing. Outside of work and school, she enjoys climbing, listening to music, and eating 🙂

Alek Lichucki

Role: ML Engineer Lead

Background: Alek is a data consultant with Slalom working across data science, data engineering, and data visualization. Outside of work and school he enjoys traveling, climbing, video games, and catering to his dog Bagels many needs.

Maegan Kornexl

Role: Project Manager & SME

Bio: Maegan is a data analyst and machine learning engineer currently working on chemical detection projects at UES Inc. She loves eating tasty food and playing exploration-based video games, and owes her continued sanity to her two cats, Daisy and Pippin.

Arshia Sharma

Role: Chief Facilitator & MVP App Developer

Bio: Arshia is a data scientist at The Walt Disney Company and works on multiple projects across the data science & software development space. When she has time outside of work & school, she is an avid reader and enjoys finding good coffee shops.

Demo

Contact Us

If you have any questions or would like to learn more about our project, please reach out to us by selecting on our names above!

Our Mission

Welcome to SeizurEShield, where we harness the power of machine learning to provide advanced solutions for epilepsy care.