Data model to understand recovery phase on epilepsy patients

Data Driven approach

Data Scientist: Dr. Juan Ignacio Barrios, MD MSc,

Doctoral Thesis by Susana Lara . Germany -Barcelona 2021

DESCRIPTIVE ANALYTICS

Distribution analysis CT with respect to sex and age

Distribution analysis CT with respect to laterality

Distribution analysis CT with respect to Behavior before

Distribution analysis CT with respect to sex

Correlation comparison for main features

Transforming - Adding age groups

Machine Learning section - Supervised training -

Predicting Conscious time using Random Forest Regressor

Tree Map -Hierarchical main Features array (location, sex, age )

Non supervised algorhitms -K means clustering method

Feature importances from k means data

hierarchial clustering

Agglomerative hierarchical clustering differs from k-means in a key way. Rather than choosing a number of clusters and starting out with random centroids, we instead begin with every point in our dataset as a “cluster.” Then we find the two closest points and combine them into a cluster. Then, we find the next closest points, and those become a cluster. We repeat the process until we only have one big giant cluster.

Now we know the number of clusters for our dataset, the next step is to group the data points into these four clusters. To do so we will use the AgglomerativeClustering class of the sklearn.cluster library. Take a look at the following script

Using Kbest to select the most representative feature in the dataset

Calculating incidence of main features into groups