Research on Machine Learning / Deep Learning applications on Human Diseases and Natural Language
contacts:
These are ongoing projects in my PhD study
Supervised feature selection for high-dimensional biological data is a critical component in the development of accurate diagnostic/prognostic molecular classifiers for complex diseases. Wrapper methods and other embedded techniques closely linked to learning algorithms have been widely applied to this task, while feature selection methods incorporating prior biological knowledge are less commonly used. In addition, these knowledge-based feature selection methods have the potential to simultaneously improve classification performance as well as model interpretability.
Post-Traumatic Stress Disorder (PTSD) is a psychiatric disorder with the symptoms of anxiety, flashbacks, hypervigilance and cognitive deficits. It occurs to around 6.8% US adult population, especially high prevalence as much as 22% in Iraq and Afghanistan veterans. The current diagnosis at PTSD is based on CAPS-5, which does survey on PTSD critical symptoms and makes assessment from the total score. This diagnosis has potential inaccuracy from imprecise information collection and interpretation. Therefore, it is necessary to develop a PTSD diagnosis model at the molecular level and identify biomarkers.
Human genomes are complex and regulated at multiple levels, while multiple omic data integration provides complementary information to decipher coherent biological signatures from multiple level information. As well, data integration has the potential to extract biologically meaningful information of clinical relevance. Subgrouping patients makes it possible to be treated efficiently, toward to personalized medicine. No omic data integration based patient subgrouping has applied on PTSD.
Deep learning (also deep neural networks, DNNs) belongs to a broad family of machine learning approaches on learning data representation, is the application of artificial neural networks (ANNs) to learning tasks that contain more than one hidden layer. Deep learning algorithms engage a layer, hierarchical architecture of learning and representing data, where higher-level layers are higher encompassing information and low-level layers are more general features. The layer based nonlinear feature extractors feature extraction often yields better machine learning results. Deep learning being capable of extracting high level abstractions from heterogeneous, high-dimensional data sets has been successful applications in biological and medical research areas.
Tumor stage is a clinical indicator of survival. However, In this project, we intend to predict survival in cancer from genomic data directly. We make use of deep learning models to automatically extract features and train a classification model of stages. Then we use this model to build a CoxPH model. In this way, we can predict survival to identify high and low risk patients.
These is a project in CS MS program