Cancer Subtyping Detection using Biomarker Discovery in Multi-Omics Tensor Datasets
| dc.contributor.advisor | Gemperline, Paul | |
| dc.contributor.advisor | Tabrizi, M. H. N | |
| dc.contributor.author | Koleini, Farnoosh | |
| dc.contributor.department | Chemistry | |
| dc.contributor.department | Computer Science | |
| dc.date.accessioned | 2023-09-14T13:04:04Z | |
| dc.date.available | 2023-09-14T13:04:04Z | |
| dc.date.created | 2023-07 | |
| dc.date.issued | 2023-08-10 | |
| dc.date.submitted | July 2023 | |
| dc.date.updated | 2023-09-12T17:51:31Z | |
| dc.degree.department | Chemistry | |
| dc.degree.department | Computer Science | |
| dc.degree.discipline | MS-Chemistry | |
| dc.degree.grantor | East Carolina University | |
| dc.degree.level | Masters | |
| dc.degree.name | M.S. | |
| dc.description.abstract | This thesis begins with a thorough review of research trends from 2015 to 2022, examining the challenges and issues related to biomarker discovery in multi-omics datasets. The review covers areas of application, proposed methodologies, evaluation criteria used to assess performance, as well as limitations and drawbacks that require further investigation and improvement. This comprehensive overview serves to provide a deeper understanding of the current state of research in this field and the opportunities for future research. It will be particularly useful for those who are interested in this area of study and seeking to expand their knowledge. In the second part of this thesis, a novel methodology is proposed for the identification of significant biomarkers in a multi-omics colon cancer dataset. The integration of clinical features with biomarker discovery has the potential to facilitate the early identification of mortality risk and the development of personalized therapies for a range of diseases, including cancer and stroke. Recent advancements in "omics" technologies have opened up new avenues for researchers to identify disease biomarkers through system-level analysis. Machine learning methods, particularly those based on tensor decomposition techniques, have gained popularity due to the challenges associated with integrative analysis of multi-omics data owing to the complexity of biological systems. Despite extensive efforts towards discovering disease-associated biomolecules by analyzing data from various "omics" experiments, such as genomics, transcriptomics, and metabolomics, the poor integration of diverse forms of 'omics' data has made the integrative analysis of multi-omics data a daunting task. Our research includes ANOVA simultaneous component analysis (ASCA) and Tucker3 modeling to analyze a multivariate dataset with an underlying experimental design. By comparing the spaces spanned by different model components we showed how the two methods can be used for confirmatory analysis and provide complementary information. we demonstrated the novel use of ASCA to analyze the residuals of Tucker3 models to find the optimum one. Increasing the model complexity to more factors removed the last remaining ASCA detectable structure in the residuals. Bootstrap analysis of the core matrix values of the Tucker3 models used to check that additional triads of eigenvectors were needed to describe the remaining structure in the residuals. Also, we developed a new simple, novel strategy for aligning Tucker3 bootstrap models with the Tucker3 model of the original data so that eigenvectors of the three modes, the order of the values in the core matrix, and their algebraic signs match the original Tucker3 model without the need for complicated bookkeeping strategies or performing rotational transformations. Additionally, to avoid getting an overparameterized Tucker3 model, we used the bootstrap method to determine 95% confidence intervals of the loadings and core values. Also, important variables for classification were identified by inspection of loading confidence intervals. The experimental results obtained using the colon cancer dataset demonstrate that our proposed methodology is effective in improving the performance of biomarker discovery in a multi-omics cancer dataset. Overall, our study highlights the potential of integrating multi-omics data with machine learning methods to gain deeper insights into the complex biological mechanisms underlying cancer and other diseases. The experimental results using NIH colon cancer dataset demonstrate that the successful application of our proposed methodology in cancer subtype classification provides a foundation for further investigation into its utility in other disease areas. | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.uri | http://hdl.handle.net/10342/13136 | |
| dc.language.iso | en | |
| dc.publisher | East Carolina University | |
| dc.subject | Biomarker Discovery | |
| dc.subject | Multi-omics data | |
| dc.subject | Tensor Decompositions | |
| dc.subject.lcsh | Colon (Anatomy)--Cancer | |
| dc.subject.lcsh | Biochemical markers | |
| dc.subject.lcsh | Cancer--Diagnosis | |
| dc.title | Cancer Subtyping Detection using Biomarker Discovery in Multi-Omics Tensor Datasets | |
| dc.type | Master's Thesis | |
| dc.type.material | text | 
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- KOLEINI-MASTERSTHESIS-2023.pdf
- Size:
- 8.38 MB
- Format:
- Adobe Portable Document Format
