Main content

A 2-Level Granular Network Model for Biological Data Mining Analysis

Show full item record

Title: A 2-Level Granular Network Model for Biological Data Mining Analysis
Author: Gadish, Moshe
Department: School of Computer Science
Program: Computer Science
Advisor: Chiu, David
Abstract: The discovery of relationships between variables in a biological system is pivotal in bioinformatics and systems biology. Biological network inference is complex and new integrative techniques still need to be explored. In this thesis, a 2-level granular approach is proposed to model, analyze, and compare biological networks. The system comprises two levels of granulation with a base data layer, representing the raw data that are transformed into a common discrete type and then analyzed at the first level of analysis for pairwise relationships, based on normalized statistically significant expected mutual information (NSEMI), and at the second level of analysis for convergent associations that identify highly associative nodes based on two different measures of convergent mutual information: cumulative connectivity and high connectivity. The proposed approach is tested by using benchmark data with known network structures from Yeast and E.coli. The level 1 analysis produces predictions with accuracies above 95%. The level 2 analysis, as expected, results in significant improvements in accuracy due to significant reductions in false positives. Next, the benchmark data is applied to test two additional network discovery methods: K2 and BDeu, used for Bayesian networks. Our approach outperforms both Bayesian methods. The level 2 analysis is integrated into two well cited and adopted information based algorithms: CLR and ARACNE. The results demonstrate similar improvements in false positives and accuracy. Finally, the approach is applied to the Enviropig™ datasets, by using data collected from the muscle tissue for both transgenic and conventional samples. Analyses reveal visible differences between the sexes, as well as between the transgenic and conventional pig lines. Also, strong associations are evident among the minerals and the fatty acids, regardless of the sex and pig line. The detected association patterns are observed to be stronger among males and conventional pigs. Key Words: systems biology, granular computing, transgenic, significant expected mutual information, Enviropig™, mutual associations, convergent associations.
Date: 2014-01-31

Files in this item

Files Size Format View Description
Gadish_Moshe_201401_PhD.pdf 3.014Mb PDF View/Open Thesis

This item appears in the following Collection(s)

Show full item record