Skip NavigationSkip to Content

Integrating data and knowledge to identify functional modules of genes: a multilayer approach

  1. Author:
    Liang, Lifan
    Chen,Vicky
    Zhu, Kunju
    Fan, Xiaonan
    Lu, Xinghua
    Lu, Songjian
  2. Author Address

    Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA., Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc, Frederick, USA., Clinical Medicine Research Institute, Jinan University, Guangzhou, 51063, Guangdong, China., Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi 39;an, 710072, Shanxi, China., Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA. songjian@pitt.edu.,
    1. Year: 2019
    2. Date: May 02
    3. Epub Date: 2019 05 02
  1. Journal: BMC bioinformatics
    1. 20
    2. 1
    3. Pages: 225
  2. Type of Article: Article
  3. Article Number: 225
  4. ISSN: 1471-2105
  1. Abstract:

    Characterizing the modular structure of cellular network is an important way to identify novel genes for targeted therapeutics. This is made possible by the rising of high-throughput technology. Unfortunately, computational methods to identify functional modules were limited by the data quality issues of high-throughput techniques. This study aims to integrate knowledge extracted from literature to further improve the accuracy of functional module identification. Our new model and algorithm were applied to both yeast and human interactomes. Predicted functional modules have covered over 90% of the proteins in both organisms, while maintaining a comparable overall accuracy. We found that the combination of both mRNA expression information and biomedical knowledge greatly improved the performance of functional module identification, which is better than those only using protein interaction network weighted with transcriptomic data, literature knowledge, or simply unweighted protein interaction network. Our new algorithm also achieved better performance when comparing with some other well-known methods, especially in terms of the positive predictive value (PPV), which indicated the confidence of novel discovery. Higher PPV with the multiplex approach suggested that information from both sources has been effectively integrated to reduce false positive. With protein coverage higher than 90%, our algorithm is able to generate more novel biological hypothesis with higher confidence.

    See More

External Sources

  1. DOI: 10.1186/s12859-019-2800-y
  2. PMID: 31046665
  3. WOS: 000466873500004
  4. PII : 10.1186/s12859-019-2800-y

Library Notes

  1. Fiscal Year: FY2018-2019
NCI at Frederick

You are leaving a government website.

This external link provides additional information that is consistent with the intended purpose of this site. The government cannot attest to the accuracy of a non-federal site.

Linking to a non-federal site does not constitute an endorsement by this institution or any of its employees of the sponsors or the information and products presented on the site. You will be subject to the destination site's privacy policy when you follow the link.

ContinueCancel