Skip NavigationSkip to Content

Detecting genomic deletions from high-throughput sequence data with unsupervised learning

  1. Author:
    Li,Xin [ORCID]
    Wu, Yufeng
  2. Author Address

    Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, 20892, USA. xin.li4@nih.gov., Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc, Frederick, MD, 21702, USA. xin.li4@nih.gov., Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, 06269, USA.,
    1. Year: 2023
    2. Date: Jan 27
    3. Epub Date: 2023 01 27
  1. Journal: BMC Bioinformatics
    1. 23
    2. Suppl 8
    3. Pages: 568
  2. Type of Article: Article
  3. Article Number: 568
  1. Abstract:

    Structural variation (SV), which ranges from 50 bp to [Formula: see text] 3 Mb in size, is an important type of genetic variations. Deletion is a type of SV in which a part of a chromosome or a sequence of DNA is lost during DNA replication. Three types of signals, including discordant read-pairs, reads depth and split reads, are commonly used for SV detection from high-throughput sequence data. Many tools have been developed for detecting SVs by using one or multiple of these signals. In this paper, we develop a new method called EigenDel for detecting the germline submicroscopic genomic deletions. EigenDel first takes advantage of discordant read-pairs and clipped reads to get initial deletion candidates, and then it clusters similar candidates by using unsupervised learning methods. After that, EigenDel uses a carefully designed approach for calling true deletions from each cluster. We conduct various experiments to evaluate the performance of EigenDel on low coverage sequence data. Our results show that EigenDel outperforms other major methods in terms of improving capability of balancing accuracy and sensitivity as well as reducing bias. EigenDel can be downloaded from https://github.com/lxwgcool/EigenDel . © 2023. The Author(s).

    See More

External Sources

  1. DOI: 10.1186/s12859-023-05139-w
  2. PMID: 36707775
  3. PMCID: PMC9881243
  4. PII : 10.1186/s12859-023-05139-w

Library Notes

  1. Fiscal Year: FY2022-2023
NCI at Frederick

You are leaving a government website.

This external link provides additional information that is consistent with the intended purpose of this site. The government cannot attest to the accuracy of a non-federal site.

Linking to a non-federal site does not constitute an endorsement by this institution or any of its employees of the sponsors or the information and products presented on the site. You will be subject to the destination site's privacy policy when you follow the link.

ContinueCancel