Skip NavigationSkip to Content

An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses

  1. Author:
    Wells,Daria
    Guo,Amber
    Shao,Wei
    Bale,Michael
    Coffin, John M
    Hughes,Stephen
    Wu,Xiaolin [ORCID]
  2. Author Address

    Cancer Research Technology Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, PO Box B, Frederick, MD, 21702, USA., Advanced Biomedical Computational Science, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, MD, USA., HIV Dynamics and Replication Program, National Cancer Institute Frederick, National Institutes of Health, Frederick, MD, USA., Department of Molecular Biology and Microbiology, Tufts University, Boston, MA, USA., Cancer Research Technology Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, PO Box B, Frederick, MD, 21702, USA. forestwu@mail.nih.gov.,
    1. Year: 2020
    2. Date: Mar 09
    3. Epub Date: 2020 03 09
  1. Journal: BMC genomics
    1. 21
    2. 1
    3. Pages: 216
  2. Type of Article: Article
  3. Article Number: 216
  4. ISSN: 1471-2164
  1. Abstract:

    Background All retroviruses, including human immunodeficiency virus (HIV), must integrate a DNA copy of their genomes into the genome of the infected host cell to replicate. Although integrated retroviral DNA, known as a provirus, can be found at many sites in the host genome, integration is not random. The adaption of linker-mediated PCR (LM-PCR) protocols for high-throughput integration site mapping, using randomly-sheared genomic DNA and Illumina paired-end sequencing, has dramatically increased the number of mapped integration sites. Analysis of samples from human donors has shown that there is clonal expansion of HIV infected cells and that clonal expansion makes an important contribution to HIV persistence. However, analysis of HIV integration sites in samples taken from patients requires extensive PCR amplification and high-throughput sequencing, which makes the methodology prone to certain specific artifacts. Results To address the problems with artifacts, we use a comprehensive approach involving experimental procedures linked to a bioinformatics analysis pipeline. Using this combined approach, we are able to reduce the number of PCR/sequencing artifacts that arise and identify the ones that remain. Our streamlined workflow combines random cleavage of the DNA in the samples, end repair, and linker ligation in a single step. We provide guidance on primer and linker design that reduces some of the common artifacts. We also discuss how to identify and remove some of the common artifacts, including the products of PCR mispriming and PCR recombination, that have appeared in some published studies. Our improved bioinformatics pipeline rapidly parses the sequencing data and identifies bona fide integration sites in clonally expanded cells, producing an Excel-formatted report that can be used for additional data processing. Conclusions We provide a detailed protocol that reduces the prevalence of artifacts that arise in the analysis of retroviral integration site data generated from in vivo samples and a bioinformatics pipeline that is able to remove the artifacts that remain.

    See More

External Sources

  1. DOI: 10.1186/s12864-020-6647-4
  2. PMID: 32151239
  3. PMCID: PMC7063773
  4. WOS: 000521046800001
  5. PII : 10.1186/s12864-020-6647-4

Library Notes

  1. Fiscal Year: FY2019-2020
NCI at Frederick

You are leaving a government website.

This external link provides additional information that is consistent with the intended purpose of this site. The government cannot attest to the accuracy of a non-federal site.

Linking to a non-federal site does not constitute an endorsement by this institution or any of its employees of the sponsors or the information and products presented on the site. You will be subject to the destination site's privacy policy when you follow the link.

ContinueCancel