Skip NavigationSkip to Content

SAVI, in silico generation of billions of easily synthesizable compounds through expert-system type rules

  1. Author:
    Patel,Hitesh [ORCID]
    Ihlenfeldt, Wolf-Dietrich [ORCID]
    Judson, Philip N [ORCID]
    Moroz, Yurii S [ORCID]
    Pevzner, Yuri [ORCID]
    Peach,Megan [ORCID]
    Delannee,Victorien [ORCID]
    Tarasova,Nadya [ORCID]
    Nicklaus,Marc [ORCID]
  2. Author Address

    Computer-Aided Drug Design Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, MD, 21702, USA., Xemistry GmbH, Schwalbenweg 5, D-61479, Glash 252;tten, Germany., Heather Lea, Bland Hill, Norwood, Harrogate, HG3 1TE, England., Enamine Ltd, 78 Chervonotkatska Street, Suite 1, Kyiv, 02094, Ukraine and Chemspace LLC, 85 Chervonotkatska Street, Suite 1, Kyiv, 02094, Ukraine., AbbVie, Inc., North Chicago, IL, 60064, USA., Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD, 21702, USA., Synthetic Biologics and Drug Discovery Group, Laboratory of Cancer Immunometabolism, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, MD, 21702, USA., Computer-Aided Drug Design Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, MD, 21702, USA. mn1@mail.nih.gov.,
    1. Year: 2020
    2. Date: NOV 11
    3. Epub Date: 2020 11 11
  1. Journal: Scientific data
    1. 7
    2. 1
    3. Pages: 384
  2. Type of Article: Article
  3. Article Number: 384
  4. ISSN: 2052-4463
  1. Abstract:

    We have made available a database of over 1 billion compounds predicted to be easily synthesizable, called Synthetically Accessible Virtual Inventory (SAVI). They have been created by a set of transforms based on an adaptation and extension of the CHMTRN/PATRAN programming languages describing chemical synthesis expert knowledge, which originally stem from the LHASA project. The chemoinformatics toolkit CACTVS was used to apply a total of 53 transforms to about 150,000 readily available building blocks (enamine.net). Only single-step, two-reactant syntheses were calculated for this database even though the technology can execute multi-step reactions. The possibility to incorporate scoring systems in CHMTRN allowed us to subdivide the database of 1.75 billion compounds in sets according to their predicted synthesizability, with the most-synthesizable class comprising 1.09 billion synthetic products. Properties calculated for all SAVI products show that the database should be well-suited for drug discovery. It is being made publicly available for free download from https://doi.org/10.35115/37n9-5738.

    See More

External Sources

  1. DOI: 10.1038/s41597-020-00727-4
  2. PMID: 33177514
  3. WOS: 000593906800006
  4. PII : 10.1038/s41597-020-00727-4

Library Notes

  1. Fiscal Year: FY2020-2021
NCI at Frederick

You are leaving a government website.

This external link provides additional information that is consistent with the intended purpose of this site. The government cannot attest to the accuracy of a non-federal site.

Linking to a non-federal site does not constitute an endorsement by this institution or any of its employees of the sponsors or the information and products presented on the site. You will be subject to the destination site's privacy policy when you follow the link.

ContinueCancel