NextMove Software
  • Home
  • Blog
  • News
  • Talks
  • Events
  • About Us
  • Careers
  • ELNs & Reactions
  • Patents/TextMining
  • Biologics
  • Similarity & Search
 
General Inquiries: info@nextmovesoftware.com
Support: support@nextmovesoftware.com

Sugar & Splice

Cheminformatics of oligopeptides, oligosaccharides and oligonucleotides

Biological macromolecules, such as peptides, proteins, RNA aptamers, carbohydrates and drug-antibody conjugates, pose unique challenges that stretch typical cheminformatics systems to their limits. Not only their size, but also their complexity and repeated substructures, make many of the standard approaches used for small molecules, such as fingerprint-based similarity or all-atom 2D co-ordinate depiction inappropriate for these classes of compounds. As a result, pharmaceutical and biotechnology companies working with biologics often different registration, compound management, ELN and analysis systems to those used for small molecule chemistry.

NextMove Software's Sugar & Splice toolkit and suite of tools are designed to bridge the gulf between cheminformatics and bioinformatics by providing functionality for seamlessly integrating the representations used in each domain. All-atom representations such as SMILES or MOL can be converted to IUPAC condensed line-notations or sequences (and vice versa). Non-standard residues are supported and the list can be extended by the user. This approach allows BLAST searching and sequence alignment of a chemical database stored as SMILES or MDL connection tables, or SMARTS substructure searching of protein databases, including non-standard amino acids and post-translational modifications.

Sugar & Splice allows supports the generation of depictions based on IUPAC (Fig. 1) or FDA recommendations for peptides, and the SNFG guidelines for sugars.


Figure 1. Sugar & Splice depiction of Iturelix given its SMILES string as input

Further info
  • A presentation describing the combination of LeadMine and Sugar & Splice to identify and extract peptides from PubMed Abstracts presented at the ACS meeting in Boston, August 2018 [PDF]
  • A presentation describing the use of Sugar & Splice to identify and analyse biologics in PubChem presented at the ACS meeting in Washington DC, August 2017 [PDF]
  • A presentation describing Sugar & Splice's naming of named peptide derivatives presented at the ACS meeting in Boston, August 2015 [PDF]
  • A presentation describing Sugar & Splice's naming of unusual backbones and sidechain bridges presented at the ACS meeting in San Francisco, August 2014 [PDF]
  • A presentation describing Sugar & Splice's naming of non-standard amino acids presented at the ACS meeting in Dallas, March 2014 [PDF]
  • A presentation describing Sugar & Splice presented at the American Chemical Society (ACS) National Meeting in New Orleans, April 2013 [PDF]
  • A presentation describing Sugar & Splice's peptide perception and depiction presented at the American Chemical Society (ACS) National Meeting in San Diego, March 2012 [PDF]
Arthor provides fast state-of-the-art substructure and chemical similarity search capabilities for ultra-large databases of hundreds of millions of compounds, using SMARTS optimization, Just-In-Time compilation and/or GPUs.
CaffeineFix is used to rapidly match chemical names or terms against a dictionary or grammar (e.g. a grammar for IUPAC names). As well as use in text-mining, it can be used to provide autocomplete functionality and spell-correction.
Casandra is a server for delivering real time safety warnings of experimental hazards straight to the pharmaceutical electronic laboratory notebooks (ELNs).

HazELNut is a suite of tools used to extract, normalize and analyse information in Electronic Lab Notebooks (ELNs). This can be used to implement a search interface, find/eliminate duplicates, find similar reactions and so on.
LeadMine extracts chemical names and terms from text. It incorporates NextMove's CaffeineFix technology to find terms that match appropriate dictionaries or grammars. It has enhanced functionality to handle the patent literature.
Matsy is a set of tools for creating and analysing Matched Molecular Series (the general form of Matched Molecular Pairs). In particular, it can be used to suggest what compound to make next in a Medicinal Chemistry program.
MPSearch rapidly searches a database to find Matched Pairs related to a query molecule. This type of search is used to explore previous medicinal chemistry strategies.
NameRXN is used to classify and name reactions. It is particular useful in the context of ELN analysis but also as a plugin to chemical drawing software. NameRXN builds on NextMove Software's Patsy technology.
Patsy is used to speed up SMARTS pattern matching by creating optimized SMARTS patterns or source code. Speed gains are particularly large when multiple SMARTS patterns are matched against a single structure.
Pistachio is a reaction dataset browser providing loading, querying, and analytics of chemical reactions. With over 9 million chemical reactions extracted from US & EPO patents, it demonstrates an AI interface to faceted (structure) search
SmallWorld is an index of chemical space based on more than 230 billion molecular substructures. It can be used to measure similarity based on graph-edit distance, find the MCS of two or more molecules, analyse HTS results and much more.
Sugar & Splice can be used to perceive and depict biopolymer structure. It makes it easy to interconvert between small-molecule representations (e.g. SMILES, MOL) and biopolymer representations (HELM, IUPAC line notation).
©2023 NextMove Software. All rights reserved.