On the other other hand

handsMy earlier “On the other hand” blog post considered some of the issues of representing D- amino acids. In this post, I discuss the representation of amino acids with sidechain stereochemistry in nomenclature and peptide registration systems. Handling of chiral sidechains is potentially tricky and non-trivial, as indicated by the Pistoia Alliance’s HELM editor which restricts the user to only 17 (of the 19) standard D-form amino acids, explicitly prohibiting the specification D-threonine and D-isoleucine.

Threonine (Thr) and Isoleucine (Ile)

The most frequently encountered cases of sidechain stereochemistry occur in the naturally occurring amino acids threonine and isoleucine, which each contain a chiral carbon atom at their beta carbon position.

subst_6 L-Thr  aka (2S,3R)    PDB Code: THR CID6288
subst_6 L-Ile aka (2S,3S)    PDB Code: ILE

CID6306

By convention, the D-forms of these amino acids flip both stereocenters.

subst_6 D-Thr   aka (2R,3S)  PDB Code: DTR CID69435
subst_6 D-Ile  aka (2R,3R)   PDB Code: DIL CID76551

The forms of these amino acids where just the sidechain stereochemistry is inverted are referred to as “allo-” forms, allothreonine (written aThr or alloThr) and alloisoleucine (written aIle or alloIle).

subst_6 L-aThr  aka (2S,3S)   PDB Code: ALO CID99289
subst_6 D-aThr   aka (2R,3R)  PDB Code: 2TL CID90624
subst_6 L-aIle aka (2S,3R)  PDB Code: IIL CID99288
subst_6 D-aIle  aka (2R,3S)  PDB Code: ??? CID94206

Things really get interesting when stereochemistry is unspecified (either a racemate or unresolved chiral center) at either of these stereocenters.  This is not uncommon when working with SMILES strings or MOL files, but almost always indicates some loss of information as the biology/chemistry will nearly universally refer to one of the four fully specified steroisomers above.

Perhaps the easiest case to denote is the case of unspecified tetrahedral stereochemistry at the alpha carbon position, for which the “DL-” prefix is conventionally used.

subst_6 DL-Thr  aka (2?,3R) CID17757244
subst_6 DL-aThr  aka (2?,3S) CID17757249
subst_6 DL-Ile aka (2?,3S) CID10396882
subst_6 DL-aIle aka (2?,3R) CID17757247

A less widely appreciated convention, is the use of the Greek letter xi (ξ) in amino acid and natural product nomenclature, for chiral centers of unknown configuration (3AA-4.5).  Here I propose the use of the prefix “xi” or “xi-” in an identical way to “allo” or “allo-” to produce xi-threonine (xiThr) and xi-isoleucine (xiIle) when the beta
carbon stereochemistry is undefined/unspecified.

subst_6 L-xiThr  aka (2S,3?) CID11768555
subst_6 D-xiThr  aka (2R,3?) CID6399258
subst_6 DL-xiThr  aka (2?,3?) CID205
subst_6 L-xiIle aka (2S,3?) CID5351546
subst_6 D-xiIle aka (2R,3?) CID11051686
subst_6 DL-xiIle aka (2?,3?) CID791

4-Hydroxyproline, Hyp

An example of a non-natural (but frequently occurring) amino acid with sidechain stereochemistry is “4-hydroxyproline”.  Here the symbol Hyp is understood to refer to the more common trans- form, so the prefix “cis” or “cis-” is use to refer to the alternate configuration, such as the symbol “cis-Hyp”.

subst_6 L-Hyp aka (2S,4R)    PDB Code: HYP CID5810
subst_6 L-cisHyp aka (2S,4S)  PDB Code: HZP CID440015
subst_6 D-Hyp aka (2R,4S)    PDB Code: ??? CID440074
subst_6 D-cisHyp aka (2R,4R)   PDB Code: ??? CID440014

Once again unspecified configurations at the alpha- and gamma- carbon locants of Hyp can be described by “DL-” and “xi-” prefixes as before.

subst_6 DL-Hyp aka (2?,4R) CID54196981
subst_6 DL-cisHyp aka (2?,4S) CID21353534
subst_6 L-xiHyp aka (2S,4?) CID69248
subst_6 D-xiHyp aka (2R,4?) CID5318330
subst_6 DL-xiHyp aka (2?,4?) CID825

Note that although a few sources refer to names such as “cis-D-Hyp”, it is more usual to order terms consistently (where possible) with the “D-“, “L-” or “DL-” prefix at the start and the “allo”, “xi”, “cis”, “nor” and “homo” prefixes adjacent to the three-letter code.

Methionine sulfoxide, Met(O)

A simpler case of sidechain stereochemistry occurs when the amino acid name doesn’t imply a default stereochemistry.  In these cases, the usual Cahn, Ingold and Prelog (CIP) rules can be used to assign R and S (or E and Z) descriptors appropriately.  A simple example of this is methionine sulfoxide, which is commonly represented by the symbol “Met(O)”. In this case, the sulfur atom bearing the substitution may adopt one of two configurations requiring a “R-” or “S-” prefix to the substituent suffix.

subst_6 L-Met(O) CID158980
subst_6 D-Met(O) CID148508
subst_6 DL-Met(O) CID847
subst_6 L-Met(R-O) CID10062737
subst_6 L-Met(S-O) CID10909908
subst_6 D-Met(R-O) CID11829787
subst_6 D-Met(S-O) CID9577091
subst_6 DL-Met(R-O) CID ???
subst_6 DL-Met(S-O) CID57148329

Image credit: EmsiProduction on Flickr