What is a med chemist’s favourite phenyl substituent?

In the course of preparing a talk for the recent ACS meeting (more on this later), I thought it would be interesting to give an overview of the ChEMBL data on substituted phenyls. What I did was take all those matched series* with associated IC50 data containing 4 or more phenyl substituents, and then count the frequency of each particular phenyl.

In other words, when a medicinal chemist was trying to optimize the substituents around a phenyl ring, which were the most frequent groups tested?

The most popular substituted phenyls
The most popular substituted phenyls

The order of popularity at the 4 position is OMe > Cl > F > Me, while at the 2 and 3 positions it’s Cl > OMe > F > Me. For these groups, in general the corresponding frequencies are in the order 4 >> 3 > 2. It would be interesting to know whether this corresponds to the ease of synthesis of these groups (in the general case) or whether other factors are at play.

In response to a query about whether the preferences have changed over time, I’ve generated the following image (click for bigger) that provides this information for the period 1990-2013 (the x-axis). The y-axis shows frequencies divided by the total number of substituted phenyls that year.

Changes in frequencies over time
Changes in frequencies over time
It’s a bit hard to draw any conclusions, but possibly 4-nitrile is becoming more popular, along with 3-F, while 2-NO2 and 2,3,4-OMe are going down.

*A matched (molecular) series is a series of analogs with same scaffold but different R groups (all at the same position). In this context, each matched series contains only molecules from the same assay and paper.

See you at the ACS?

NextMovers will be present and presenting at the 248th ACS National Meeting in San Francisco, which starts this Sunday.

As well as having a booth (come by if interested in picking up an evaluation copy of Matsy), we will be giving the following presentations:

Sunday
11 – Classification, representation, and analysis of cyclic peptides and peptide-like analogs
Authors: Dr Roger A Sayle, Dr Daniel M Lowe, Dr Noel M O’Boyle
Division: CINF: Division of Chemical Information
Date/Time: Sunday, August 10, 2014 – 08:35 AM
Session Info: Computational Methods and the Development/Production of Biologics and Biosimilars (08:30 AM – 09:35 AM)
Location: Palace Hotel
Room: California Parlor

6 – Chemistry and reactions from non-US patents
Authors: Dr Daniel M Lowe, Dr Roger A Sayle
Division: CINF: Division of Chemical Information
Date/Time: Sunday, August 10, 2014 – 09:20 AM
Session Info: Hunting for Hidden Treasures: Chemistry Text Mining in Patents and Other Documents (08:40 AM – 12:00 PM)
Location: Palace Hotel
Room: Presidio

22 – Revising the Topliss decision tree based on 30 years of medicinal chemistry literature
Authors: Noel M O’Boyle, Jonas Boström, Roger A Sayle, Adrian Gill
Division: MEDI: Division of Medicinal Chemistry
Date/Time: Sunday, August 10, 2014 – 11:30 AM
Session Info: General Oral Session (08:30 AM – 12:10 PM)
Location: Moscone Center, West Bldg.
Room: 3008

Tuesday
394 – Using matched series to predict R groups that improve biological activity
Authors: Noel M O’Boyle, Jonas Boström, Roger A Sayle, Adrian Gill
Division: COMP: Division of Computers in Chemistry
Date/Time: Tuesday, August 12, 2014 – 06:00 PM
Session Info: Poster Session (06:00 PM – 08:00 PM)
Location: San Francisco Marriott Marquis
Room: Golden Gate Section A/B