In collaboration with Novartis (with particular thanks to Nadine Schneider) we have published a paper on the the analysis of reactions that we have text-mined from 40 years of US medicinal chemistry patents.
The paper covers the evolution of common reaction types over time, using NameRxn to provide the reaction classification. The reaction classification is hierarchical allowing a reaction to be classified at various levels of granularity. For example a Chloro Suzuki coupling is a Suzuki coupling which is a C-C bond formation reaction. Analysis of the properties of the reaction products was also performed revealing trends such as increase in the number of rings over time.
The reactions were extracted using a workflow based on the use of LeadMine for identifying and normalizing chemicals and physical quantities. One quantity of especial interest that is extracted and associated with the reaction is the yield. This allowed the identification of reaction types with consistently low/high yield and revealed a trend towards slightly lower yields over time.
Greg Landrum has kindly hosted interactive versions of some of the graphs from the paper here. In the Pipeline has also blogged positively about the paper here.