Using PubChem’s REST interface from Python

Version 1.0 of PubChem’s REST interface (PUG REST) went live on Sept 13. The tutorial gives examples of use, while full details are in the spec.

The following code shows illustrates basic usage from Python. If you want to simultaneously query large numbers of molecules, or do a substructure search, you need to perform a query “in the background” using the ListKey approach shown by SubstructureToCids:

import os
import sys
import time

import urllib2

def getresult(url):
        connection = urllib2.urlopen(url)
    except urllib2.HTTPError, e:
        return ""

def NameToSmiles(name):
    return getresult("" % name)

def NameToCid(name):
    return getresult("" % name)

def CidToSmiles(cid):
    return getresult("" % cid)

def SubstructureToCids(smiles):
    result = getresult("" % smiles)
    if not result:
        return []
    listkey = ""
    for line in result.split("\n"):
        if line.find("ListKey")>=0:
            listkey = line.split("ListKey>")[1][:-2]
    assert listkey

    url = "" % listkey
    delta = 2
    while True:
        time.sleep(delta) # ...perchance to dream
        if delta < 8:
            delta += 1
        result = getresult(url).split("\n")
        if result[0] != "Your request is running":
    return result

if __name__ == "__main__":
    print NameToSmiles("Gleevec")
    print NameToCid("Gleevec")
    print CidToSmiles("123596")
    print SubstructureToCids("CC1=C(C=C(C=C1)NC(=O)C2=CC=C(C=C2)CN3CCN(CC3)C)NC4=NC=CC(=N4)C5=CN=CC=C5")

One thought on “Using PubChem’s REST interface from Python”

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.