Version 1.0 of PubChem’s REST interface (PUG REST) went live on Sept 13. The tutorial gives examples of use, while full details are in the spec.
The following code shows illustrates basic usage from Python. If you want to simultaneously query large numbers of molecules, or do a substructure search, you need to perform a query “in the background” using the ListKey approach shown by SubstructureToCids:
import os import sys import time import urllib2 def getresult(url): try: connection = urllib2.urlopen(url) except urllib2.HTTPError, e: return "" else: return connection.read().rstrip() def NameToSmiles(name): return getresult("http://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/%s/property/IsomericSMILES/TXT" % name) def NameToCid(name): return getresult("http://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/%s/cids/TXT" % name) def CidToSmiles(cid): return getresult("http://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/%s/property/IsomericSMILES/TXT" % cid) def SubstructureToCids(smiles): result = getresult("http://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/substructure/smiles/%s/XML" % smiles) if not result: return [] listkey = "" for line in result.split("\n"): if line.find("ListKey")>=0: listkey = line.split("ListKey>")[1][:-2] assert listkey url = "http://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/listkey/%s/cids/TXT" % listkey delta = 2 while True: time.sleep(delta) # ...perchance to dream if delta < 8: delta += 1 result = getresult(url).split("\n") if result[0] != "Your request is running": break return result if __name__ == "__main__": print NameToSmiles("Gleevec") print NameToCid("Gleevec") print CidToSmiles("123596") print SubstructureToCids("CC1=C(C=C(C=C1)NC(=O)C2=CC=C(C=C2)CN3CCN(CC3)C)NC4=NC=CC(=N4)C5=CN=CC=C5")
I took this idea a bit further: https://github.com/mcs07/PubChemPy