{"id":74,"date":"2012-10-22T10:44:45","date_gmt":"2012-10-22T09:44:45","guid":{"rendered":"http:\/\/nextmovesoftware.com\/blog\/?p=74"},"modified":"2012-10-22T10:44:45","modified_gmt":"2012-10-22T09:44:45","slug":"using-pubchems-rest-interface-from-python","status":"publish","type":"post","link":"https:\/\/nextmovesoftware.com\/blog\/2012\/10\/22\/using-pubchems-rest-interface-from-python\/","title":{"rendered":"Using PubChem&#8217;s REST interface from Python"},"content":{"rendered":"<p>Version 1.0 of PubChem&#8217;s REST interface (PUG REST) went live on Sept 13. The <a href=\"http:\/\/pubchem.ncbi.nlm.nih.gov\/\/pug_rest\/\">tutorial<\/a> gives examples of use, while full details are in the <a href=\"http:\/\/pubchem.ncbi.nlm.nih.gov\/pug_rest\/PUG_REST.html\">spec<\/a>.<\/p>\n<p>The following code shows illustrates basic usage from Python. If you want to simultaneously query large numbers of molecules, or do a substructure search, you need to perform a query &#8220;in the background&#8221; using the ListKey approach shown by SubstructureToCids:<\/p>\n<style type=\"text\/css\">\n<!--\npre { font-family: monospace; color: #000000; background-color: #ffffff; }\n.Comment { color: #0000ff; }\n.Special { color: #6a5acd; }\n.Constant { color: #ff00ff; }\n.Identifier { color: #008080; }\n.Statement { color: #804040; font-weight: bold; }\n.PreProc { color: #a020f0; }\n-->\n<\/style>\n<pre>\r\n<span class=\"PreProc\">import<\/span> os\r\n<span class=\"PreProc\">import<\/span> sys\r\n<span class=\"PreProc\">import<\/span> time\r\n\r\n<span class=\"PreProc\">import<\/span> urllib2\r\n\r\n<span class=\"Statement\">def<\/span> <span class=\"Identifier\">getresult<\/span>(url):\r\n    <span class=\"Statement\">try<\/span>:\r\n        connection = urllib2.urlopen(url)\r\n    <span class=\"Statement\">except<\/span> urllib2.HTTPError, e:\r\n        <span class=\"Statement\">return<\/span> <span class=\"Constant\">&quot;&quot;<\/span>\r\n    <span class=\"Statement\">else<\/span>:\r\n        <span class=\"Statement\">return<\/span> connection.read().rstrip()\r\n\r\n<span class=\"Statement\">def<\/span> <span class=\"Identifier\">NameToSmiles<\/span>(name):\r\n    <span class=\"Statement\">return<\/span> getresult(<span class=\"Constant\">&quot;<a href=\"http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/name\/%s\/property\/IsomericSMILES\/TXT\">http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/name\/%s\/property\/IsomericSMILES\/TXT<\/a>&quot;<\/span> % name)\r\n\r\n<span class=\"Statement\">def<\/span> <span class=\"Identifier\">NameToCid<\/span>(name):\r\n    <span class=\"Statement\">return<\/span> getresult(<span class=\"Constant\">&quot;<a href=\"http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/name\/%s\/cids\/TXT\">http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/name\/%s\/cids\/TXT<\/a>&quot;<\/span> % name)\r\n\r\n<span class=\"Statement\">def<\/span> <span class=\"Identifier\">CidToSmiles<\/span>(cid):\r\n    <span class=\"Statement\">return<\/span> getresult(<span class=\"Constant\">&quot;<a href=\"http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/cid\/%s\/property\/IsomericSMILES\/TXT\">http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/cid\/%s\/property\/IsomericSMILES\/TXT<\/a>&quot;<\/span> % cid)\r\n\r\n<span class=\"Statement\">def<\/span> <span class=\"Identifier\">SubstructureToCids<\/span>(smiles):\r\n    result = getresult(<span class=\"Constant\">&quot;<a href=\"http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/substructure\/smiles\/%s\/XML\">http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/substructure\/smiles\/%s\/XML<\/a>&quot;<\/span> % smiles)\r\n    <span class=\"Statement\">if<\/span> <span class=\"Statement\">not<\/span> result:\r\n        <span class=\"Statement\">return<\/span> []\r\n    listkey = <span class=\"Constant\">&quot;&quot;<\/span>\r\n    <span class=\"Statement\">for<\/span> line <span class=\"Statement\">in<\/span> result.split(<span class=\"Constant\">&quot;<\/span><span class=\"Special\">\\n<\/span><span class=\"Constant\">&quot;<\/span>):\r\n        <span class=\"Statement\">if<\/span> line.find(<span class=\"Constant\">&quot;ListKey&quot;<\/span>)&gt;=<span class=\"Constant\">0<\/span>:\r\n            listkey = line.split(<span class=\"Constant\">&quot;ListKey&gt;&quot;<\/span>)[<span class=\"Constant\">1<\/span>][:-<span class=\"Constant\">2<\/span>]\r\n    <span class=\"Statement\">assert<\/span> listkey\r\n\r\n    url = <span class=\"Constant\">&quot;<a href=\"http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/listkey\/%s\/cids\/TXT\">http:\/\/pubchem.ncbi.nlm.nih.gov\/rest\/pug\/compound\/listkey\/%s\/cids\/TXT<\/a>&quot;<\/span> % listkey\r\n    delta = <span class=\"Constant\">2<\/span>\r\n    <span class=\"Statement\">while<\/span> <span class=\"Identifier\">True<\/span>:\r\n        time.sleep(delta) <span class=\"Comment\"># ...perchance to dream<\/span>\r\n        <span class=\"Statement\">if<\/span> delta &lt; <span class=\"Constant\">8<\/span>:\r\n            delta += <span class=\"Constant\">1<\/span>\r\n        result = getresult(url).split(<span class=\"Constant\">&quot;<\/span><span class=\"Special\">\\n<\/span><span class=\"Constant\">&quot;<\/span>)\r\n        <span class=\"Statement\">if<\/span> result[<span class=\"Constant\">0<\/span>] != <span class=\"Constant\">&quot;Your request is running&quot;<\/span>:\r\n            <span class=\"Statement\">break<\/span>\r\n    <span class=\"Statement\">return<\/span> result\r\n\r\n<span class=\"Statement\">if<\/span> __name__ == <span class=\"Constant\">&quot;__main__&quot;<\/span>:\r\n    <span class=\"Identifier\">print<\/span> NameToSmiles(<span class=\"Constant\">&quot;Gleevec&quot;<\/span>)\r\n    <span class=\"Identifier\">print<\/span> NameToCid(<span class=\"Constant\">&quot;Gleevec&quot;<\/span>)\r\n    <span class=\"Identifier\">print<\/span> CidToSmiles(<span class=\"Constant\">&quot;123596&quot;<\/span>)\r\n    <span class=\"Identifier\">print<\/span> SubstructureToCids(<span class=\"Constant\">&quot;CC1=C(C=C(C=C1)NC(=O)C2=CC=C(C=C2)CN3CCN(CC3)C)NC4=NC=CC(=N4)C5=CN=CC=C5&quot;<\/span>)\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Version 1.0 of PubChem&#8217;s REST interface (PUG REST) went live on Sept 13. The tutorial gives examples of use, while full details are in the spec. The following code shows illustrates basic usage from Python. If you want to simultaneously query large numbers of molecules, or do a substructure search, you need to perform a &hellip; <a href=\"https:\/\/nextmovesoftware.com\/blog\/2012\/10\/22\/using-pubchems-rest-interface-from-python\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Using PubChem&#8217;s REST interface from Python<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/posts\/74"}],"collection":[{"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/comments?post=74"}],"version-history":[{"count":7,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/posts\/74\/revisions"}],"predecessor-version":[{"id":81,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/posts\/74\/revisions\/81"}],"wp:attachment":[{"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/media?parent=74"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/categories?post=74"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/tags?post=74"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}