{"id":1359,"date":"2015-06-01T16:04:51","date_gmt":"2015-06-01T15:04:51","guid":{"rendered":"http:\/\/nextmovesoftware.com\/blog\/?p=1359"},"modified":"2015-06-01T16:05:17","modified_gmt":"2015-06-01T15:05:17","slug":"substructure-search-face-off-are-the-slowest-queries-the-same-between-tools","status":"publish","type":"post","link":"https:\/\/nextmovesoftware.com\/blog\/2015\/06\/01\/substructure-search-face-off-are-the-slowest-queries-the-same-between-tools\/","title":{"rendered":"Substructure Search Face-off: Are the slowest queries the same between tools?"},"content":{"rendered":"<p>At the recent\u00a0<a href=\"http:\/\/c-inf.net\/\">Cambridge Cheminformatics Network Meeting (CCNM)<\/a>\u00a0we presented a\u00a0performance benchmark of substructure searching tools using the same queries, target dataset, and hardware. Whilst many tools publish figures for isolated benchmarks, the use of different query sets and variations in target database size makes it impossible to determine how tools compare to each other.<\/p>\n<p>The talk compared the performance of various tools and offers insight in\u00a0to the performance characteristics.<\/p>\n<p><center><br \/>\n<iframe loading=\"lazy\" src=\"\/\/www.slideshare.net\/slideshow\/embed_code\/key\/LtTDc4cfpvWkxz\" width=\"510\" height=\"420\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" style=\"border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;\" allowfullscreen> <\/iframe><br \/>\n<\/center><\/p>\n<p>A question was asked at the talk as to whether the slowest queries were always the same.\u00a0As expected there is some correlation (benzene is always bad) but there are some rather dramatic differences within and between tools. For example, the time taken to query Anthracene or Zinc varies with some tools finding Anthracene hits faster (marked as &lt;) and others finding Zinc faster (marked as &gt;). <\/p>\n<p>The rank of slowest queries (per tool) is provided as a guide to how many queries took more time than listed here.<\/p>\n<table>\n<tr>\n<th><\/th>\n<th colspan=\"2\">Anthracene<\/th>\n<th><\/th>\n<th colspan=\"2\">Zinc<\/th>\n<\/tr>\n<tr>\n<th>Tool<\/th>\n<th>Query Time (s)<\/th>\n<th>Rank (slow)<\/th>\n<th><\/th>\n<th>Query Time (s)<\/th>\n<th>Rank (slow)<\/th>\n<\/tr>\n<tr>\n<td>arthor<\/td>\n<td>2.254<\/td>\n<td>3<\/td>\n<td>&gt;<\/td>\n<td>0.357<\/td>\n<td>2602<\/td>\n<\/tr>\n<tr>\n<td>arthor+fp<\/td>\n<td>0.022<\/td>\n<td>285<\/td>\n<td>&gt;<\/td>\n<td>0.001<\/td>\n<td>1667<\/td>\n<\/tr>\n<tr>\n<td>rdcart<\/td>\n<td>0.698<\/td>\n<td>794<\/td>\n<td>&lt;<\/td>\n<td>202<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>rdlucene<\/td>\n<td>27.126<\/td>\n<td>566<\/td>\n<td>&gt;<\/td>\n<td>23.87<\/td>\n<td>600<\/td>\n<\/tr>\n<tr>\n<td>pgchem<\/td>\n<td>28.231<\/td>\n<td>138<\/td>\n<td>&gt;<\/td>\n<td>18.181<\/td>\n<td>197<\/td>\n<\/tr>\n<tr>\n<td>mychem<\/td>\n<td>48.289<\/td>\n<td>108<\/td>\n<td>&gt;<\/td>\n<td>34.145<\/td>\n<td>159<\/td>\n<\/tr>\n<tr>\n<td>fastsearch<\/td>\n<td>396<\/td>\n<td>99<\/td>\n<td>&gt;<\/td>\n<td>285<\/td>\n<td>126<\/td>\n<\/tr>\n<tr>\n<td>bingo-nosql<\/td>\n<td>0.448<\/td>\n<td>451<\/td>\n<td>&lt;<\/td>\n<td>1.311<\/td>\n<td>260<\/td>\n<\/tr>\n<tr>\n<td>bingo-pgsql<\/td>\n<td>0.392<\/td>\n<td>638<\/td>\n<td>&gt;<\/td>\n<td>0.060<\/td>\n<td>1228<\/td>\n<\/tr>\n<tr>\n<td>tripod-ss<\/td>\n<td>21.797<\/td>\n<td>350<\/td>\n<td>&lt;<\/td>\n<td>1441<\/td>\n<td>18<\/td>\n<\/tr>\n<tr>\n<td>orchem<\/td>\n<td>27.075<\/td>\n<td>906<\/td>\n<td>&gt;<\/td>\n<td>0.721<\/td>\n<td>2390<\/td>\n<\/tr>\n<\/table>\n<p>As promised the query and target ids are available: <a href=\"http:\/\/nextmovesoftware.com\/blog\/supplementary\/sss-faceoff\/\">here<\/a>.<\/p>\n<p>If this is an area of interest to you feel free to get in touch.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>At the recent\u00a0Cambridge Cheminformatics Network Meeting (CCNM)\u00a0we presented a\u00a0performance benchmark of substructure searching tools using the same queries, target dataset, and hardware. Whilst many tools publish figures for isolated benchmarks, the use of different query sets and variations in target database size makes it impossible to determine how tools compare to each other. The talk &hellip; <a href=\"https:\/\/nextmovesoftware.com\/blog\/2015\/06\/01\/substructure-search-face-off-are-the-slowest-queries-the-same-between-tools\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Substructure Search Face-off: Are the slowest queries the same between tools?<\/span><\/a><\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/posts\/1359"}],"collection":[{"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/comments?post=1359"}],"version-history":[{"count":48,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/posts\/1359\/revisions"}],"predecessor-version":[{"id":1409,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/posts\/1359\/revisions\/1409"}],"wp:attachment":[{"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/media?parent=1359"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/categories?post=1359"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nextmovesoftware.com\/blog\/wp-json\/wp\/v2\/tags?post=1359"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}