<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://www.nodalpoint.org" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>nodalpoint.org - Lincoln Stein - Comments</title>
 <link>http://www.nodalpoint.org/nodalpoint_tags/lincoln_stein</link>
 <description>Comments for &quot;Lincoln Stein&quot;</description>
 <language>en</language>
<item>
 <title>-10 Ooops</title>
 <link>http://www.nodalpoint.org/2007/08/06/scifoo_day_3_genome_voyeurism_with_lincoln_stein#comment-4136</link>
 <description>&lt;p&gt;Ooops, my &lt;a href=&quot;http://www.nodalpoint.org/2006/11/28/new_improved_semantic_web&quot;&gt;semantic web agent&lt;/a&gt; did the sums wrong (bad ontology) and made Jim aged 89 not 79....corrected it now!&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Fri, 10 Aug 2007 05:05:19 -0400</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">comment 4136 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>+10 ?</title>
 <link>http://www.nodalpoint.org/2007/08/06/scifoo_day_3_genome_voyeurism_with_lincoln_stein#comment-4135</link>
 <description>&lt;p&gt;Man, that sequence must have been pretty scary ... seeing his own genome sequence instantly aged Lucky Jim an extra 10 years!   ;-)&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Thu, 09 Aug 2007 23:30:12 -0400</pubDate>
 <dc:creator>radmap</dc:creator>
 <guid isPermaLink="false">comment 4135 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Databases in Peril</title>
 <link>http://www.nodalpoint.org/2007/01/05/nar_database_issue_2007#comment-3277</link>
 <description>&lt;p&gt;Thanks for all your comments, here are some thoughts...&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Quantity &lt;i&gt;is&lt;/i&gt; a significant problem&lt;/b&gt; we&#039;re not just talking about individual databases getting bigger and bigger like GenBank, we&#039;re talking about more different types of databases. Potentially we want to allow the combination of data from &lt;i&gt;any&lt;/i&gt; of these different databases and others that will appear in the future. Obviously, any given researcher probably isn&#039;t going to want to search all 900+ databases, but it would be beneficial to the wider scientific community if all these databases can easily interoperate. The more databases there are, the more challenging easy interoperation becomes, because there is more heterogeneity, more API&#039;s, more schemas etc.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Peer-reviewed publication can help assess quality&lt;/b&gt; this is what peer-review is for. The editors of this issue claim to look for good quality data as well as a good quality interface. As pointed out in the comments above &amp;#x201C;anyone with a modicum of knowledge can put a database or web app online&amp;#x201D;. By itself, this is not enough for publication. It is no good having great data with an awkward non-standard interface and vice versa. The NAR database issue may well be an &amp;#x201C;easy&amp;#x201D; publication, but it doesn&#039;t make it any less important. The &lt;a href=&quot;http://dx.doi.org/10.1038/4351010a&quot; rev=&quot;review&quot;&gt;Databases in Peril&lt;/a&gt; article, wouldn&#039;t have been possible if NAR hadn&#039;t been faithfully recording all this information in the first place. I suspect publication in the NAR database issue is harder than some suggested, it&#039;s not just a case of shoving a database on the web then writing a paper about it, you have to convince the reviewers the database is worthy: novel, useful and usable.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Churn is inevitable but the overall trend is still upward&lt;/b&gt; Databases (and tools) are not immortal, some are bound to wither and die eventually. Since last year 11 databases have gone this way, and the &lt;a href=&quot;http://dx.doi.org/10.1093/nar/gkl1008&quot;&gt;article&lt;/a&gt;, discusses why. The general trend is still upward and will probably keep going. In the long run, the longevity of database can be an indicator of its quality because somebody cares and is skilled enough to maintain and fund it for a long period of time. As for the databases that are &amp;#x201C;struggling financially&amp;#x201D; (according to Nature) how is this news? Struggling could mean anything.	 Haven&#039;t you always had to fight for sustained funding of any scientific project?&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Standards are boring (but important)&lt;/b&gt; it can be difficult to get standards work funded, done and published, what John Quackenbush calls &lt;a href=&quot;http://dx.doi.org/10.1038/msb4100052&quot; title=&quot;Standardising the standards Molecular Systems Biology 2, 1 (2006-02-21)&quot; rev=&quot;review&quot;&gt;Blue-collar science&lt;/a&gt;. It is  unglamorous but essential work, and nobody is going to win a nobel prize for creating a standard schema, ontology or whatever. What is the research contribution of creating a standard? Novelty? Discovery of new knowledge? This is partly why we have chaos, creating standards, in itself is often not considered &amp;#x201C;science&amp;#x201D; or &amp;#x201C;research&amp;#x201D;. But without them, science is much harder.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Integrated Search is hard&lt;/b&gt; We would all like integrated search &amp;#x201C;from one box&amp;#x201D;, but  the way to do this is still very much an open research question, not just in bioinformatics, but for 	computer science also. What is more, this is not merely an &amp;#x201C;IT problem&amp;#x201D;, there are novel and &lt;a href=&quot;http://dx.doi.org/10.1038/nrd1608&quot; rev=&quot;review&quot; title=&quot;Nature Reviews Drug Discovery 4, 45-58 (2005)&quot;&gt;serious scientific challenges&lt;/a&gt; in achieving this. If it was easy and straight forward to provide integrated search to all these databases, don&#039;t you think somebody would have done it by now? Until that time, we have &lt;a href=&quot;http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi&quot;&gt;Entrez Global Query&lt;/a&gt;...&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Fri, 12 Jan 2007 10:48:39 -0500</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">comment 3277 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Quantity not the problem</title>
 <link>http://www.nodalpoint.org/2007/01/05/nar_database_issue_2007#comment-3267</link>
 <description>&lt;p&gt;Not drowning.  More data are good - the more, the better.  Only a few of those databases are relevant to an individual researcher.&lt;/p&gt;
&lt;p&gt;As others have already commented, the problems are (1) the quality of the databases, (2) their diverse, &quot;higgledy-piggledy&quot; nature (no standards, APIs, integration) and (3) their longevity, or lack thereof.  Frankly, anyone with a modicum of SQL and CGI knowledge can put a database or web app online.  So they do.  You can&#039;t legislate against bad web resources.&lt;/p&gt;
&lt;p&gt;I would question whether these annual issues still serve any useful purpose, other than to make the journal appear authoritative or provide an avenue for an easy publication.  If I&#039;m looking for an online resource I start with Google, not an outdated journal article.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Sun, 07 Jan 2007 08:10:00 -0500</pubDate>
 <dc:creator>Neil</dc:creator>
 <guid isPermaLink="false">comment 3267 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Need standards</title>
 <link>http://www.nodalpoint.org/2007/01/05/nar_database_issue_2007#comment-3265</link>
 <description>&lt;p&gt;Great article Duncan, thanks for bringing this on.&lt;br /&gt;
I have seen lot of people accessing low quality data from many well-known db&#039;s and high quality data in not-so-well-known ones.&lt;br /&gt;
For eg, GBK files mostly does not talk about quality while it&#039;s ASN.1 counterpart might offer it [ &lt;a href=&quot;http://getentry.ddbj.nig.ac.jp/cgi-bin/get_entry.pl?AF207953&quot; title=&quot;http://getentry.ddbj.nig.ac.jp/cgi-bin/get_entry.pl?AF207953&quot;&gt;http://getentry.ddbj.nig.ac.jp/cgi-bin/get_entry.pl?AF207953&lt;/a&gt; ].&lt;br /&gt;
Regarding algorithms to analyze these data, any comment will be like a troll.&lt;br /&gt;
To minimize this, I feel, something like Bioinformatics oriented DIGG will be great.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Sun, 07 Jan 2007 05:02:36 -0500</pubDate>
 <dc:creator>Animesh</dc:creator>
 <guid isPermaLink="false">comment 3265 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Vaporware</title>
 <link>http://www.nodalpoint.org/2007/01/05/nar_database_issue_2007#comment-3264</link>
 <description>&lt;p&gt;Each time a new annual issue of NAR is published I remember this paper from Nature.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.nature.com/nature/journal/v435/n7045/full/4351010a.html&quot;&gt;http://www.nature.com/nature/journal/v435/n7045/full/4351010a.html&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Databases in peril&lt;br /&gt;
Zeeya Merali and Jim Giles&lt;br /&gt;
Nature 435, 1010-1011 (23 June 2005)&lt;br /&gt;
doi: 10.1038/4351010a&lt;/p&gt;
&lt;p&gt;&lt;cite&gt;Nature contacted 89 databases listed in the Molecular Biology Database Collection (Nucl. Acids Res.28 1−7; 2000) to see how many still have funding five years on. Of these, 51 reported that they are struggling financially. Seven of these have closed; the rest are being updated sporadically in their owners&#039; spare time.&lt;/cite&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;http://www.nature.com/nature/journal/v435/n7045/images/4351010a-i3.0.jpg&quot;/&gt;&lt;/p&gt;
&lt;p&gt;Pierre&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Fri, 05 Jan 2007 16:23:41 -0500</pubDate>
 <dc:creator>lindenb</dc:creator>
 <guid isPermaLink="false">comment 3264 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Drowning!!!</title>
 <link>http://www.nodalpoint.org/2007/01/05/nar_database_issue_2007#comment-3263</link>
 <description>&lt;p&gt;I think that we&#039;re going to drown at this rate.  Not because there are too many databases.  Those can, and perhaps should be spread far and wide.  My concerns&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Quality&lt;/b&gt;.  How do we know whether the results in our hands are any good?  Can we glean meaningful knowledge from them?&lt;br /&gt;
&lt;b&gt;Integrated search&lt;/b&gt;.  I don&#039;t want to go to every database and search there.  I want to search from one box&lt;br /&gt;
&lt;b&gt;Standards&lt;/b&gt;.  I want the data to follow certain minimum standards.&lt;/p&gt;
&lt;p&gt;What was that about airlines :-)?&lt;/p&gt;
&lt;p&gt;My Blog: &lt;a href=&quot;http://mndoci.com&quot; title=&quot;http://mndoci.com&quot;&gt;http://mndoci.com&lt;/a&gt;&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Fri, 05 Jan 2007 16:12:25 -0500</pubDate>
 <dc:creator>mndoci</dc:creator>
 <guid isPermaLink="false">comment 3263 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>agreed</title>
 <link>http://www.nodalpoint.org/2006/11/01/bioinformatics_impact_factors#comment-3199</link>
 <description>&lt;p&gt;&lt;i&gt;IMHO, screen-scraping this kind of data is a mugs game&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;I&#039;m with you there.  Scraping seems to be the basis of &lt;a href=&quot;http://www.zotero.org/&quot;&gt;Zotero&lt;/a&gt;, which everyone is talking about just now.  Broken scraping seems to be the reason for &lt;a href=&quot;//nsaunders.wordpress.com/2006/10/31/zotero-looks-great-does-it-work/&quot;&gt;its current inability to import&lt;/a&gt; from PubMed/HubMed.&lt;/p&gt;
&lt;p&gt;Having said that, I just wrote a scraper to pull hundreds of fasta files from an online genome database.  It was a one-off though, honest.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Mon, 06 Nov 2006 20:47:06 -0500</pubDate>
 <dc:creator>Neil</dc:creator>
 <guid isPermaLink="false">comment 3199 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Screen Scraping Hell</title>
 <link>http://www.nodalpoint.org/2006/11/01/bioinformatics_impact_factors#comment-3198</link>
 <description>&lt;p&gt;IMHO, &lt;a href=&quot;http://en.wikipedia.org/wiki/Screen_scraping&quot;&gt;screen-scraping&lt;/a&gt; this kind of data is a mugs game, or as &lt;a href=&quot;http://www.bioperl.org/wiki/Lincoln_Stein&quot;&gt;Lincoln Stein&lt;/a&gt; once put it: &lt;a href=&quot;http://dx.doi.org/10.1038/417119a&quot; title=&quot;Nature. 417 (6885), 119-20 (09 May 2002)&quot;&gt;mediaeval torture&lt;/a&gt;, best left as a task for the sado-masochists, who actually &lt;i&gt;like&lt;/i&gt; having to repeatedly rewrite their code!&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Mon, 06 Nov 2006 11:57:45 -0500</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">comment 3198 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Rise of BMC</title>
 <link>http://www.nodalpoint.org/2006/11/01/bioinformatics_impact_factors#comment-3196</link>
 <description>&lt;p&gt;&lt;i&gt;...young upstart BioMed Central Bioinformatics...&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;I like all of the BMC Journals and on the whole, rate &lt;i&gt;BMC Bioinformatics&lt;/i&gt; over &lt;i&gt;Bioinformatics&lt;/i&gt;.  The latter journal has become almost solely a methods/algorithms journal, whereas the former has a lot more interest in terms of application to real biological problems.&lt;/p&gt;
&lt;p&gt;And yet - look at the problems that the BMC journals had initially in getting established - all because noone could figure out how impact factors applied to free, online articles!  The nonsense that is impact factors, once again.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Sat, 04 Nov 2006 04:09:29 -0500</pubDate>
 <dc:creator>Neil</dc:creator>
 <guid isPermaLink="false">comment 3196 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Digestability</title>
 <link>http://www.nodalpoint.org/2006/11/01/bioinformatics_impact_factors#comment-3195</link>
 <description>&lt;p&gt;Probably because a single, simple metric has intuitive appeal. Eugene Garfield, the guy behind IFs, has commented on their use and abuse in &lt;a href=&quot;http://jama.ama-assn.org/cgi/content/full/295/1/90&quot;&gt;JAMA&lt;/a&gt; and on the &lt;a href=&quot;http://scientific.thomson.com/news/newsletter/2005-11/8298245/&quot;&gt;Thomson website&lt;/a&gt; (pdf). There&#039;s quite a &lt;a href=&quot;http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?itool=pubmed_DocSum&amp;amp;db=pubmed&amp;amp;cmd=Display&amp;amp;dopt=pubmed_pubmed&amp;amp;from_uid=16272064&quot;&gt;large literature&lt;/a&gt; critical of whole concept, too.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Fri, 03 Nov 2006 15:02:14 -0500</pubDate>
 <dc:creator>chris</dc:creator>
 <guid isPermaLink="false">comment 3195 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>What a mess</title>
 <link>http://www.nodalpoint.org/2006/11/01/bioinformatics_impact_factors#comment-3194</link>
 <description>&lt;p&gt;Yeah, impact factors are always horribly out of date, are difficult to get hold of, often misleading and in the hands of a for-profit organisation that is responsible primarily to its shareholders. How did research assessment become so overly dependent on such figures?&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Thu, 02 Nov 2006 05:58:49 -0500</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">comment 3194 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Good work</title>
 <link>http://www.nodalpoint.org/2006/11/01/bioinformatics_impact_factors#comment-3193</link>
 <description>&lt;p&gt;Everything about impact factors is ridiculous - the difficulty in obtaining what should be freely-available and up to date data is perhaps the most ridiculous aspect.  Thanks for the list.&lt;/p&gt;
&lt;p&gt;Given that many journal webpages now display their impact factor, I wonder how many could be obtained using scrapers?&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Wed, 01 Nov 2006 21:58:16 -0500</pubDate>
 <dc:creator>Neil</dc:creator>
 <guid isPermaLink="false">comment 3193 at http://www.nodalpoint.org</guid>
</item>
</channel>
</rss>
