<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://www.nodalpoint.org" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>nodalpoint.org - MEDIE: MEDLINE++ - Comments</title>
 <link>http://www.nodalpoint.org/2006/10/27/medie_medline</link>
 <description>Comments for &quot;MEDIE: MEDLINE++&quot;</description>
 <language>en</language>
<item>
 <title>From MEDIE team</title>
 <link>http://www.nodalpoint.org/2006/10/27/medie_medline#comment-3197</link>
 <description>&lt;p&gt;I would like to make comments, since I was the speaker on MEDIE at MIB.&lt;/p&gt;
&lt;p&gt;1) Perhaps, our website of MEDIE and my presentation gave wrong impression. MEDIE intends to show general functionalities that the parsing (NLP) technology can provide for intelligent text mining, information retrieval, etc. As it is now, it does not intend to be a system for performing a specific task like extracting protein-protein interactions from text. &lt;/p&gt;
&lt;p&gt;2) We are fully aware that another task-specific layer of software (or a set of rules) is needed. You are right in saying that we need a huge set of rules in this layer. In short, to reduce the number of rules or to introduce different technologies in this layer such as statistical models (instead of “rules&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Sun, 05 Nov 2006 04:50:17 -0500</pubDate>
 <dc:creator>Tsujii</dc:creator>
 <guid isPermaLink="false">comment 3197 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>who or what is going to</title>
 <link>http://www.nodalpoint.org/2006/10/27/medie_medline#comment-3190</link>
 <description>&lt;p&gt;&lt;i&gt;who or what is going to accurately annotate 14 million medline abstracts with ontological terms?&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;Nobody.  I&#039;m suggesting that abstracts in their present form are not a worthwhile data source for mining biological data.  Sure, text mining can &quot;try to make sense&quot;, but when you then spend all your time figuring out if it made a good job of it or not, what have you really gained?  Better to build a working system from scratch, rather than make do with a kludge that doesn&#039;t really save you time or effort.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 31 Oct 2006 00:24:22 -0500</pubDate>
 <dc:creator>Neil</dc:creator>
 <guid isPermaLink="false">comment 3190 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Who&#039;s annotating millions of abstracts?</title>
 <link>http://www.nodalpoint.org/2006/10/27/medie_medline#comment-3189</link>
 <description>&lt;p&gt;The obvious smart-aleck ansswer is the National Library of Medicine.  They&#039;re putting MeSH (medical subject heading) terms on every MEDLINE citation, of which there are now more than 15M.  To quote from NLM&#039;s paper &lt;a href=&quot;http://www.nlm.nih.gov/mesh/mtms_medinfo_2004.html&quot;&gt;The MeSH Translation Maintenance System:&lt;br /&gt;
Structure, Interface Design, and Implementation&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;
Each article is indexed with Medical Subject Headings by an individual who, after reading the article in its original language, assigns the descriptors to indicate what the article is about.
&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;NLM must have an army of folks keeping up with the roughly 2000 articles/day added to MEDLINE.  And I have no idea how accurately this is done in terms of inter-annotator agreement.&lt;/p&gt;
&lt;p&gt;An obvious alternative would be to allow user tags in the Web 2.0 sense.  Users reading articles could add tags.  Other users could then search by tag.  It helps with recall and precision for Flickr and YouTube.  It could work for research articles.  As soon as this became even moderately popular, people begin tagging their own entries so that people can find them.  Each paper has at least one individual (the author) with a vested interested in making the paper easy to find.  &lt;/p&gt;
&lt;p&gt;(Disclaimer:  We have an NLM SBIR grant focused on biomedical text processing.  We&#039;re working with real biomedical researchers because it&#039;s a real problem when you have 500 candidate genes listed by Entrez ID and you want to investigate the literature surrounding them.)&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.colloquial.com/carp&quot;&gt;Bob Carpenter&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.alias-i.com/lingpipe&quot;&gt;Alias-i, Inc.&lt;/a&gt;&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Mon, 30 Oct 2006 16:39:08 -0500</pubDate>
 <dc:creator>Bob Carpenter</dc:creator>
 <guid isPermaLink="false">comment 3189 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>text mining vs. ontologies</title>
 <link>http://www.nodalpoint.org/2006/10/27/medie_medline#comment-3188</link>
 <description>&lt;p&gt;I see what you mean, there are obvious limitations to text mining. But then there are obvious limitations to controlled vocabularies and ontologies as well. Whereas text mining can at least try to make sense of millions of abstracts, who or what is going to accurately annotate 14 million medline abstracts with ontological terms? I think &lt;a href=&quot;http://www.gopubmed.org/&quot;&gt;GoPubMed&lt;/a&gt; is a nice demonstration that combines the text mining and ontological approaches, although its far from perfect.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Mon, 30 Oct 2006 11:34:02 -0500</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">comment 3188 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>not overly impressed</title>
 <link>http://www.nodalpoint.org/2006/10/27/medie_medline#comment-3185</link>
 <description>&lt;p&gt;I didn&#039;t think much of Medie based on the sample queries, or the few queries that I tried.  But then, I have a low opinion of text mining in general.  I get the theory - sentences have structure (&lt;i&gt;e.g.&lt;/i&gt; &quot;A is something-ed by B&quot;, &quot;X somethings Y&quot;) and there are a finite number of variations in this structure, so we should be able to derive rules from it.  In my experience though, that&#039;s not what happens.  I don&#039;t know whether it&#039;s because we need more rules, or more complex rules, but my suspicion is that in a medium without constraints where people can write in any style that they choose (like an abstract), they will find a way of expressing themselves that confuses the rule machine.  Let&#039;s face it, the standard of written English in many abstracts is not high.  There&#039;s a lot of subtlety in expression too:  &quot;leading to enhanced phosphorylation&quot; is not the same as &quot;phosphorylates&quot;.&lt;br /&gt;
I think the only sure way to describe biological interactions in a way that allows accurate mining is controlled ontologies.  You could spend years refining text mining rules - or you could set up a standardised descriptive system at the outset and use that.  It&#039;s the same old story - you wonder if some people really want to address the problems of biologists or whether they&#039;re just cashing in because text mining is fundable at the moment.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Sun, 29 Oct 2006 22:52:14 -0500</pubDate>
 <dc:creator>Neil</dc:creator>
 <guid isPermaLink="false">comment 3185 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>MEDIE: MEDLINE++</title>
 <link>http://www.nodalpoint.org/2006/10/27/medie_medline</link>
 <description>&lt;p&gt;&lt;span id=&quot;picture-right&quot; style=&quot;border:none;float: right; margin-left:0.5em; font-size:10px; color:#666666;font-weight:normal;&quot;&gt;&lt;a href=&quot;http://www.flickr.com/photos/dullhunk/768929857/&quot; title=&quot;MEDIE&quot;&gt;&lt;img src=&quot;http://farm2.static.flickr.com/1022/768929857_ef0aa6216b_o.jpg&quot; width=&quot;156&quot; height=&quot;74&quot; alt=&quot;Medie&quot; /&gt;&lt;/a&gt;&lt;/span&gt;MEDIE is an &amp;#x201C;intelligent&amp;#x201D; semantic search engine that retrieves biomedical correlations from over 14 million articles in MEDLINE.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;&lt;p&gt;&lt;a href=&quot;http://www.nodalpoint.org/2006/10/27/medie_medline&quot;&gt;read more&lt;/a&gt;&lt;/p&gt;</description>
 <comments>http://www.nodalpoint.org/2006/10/27/medie_medline#comments</comments>
 <category domain="http://www.nodalpoint.org/master_list/bioinformatics">Bioinformatics</category>
 <category domain="http://www.nodalpoint.org/nodalpoint_tags/medie">medie</category>
 <category domain="http://www.nodalpoint.org/nodalpoint_tags/medline">medline</category>
 <category domain="http://www.nodalpoint.org/nodalpoint_tags/mib">MIB</category>
 <category domain="http://www.nodalpoint.org/nodalpoint_tags/nactem">NaCTeM</category>
 <category domain="http://www.nodalpoint.org/nodalpoint_tags/text_mining">text mining</category>
 <pubDate>Fri, 27 Oct 2006 05:20:34 -0400</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">2101 at http://www.nodalpoint.org</guid>
</item>
</channel>
</rss>
