<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://www.nodalpoint.org" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>nodalpoint.org - best practices - Comments</title>
 <link>http://www.nodalpoint.org/nodalpoint_tags/best_practices</link>
 <description>Comments for &quot;best practices&quot;</description>
 <language>en</language>
<item>
 <title>Taverna</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3459</link>
 <description>&lt;p&gt;I don&#039;t get it. The diagram is too complex? Of course it is complex. But everything is pointing towards a workflow (SOA) world. Why does bioinformatics deny this turn? Oracle, SAP, IBM and so on all rewrite all their applications so they are able to use BPEL as graphical buil environment for integration processess. So stick in the 90s and keep using scripts, but within a few years we all will use workflows in a SOA world. I really believe this will happen, IBM believes in it so does the whole IT world. Let&#039;s see who&#039;s right.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Wed, 11 Apr 2007 16:35:06 -0400</pubDate>
 <dc:creator>mart1nus</dc:creator>
 <guid isPermaLink="false">comment 3459 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>A hack (not too kludgy)</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3439</link>
 <description>&lt;p&gt;Designate one of the multiple output as a representative file (say foo.psl), touch the other files at the end of the commands, and have the rule: &lt;code&gt;foo.log foo.whatever ...: foo.psl&lt;/code&gt;&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Wed, 28 Mar 2007 09:02:33 -0400</pubDate>
 <dc:creator>pjw</dc:creator>
 <guid isPermaLink="false">comment 3439 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Doesn&#039;t change a thing</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3436</link>
 <description>&lt;p&gt;What you wrote unfortunately is equivalent to what I wrote. It&#039;s like a shorthand notation for two separate rules, I should have explained it right away. It is going to execute the command twice. What we need is a way to express that a command has multiple outputs, and rules with multiple targets don&#039;t accomplish  that.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Fri, 23 Mar 2007 00:22:26 -0400</pubDate>
 <dc:creator>Antonio Piccolboni</dc:creator>
 <guid isPermaLink="false">comment 3436 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>makeovers</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3435</link>
 <description>&lt;p&gt;We certainly would want the replacement to look almost (if not exactly) like make. It should definitely be as simple as make -- that&#039;s probably the primary design goal: that you can cut and paste from the command line to a pipeline description file, with minimal extra typing. I&#039;m fully aware of the dangers of excessive re-engineering...&lt;/p&gt;
&lt;p&gt;Thanks for the other links... we&#039;ll certainly look into them, and post discussion summaries on biowiki.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://biowiki.org/IanHolmes&quot; title=&quot;http://biowiki.org/IanHolmes&quot;&gt;http://biowiki.org/IanHolmes&lt;/a&gt;&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Wed, 21 Mar 2007 12:26:55 -0400</pubDate>
 <dc:creator>Ian Holmes</dc:creator>
 <guid isPermaLink="false">comment 3435 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Why not scripting?</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3434</link>
 <description>&lt;p&gt;Scripting is great, a powerful tool that lets you achieve world peace in three lines of PERL/Python/Ruby. But what if people don&#039;t want to hack scripts? According to Grady Booch, the history of software engineering is one of increasing levels of abstraction, which is where Taverna and workflows are trying to go. Admittedly, we&#039;re not quite there yet, sometimes the &lt;a href=&quot;http://www.joelonsoftware.com/articles/LeakyAbstractions.html&quot; title=&quot;The law of leaky abstractions&quot;&gt;abstractions leak&lt;/a&gt;, and as Stew says, you end up &lt;a href=&quot;http://www.ghastlyfop.com/blog/2005/12/workflows-grid-services.html&quot;&gt;hacking BeanShell&lt;/a&gt;, thats not really a problem with Taverna or workflows, its an inherent problem in bioinformatics data, a flat-file legacy nightmare, that means we&#039;ll be forced to hack scripting languages for a long time to come, whether we like it or not.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Wed, 21 Mar 2007 04:20:40 -0400</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">comment 3434 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>Lets agree to disagree</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3433</link>
 <description>&lt;p&gt;Many taverna users I&#039;ve been in contact with find the visualisations very useful. Clearly you&#039;re not one of these people, so lets agree to disagree.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Wed, 21 Mar 2007 04:17:27 -0400</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">comment 3433 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>make alternatives</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3432</link>
 <description>&lt;p&gt;There&#039;s also a &lt;a href=&quot;http://freshmeat.net/articles/view/1715/&quot;&gt;page on freshmeat&lt;/a&gt; with a collection of make alternatives. For Python fans: Scons.org&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 20 Mar 2007 19:34:27 -0400</pubDate>
 <dc:creator>maximilianh</dc:creator>
 <guid isPermaLink="false">comment 3432 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>make alternatives: makepp?</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3431</link>
 <description>&lt;p&gt;Great, these wiki pages. My web-searching-reflexes still aren&#039;t good enough. I haven&#039;t searched for &quot;bioinformatics pipelines make&quot; before submitting the post. &lt;/p&gt;
&lt;p&gt;Has anyone of you Perl coders tried &lt;a href=&quot;http://makepp.sourceforge.net/&quot;&gt;makepp&lt;/a&gt; ? Though it doesn&#039;t address many issues raised, it might be 1) a step into the right direction while maintaining compatibility (a cherished concept for us) and 2) Might serve as a base one day, as I guess you prefer to modify rather Perl code than C code.&lt;br /&gt;
I just hope you won&#039;t embark into the LISP/Scheme/Prolog direction. Whereas it might be tempting to write a completely new make system, something that keeps at least some superficial compatibility can lure many more people into trying it out than a new system that we would have to learn from scratch. &lt;/p&gt;
&lt;p&gt;Would you mind posting some results from your discussion to the frontpage of biowiki or here, for those people that don&#039;t live in the bay area?&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 20 Mar 2007 19:17:47 -0400</pubDate>
 <dc:creator>maximilianh</dc:creator>
 <guid isPermaLink="false">comment 3431 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>model multiple outputs in make</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3420</link>
 <description>&lt;p&gt;hm... I might have missed something, but why don&#039;t you write:&lt;br /&gt;
f.psl f.log : f.fasta&lt;br /&gt;
         blat &amp;gt; f.fasta 2&amp;gt; f.log&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 20 Mar 2007 18:38:32 -0400</pubDate>
 <dc:creator>maximilianh</dc:creator>
 <guid isPermaLink="false">comment 3420 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>made guys</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3430</link>
 <description>&lt;p&gt;Thanks for that Chris Lee paper... I&#039;ve been using &#039;make&#039; for pipelines since 1996 (I actually built my entire thesis using makefiles, from analysis to latex to postscript). However it has serious limitations, as documented by Andrew Uzilov on our &lt;a href=&quot;http://biowiki.org/MakefileManifesto&quot;&gt;wiki&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As Jason pointed out, we are offering a Summer of Code proposal (applications this week!) to boost &#039;make&#039; past these limitations. This largely grew out of discussions with Chris Mungall on his biomake project (which also has a &lt;a href=&quot;http://biowiki.org/BioMake&quot;&gt;biowiki page&lt;/a&gt; detailing some of the high-level design goals).&lt;/p&gt;
&lt;p&gt;I think Chris has done as good a job as anyone of defining the high-level goals for such a tool: declarative structure, shell script hooks, flexible dependency tracking (MD5 etc rather than just timestamps), facility to build database tables rather than just files, advanced pattern-matching (not just one wildcard per rule, as in &#039;make&#039;), parallel execution on a cluster and (ideally) a Turing-complete functional programming syntax so that you can start to do low-intensity computation within the pipeline language itself.&lt;/p&gt;
&lt;p&gt;Sadly, with Chris now doing more ontology work than genome annotation, biomake has stalled somewhat. There are some practical alternatives, e.g. Perl modules such as Shengqiang Shu&#039;s SAPS modules (used by the Berkeley Drosophila Genome Project for their pipeline); and then there are some pie-in-the-sky (but theoretically appealing) functional language-based alternatives, like &lt;a href=&quot;http://biowiki.org/ErlangLanguage&quot;&gt;Erlang&lt;/a&gt; (based on Prolog) or &lt;a href=&quot;http://biowiki.org/TermiteScheme&quot;&gt;Termite&lt;/a&gt; (a dialect of Scheme/Lisp).&lt;/p&gt;
&lt;p&gt;In the meantime, there are several versions of make that can use GNU make&#039;s remote stubs feature to do parallel execution on a cluster. These include &lt;a href=&quot;http://biowiki.org/DistmakeProgram&quot;&gt;distmake&lt;/a&gt;, &lt;a href=&quot;http://biowiki.org/QmakeProgram&quot;&gt;qmake&lt;/a&gt; and &lt;a href=&quot;http://biowiki.org/OmakeProgram&quot;&gt;omake&lt;/a&gt; (which also has MD5-based dependency tracking). And of course there are things like Apache Ant, but then you&#039;re moving too far away from the command line for my liking, personally ;-)&lt;/p&gt;
&lt;p&gt;As you might gather from that last sentence I&#039;m &lt;a href=&quot;http://biowiki.org/BioinformaticsWorkflows&quot;&gt;not exactly a proponent&lt;/a&gt; of the Taverna-style approach. I like what they&#039;re doing but I completely agree with the previous commenter that graphical editors tend to become an end in themselves. I think that there is quite enough to do with developing a workable domain-specific declarative programming language for pipelines without trying to build Yahoo Pipes at the same time. But that&#039;s just me, I&#039;m a born-in-the-20th-century fogey; what can I say.&lt;/p&gt;
&lt;p&gt;If anyone reading this is in the Bay Area at noon on Wednesday April 4th 2007, btw, we&#039;re having a lab meeting to discuss exactly this issue (make and successors). Come to my lab, 425 Hearst Mining Building on the Berkeley campus, and meet some other &quot;made guys&quot;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://biowiki.org/IanHolmes&quot; title=&quot;http://biowiki.org/IanHolmes&quot;&gt;http://biowiki.org/IanHolmes&lt;/a&gt;&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 20 Mar 2007 12:40:18 -0400</pubDate>
 <dc:creator>Ian Holmes</dc:creator>
 <guid isPermaLink="false">comment 3430 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>diagrams</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3429</link>
 <description>&lt;p&gt;The workflow diagram that you provide is for me not a real diagram. It&#039;s too complicated. I don&#039;t see a lot of value in it.&lt;/p&gt;
&lt;p&gt;If a diagram is getting complicated it&#039;s taking too much space in you paper, isn&#039;t it? I&#039;d rather write the _details_ as a text.&lt;/p&gt;
&lt;p&gt;If a diagram is simple, I can also draw it myself and optimize the layout in a more human-friendly way than graphviz can.&lt;/p&gt;
&lt;p&gt;We know the problem from software engineering, right: I don&#039;t believen in the added value of UML-generating tools. You end up with thousands of diagrams that no one can read anymore. If diagrams are created automatically they loose their value as a means to simplify and over-simplify the system. &lt;/p&gt;
&lt;p&gt;Biologists are using textual protocols for a reason: They are much more compact and as everything is linear for them, there&#039;s no real need for the expressive power of flowcharts: complex branches or repetitions. I don&#039;t have these neither in my pipelines as they are rather linear in structure.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 20 Mar 2007 11:25:06 -0400</pubDate>
 <dc:creator>maximilianh</dc:creator>
 <guid isPermaLink="false">comment 3429 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>diagrams and pipelines.</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3427</link>
 <description>&lt;p&gt;Too bad you don&#039;t use diagrams to describe your work.&lt;br /&gt;
It&#039;s very useful to show how your pipeline works to other people, and to yourself.. it helps you in writing the right test units for your programs and to see which improvements/changes you can do to them.&lt;br /&gt;
Also, it&#039;s easier to compare two experiments when they are described with a diagram produced with the same syntax.&lt;br /&gt;
I wonder why wet biology scientits don&#039;t use diagrams to describe their experiments when writing papers, too (bioinformaticits will be very happy then).&lt;br /&gt;
If you try to draw the diagram of the pipelin you&#039;ve described in your post, I will understand it easily.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 20 Mar 2007 10:31:41 -0400</pubDate>
 <dc:creator>dalloliogm</dc:creator>
 <guid isPermaLink="false">comment 3427 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>I agree with your sentiment</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3426</link>
 <description>&lt;p&gt;I agree with your sentiment regarding generalized &#039;ubertools&#039;. So much effort creating is expended creating a generalized workflow editor, that the editor becomes and end in itself. &lt;/p&gt;
&lt;p&gt;Another thing that gets me about graphical workflow editors is that for anything other than the most common tasks, you need to revert to scripting to plug outputs into inputs. So why not scripting in the first place ? Maybe I&#039;m not the right audience for Taverna...&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 20 Mar 2007 10:28:31 -0400</pubDate>
 <dc:creator>Greg</dc:creator>
 <guid isPermaLink="false">comment 3426 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>powerpoint vs graphviz diagrams</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3425</link>
 <description>&lt;p&gt;You could draw them in PowerPoint, but if they start getting complicated, it is time consuming. Having a standard way of drawing pipelines/workflows is a handy tool for quickly communicating what your analysis does graphically. Taverna uses &lt;a href=&quot;http://en.wikipedia.org/wiki/Graphviz&quot;&gt;GraphViz&lt;/a&gt; to do this, see for example &lt;a href=&quot;http://www.flickr.com/photos/dullhunk/428079229/&quot;&gt;this workflow diagram&lt;/a&gt;. I dunno about you, but I wouldn&#039;t fancy drawing that figure manually in PowerPoint, when GraphViz can do most of the hard work for me.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 20 Mar 2007 09:50:50 -0400</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">comment 3425 at http://www.nodalpoint.org</guid>
</item>
<item>
 <title>taverna papers</title>
 <link>http://www.nodalpoint.org/2007/03/18/a_pipeline_is_a_makefile#comment-3424</link>
 <description>&lt;p&gt;Not sure what you mean by &quot;applying taverna for papers&quot;, but here is &lt;a href=&quot;http://dx.doi.org/10.1093/bioinformatics/bth944&quot; rev=&quot;review&quot;&gt;one example&lt;/a&gt; of what I think you might be interested in, the successful application of taverna workflows to a difficult problem.&lt;/p&gt;
&lt;br class=&quot;clear&quot; /&gt;</description>
 <pubDate>Tue, 20 Mar 2007 09:43:23 -0400</pubDate>
 <dc:creator>Duncan</dc:creator>
 <guid isPermaLink="false">comment 3424 at http://www.nodalpoint.org</guid>
</item>
</channel>
</rss>
