What You’re Doing Is Rather Desperate

Notes from the life of a bioinformatics researcher

OpenID: don’t provide if you won’t accept

with one comment

Once again, an interesting FriendFeed discussion has morphed into a thread on a wider issue: OpenID.

OpenID is one of those brilliantly simple ideas that you’d imagine most people would applaud. A single “digital identity”, used for any website that requires a login, rather than creating multiple accounts, usernames and passwords for each site. Here’s the problem: many services allow you to use your account details as an OpenID at other sites, but they won’t accept credentials other than their own. For example, both my WordPress and GMail accounts are OpenIDs, but I can’t login to Google using nsaunders.wordpress.com.

You might ask - why? Is it some sort of “brand loyalty” issue? When I sign up to Service X, is there a contract between us such that Service X provides me with tools so long as I declare myself to be “Neil @ Service X”, as opposed to “Neil @ Service Y”? I’m still logged into Service X, I must be “registered” in some sense as a user and I’m more likely to return if Service X makes my life easier. Where’s the problem?

There must be strategists who make these recommendations to companies. I’d really like to hear their reasoning.

Written by nsaunders

November 27, 2008 at 5:23 pm

Spreading the message, a few minds at a time

with 2 comments

Given my passion for online science networking, it’s surprising that I’ve never given a talk on the subject [1]. So a big thank you to William who invited me over to his institute for an informal chat about the topic with a small group of staff.

I learned that:

  • A good quote from an internet guru goes down well
  • Everyone loves an xkcd cartoon
  • Many biologists still don’t know what an RSS feed is

My slides are embedded, below or visit Slideshare - best viewed full screen.

1. Oh wait, I work in a university
See the slides

Written by nsaunders

November 27, 2008 at 2:50 pm

Poor reproducibility: understandable, if not desirable

leave a comment »

Greg Wilson once told me a statistic concerning the mean lifetime of research software reproducibility. That is, the time that elapses on average after which you cannot reproduce your own results using your own code, never mind anyone else’s. I forget the exact number but it was not high - a few months at best.

Why does this happen, aside from obvious bad practices? Well, here’s a typical exchange in an academic research setting:
Read the rest…

Written by nsaunders

November 19, 2008 at 4:02 pm

Silly script for the day

leave a comment »

So, you’d like to submit a URL to Open Laboratory 2008, you want to know if it’s already in Bora’s list and your machine runs Ruby? I thought so!

Save the following code as “bora.rb”, make it executable and run:

./bora.rb http://your.url.goes.here

You really want to read the rest?

Written by nsaunders

November 13, 2008 at 5:15 pm

Good software, data and your brain

with 2 comments

I recently asked the FriendFeed community about wiki usage and was struck by a comment from Allyson:

I think we’re on our third incarnation of various bits of wiki software, and we’ve finally hit on the right software for both our wet lab and bioinformaticians

By “the right software”, she means software that makes sense to the people who use it. When faced with several software alternatives, we often find there is one which for some reason, “makes sense” - it meshes naturally with the way our brains work. When you find a program that you like, it’s not only a joy to use but can enable understanding of data and processes that previously eluded you. In other words, good software doesn’t shield you from the fundamentals - it illuminates them.
Here are three examples of software that made me say: “Oh right! Now I get it.” These are not recommendations and opinions expressed are highly subjective: the point is, I like them because they work for me.
Read the rest…

Written by nsaunders

November 13, 2008 at 1:23 pm

Posted in computing, programming

Tagged with , , , , , ,

DokuWiki, PubMed and Ruby

with 3 comments

I recently built a wiki for a research group using DokuWiki, one of my favourite wiki packages. As with many other wikis, developers have extended its functionality by writing plugins. Some of these are excellent, allowing users to generate lots of content with a minimum of syntax. For example, using the PubMed plugin, you type this:

{{pubmed>long:15595725}}

and the result is this:
pubmed

Which got me thinking. Assuming that you’ve searched PubMed and retrieved a bunch of references in XML format, how might you generate text in DokuWiki syntax, to paste into your wiki? Here’s the small parser that I wrote in ruby:


#!/usr/bin/ruby
require 'rubygems'
require 'hpricot'

h = {}
d = Hpricot.XML(open('pubmed_result.xml'))

(d/:PubmedArticle).each do |a|
  (h["=== #{a.at('DateCreated/Year').inner_html} ==="] ||= []) << "{{pubmed>long:#{a.at('PMID').inner_html}}}"
end

puts h.sort {|a,b| b<=>a}

Nine lines - how cool is that? It uses Hpricot to parse the XML and creates a hash of arrays. Hash key is the year, formatted to show a level 4 headline in DokuWiki; hash value is an array of PMIDs, formatted with PubMed plugin syntax. At the end we just print it all out, sorting by year from newest - oldest.

As Pierre would say - that’s it.

Written by nsaunders

November 6, 2008 at 6:54 pm

Posted in computing

Tagged with , , , , , ,

When information retrieval goes…weird

with 4 comments

Bar-tailed godwit

Bar-tailed godwit

This is a little odd - the tale of the publication that isn’t.

Update: the “missing article” surfaced in my RSS reader on Nov 1; here’s the link

Read the rest…

Written by nsaunders

October 30, 2008 at 2:08 pm

Reasons to love the Web #999

leave a comment »

Every day, I’m amazed by the information ecosystem that we call the WWW and how it has changed forever the way we educate ourselves.

Today’s illustration. I spent part of last weekend strolling through the beautiful rainforest of Brisbane Forest Park, a mere hour’s drive from the city. On the track at Maiala I heard a very bizarre noise, high in the misty canopy. The sound was a blend of fighting cats and crying children, yet strangely musical. It was a new sound to me but the cat-like aspect was a give-away, since I was aware of a species called the green catbird.

Back at home, I consulted the trusty Simpson and Day’s Birds of Australia. It described a sound similar to what I had heard but of course, bird sounds don’t translate to written English very well. So I headed off to the appropriate Wikipedia entry. It’s not one of the more compehensive pages but in the external links includes:

Green Catbird audio recording at Freesound

I played the sound - it was exactly what I had heard. What’s more the page is tagged, geotagged and part of a wonderful resource called the Freesound project - a collaborative database of Creative Commons licensed sounds.

So in the space of a few hours I lifted my spirits in the great outdoors, heard something new, tracked it down on the Web and discovered a bunch of new, interesting related information. That’s the Web at its best; integrating seamlessly with your daily life to enhance what you see around you. When it works, it’s an almost Zen-like experience.

Written by nsaunders

October 21, 2008 at 3:11 pm

Posted in australia, web resources

Tagged with , ,

The three phases of MySQL usage

with 4 comments

As Mike keeps reminding me, getting your data into database tables is A Good Thing. Like many people, my database of choice is MySQL - largely because it was the first one that I tried and it works for me.

However, I’m far from being an expert MySQL user. In fact, I’ve identified 3 stages in my use of MySQL over the years; see if you recognise yourself in any of them.
Read the rest…

Written by nsaunders

October 17, 2008 at 12:06 pm

Giant panda genome: mapped or sequenced?

with 5 comments

I’m with Ogden Nash who said:

I love the baby giant panda,
I’d welcome one to my veranda

This week, I learned via Keith that Chinese scientists announced the completion of the giant panda genome. An impressive achievement, given that the project was announced in March this year, but what exactly has been completed? Has the genome been sequenced - that is, there are strings of A, C, G and T covering most chromosomes, or mapped - that is, the approximate chromosomal location of most genes determined? The media seem unsure.

And so on. Here’s a Google News search with more hits.

So what has been achieved - sequencing or mapping? If the former, is it really complete (I doubt this) or draft - and if draft, what kind of quality? And where are the data? Nothing in the genome project section of NCBI as yet.

Written by nsaunders

October 17, 2008 at 11:16 am