Literature

Publish or Perish software - now for Linux

Publish or Perish is an interesting (and free) piece of software, that obtains citations using Google Scholar and then analyses them in various ways. In particular it makes use of h-indices, which have been proposed as a "fairer" citation metric.

I've been in correspondence with the developers over the past couple of months and they kindly let me know that a native Linux version, built using GTK+ 2.x is now available. If citation analysis is your thing, give it a try and let the authors know what you think.


How to compile a database of citations?

The discussion on impact factors got me wondering - is there a public, free access citation database for articles in Medline / Pubmed? I know of Scopus, ISI WOS (but theyre not free, and their content is proprietary) and Google Scholar (only give 'cited by', when I want 'this article cites x and y')?

How would one build such a database, if its not accessible? I know that ISI actually scans articles (not doable by myself) - I don't know how Scopus got their index, though.

Such a database would help tremendously on some bibliomics work I'm doing. Is it technically feasible to get references for all Medline articles (at least, those past 1996?). Where would you get the information - scrape/spider&index publishers website, if this information is even freely accessible (without a subscription?) and then match against a local Medline database (which I already have)? If anyone can help, it'd be appreciated :)


(Velculescu VE, 1997) Characterization of the Yeast Transcriptome @Cell #20060730

HI HexiRPA000010
DN (Velculescu VE, 1997) Characterization of the Yeast Transcriptome @Cell #20060730
DA 2006.07.30
CP Cell. 1997 Jan 24;88(2):243-51.
TI Characterization of the yeast transcriptome.
AU Velculescu VE, Zhang L, Zhou W, Vogelstein J, Basrai MA, Bassett DE Jr, Hieter P, Vogelstein B, Kinzler KW.
IN Program in Human Genetics and Molecular Biology, The Johns Hopkins University School of Medicine, Baltimore, Maryland 21231, USA.
AB We have analyzed the set of genes expressed from the yeast genome, herein called the transcriptome, using serial analysis of gene expression. Analysis of 60,633 transcripts revealed 4,665 genes, with expression levels ranging from 0.3 to over 200 transcripts per cell. Of these genes, 1981 had known functions, while 2684 were previously uncharacterized. The integration of positional information with gene expression data allowed for the generation of chromosomal expression maps identifying physical regions of transcriptional activity and identified genes that had not been predicted by sequence information alone. These studies provide insight into global patterns of gene expression in yeast and demonstrate the feasibility of genome-wide expression studies in eukaryotes.


Readme for HexiRPA Document - my personal notes to reading papers

Hexi Reading Professional Articles (HexiRPA):

Readme for the docment:

for Each entry, there are 5 sections:Paper Tag (Line 1-2), Read Date (Line 3), PubMed Citation (Line 4-9), MyNotes (Line 10-12) and End Tag (Line 13)
Line 1 : Paper No. --(HI:HexiRRA Idendifier)
Line 2 : Document Name on myPC --(DN: Document Name)
Line 3 : Date read the paper --(DA:Date)
LIne 4 : Citation of PubMed Format (CP: Citation of PubMed)
Line 5 : Title (TI: Title)
LIne 6 : Authors (AU : Author)
Line 7 : Institutes [one or more lines] (IN : INstitutes)
Line 8 : Abstract (AB : ABstract)
Line 9 : PubMed ID (PM : PubMed ID)


A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes, from PN

Based on the comparison of different sequencing strategies in six small marine microbial genome, the paper evaluated the utility and cost-effectiveness of a hybrid sequencing approach using 3730xl Sanger sequecing and 454 run to generate higher-quality lower-quality lower-cost assemblies compared to current Sanger sequencing strategies alone. For the genome more than 3Mb with many sequencing gaps and hard stops, the sequence strategy of 5.3X Sanger sequencing plus two 454 runs is the best choice.

Proc Natl Acad Sci U S A. 2006 Jul 13; [Epub ahead of print] Books, LinkOut

A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes.


A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies, Nat Biotech

The paper described a framework for comparisons across gene expression microarray platforms and laboratories, which including: 1) Affymetrix; 2) Agilent; 3) Applied Biosystems (ABI); 4) Amersham (now GE Healthcare); 5) cDNA arrays provided by the Cepko laboratory (academic cDNA); 6) Compugen (now Sigma-Genosys); 7) Mergen; 8) long oligonuceotide arrays from the Microarray Core facility at Massachusetts General Hospital (MGH long oligo); 9) MWG BioTech (now Ocimum Biosolutions); 10) Operon. As a result, the commercial platform ABI has the best performace, where the academic cDNA from Harvard poorest.


Scientist rankings arranged by topic

I have been working on a citation ranking system to rank researchers in subject-specific fields. You can see my beta launch of this service at http://www.biolicious.com. We are adding new topics quite often, but the parsing is quite a lot of work and we only have 1 dedicated box right now.


Who can find a paper of the month?

I was skimming through my RSS feeds in search of a "paper of the month" and I came up short. It was rather disheartening actually - a lot of current publications in bioinformatics seem to consist of:

  • new algorithms without practical application
  • findings of low general interest by beginners using the most basic of tools e.g. BLAST
  • badly designed database frontends with no functionality

I'm beginning to worry that bioinformatics is in danger of failing to live up to its promises. We have to convince the unenlightened that our tools, applied intelligently, can provide meaningful insight into real biological problems of fundamental interest. Yet I see little evidence of this at the moment.
Could someone please find a really good paper or suggest a fantastic collaborative project, or I'll get really depressed.


Impact factors

Spitshine over at A Bioinformatics Blog muses over the importance (or otherwise) of impact factor. I was going to comment over there but it involved some registration with which I wasn't too comfortable (sorry mate).

I agree with his post - impact factor has always struck me as one of the great lies of science and one that I don't want to live. I'd even say that how someone feels about impact factor is one of the criteria that I use to determine what kind of scientist they are - the stuffy, traditional unimaginative kind or the enlightened and serene type that inhabit Nodal :).

Unfortunately, many people that I know are happy to live this lie. Whenever I'm writing up a paper, there is always discussion about where to send it and the impact factor. It's an entrenched attitude that won't change until more people are willing to just "put it out there", in an appropriate forum, as opposed to worrying about status.


And so another month is almost gone...

As much a reminder to myself as a spur for the rest of you...

Can it be the end of April already? Where does the time go? More importantly, get thinking about your next submission for "Bioinformatics paper of the month" in 2 days time.


Syndicate content