The paper described a framework for comparisons across gene expression microarray platforms and laboratories, which including: 1) Affymetrix; 2) Agilent; 3) Applied Biosystems (ABI); 4) Amersham (now GE Healthcare); 5) cDNA arrays provided by the Cepko laboratory (academic cDNA); 6) Compugen (now Sigma-Genosys); 7) Mergen; 8) long oligonuceotide arrays from the Microarray Core facility at Massachusetts General Hospital (MGH long oligo); 9) MWG BioTech (now Ocimum Biosolutions); 10) Operon. As a result, the commercial platform ABI has the best performace, where the academic cDNA from Harvard poorest.
Nat Biotechnol. 2006 Jul;24(7):832-840. Epub 2006 Jul 2.
A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies.
Kuo WP, Liu F, Trimarchi J, Punzo C, Lombardi M, Sarang J, Whipple ME, Maysuria M, Serikawa K, Lee SY, McCrann D, Kang J, Shearstone JR, Burke J, Park DJ, Wang X, Rector TL, Ricciardi-Castagnoli P, Perrin S, Choi S, Bumgarner R, Kim JH, Short GF 3rd, Freeman MW, Seed B, Jensen R, Church GM, Hovig E, Cepko CL, Park P, Ohno-Machado L, Jenssen TK.
[1] Department of Developmental Biology, Harvard School of Dental Medicine, 188 Longwood Ave., Boston, Massachusetts 02115, USA.
[2] Department of Genetics, Harvard Medical School, Howard Hughes Medical Institute, Boston, Massachusetts, USA.
[3] Decision Systems Group, Brigham and Women's Hospital, Boston, Massachusetts, USA.
[4] These authors contributed equally to this work.
Over the last decade, gene expression microarrays have had a profound impact on biomedical research. The diversity of platforms and analytical methods available to researchers have made the comparison of data from multiple platforms challenging. In this study, we describe a framework for comparisons across platforms and laboratories. We have attempted to include nearly all the available commercial and 'in-house' platforms. Using probe sequences matched at the exon level improved consistency of measurements across the different microarray platforms compared to annotation-based matches. Generally, consistency was good for highly expressed genes, and variable for genes with lower expression values as confirmed by quantitative real-time (QRT)-PCR. Concordance of measurements was higher between laboratories on the same platform than across platforms. We demonstrate that, after stringent preprocessing, commercial arrays were more consistent than in-house arrays, and by most measures, one-dye platforms were more consistent than two-dye platforms.
PMID: 16823376 [PubMed - as supplied by publisher]
Protocols for microarray:
1. the Minimum Information about a Microarray Experiment (MIAME)
http://www.mged.org/Workgroups/MIAME/miame.html
2. the External RNA Control Consortium (ERCC)
http://www.cstl.nist.gov/biotech/Cell&TissueMeasurements/GeneExpression/...
3. the MicroArray Quality Control (MAQC) Project:
http://www.fda.gov/nctr/science/centers/toxicoinformatics/maqc
Microarray Data:
1. Gene Expression Omnibus (GEO)
2. ArrayExpress
Bias-induced Factors:
1. nonidentical samples on different platforms
2. samples not sufficiently distinct
3. samples processed using different protocols
4. lack of technical replicates
5. data prepocessin steps not standardized
6. only a few types of platforms directly compared
7. measurements matched using probe annotations
8. 'agreement' not unambigously quantified
9. insufficient biological validation
Comparisons on ten different microarray platforms:
1. Affymetrix
2. Agilent
3. Applied Biosystems (ABI)
4. Amersham (now GE Healthcare)
5. cDNA arrays provided by the Cepko laboratory (academic cDNA)
6. Compugen (now Sigma-Genosys)
7. Mergen
8. long oligonuceotide arrays from the Microarray Core facility at Massachusetts General Hospital (MGH long oligo)
9. MWG BioTech (now Ocimum Biosolutions)
10. Operon
5 replicate assays for each sample
Intra-Platform Comparsions:
1. see Table 1
Correlation (Pearson + Spearman)
Highest: ABI
Lowest: academic cDNA
2. Coefficents of Viration (CVs)
Best: ABI, Affymetrix, Amersham, Agilent
Poorest: academic cDNA
Inter-platform Comparsions:
1. Probe measurements were mapped to the following gene identifiers:
a. UniGene (UG)
b. LocusLink (LL)
c. RefSeq (RS)
d. RefSeq exon (RSEXON)
See Figure 1
e.g. NM_008086 (Gas1)
2. Assessment of measurement deviation from pseudo-nomival values
? what is outlier (statistics)
Principal Component Analysis (PCA) Plot
See Figure 2 ?
Inter-laboratory Comparison:
Data from 3 platforms: a)Affymetrix, b)Amersham, c)Mergent
Difference of one-dye platforms and two-dye platforms ?
Discussion:
1. Why Cortex and Retina:
Cortex: brain tissues are generally considered to have broad expression profiles
Retina: has some well-known tissue specific transcripts
2. One-dye platforms: ABI, Affymetrix, Amersham, Mergen
Two-dye platforms: The other six ones.
Methods:
1. Preprocessing of Microarray data:
a. Normalization
b. Transformation
c. Filtering
2. Mapping of Genes across platforms:
a. annotation-based approaches: MatchMiner (UG, LL)
b. sequence-based approaches: UCSC, BLAT
3. Analyses tools/softwares:
a. R software environment (http://www.R-project.org)
b. BioConductor package
c. MATLAB
4. Biological validations:
a. genes should be present in at least six platforms (4+2): One-dye Platforms + Other 2 platforms
b. genes should span the dynamic range
c. genes should include pairs with measurements that were in disagreement.

