Scanning -> dat file contains one intensity per pixel
49 pixels per cell are summarized by 75th percentile after removal of outer perimeter pixels. This is the cell intensity, each cell corresponding to a probe.
On the HG-U133 chip, each target is represented by a set of 11 pairs of PM:MM probes.
The MM probe is obtained by complementing the middle base in the PM oligo and is meant to be an internal control assumed to hybridize to nonspecific sequences about as effectively as its PM counterpart.
Each PM probe is a 25 base long oligo selected with the objective of achieving linearity between log intensity and log concentration.
How to combine the PM:MM intensities into a measure of expression for the target?
Other gel electrophoresis patterns used at different stages of preparation are used to make qualitative assessments of the RNA samples.
Sample quality assessment by gel electrophoresis
For total RNA, look for 18S and 28S bands (not shown here).
For cDNA, a good sample will produce a smear extending from top to bottom of the gel.
Unfragmented cRNA will also produce a smear running doen the gel.
Fragmented cRNA gel should appear as a blob at the bottom of the gel indicating that the cRNA has been sucessfully fragmented to pieces about 50 bp in length,
Next slide from Vanderbilt MicroArray Shared Resource web site
Affymetrix standards for post hybridization and scanning quality assessment – examination of quality report.
Array quality metrics:
Raw Q (Noise): The degree of pixel-to-pixel variation among the probe cells used to calculate the background = average over background cells (lower 2 percentile) of cell pixel intensity standard error. Between 1.5 and 3.0 is ok. Use scaled noise to get consistency between arrays.
Scaling factor ~ 100/2% trimmed mean of intensities (not logged). Should be kept below 10. Key is consistency across arrays.
Background ~ average of of cell intensities in lowest 2 percentile, by region, with smoothing. No range. Key is consistency.
Percent present calls. Typical range is 20-50%. (i.e. are PM>MM?).
Note – All these quantities, including noise, can be extracted from the cel file.
Affymetrix standards - Examination of spikes and poly A controls
Hybridization controls: bioB, bioC, bioD and cre. “At 1.5 pM bioB should be called Present 70% of the time. … the others should be called present 100% of the time with increasing Signal value (bioC, bioD, and cre, resp.) “Check that bio C, representing the minimum specification of detection, is present.
Poly A controls: dap, lys, phe, thr, tryp. Used to monitor wet lab work. Sense strand cRNAs synthesized from the control genes can be added to samples prior to the reverse transcription step to monitor target synthesis and labeling efficiencies. Antisense cRNA transcripts can be added to the to the target cRNA sample to monitor the amplification and labelling steps.
Housekeeping/Control Genes: GAPDH, beta-Actin, ISGF-3 (STAT1): 3’ and 5’ signal intensity ratios of control probe sets (GAPDH, Beta Actin): “A 1:1 molar ratio of the 3’ and 5’ transcript regions will not necessarily give a signal ratio of 1”
All controls appear on the chip in both sense strand (_st) and antisense strand (_at) versions, and all have probe sets chose from the 5’, M and 3’ end of the target transcript.
Affymetrix standards - Examination of other spike ins or control probe sets:
Normalization Control Set: 100 probe sets replicated on both A and B arrays (new to HG-U133) – these are a set of genes found to be called present with low MAS4 signal variability in a large set of tissues.
Linearity and sensitivity of amplification as quantified using spike-in bacterial cRNA.
Chip dat file – checkered board – close up w/ grid
Examination of cel file
Chip cel file – checkered board
Chip cel file – checkered board – close up w/ grid
Chip cel file – PM - MM
Limitatons of standard QC metrics and procedure
Link between these metrics and the numbers we care about is missing.
Quality of data gauged from spike-ins requiring special processing may not represent the quality of the rest of the data on the chip – risk of QCing the chip QC process itself, but not the gene expression data.
Good end-point data quality assessment is needed to assess the validity of these indirect data quality assessments.
Review of models for gene expression value estimation
MAS 5 (Microarray Suite 5 by Affymetrix)
Expression measures are derived as follows in Affymetrix’ Microarray Analysis Suite 5.0:
A background correction is applied to the probe intensities.
For each probe set the log expression is estimated by means of a one-step Tukey biweighted average of log(PMj- MMj*), where MM* is an MM value modified to ensure that it does not exceed the PM value. This is equivalent to robustly estimating the parameter in the model log(PMj- MMj*) = + ej
To compare expression measures across chips, expression values are normalized by a multiplicative scaling factor. This is equivalent to shifting the expression values on the log scale.
See Affy technical description [1].
RMA
The Robust Multichip Average is an expression measure obtained from analysing a set of chips in the following way:
A background correction is applied to probe intensities [3].
A probe intensity normalization vector is computed from the set of chips and the intensities of each chip normalized to this vector [4].
For one probe set, the log of the background corrected and normalized probe intensities, Yki say, are modelled as the sum of a chip effect and a probe effect:
Yjk = j + k + jk
where k indexes chips and j indexes probes
To produce the RMA expression values, the model is fitted robustly and the estimated parameters k used as estimate of log expression for each chip.
RMA vs MAS5
Background correction is different – Affymetrix removes a fixed amount with some local adjustment; RMA uses a model which results in an intensity dependent bg correction.
Normalization is at probe level and intensity dependent.
Multichip analysis enables the estimation of probe effects.
RMA expression values has been shown to be highly reproducible and to detect changes in target mRNA concentration with great sensitivity [5, 6].
Our main interest here is in the use of model fit results for quality assessment. The size of the residuals from a fit indicates the quality of the fit and the variability of the parameter estimates. These can be summarize and visualized in various ways to provide chip expression data quality indicators.
Affymetrix public dataset - Spike-in design below is repeated 3 times with chips from different lots. (One large sample prepared from pancreas polya+ mRNA)
The model fits – ex 1
The model fits – ex 2
LS fit
If we assume all measurements are equally precise we obtain the simplified error model
Yjk = j + k + jk , with kj ~ iid N(0, 2)
This model is commonly fitted by LS with parameter estimates:
bj = Yj. – the mean of the observations for probe j
ak = Y.k – the mean of the observations for chip k
s2 = r2/(n-J-K+1)
Under this model, parameters have estimated standard errors:
SE(ak) = s/sqrt(J) , SE(bj)=s/sqrt(K)
i.e. Every chip expression has the same estimated variability.
Robust fit
The least squares fit provides optimal (unbiased, asym min var) estimates when the model is true, but the LS estimates produced under slight departures from the assumed model soon lose their good properties. Robust fitting procedures have been devised to produce estimates which are good under the assumed model and remain so under slight departures from these assumptions.
A commonly used robust fitting procedure is iteratively reweighted least squares, in which an following an initial guess at the fit is followed by a sequece of weighted LS fits, with the wwights derived from the previous fit as follows:
Estimate the scale: S = mad(res)
Weights are given by: wjk =.huber(abs(rjk/S))
Weighted LS fit estimates and estimated standard errors are given by:
Image artefacts (scratches, bubbles, uneven hybridization, glare in scan) being a common occurrence, the gross error model is more realistic than the iid Normal model.
Because of cross-hybridisation, and other reasons, probes within probe sets do not all respond the same way – the robust fitting procedure will go with the majority of the probes.
The proof of the benefits of robustly fitting the model will be in the pudding (but that is not to be tasted today)
For QC purposes, it is essential to use a robust fitting procedure in order to let the outliers speak out.
Assessing chip expression data quality
Chip expression data quality assessment
Having fitted models at the probe set level across a set of chips, we want derive some chip specific quantities to be used as indicators of overall chip expression data quality.
Look at set of residuals for a chip over all probe sets, one residual per probe. Compare these batches of residuals across chips. Chips with large number of bad probes will have larger residuals – look at IQR
First summarize the residuals into a probe set SE for expression value for chip and compare batches of SEs between chips.
SEs in 2 are heterogeneous mix – can use batches of unscaled SEs to compare chips.
Can normalize further by rescaling by the median chip unscaled SE.
All the above produce a batch of numbers for each chip. Need to have one, or a few numbers, per chip. Start with median of set in 3.
Data Picture 1.
Data Picture 2.
Analyzing chips one at a time
Some will want to analyze chips one at a time – either because they have too few, or in some cases, too many, to analyze in batches.
We can then get a probe set summary by summarizing these robustly – a=T(Z), where T is median, trimmed mean or other robust summary (note that we only have 11 points here)
Subtracting the probe summary from the probe effect corrected intensities produces a set of residuals - rj = Zj-a.
The residuals can be turned into weights using the estimate of scale from the fitted model – wj=psi.huber(rj/S).
Chip expression data quality assessment example
Probe level data images:
Residual chip pseudo-images
Weight chip pseudo-images
Probe level data:
Boxplot residuals
Bar 10th percentile of weights distributio
Probe set level data:
Boxplot SEs
Boxplot unscaled SEs
Boxplot normalized unscaled SEs
data set used for illustration
Case study 1 – 24 chips with common pancreas RNA preparation (part of Affymetrix Latin square experiment)
We first look at the RMA derived diagnostics case, by case.
Then look at control and housekeeping genes and 3’-5’ bias, by analysis (to make comparisons between the sources of RNA easier).
New Statistical Algorithms for Monitoring Gene Expression on GeneChip® Probe Arrays, Affymetrix technical report.
Array Design for the GeneChip® Human Genome U133 Set, Affymetrix technical note.
Discussion on Background, Ben Bolstad.
Bolstad BM, et. al. (2003), A comparison of normalization methods for high density oligonucleotide array data basedon variance and bias.Bioinformatics. 2003 Jan 22;19(2):185-193.