Mammography data (x-ray radiography of the breast) are available for a handful of patients from the TOP trial. The resolution of these images, stored in the DICOM format, is 70μm. They don’t have associated annotations (e.g. tumour contours).
Affymetrix U133 plus 2.0 contains probes for more than 38,500 transcripts corresponding to well-characterized genes and Unigene genes, giving a full-genome view of gene expression. The raw information is stored in “.CEL” files and a number of pre-processing steps is required to retrieve it and produce gene expression estimates. These steps involving background correction, normalization, and summarization are often combined into a single all-in-one pre-processing algorithm that takes raw probe intensities as input and produces gene expression estimates as output.