Background: A major challenge of cancer research is to identify key molecules which are responsible for the development of the malignant metastatic phenotype, the major cause of cancer death.Methods: Four subtracted cDNA libraries were constructed representing mRNAs differentially expressed between benign and malignant human breast tumour cells and between micro-dissected breast carcinoma in situ and invasive carcinoma. Hundreds of differentially expressed cDNAs from the libraries were micro-arrayed and screened with mRNAs from human breast tumor cell lines and clinical specimens. Gene products were further examined by RT-PCR and correlated with clinical data.Results: The combination of subtractive hybridisation and microarray analysis has identified a panel of 15 cDNAs which shows strong correlations with estrogen receptor status, malignancy or relapse. This panel included S100P, which was associated with aneuploidy in cell lines and relapse/death in patients, and AGR2 which was associated with estrogen receptor and with patient relapse. X-box binding protein-1 is also an estrogen-dependent gene and is associated with better survival for breast cancer patients.Conclusions: The combination of subtracted cDNA libraries and microarray analysis has thus identified potential diagnostic/prognostic biomarkers and targets for cancer therapy, which have not been identified from common prognostic gene signatures.