Bayesian variable selection and modelling for metastatic breast cancer data

Sarini Sarini, James McGree, Kerrie Mengersen


A Bayesian model selection procedure is applied to data on 90 women with metastatic breast cancer. Protein covariates are measured on nucleus, cytoplasm, membrane, and stroma of primary breast carcinoma and lymph node metastasis tissue. Multiple imputation is performed to deal with missing data. Zellner's g-prior is used in the Bayesian variable selection procedure. The model space is reduced using posterior variable inclusion probabilities, and then posterior model probabilities are used to derive a candidate set of models. Bayesian model averaging is employed to robustly estimate survival time, and the goodness of fit of the derived model assessed by the correlation between estimated and observed survival times. The results show evidence of proteins having different rules in different parts of the tissue cell with respect to patient survival. Therefore, a recommendation is given on which part of the cell to observe certain proteins for prognosis. The models obtained are robust toward censoring and showed correlations between the observed and the predicted data between 0.7 and 0.84.

bayesian model averaging; bayesian variable selection; gibbs sampler; metastasis breast cancer; weibull regression; zellner's-g prior

ANZIAM Journal, ISSN 1446-8735, copyright Australian Mathematical Society.