Fig. 8

A simple machine-learning model predicts TNBC mesenchymal subtype patients. (a) Flowchart illustrating the construction of the VAX2 signature (VAX2.sig). First, downregulated genes (p.adj < 0.05 & logFC < -1) were identified from RNA-seq of VAX2-knockdown MDA468 vs. WT-MDA468. Second, genes near VAX2-binding peak regions were obtained from ChIP-seq analysis of VAX2-knockdown MDA468 vs. WT-MDA468. Finally, genes specifically expressed in tumor cells were identified from six TNBC single-cell datasets (p.adj < 0.05 & logFC > 2; genes present in more than three datasets). The intersection of these gene sets was taken to define VAX2.sig. (b) VAX2 or VAX2.sig expression was used in a multi-task machine-learning model to predict patients with the TNBC mesenchymal subtype. METABRIC-TNBC was used for model training and validation, and TCGA-TNBC was used for independent validation. (c-d) In METABRIC-TNBC, the predictive performance of different machine-learning models based on VAX2 expression (c) or VAX2.sig (d), including the ROC curves of different machine-learning models and the confusion matrix of the best-performing model. (e) In TCGA-TNBC, the predictive performance of different machine-learning models based on VAX2.sig, including the ROC curves of different machine-learning models and the confusion matrix of the best-performing model. (f) The best-performing machine-learning model based on VAX2.sig predicted the prognosis of patients with the mesenchymal subtype in two additional TNBC cohorts.