Charles Explorer logo
🇬🇧

Pan-Cancer Detection and Typing by Mining Patterns in Large Genome-Wide Cell-Free DNA Sequencing Datasets

Publication at First Faculty of Medicine |
2022

Abstract

Background: Cell-free DNA (cfDNA) analysis holds great promise for non-invasive cancer screening, diagnosis, and monitoring. We hypothesized that mining the patterns of cfDNA shallow whole-genome sequencing datasets from patients with cancer could improve cancer detection.

Methods: By applying unsupervised clustering and supervised machine learning on large cfDNA shallow whole-genome sequencing datasets from healthy individuals (n = 367) and patients with different hematological (n = 238) and solid malignancies (n = 320), we identified cfDNA signatures that enabled cancer detection and typing. Results: Unsupervised clustering revealed cancer type-specific sub-grouping.

Classification using a supervised machine learning model yielded accuracies of 96% and 65% in discriminating hematological and solid malignancies from healthy controls, respectively. The accuracy of disease type prediction was 85% and 70% for the hematological and solid cancers, respectively.

The potential utility of managing a specific cancer was demonstrated by classifying benign from invasive and borderline adnexal masses with an area under the curve of 0.87 and 0.74, respectively. Conclusions: This approach provides a generic analytical strategy for non-invasive pan-cancer detection and cancer type prediction.