Abstract
Background
Two large LDCT screening trials have provided evidence of significant reduction in lung cancer mortality. Current implementation focuses on high risk, based on age and smoking history and/or utilising epidemiological risk models, rather than biological risk. The plasma proteome provides potential insight into contributing biological factors; we investigated its potential for lung cancer prediction up to 5 years before diagnosis.
Methods
The Olink® Explore 3072 platform quantitated 2941 proteins in 496 Liverpool Lung Project plasma samples, including 131 cases taken 1-10 years prior to diagnosis, 237 controls, and 90 subjects at multiple times. 1112 proteins significantly associated with haemolysis (FDRResults
For samples 1-3 years prior to diagnosis, 240 proteins were differentially expressed between cases and matched controls; for 1-5 year samples, 117 of these and 98 different proteins were identified, mapping to significantly different pathways. Four machine learning algorithms (Elastic Net, Random Forest, Support Vector Machine, XGBoost) gave median AUCs of 0.76-0.90 for the 1-3 year proteins and 0.73-0.83 for the 1-5 year proteins. External validation in UK Biobank samples gave AUCs of 0.75 (1-3 year) and 0.69 (1-5 year), with AUC 0.7 up to 12 years prior to diagnosis. The models were independent of age, smoking duration, cancer histology and the presence of COPD.
Conclusion
The plasma proteome provides a rich source of protein biomarkers which may be used to identify patients at greatest risk of lung cancer, 5 or more years prior to diagnosis. The proteins and the pathways are different when lung cancer is more imminent, indicating that both biomarkers of inherent risk and biomarkers associated with presence of early lung cancer may be identified.
Two large LDCT screening trials have provided evidence of significant reduction in lung cancer mortality. Current implementation focuses on high risk, based on age and smoking history and/or utilising epidemiological risk models, rather than biological risk. The plasma proteome provides potential insight into contributing biological factors; we investigated its potential for lung cancer prediction up to 5 years before diagnosis.
Methods
The Olink® Explore 3072 platform quantitated 2941 proteins in 496 Liverpool Lung Project plasma samples, including 131 cases taken 1-10 years prior to diagnosis, 237 controls, and 90 subjects at multiple times. 1112 proteins significantly associated with haemolysis (FDRResults
For samples 1-3 years prior to diagnosis, 240 proteins were differentially expressed between cases and matched controls; for 1-5 year samples, 117 of these and 98 different proteins were identified, mapping to significantly different pathways. Four machine learning algorithms (Elastic Net, Random Forest, Support Vector Machine, XGBoost) gave median AUCs of 0.76-0.90 for the 1-3 year proteins and 0.73-0.83 for the 1-5 year proteins. External validation in UK Biobank samples gave AUCs of 0.75 (1-3 year) and 0.69 (1-5 year), with AUC 0.7 up to 12 years prior to diagnosis. The models were independent of age, smoking duration, cancer histology and the presence of COPD.
Conclusion
The plasma proteome provides a rich source of protein biomarkers which may be used to identify patients at greatest risk of lung cancer, 5 or more years prior to diagnosis. The proteins and the pathways are different when lung cancer is more imminent, indicating that both biomarkers of inherent risk and biomarkers associated with presence of early lung cancer may be identified.
Original language | English |
---|---|
Article number | 104686 |
Journal | eBioMedicine |
Volume | 93 |
Early online date | 26 Jun 2023 |
DOIs | |
Publication status | Published - 26 Jun 2023 |
Keywords
- Early-detection
- Lung cancer prediction
- Plasma
- Proteins
- Proteomics