2Department of Chest Diseases, Eskisehir Osmangazi University Faculty of Medicine, Eskisehir-Turkey
3Department of Mathematics and Computer Science, Eskisehir Osmangazi University Faculty of Arts and Sciences, Eskisehir-Turkey DOI : 10.5505/tjo.2021.2788
Summary
OBJECTIVEThis study aimed to predict the overall survival (OS), survival time, and time to progression in cases diagnosed with Stage III lung cancer.
METHODS
The sample consisted of 585 patients that underwent radiotherapy and chemotherapy with the diagnosis
of Stage III lung cancer. OS prediction was undertaken in 324 cases, survival time prediction in
241 that died due to lung cancer, and prediction of time to progression in 223 that showed progression
during follow-up. Twenty-seven variables were evaluated, and logistic regression, multilayer perceptron
classifier (MLP), extreme gradient boosting, support vector clustering, random forest classifier (RFC),
Gaussian Naive Bayes, and light gradient boosting machine algorithms were used to construct prediction
models.
RESULTS
In OS prediction, over a median 21-month follow-up, 255 of 324 cases died and the median OS was 20
(2-101) months. The best predictive algorithms belonged to logistic regression for OS (accuracy rate:
70%, confidence interval [CI]: 0.60-0.82, area under curve [AUC]: 0.76), MLP classifier for 12- and
20-month survival times (67%, CI: 0.54-0.81, AUC: 0.64 and 71%, CI: 0.59-0.84, AUC: 0.61, respectively),
and RFC for time to progression (76%, CI: 0.66-0.86, AUC: 0.78).
CONCLUSION
Considering high treatment costs, potential serious toxicity, the harm of early progression, and low
survival in cases of ineffective treatment, machine learning-based predictive systems are promising.
Personalizing prognosis and treatment using these algorithms can improve oncological results.
Introduction
Lung cancer is the leading cause of cancer-related deaths worldwide.[1] Although multiple treatment modalities are applied, the median overall survival (OS) is 12-23.2 months for non-small-cell lung cancer (SCLC) and 16-20 months for limited-stage SCLC. [2,3] A standard treatment based on the TNM staging system may not be suitable for every patient. Identifying patients at high risk of recurrence and high mortality due to the disease is also valuable in guiding treatment. Therefore, in this complex and heterogeneous disease group, it is important to evaluate prognosis in a personalized manner and plan treatment accordingly.Artificial intelligence (AI) is a branch of computer science that aims to emulate human-like intelligence in machines using computer software and algorithms without direct human stimuli to perform certain tasks.[4] Machine learning (ML) is a subunit of AI using data-driven algorithms that learn to imitate human behavior based on a previous example or experience.[5] ML uses mathematical algorithms applied with computer programs to identify patterns in large data sets and improve this identification with additional data.[6]
It is important to predict survival and progression in cases diagnosed with cancer to improve treatment and provide patients and clinicians with information. Considering the data set of lung cancer patients with specific demographic, tumor and treatment information, it is essential to determine if any parameter can be used to predict whether the patient will survive or the disease will recur.
The current study aimed to predict OS, survival time, and time to progression using ML in patients diagnosed with Stage III lung cancer and treated at the Radiation Oncology and Chest Diseases departments of Eskişehir Osmangazi University Faculty of Medicine.
Methods
Patient CharacteristicsThe study included 585 cases diagnosed with Stage III lung cancer from 2007 to 2018. For the application of the ML technique, the cases were determined for each prediction group.
The inclusion criteria were as follows: A histopathological diagnosis of lung cancer, no diagnosis of distant metastasis or multiple primary neoplasia, Karnofsky Performance Scale (KPS) score ≥60, age >18, having completed all planned radiotherapy (RT) and chemotherapy schemes, and regularly attending the follow-up sessions. Staging was performed according to the American Joint Committee on Cancer Staging System, eighth edition.[7] For staging purposes, the thorax-abdomen computed tomography (CT)/fluorodeoxyglucose positron emission tomography (FDG-PET)/CT and brain magnetic resonance (MR) images were reviewed in each case. After the diagnosis, the cases were evaluated at the lung/pleural cancer council of ESOGUMF, and the treatment decision was taken using a multidisciplinary approach. Our study was approved by Eskisehir Osmangazi University Clinical Research Ethics Committee. All patients provided written informed consent before enrollment in the study.
Treatment Characteristics
Radiotherapy and concurrent chemotherapy
The patients were immobilized in a supine position
using T-bar/Wingboard with their hands above their
head, and planning CT was performed with the Somatom
Definition AS® device with a 3-5-mm crosssection.
The images were fused with the FDG-PET/
thoracic CT images at the time of diagnosis and current
thorax CT images after chemotherapy in cases that
underwent chemotherapy before RT. The gross tumor
volume (GTV) was determined after fusion. In cases
receiving chemotherapy before RT, GTVtumor was determined
as the post-chemotherapy volume, and GTVlymph
node as the pre-chemotherapy volume. The clinical
target volume (CTV) margin was set according to tumor
histopathology: CTVtumor was taken as 0.8 cm for
adenocarcinoma, 0.6 cm for squamous cell carcinoma,
and 0.5 cm for other histologies. CTVlymph node was determined
as 0.5 cm. No elective nodal irradiation was
performed. For the planning target volume (PTV), the
CTVtumor and CTVlymph node, the volumes were given a
0.5-cm margin, and the cases were treated with imageguided
radiation therapy after 2014. Radiation therapy
was applied with daily fractions ranging from 1.8-2
Gy to 45-68 Gy depending on various criteria, such as
tumor localization and size, lung volume, and tumor
volume, under the guidance of 3DCRT/IMRT/VMAT
using a Varian Trilogy®/TrueBeam® or Elekta Precise?
device. In SCLC cases with good treatment response,
25 Gy (2.5 Gy/day×10 fractions) prophylactic cranial
irradiation was applied.
Concurrent chemotherapy was applied to the appropriate cases. In the non-SCLC group, cisplatin (40 mg/m2) or paclitaxel (45-50 mg/m2)+carboplatin (area under curve [AUC]: 2) was administered weekly. In the patients with SCLC, cisplatin (40 mg/m2) was administered weekly or cisplatin (75 mg/m2)+etoposide (100 mg/m2) every 21 days. The patients attended the outpatient clinic every week.
Chemotherapy
In squamous cell lung cancer, gemcitabine, paclitaxel,
or vinorelbine was used in primary and secondary chemotherapy, either alone or in combination with
platinum. The first-line chemotherapy of adenocarcinoma
was the same as given in the section above, but
pemetrexed was applied as the second-line therapy.
In patients with epidermal growth factor receptor,
anaplastic lymphoma receptor tyrosine kinase gene
translocation, or ROS proto-oncogene 1 receptor tyrosine
kinase gene rearrangement, first-line chemotherapy
was the same as in the previous section, and
the second-line therapy was arranged as the targeted
therapy specific to the genetic change. In patients with
recurrent/progressive disease, a chemotherapy regimen
that had not previously been used was applied,
taking into account the clinical performance ability
and comorbidities of the patient; therefore, the decision
to continue this therapy was taken according to
the patient response. In the treatment of SCLC, etoposide
combined with platinum was used as the firstline
chemotherapy regimen, and irinotecan or the
combination of vincristine+cyclophosphamide+adriablastina
was used as the second-line regimen in cases
that did not respond to treatment or recurred.
Post-treatment Follow-up
At the 1st month after the end of treatment, anamnesis,
a physical examination, thorax CT, and response to
treatment were evaluated. The follow-up evaluations of
anamnesis, physical examination, and thorax CT were
performed every 3 months for the following 3 years,
and every 6 months for the 4th and 5th years. After the
5th year, annual follow-up was undertaken. In suspected
cases of recurrence/metastasis, abdominal CT/
brain MR and/or PET CT was also conducted.
ML, Statistical Analysis, and Application
In the prediction of both OS and time to progression,
the following 27 variables were evaluated: Age, gender,
KPS score, body mass index, smoking history, presence
of chronic obstructive pulmonary disease, histopathology,
tumor localization, tumor size, lymph node site,
lymph node involvement (single level/multilevel), T
stage, N stage, TNM stage, surgical history, presence of
concurrent chemotherapy, concurrent chemotherapy
scheme, number of chemotherapy cycles before RT,
GTV, PTV, total RT dose, RT fraction dose, prognostic
nutritional index, pretreatment serum albumin and hemoglobin
values, neutrophil lymphocyte ratio (NLR),
and advanced lung cancer inflammation index. These
parameters were determined by considering previous
prognosis studies related to lung cancer.[8-13] For the
predictions, the ML algorithms of logistic regression, multilayer perceptron classifier (MLP), extreme gradient
boosting (XGB) classifier, support vector clustering
(SVC), random forest classifier (RFC), Gaussian Naive
Bayes (GNB), and light gradient boosting machine
(LGBM) classifier were used.
Statistical Analysis and Application
Extreme value analysis is a branch of statistics that
deals with extreme deviations from the median of
probability distributions. It aims to assess the likelihood
of more extreme events than those previously
observed from a particular sequential example of a
certain random variable. Excessive values decrease
predictive performance, and there are different methods
for detecting extreme values, but in simple terms,
values that deviate a certain amount from the mean are
considered as extreme.[14] In this study, to increase
the predictive performance, the data that were 1.96 ×
standard deviation from the mean (excessive values)
according to the box plot method were excluded from
the study. As training-test data rates, 80-20% were selected
for the prediction of OS and OS time (12-month
and 20-month), and 70-30% for the prediction of time
to progression (12-month).
Synthetic minority over-sampling involves developing predictive models based on unbalanced classification data sets with severe class imbalance. The difficulty in working with unbalanced data sets is that most ML techniques do not take into account the minority class and perform poorly, but typically the most important performance belongs to the minority class. One approach to unbalanced data sets is to over-sample the minority class. The simplest approach is the duplication of samples in the minority class; however, these samples do not add any new information to the model; rather, new samples can be synthesized from existing samples.[15]
Cross validation is a model validation technique that tests what results will be obtained from a statistical analysis performed on an independent data set. Its main use is to predict what accuracy a prediction system will have in practice. In a prediction problem, the model is usually trained with a "known data set" (training set) and tested with an "unknown data set" (verification or test set). The purpose of this test is to measure the ability of the trained model to generalize new data and to identify problems of over-compliance or selection bias.[16] In the current study, cross verification was also undertaken. The structure of cross-validation is shown in Supplementary 1.
Suppl. Fig 1: Cross-validation structure.
Results
Patient, Tumor, and Treatment CharacteristicsIn OS prediction, 324 Stage III lung cancer cases were evaluated. The median age was 61 (range, 44-79) years. The median RT dose was 60 (range, 50-68) Gy. Concurrent chemotherapy was administered to 239 cases. The median number of concurrent chemotherapy is 4 (min: 0, max: 6). RT timing was with the first cycle of chemotherapy in 25 patients. Patient and tumor characteristics are summarized in Table 1a, and treatment characteristics in Table 1b.
Table 1a: Patient and tumor characteristics for the prediction of survival
Table1b: Treatment characteristics for the prediction of survival
In the prediction of OS time, 241 Stage III lung cancer cases that died were evaluated. The median age was 62 (range, 44-80) years. The median RT dose was 60 (range, 50-68) Gy. Concurrent chemotherapy was applied in 180 cases. The median number of concurrent chemotherapy is 4 (min: 0, max: 6). RT timing was with the first cycle of chemotherapy in 17 patients. The characteristics of the patients and tumors are summarized in Table 2a, and the treatment characteristics are given in Table 2b.
Table 2a: Patient and tumor characteristics for the prediction of survival time
Table 2b: Treatment characteristics for the prediction of survival time
For the prediction of time to progression, 223 cases that showed progression during the follow-up were evaluated. The median age was 61 (range, 44-80) years. The median RT dose was 60 (range, 50-68) Gy. Concurrent chemotherapy was applied to 172 cases. RT timing was with the first cycle of chemotherapy in 11 patients. The median number of concurrent chemotherapy is 4 (min: 0, max: 6). Patient and tumor characteristics are summarized in Table 3a, and the treatment characteristics are given in Table 3b.
Table 3a: Patient and tumor characteristics for the prediction of time to progression
Table 3b: Treatment characteristics for the prediction of time to progression
OS and Progression-free OS
The OS evaluation was conducted with 324 cases, and
over a median follow-up of 21 months, 255 patients
died. The prediction of OS time was performed with
241 of the patients that died, and the median survival
time of this group was 20 (2-101) months. The median
survival times for substages IIIA, IIIB, and IIIC were 25 (6-101), 19.5 (5-70), and 15 (2-65) months,
respectively. The prediction of time to progression was
undertaken with 223 cases that showed progression
during the follow-up. The median time from the end
of treatment to progression was 9 (0-96) months. The
median values for substages IIIA, IIIB, and IIIC were
10 (0-96), 9 (0-68), and 7 (1-28) months, respectively.
ML Prediction
OS prediction
Significant variables were determined as PTV, lymph node
site, and KPS score. Figure 1a gives the feature importance
plot and the correlation matrix of the variables. The best
predictive algorithm was identified as logistic regression
with 70% accuracy (AUC: 0.76, confidence interval [CI]:
0.597-0.818), 94.44% sensitivity, and 41.38% specificity.
The accuracy rates for the MLP, XGB, SVC, RFC, GNB,
and LGBM algorithms were calculated as 63%, 53%, 56%,
60%, 66%, and 64%, respectively. The AUC graphs of the
algorithms are given in Figure 2a, and the data belonging
to the best predictive algorithm are shown in Table 4. The
logistic regression algorithm accurately predicted 34 of 51
cases that died and 12 of 14 cases that survived, and the
confusion matrix is presented in Table 5a.
Table 4: Results of the best performing algorithm for each prediction
Table 5a: Confusion matrix for the prediction of survival
Table 5b: Confusion matrix for the prediction of 12-month survival
Table 5c: Confusion matrix for the prediction of 20-month survival
Table 5d: Confusion matrix for the prediction of time to progression
OS time prediction
Twelve-month survival prediction
Significant variables were identified as GTV, lymph
node site, surgical history, and histopathology. Figure 1b
presents the feature importance plot and the correlation
matrix of the variables. The best predictive algorithm
was found to be MLP with 67% accuracy (AUC: 0.64, CI:
0.542-0.805), 66.67% sensitivity, and 67.65% specificity.
The accuracy rates for the logistic regression, XGB, SVC,
RFC, GNB, and LGBM algorithms were determined as
46%, 57%, 51%, 55%, 59%, and 53%, respectively. The
AUC graph of the algorithms is given in Figure 2b. The
data on the algorithm with the best predictive results are
shown in Table 3. The MLP algorithm accurately predicted
10 of 21 cases that survived for ≤12 months and
23 of 28 cases that survived for >12 months, and the
confusion matrix is given in Table 5b.
Twenty-month survival prediction
Significant variables were identified as GTV, lymph
node site, and T stage. In Figure 1c, the feature importance
plot and the correlation matrix of the variables
are shown. The algorithm with the best predictive ability
was MLP, which had an accuracy of 71% (AUC: 0.61,
CI: 0.588-0.841), sensitivity of 73.17% and specificity
of 62.50%. The accuracy rates for the logistic regression, XGB, SVC, RFC, GNB, and LGBM algorithms
were determined as 59%, 59%, 71%, 51%, 67%, and
59%, respectively. Figure 2c presents the AUC graph of the algorithms, and Table 3 gives the detailed data of
the best predictive algorithm. The MLP algorithm accurately
predicted 30 of 33 cases that survived for ?20
months and 5 of 16 cases that survived for >20 months.
The confusion matrix is presented in Table 5c.
Prediction of time to progression
Significant variables were determined as NLR, lymph
node site, age, and T stage. In Figure 1d, the feature
importance plot and the correlation matrix of the
variables are shown. RFC was identified as the best
predictive algorithm with 76% accuracy (AUC: 0.79,
CI: 0.659-0.863), 90.91% sensitivity, and 61.76% specificity.
The accuracy rates for the remaining algorithms
were calculated as 61% for logistic Regression, 73% for
XGB, 56% for SVC, 53% for MLP, 70% for GNB, and
68% for LGBM. Figure 2d presents the AUC graphic
of all algorithms, and Table 3 shows the detailed data
obtained from the best predictive algorithm. The RFC
algorithm accurately predicted 30 of 43 cases that
showed progression within 12 months and 21 of 24
cases that progressed after 12 months. Finally, the confusion
matrix is presented in Table 5d.
Fig 2: Area under the curve graphs. (a) Prediction of survival. (b) Prediction of 12-month survival. (c) Prediction of
20-month survival. (d) Prediction of time to progression.
ROC: Receiver operating characteristic; SVC: Support vector classification; MLP: Multilayer perceptron classifier; LGBM: Light gradient
boosting machine.
Discussion
In the past two decades, there has been an increase in the use of digital footprints to track and predict human behavior. Furthermore, the ML approach is increasingly being adopted in clinical settings. It is considered that using ML techniques will lead to a change in clinical medicine by solving basic problems related to large and complex data sets. ML offers the potential to derive adaptive systems from various data sets, discover hidden connections between data items, and predict results.[17]Today, many hospitals store data in a digital environment. By evaluating these large data sets with ML techniques, it could become possible to predict the treatment results of patients, plan individualized patient treatment, improve institutional performance, and regulate health insurance. The accurate prediction of survival in cancer patients continues to be a problem due to the increased heterogeneity and complexity of cancer, various treatment options, and different patient characteristics (age, KPS score, comorbidities, etc.). If reliable estimates are obtained by ML, it can help achieve personalized care and treatment.
There is a growing interest in studies on prognosis prediction based on ML using patient, tumor and treatment data.[18,19] In a study conducted with 8,066 patients diagnosed with breast cancer, Ganggayah et al.[20] evaluated 23 variables for the OS prediction. The authors used the algorithms of decision tree, RFC, neural networks, extreme boost, logistic regression and SVM. Cancer stage, tumor size, total number of dissected axillary lymph nodes, number of metastatic lymph nodes, and primary treatment applied were determined as significant variables, and the algorithm that had the highest predictive ability was RFC with an accuracy rate of 82.7. Li et al.[21] examined 515 tumor tissues and 59 adjacent normal tissues and analyzed the gene expression profiles of the cases. They used three different algorithms (sigFeature, RFC, and univariate cox regression) to assess the prognostic value of survival-associated genes. A risk estimation model was established, and the expression of 16 genes was found to be highly correlated with recurrence-free survival and high-risk group with low OS. In the current study, OS prediction was made using ML in Stage III lung cancer, and significant parameters were determined as PTV, lymph node site, and KPS score with the logistic regression algorithm providing the best predictive results.
Gupta et al.[17] predicted 6-month, 12-month and 24-month OS times in 869 cancer patients, and calculated the AUC values as 0.87 (95% CI: 0.848-0.890), 0.796 (95% CI: 0.774-0.823) and 0.764 (95% CI: 0.737-0.789), respectively. Parikh et al.[19] performed the prediction of 6-month survival in cancer patients. Of the 26,525 cancer cases evaluated, 1,065 died within 180 days. The data of 70% of the cases were used for training and 30% for testing. They reported the positive predictive values of the RFC, XGB and logistic regression algorithms as 51.3%, 49.4%, and 44.7%, respectively, and their AUC (95% CI) values as 0.88 (0.86-0.89), 0.87 (0.85-0.89), and 0.86 (0.84-0.88), respectively. In the current study, 12- and 20-month OS predictions were made, and significant variables affecting survival time were determined as T stage, lymph node site, GTV, surgical history, and histopathology. MLP was the algorithm with the highest accuracy rate in the OS time prediction.
The N stage, which is also used in TNM staging, affects the treatment decision and prognosis. In a previous study, the 5-year OS was examined according to the Nclinical and Npathological stages, and these rates were found to be 60% and 75%, respectively, for N0, 37% and 49%, respectively, for N1, 23% and 36%, respectively, for N2, and 9% and 20%, respectively, for N3.[22] Descriptors of the N stage (lymph node site) in the TNM system, which are routinely used when making the treatment decision, were also determined as a significant variable in the current study for the prediction of OS and OS time using ML. In another study with 157 cases diagnosed with locally advanced lung cancer, Pöttgen et al.[13] considered Nclinical stage, addition of pneumonectomy to treatment, gender, adenocarcinoma histology, age, and Pancoast tumor localization as significant prognostic factors. Firat et al.,[23] evaluating 163 patients with a lung cancer diagnosis, identified comorbidity and KPS score <70 to be prognostic factors for OS. In a review published by Hirsch et al.,[24] the effect of histology on prognosis in lung cancer was investigated by evaluating 408 studies, of which 11 had established a relationship between histology and clinical outcomes and seven had shown that histopathology affected oncological results in locally advanced lung cancer. In the current study, the KPS score was a significant variable for the OS prediction, surgical history, and histopathology for the OS time prediction.
In a study conducted with 207 cases diagnosed with inoperable lung cancer, Bradley et al.[25] accepted receiving RT as a prognostic factor for not only OS but also disease-specific survival and local tumor control. Etiz et al.,[26] carrying out a study with a 150-patient sample with Stage I-IIIB lung cancer, reported that total tumor volume, age, KPS score, and gender were significant prognostic factors affecting OS. In the current study, significant variables for the OS and OS time prediction were identified as PTV and GTV, respectively.
In the current study, in the prediction of OS time, the cases that survived for ≤20 months were successfully predicted by the MLP algorithm at an accuracy rate of 91%, and this algorithm had an accuracy of 31% for those surviving for >20 months. The same algorithm had a 48% accuracy rate in predicting patients surviving for ≤12 months and 82% accuracy rate in predicting those surviving for more than 12 months. These results may be associated with the patient data set including a low number of cases surviving for <12 months or more than 20 months. There is a need for larger case studies on ML.
Gupta et al.[27] performed TNM staging and 5-year disease-free survival prediction among 4,021 cases diagnosed with colon cancer. The authors reported that the RFC algorithm had the highest accuracy in both TNM staging (89%) and 5-year disease-free survival prediction (84%). In the current study, the prediction of time to progression was undertaken with the significant variables of age, NLR, T stage, and lymph node site, and as a result, the RFC algorithm had the highest accuracy rate. Inflammation is a known factor for the development and progression of cancer.[8] While the presence of CD8 T cells in tumor microenvironment is related to better oncological results, neutrophils, M2 polarized macrophages, and FOXP3 positive regulator T cells are associated with a poor prognosis.[28-30] In many cancer types, such as those of the breast, head and neck, kidney, and stomach, the relationship between high NLR and poor prognosis has been reported in many studies. [31-33] In their meta-analysis of 19 studies with a total of 7283 cases diagnosed with lung cancer, Yang et al.[34] determined that higher NLR was associated with lower OS and progression-free survival. In the same study, tumor invasion depth, extension of lymph node metastasis, poor differentiation, and vascular invasion were associated with high NLR. NLR may show a pro-angiogenic/ pro-inflammatory status in tumor tissue, which may reflect the immune system function of patients.[35] A high NLR value indicates high neutrophil and low lymphocyte levels, indirectly associated with low lymphocyte- mediated immune response, accelerated tumor process, and poor prognosis.[36]
ML is becoming part of people's lives day by day, and its use in the health area can both improve treatment outcome and reduce treatment costs. However, large data sets are required for ML, and data size and diversity are important to achieve an effective algorithm. There is still no standard ML algorithm to predict prognosis, treatment outcome, or toxicity rate in oncology, and multicenter large-scale data are required to create the most appropriate algorithm. Thus, in future work, it is planned to establish big data and re-evaluate the results by increasing the number of patients and collaborating with other centers.
Conclusion
Given high treatment costs, potential serious toxicity, harms of early progression, and low survival in cases of ineffective treatment, predictive systems with ML are promising. Multicenter studies with large data sets can provide algorithms with higher accuracy rates.Peer-review: Externally peer-reviewed.
Conflict of Interest: All authors declared no conflict of interest.
Ethics Committee Approval: The study was approved by the Eskişehir Osmangazi University Non-Invasive Clinical Research Ethics Committee (No: 29, Date: 17/12/2019).
Financial Support: None declared.
Authorship contributions: Concept - D.E., M.Y., M.M., G.A., Ş.Y.; Design - D.E., M.Y., M.M.; Supervision - D.E., M.Y.; Funding - D.E., M.Y., Ş.Y.; Materials - M.Y., Ş.Y.; Data collection and/or processing - M.Y., Ş.Y.; Data analysis and/ or interpretation - D.E., M.Y., Ş.Y., M.M., G.A., Ö.Ç.; Literature search - D.E., M.Y. Ş.Y., M.M., G.A.; Writing - D.E., M.Y. Ş.Y., M.M., G.A., Ö.Ç.; Critical review - D.E., M.Y. Ş.Y., M.M., G.A., Ö.Ç.
References
1) Siegel R, Naishadham D, Jemal A. Cancer statistics.
CA Cancer J Clin 2013;63(1):11?30.
2) Aupérin A, Le Péchoux C, Rolland E, Curran WJ,
Furuse K, Fournel P, et al. Meta-analysis of concomitant
versus sequential radiochemotherapy in locally
advanced non-small-cell lung cancer. J Clin Oncol
2010;28(13):2181?90.
3) Chen J, Jiang R, Garces YI, Jatoi A, Stoddard SM, Sun
Z, et al. Prognostic factors for limited-stage small cell
lung cancer: A study of 284 patients. Lung Cancer
2010;67(2):221?6.
4) Meyer P, Noblet V, Mazzara C, Lallement A. Survey
on deep learning for radiotherapy. Comput Biol Med
2018;98:126?46.
5) Jarrett D, Stride E, Vallis K, Gooding MJ. Applications
and limitations of machine learning in radiation oncology.
Br J Radiol 2019;92(1100):20190001.
6) Lynch CM, Abdollahi B, Fuqua JD, de Carlo AR,
Bartholomai JA, Balgemann RN, et al. Prediction of
lung cancer patient survival via supervised machine
learning classification techniques. Int J Med Inform
2017;108:1-8.
7) Brierley J, Gospodarowicz MK, Wittekind C. TNM
Classification of Malignant Tumours. 8th ed. Hoboken,
NJ: John Wiley and Sons, Inc.; 2017.
8) Diem S, Schmid S, Krapf M, Flatz L, Born D, Jochum
W, et al. Neutrophil-to-lymphocyte ratio (NLR) and
platelet-to-lymphocyte ratio (PLR) as prognostic
markers in patients with non-small cell lung cancer
(NSCLC) treated with nivolumab. Lung Cancer
2017;111:176?81.
9) Hong S, Zhou T, Fang W, Xue C, Hu Z, Qin T, et al.
The prognostic nutritional index (PNI) predicts overall
survival of small-cell lung cancer patients. Tumor Biol 2015;36(5):3389?97.
10) Zhu H, Zhou Z, Xue Q, Zhang X, He J, Wang L. Treatment
modality selection and prognosis of early stage
small cell lung cancer: Retrospective analysis from
a single cancer institute. Eur J Cancer Care (Engl)
2013;22(6):789?96.
11) Kasmann L, Bolm L, Janssen S, Rades D. Prognostic
factors and treatment of early-stage small-cell lung
cancer. Anticancer Res 2017;37(3):1535?8.
12) Wang L, Dong T, Xin B, Xu C, Guo M, Zhang H, et
al. Integrative nomogram of CT imaging, clinical, and
hematological features for survival prediction of patients
with locally advanced non-small cell lung cancer.
Eur Radiol 2019;29(6):2958?67.
13) Pöttgen C, Stuschke M, Graupner B, Theegarten D,
Gauler T, Jendrossek V, et al. Prognostic model for longterm
survival of locally advanced non-small-cell lung
cancer patients after neoadjuvant radiochemotherapy
and resection integrating clinical and histopathologic
factors. BMC Cancer 2015;15:363.
14) de Haan L, Ferreira A. Extreme Value Theory: An Introduction.
Berlin: Springer Science Business Media;
2007.
15) Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP.
SMOTE: Synthetic minority over-sampling technique.
J Artif Intell Res 2002;16:321?57.
16) Ron K. A Study of Cross-Validation and Bootstrap for
Accuracy Estimation and Model Selection. In: Proceedings
of the 14th International Joint Conference on
Artificial Intelligence. San Mateo, CA: Morgan Kaufmann;
1995. p. 1137?43.
17) Gupta S, Tran T, Luo W, Phung D, Kennedy RL, Broad
A, et al. Machine learning prediction of cancer survival:
A retrospective study using electronic administrative
records and a cancer registry. BMJ Open
2014;4(3):e004007.
18) Chen YC, Ke WC, Chiu HW. Risk classification of
cancer survival using ANN with gene expression
data from multiple laboratories. Comput Biol Med
2014;48:1-7.
19) Parikh RB, Manz C, Chives C, Regli SH, Braun J,
Draugelis ME, et al. Machine learning approaches to
predict 6-month mortality among patients with cancer.
JAMA Netw Open 2019;2(10):e1915997.
20) Ganggayah MD, Taib NA, Har YC, Lio P, Dhillon SK.
Predicting factors for survival of breast cancer patients
using machine learning techniques. BMC Med Inform
Decis Mak 2019;19:48.
21) Li Y, Ge D, Gu J, Xu F, Zhu Q, Lu C, et al. A large cohort
study identifying a novel prognosis prediction model
for lung adenocarcinoma through machine learning
strategies. BMC Cancer 2019;19:886.
22) Asamura H, Chansky K, Crowley J, Goldstraw P, Rusch
VW, Vansteenkiste JF, et al. The international association for the study of lung cancer lung cancer staging
project: Proposals for the revision of the N descriptors
in the forthcoming 8th edition of the TNM classification
for lung cancer. J Thorac Oncol 2015;10(12):1675-
84)
23) Firat S, Bousamra M, Gore E, Byhardt RW. Comorbidity
and KPS are independent prognostic factors in
stage I non-small-cell lung cancer. Int J Rad Oncol Biol
Phys 2002;52(4):1047?57.
24) Hirsch FR, Spreafico A, Novella S, Wood MD, Simms
L, Papotti M. The prognostic and predictive role of histology
in advanced non-small cell lung cancer a literature
review. J Thorac Oncol 2008;3(12):1468?81.
25) Bradley JD, Ieumwananonthachai N, Purdy JA,
Wasserman TH, Lockett MA, Graham MV, et al. Gross
tumor volume, critical prognostic factor in patients
treated with three-dimensional conformal radiation
therapy for non?small-cell lung carcinoma. Int J Rad
Oncol Biol Phys 2002;52(1):49?57.
26) Etiz D, Marks LB, Zhou SM, Bentel GC, Clough R,
Hernando ML, et al. Influence of tumor volume on
survival in patients irradiated for non?small-cell lung
cancer. Int J Rad Oncol Biol Phys 2002;53(4):835?46.
27) Gupta P, Chiang SF, Sahoo PK, Mohapatra SK, You JF,
Onthoni DD, et al. Prediction of colon cancer stages
and survival period with machine learning approach.
Cancers 2019;11(12):2007.
28) Diakos CI, Charles KA, McMillan DC, Clarke SJ.
Cancer-related inflammation and treatment effectiveness.
Lancet Oncol 2014;15(11):e493?503.
29) Yuan A, Hsiao YJ, Chen HY, Chen HW, Ho CC, Chen
YY, et al. Opposite effects of M1 and M2 macrophage
subtypes on lung cancer progression. Sci Rep
2015;5:14273.
30) Tao H, Mimura Y, Aoe K, Kobayashi S, Yamamoto
H, Matsuda E, et al. Prognostic potential of FOXP3
expression in non-small cell lung cancer cells combined
with tumor-infiltrating regulatory T cells. Lung
Cancer 2012;75(1):95?101.
31) Pei D, Zhu F, Chen X, Qian J, He S, Qian Y, et al. Preadjuvant
chemotherapy leukocyte count may predict
the outcome for advanced gastric cancer after radical
resection. Biomed Pharmacother 2014;68(2):213?7.
32) Tsai YD, Wang CP, Chen CY, Lin LW, Hwang TZ, Lu
LF, et al. Pretreatment circulating monocyte count associated
with poor prognosis in patients with oral cavity
cancer. Head Neck 2014;36(7):947?53.
33) Forget P, Machiels JP, Coulie PG, Berliere M, Poncelet
AJ, Tombal B, et al. Neutrophil: Lymphocyte ratio and
intraoperative use of ketorolac or diclofenac are prognostic
factors in different cohorts of patients undergoing
breast, lung, and kidney cancer surgery. Ann Surg
Oncol 2013;20(Suppl 3):S650?60.
34) Yang HB, Xing M, Ma LN, Feng LX, Yu Z. Prognostic
significance of neutrophil-lymphocyteratio/plateletlymphocyteratioin
lung cancers: A meta-analysis. Oncotarget
2016;7(47):76769?78.