Contact Us

For Marketing, Sales and Subscriptions Inquiries
Rockefeller Center, 45 Rockefeller Plaza
20th Flr Unit #5, New York, NY 10111
United States

Conference List

Journal of Information

June 2015, Volume 1, 1, pp 1-11

Predict Survival of Patients with Lung Cancer Using an Ensemble Feature Selection Algorithm and Classification Methods in Data Mining

Mahdis Dezfuly


Hedieh Sajedi

Mahdis Dezfuly 1
Hedieh Sajedi 2

  1. Department of Electrical and Computer Engineering, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran 1

  2. Department of Computer Science, School of Mathematics, Statistics, and Computer Science, College of Science, University of Tehran, Tehran, Iran 2

Pages: 1-11

DOI: 10.18488/journal.104/2015.1.1/

Share :


This research proposes an efficient model for predicting the survival rate of patients affected by lung cancer. The researchers collected data from four feature categories (population, recognition, treatment, and result) of cancer patients based on the importance of the survival of patients with lung cancer. Analyses of the predicted survival rates of the patients indicate that, among the classification algorithms, Decision Tree C5.0 results the highest accuracy. The models were created using algorithms based on the  level of death risk in five stages: six months, nine months, one year, two years, and five years. In this paper, we proposed a mechanism for feature selection. Our mechanism combines the results of some feature section algorithm. The results illustrate that out mechanism outperform other feature selection algorithms. After applying the proposed mechanism for feature selection, the accuracy of the C5.0 algorithm was equivalent to 97.93%.

Contribution/ Originality
This study proposes an Ensemble feature selection algorithm for predict survival of patients with lung cancer.




  1. L. GloecklerRies, A. M. Reichman, D. Lewis, R. B. F. Hankey, and B. K. Edwards, "Cancer survival and incidence from the surveillance, epidemiology, and end results (SEER) Program," Oncologist, 2003.
  2. A. Ankit, M. Sanchit, N. Ramanathan, P. Lalith, and C. Alok, "A lung cancer outcome calculator using ensemble data mining on SEER data," Electrical Engg. and Computer Science Northwestern University, 2011.
  3. K. Lang, J. Korn, D. W. Lee, L. M. Lines, C. C. Earle, and J. Menzine, "BMC Cancer, USA," 2009.
  4. S. Palaniappan and A. Rafiah, "Intelligent heart disease prediction system using data mining techniques. Department of information technology Malaysia university of science and technology," 2008.
  5. D. Delen, G. Walker, and A. Kadam, "Predicting breast cancer survivability: A comparison of three data mining methods," Artificial Intelligence in Medicine, vol. 34, pp. 113-127, 2005.
  6. M. Lundin, J. Lundin, H. BurkeB, S. Toikkanen, L. Pylkkänen, and H. Joensuu, "Artificial neural networks applied to survival prediction in breast cancer," Oncology International Journal for Cancer Resaerch and Treatment, vol. 57, pp. 281-286, 1999.
  7. C. Shearer, "The CRISP-DM model: The new blueprint for data mining," J. Data Warehousing, vol. 5, pp. 13-22, 2000.
  8. M. Kantardzic, Data mining: Concepts, models, methods, and algorithms, 2nd ed. Simltaneously in Canada: WILEY, 2011.
  9. SEER, "Surveillance, epidemiology, and end results (SEER) program ( limited-use data (1973-2006)," National Cancer Institute, DCCPS, Surveillance Research Program, Cancer Statistics Branch2009.
  10. SEER, "Overview of the seer program," Surveillance Epidemiology and End Results. Available, 2014.
  11. M. Green and M. Ohlsson, "Comparision of standard resampling methods for performance estimation of  artificial neural network ensembles. Computational biology and biological physics group, department of theoretical physics, Lund University," 2006.
  12. Clementine® 12.0 Algorithms Guide, SPSS Inc. 233 South Wacker Drive, 11th ed. Chicago, IL 60606-6412 Copyright © by Integral Solutions Limited, 2007.


Google Scholor ideas Microsoft Academic Search bing Google Scholor


Competing Interests:


Related Article

( 1 ) Predict Survival of Patients with Lung Cancer Using an Ensemble Feature Selection Algorithm and Classification Methods in Data Mining
( 2 ) Current Approaches in Prediction of PVT Properties of Reservoir Oils
( 5 ) A Study on Commensal Mortality Rate of a Typical Three Species Syn-Eco-System with Unlimited Resources for Commensal
( 6 ) On Homogeneous Cubic Equation with Fourunknowns x3+y3= 2lzw2
( 8 ) Classification and Identification of Risk Management Techniques for Mitigating Risks with Factor Analysis Technique in Software Risk Management
( 9 ) Security Issues with Contactless Bank Cards
( 10 ) Enhanced Isolation Mimo Antenna with DGS Structures for Long Term Evolution Systems
( 11 ) Design and Analysis of Compact Microstrip Circular Resonator with Slotted in Ground Plane as a Grain Moisture Sensor
( 12 ) Two-Stage Model with Rough Cluster and Salp Optimization Technique for Epistasis Detection
( 15 ) Investigate of Mechanical Fuse in Cardan Shaft Using FEM
( 16 ) Vlsi Architecture of Mimo Detector Using Fixed Complexity Sphere Decoding
( 18 ) Postbuckling Analysis of Functionally Graded Beams Using Hyperbolic Shear Deformation Theory
( 19 ) A Detailed Analysis of Software Cost Estimation Using Cosmic-FFP
( 20 ) Received Signal Strength in a Macrocell in Lagos Environs Using Finite Element Method
( 21 ) RLS Fixed-Lag Smoother Using Covariance Information Based on Innovation Approach in Linear Continuous Stochastic Systems
( 22 ) Lossless Image Compression and Decompression to Improve the PSNR and MSE Values Using Architecture
( 23 ) PAPR Reduction Using Eight Factors Rotating Phase Shift Technique Based on Local Search Algorithm in OFDM
( 24 ) Using ICT Policy Framework as a Panacea for Economic Recession and Instability in Nigeria
( 25 ) Detection and Prevention of Phishing Attack Using Linkguard Algorithm
( 26 ) A Survey on Efficient Power Management Using Smart Socket and IoT
( 27 ) Characterisation of Propagation Loss for a 3G Cellular Network in a Crowded Market Area Using CCIR Model
( 28 ) Network Traffic Analysis Using Queuing Model and Regression Technique
( 29 ) Stability Analysis of Type-2 Fuzzy Process Control Using LMI
( 30 ) Intensive Patient Monitoring Using LabVIEW
( 31 ) Quality Assessment and Monitoring of Networks Using Passive Technique
( 32 ) An Overview of Advances in Image Colorization Using Computer Vision and Deep Learning Techniques
( 33 ) Activity Recognition and Creation of Web Service for Activity Recognition using Mobile Sensor Data using Azure Machine Learning Studio
( 35 ) Classification Ensemble Based Anomaly Detection in Network Traffic
( 37 ) Smart Feature Fusion and Model for Human Detection
( 39 ) Selection of Appropriate Equipment for Designing Effective Vacuum System
( 40 ) Exploring Internet of Thing on PCA Algorithm for Optimization of Facial Detection and Tracking
( 42 ) Real-Time Workload Scheduling (RTWS) Algorithm for Cloud
( 43 ) Solving Nonlinear Single-Unit Commitment Problem by Genetic Algorithm Based Clustering Technique
( 44 ) OFDM Channel Estimation Based on Novel Local Search Particle Swarm Optimization Algorithm
( 46 ) Comprehensive Analysis & Performance Comparison of Clustering Algorithms for Big Data
( 48 ) Development and Performance Analysis of Bisection Method-Based Optimal Path Length Algorithm for Terrestrial Microwave Link
( 49 ) A Survey on Sentiment Analysis Algorithms and Datasets