Contact Us

For Marketing, Sales and Subscriptions Inquiries
Rockefeller Center, 45 Rockefeller Plaza
20th Flr Unit #5, New York, NY 10111
United States

Conference List

Review of Computer Engineering Research

June 2019, Volume 6, 2, pp 64-75

A Review of Machine Learning Models for Software Cost Estimation

Farrukh Arslan

Farrukh Arslan 1

  1. School of Electrical and Computer Engineering, Purdue University, West Lafayette, USA. 1

on Google Scholar
on PubMed

Pages: 64-75

DOI: 10.18488/journal.76.2019.62.64.75

Share :

Article History:

Received: 12 June, 2019
Revised: 15 July, 2019
Accepted: 20 August, 2019
Published: 27 September, 2019


Software cost estimation is a critical task in software projects development. It assists project managers and software engineers to plan and manage their resources. However, developing an accurate cost estimation model for a software project is a challenging process. The aim of such a process is to have a better future sight of the project progress and its phases. Another main objective is to have clear project details and specifications to assist stakeholders in managing the project in terms of human resources, assets, software, data and even in the feasibility study. Accurate estimation results with definitely helps the project manager to do better estimation for the project cost, the time required for various project phases and resources or assets. This paper builds a software cost estimation model using machine learning approach. Different machine learning algorithms are applied to two public datasets to predict the software cost in the early stages. Results show that machine learning methods can be used to predict software cost with a high accuracy rate.
Contribution/ Originality
This study contributes to the existing literature by enhancing the results of thirteen Machine Learning algorithms on two datasets. The evaluation criteria used in this work are R², MAE, RMAE, RAE, and RRSE. The aim of the proposed model is to predict the effort using dataset attributes and compare them with the actual effort in order to measure the error using different criteria.


Machine learning, Cost estimation, Prediction, Weka, Algorithms, Classification, Prediction models.


[1]          S. Kumari and S. Pushkar, "Cuckoo search based hybrid models for improving the accuracy of software effort estimation," Microsystem Technologies, vol. 24, pp. 4767-4774, 2018. Available at:

[2]          P. Pospieszny, B. Czarnacka-Chrobot, and A. Kobylinski, "An effective approach for software project effort and duration estimation with machine learning algorithms," The Journal of Systems & Software, vol. 137, pp. 184–196, 2018. Available at:

[3]          K. Langsari, R. Sarno, and Sholiq, "Optimizing effort parameter of COCOMO II using particle swarm optimization method," Telkomnika, vol. 16, pp. 2208-2216, 2018. Available at:

[4]          I. Attarzadeh and S. H. Ow, "Improving estimation accuracy of the COCOMO II using an adaptive fuzzy logic model," presented at the 2011 IEEE International Conference on Fuzzy Systems, Taipei, Taiwan, 2011.

[5]          R. Litoriya, N. Sharma, and D. A. Kothari, "Incorporating cost driver substitution to improve the effort using Agile COCOMO II," presented at the 2012 CSI Sixth International Conference on Software Engineering, 2012.

[6]          R. Saljoughinejad and V. Khatibi, "A new optimized hybrid model based On COCOMO to increase the accuracy of software cost estimation," Journal of Advances in Computer Engineering and Technology, vol. 4, pp. 27-40, 2018.

[7]          Z. Chen, T. Menzies, D. Port, and B. Boehm, "Feature subset selection can improve software cost estimation accuracy," ACM SIGSOFT Software Engineering Notes, vol. 30, pp. 1-6, 2005. Available at:

[8]          Z. A. Khalifelu and F. S. Gharehchopogh, "Comparison and evaluation of data mining techniques with algorithmic models in software cost estimation," Procedia Technology, vol. 1, pp. 65-71, 2012. Available at:

[9]          P. A. Whigham, C. A. Owen, and S. G. Macdonell, "A baseline model for software effort estimation," ACM Transactions on Software Engineering and Methodology, vol. 24, pp. 1-11, 2015. Available at:

[10]        F. Sarro, A. Petrozziello, and M. Harman, "Multi-objective software effort estimation," presented at the ACM 38th IEEE International Conference on Software Engineering, 2016.

[11]        Y. Masoudi-Sobhanzadeh, H. Motieghader, and A. Masoudi-Nejad, "Feature select: A software for feature selection based on machine learning approaches," BMC Bioinformatics, vol. 20, pp. 1-17, 2019. Available at:

[12]        V. Vig and A. Kaur, "Test effort estimation and prediction of traditional and rapid release models using machine learning algorithms," Journal of Intelligent & Fuzzy Systems, vol. 35, pp. 1657-1669, 2018. Available at:

[13]        A. Khalid, M. A. Latif, and M. Adnan, "An approach to estimate the duration of software project through machine learning techniques," Gomal University Journal of Research, vol. 33, pp. 1-13, 2017.

[14]        T.-H. Yeh and S. Deng, "Application of machine learning methods to cost estimation of product life cycle," International Journal of Computer Integrated Manufacturing, vol. 25, pp. 340-352, 2012. Available at:

[15]        M. D. Ganggayah, N. A. Taib, Y. C. Har, P. Lio, and S. K. Dhillon, "Predicting factors for survival of breast cancer patients using machine learning techniques," BMC Medical Informatics and Decision Making, vol. 19, pp. 1-17, 2019. Available at:

[16]        P. Pandey, "Analysis of the techniques for software cost estimation," presented at the 2013 Third International Conference on Advanced Computing and Communication Technologies (ACCT), Rohtak, India, 2013.

[17]        B. Başkeleş, B. Turhan, and A. Bener, "Software effort estimation using machine learning methods," presented at the 2007 22nd International Symposium on Computer & Information Sciences, 2007.

[18]        J. Rahikkala, S. Hyrynsalmi, V. Leppänen, and I. Porres, "The role of organisational phenomena in software cost estimation: A case study of supporting and hindering factors," E-Informatica Software Engineering Journal, vol. 12, pp. 167–198, 2018.

[19]        M. Vyas, A. Bohra, D. C. Lamba, and A. Vyas, "A review on software cost and effort estimation techniques for agile development process," International Journal of Recent Research Aspects, vol. 5, pp. 612-618, 2016.

[20]        S. A. Woznicki, J. Baynes, S. Panlasigui, M. Mehaffey, and A. Neale, "Development of a spatially complete floodplain map of the conterminous United States using random forest," Science of the Total Environment, vol. 647, pp. 942-953, 2019. Available at:

[21]        S. Kalmegh, "Analysis of weka data mining algorithm reptree, simple cart and randomtree for classification of Indian news," International Journal of Innovative Science, Engineering & Technology, vol. 2, pp. 438-446, 2015.

[22]        S.-A. Blaifi, S. Moulahoum, R. Benkercha, B. Taghezouit, and A. Saim, "M5P model tree based fast fuzzy maximum power point tracker," Solar Energy, vol. 163, pp. 405-424, 2018. Available at:

[23]        T. Rajasekaran, P. Jayasheelan, and K. S. Preethaa, "Predictive analysis in agriculture to improve the crop productivity using zeroR algorithm," International Journal of Computer Science and Engineering Communications, vol. 4, pp. 1397-1401, 2016.

[24]        B. G. Becker, "Visualizing decision table classifiers," in Proceedings IEEE Symposium on Information Visualization, 1998.

[25]        Class Input Mapped Classifier, Available:, 2019.

[26]        Additive Regression, Available:, 2019.

[27]        Gerardnico, "Machine learning - K-nearest neighbors (KNN) algorithm - instance based learning." Available:, 2017.

[28]        University of Konstanz, "K*  Algorithm  (K  Star)." Available:, 2019.

[29]        M. Krasser, "Gaussian processes." Available:, 2018.

[30]        Geeksforgeeks, "ML linear regression." Available:, 2019.

[31]        P. Singh and S. Agrawal, "Node localization in wireless sensor networks using the M5P tree and SMOreg algorithms," presented at the 2013 5th International Conference and Computational Intelligence and Communication Networks. IEEE, 2013.


Google Scholor ideas Microsoft Academic Search bing Google Scholor


This study received no specific financial support.

Competing Interests:

The author declares that there are no conflicts of interests regarding the publication of this paper.


Related Article

( 1 ) Analytical Review of SQL Server Optimization
( 2 ) A Comprehensive Review of Semiconductor-Type Gas Sensors for Environmental Monitoring
( 3 ) FPGA Implementation of MC-CDMA Wireless Communication System Based on SDR-A Review
( 4 ) A Review of Machine Learning Models for Software Cost Estimation
( 5 ) Factors Influencing Cloud Computing Adoption-Compared Review and A 4-M Recommendation
( 6 ) Co-Development of Process Planning and Structural Configurations Considering Machine’s Accessibility in a Reconfigurable Setup
( 8 ) Activity Recognition and Creation of Web Service for Activity Recognition using Mobile Sensor Data using Azure Machine Learning Studio
( 9 ) Information and Communication Technology in Classroom Situations in Rural and Urban Areas in Zimbabwe: A Comparative Study on the Use of Digital and Projected Media in Teaching and Learning at Six Secondary Schools in Masvingo
( 10 ) Smart Campus: An Implementation of a Cloud-Based Mobile Learning Application
( 12 ) An Overview of Advances in Image Colorization Using Computer Vision and Deep Learning Techniques
( 14 ) Developing a Software Application for the Study and Learning of Linear a Script
( 16 ) Forecasting Air Passengers of Changi Airport Based on Seasonal Decomposition and an LSSVM Model
( 17 ) Exploring Internet of Thing on PCA Algorithm for Optimization of Facial Detection and Tracking
( 18 ) Feasibility of Chatbot for Mehran UET Examination Department
( 19 ) Information and Communication Technology (Ict) As a Necessity for Libraries and Librarians of Nigerian Universities in the 21st Century
( 20 ) A Study on Commensal Mortality Rate of a Typical Three Species Syn-Eco-System with Unlimited Resources for Commensal
( 22 ) Postbuckling Analysis of Functionally Graded Beams Using Hyperbolic Shear Deformation Theory
( 23 ) Classification and Identification of Risk Management Techniques for Mitigating Risks with Factor Analysis Technique in Software Risk Management
( 24 ) Information about Simulation Software for Testing of Wireless Network
( 25 ) Investigation on the Dependence of TCP Upstream Throughput on Snr for Single and Multiple Links in a Wlan System
( 26 ) RLS Fixed-Lag Smoother Using Covariance Information Based on Innovation Approach in Linear Continuous Stochastic Systems
( 27 ) Real-Time Workload Scheduling (RTWS) Algorithm for Cloud
( 28 ) Performance Analysis of Routing Protocols for CBR Traffic in Mobile Ad-Hoc Networks
( 29 ) Enhanced Isolation Mimo Antenna with DGS Structures for Long Term Evolution Systems
( 30 ) High Isolation Microstrip Mimo Antennas for Wlan Systems
( 31 ) Generalized Quantum Key Distribution for WDM Router Applications
( 32 ) Selection of Appropriate Equipment for Designing Effective Vacuum System
( 33 ) Web Service Composition for E-Commerce Web Application
( 34 ) Role of Library and Information Science Professionals in the Knowledge Society
( 36 ) Analysis of Suitable Security Protocols for Apply a Model of Identity in the Civil Registry of Ecuador
( 37 ) Path Loss Measurement and Modeling for Lagos State G.S.M Environments
( 38 ) Simulation of the Performance of CdTe/CdS/ZnO Multi- Junction Thin Film Solar Cell
( 39 ) Perturbation Functions for Compact Database
( 40 ) Chaotic Particle Swarm Optimization for Imprecise Combined Economic and Emission Dispatch Problem
( 41 ) Using ICT Policy Framework as a Panacea for Economic Recession and Instability in Nigeria
( 42 ) A Common Operational Picture in Support of Situational Awareness for Efficient Emergency Response Operations
( 43 ) Development Problems of Information Provision on the Management of High Technology Park
( 44 ) Comprehensive Analysis & Performance Comparison of Clustering Algorithms for Big Data
( 45 ) A Security Scheme for Protecting Agent Societies
( 46 ) Creation of Algoritms for Recommendation System Based on Users Data on Internet Advertisement Marketing
( 47 ) Development and Performance Analysis of Bisection Method-Based Optimal Path Length Algorithm for Terrestrial Microwave Link
( 48 ) Development of Facilitated Participatory Spatial Information System for Selected Urban Management Services
( 49 ) Stanford University Interim Propagation Loss Model for a Gmelina Arborea Tree-Lined Road
( 50 ) Characterisation of Propagation Loss for a 3G Cellular Network in a Crowded Market Area Using CCIR Model
( 51 ) Development of Web Application for University of Uyo Post UTME Examination Timetable
( 52 ) A Framework for Digital Forensic in Joint Heterogeneous Cloud Computing Environment
( 53 ) A Structural Framework for Distributed Electronic Voters Register
( 55 ) Speed Control of Induction Motor on C2000 DSP Platform
( 56 ) Application of Ant Algorithm for Software Optimization
( 57 ) Interactive Algorithms for the Verification of the Equality between Complex and Simplified Boolean-Algebra Expressions in Digital Decoders
( 58 ) Smart Feature Fusion and Model for Human Detection
( 59 ) LMS Algorithm for Adaptive Transversal Equalization of a Linear Dispersive Communication Channel
( 60 ) Study and Comparative Analysis of Programming Languages Used for Big Data
( 62 ) Big Data Frameworks for Sites and Products Recommendation