A comprehensive review of machine learning techniques on diabetes detection

Sharma, Toshita; Shah, Manan

doi:10.1186/s42492-021-00097-7

Table 3 A comprehensive study of the machine learning methods done by some researchers

From: A comprehensive review of machine learning techniques on diabetes detection

Algorithm	Method used/innovation	Application and future work	Results and limitations (if specified)	References
J48, AdaBoost, and bagging on base classifier	The model was performed on Canadian Primary Care Sentinel Surveillance Network dataset with several features to train on. The author used ensemble methods AdaBoost on base classifier J48 DT.	The author claimed that these ensemble algorithms can be used on other disease datasets to increase accuracy.	The AdaBoost algorithm with the J48 as the base classifier showed the maximum accuracy followed by bagging and then the J48 classifier. The AROC was used as the parameter.	[27]
NB with clustering	Dataset used was the PIMA Indians Diabetes Dataset with eight attributes. The model is NB performed on prior clustering. This model is compared with only the NB model. Five hundred and thirty-one instances of data were divided into 5 clusters. The fourth cluster was the only one used for testing, which consisted of 148 instances.	By collecting a large amount of data for training, the accuracy can be increased by many-fold, helping people by developing a system that gives them a correct prediction without having to consult a doctor.	The parameters used for evaluation are accuracy, sensitivity, and specificity. The model with clustering showed a 10% increased accuracy, rise in sensitivity by 53.11% but the limitation caused here was the fall of specificity by 10.99% and also a reduced amount of dataset.	[28]
DTs, LR, and NB with bagging and boosting	Initial datasets were collected from primary care units, which (through further changes) consisted of 11 features and a data of 30122 people. The three algorithms are used along with bagging and boosting methods, which are to decrease overfitting and increase accuracy.	The final model obtained with highest accuracy was deployed on a commercial web application.	The following data shows the accuracy with bagging and boosting. DT 85.090, LR 82.308, NB 81.010, Bagging with DT (BG+DT) 85.333, bagging with LR (BG+LR) 82.318, bagging with NB (BG+NB) 80.960, boosting with DT (BT+DT) 84.098, boosting with LR (BT+LR) 82.312, and boosting with NB (BT+NB) 81.019. RF 85.558 shows the maximum accuracy. The ROC was used for final validation.	[29]
LR, KNN, SVM, LDA, NB, DT, and RF	The author collected a raw dataset from Noakhali medical hospital containing 9843 samples with 14 attributes. Eighty percent of the data was taken for training and the rest for testing. All the algorithms chosen by the author were used for model building and then validation was performed on them using k-fold.	The author proposed that we can enhance the accuracy of early treatment to lessen the suffering of patients. Additionally, we can implement more classifiers to pick up the leading one for record-breaking performance and extend it to automation analysis.	The RF classifier was the algorithm that performed the best in classifying data and LR showed the worst performance. Although machine learning classifiers are widely used, they still lack in terms of accuracy against deep learning models.	[30]
LR and DTs	The dataset was prepared using a questionnaire carried out for 1487 individuals in which 735 were diabetic and the remaining 752 negatives. A Pearson chi-square test was carried out on all the characteristics. The models’ performance was evaluated on three parameters: accuracy, sensitivity, and specificity. Apart from this, a confusion matrix was also built to determine model performance.	Recently, many researchers have been implementing various algorithms and networks to compare them and find out the most feasible one. DTs and LR are among the ones that are most used.	LR achieved a ACC of 76.54%, sensitivity of 79.4%, and specificity of 73.54% on the testing data while the DT gained an accuracy of 76.97%, sensitivity of 78.11%, and specificity of 75.78%. Overall, the DT model performed better than the LR model. The model poses a limitation of the dataset. It is collected only from one area of China, if it had been collected from different regions, the model implementation could be more practical.	[31]
SVM and LR	Practice fusion de-identified dataset was used for the study taken from Kaggle containing data of approximately 10000 patients. The features were divided into baseline, lab-test, diagnosis, and medication. For the classification task, LR and SVM were deployed. The LR model was implemented using the GLM function and the SVM was used on a linear kernel. The area under the ROC was the parameter used for evaluation.	LR is a model that is widely used in public health and clinical practice for disease detection and to calculate risks.	On using a smaller subset of features, the LR model performed slightly better than the SVM model.	[32]
DTs and NB	The dataset taken for consideration was the PIMA Indian diabetes database. On applying feature selection, the author obtained five features. 10-fold cross-validation was used for data preparation after which the J48 algorithm – DTs and NB is applied. The model performance was evaluated using mean absolute error (MAE), root-mean-square error (RMSE), relative absolute error, root relative squared error, and kappa statistic.	The author proposed to gather information for the dataset from different people to make a more representative model. The work can be further enhanced to include automation.	Using a percentage split of 70:30, the J48 DT algorithm correctly classified 177 instances (76.95%) whereas the NB got an accuracy of 79.56%. The accuracy obtained performed better on the percentage split, which shows the models are not showing good accuracy on larger datasets.	[33]
RF and XG boost	The author used the PIMA diabetes dataset. Using Jupyter Notebook as an IDE, the author trains the model using 8 attributes of the total 9 provided in the dataset. The algorithms used were RF and XGBoost. After setting the hyperparameters the models were trained.	The author suggests the use of more algorithms in this branch of machine learning like hybrid model for better accuracies.	The accuracy gained on the RF classifier came out to be 71.9%. The hybrid model proposed through XG boost gained an accuracy of 74.1%. The accuracies gained on the models were comparatively less compared those already available. The hyperparameter tuning needs to be set better for optimizing the algorithm.	[34]
DT (J48) and NB	The author used the PIMA Indian diabetes dataset with 8 attributes, which was reduced to 5 based on the feature selection. The pre-processing was performed used the WEKA using 10-fold validation. The model was created using the 70% dataset and the rest was used for testing.	In future, it is planned to gather the information from different locales over the world and make a more precise and general prescient model for diabetes conclusion. Future study will likewise focus on gathering information from a later time period and discover new potential prognostic elements to be incorporated. The work can be extended and improved for the automation of diabetes analysis.	The J48 algorithm was 76.95% accurate with other parameters like kappa statistic, MAE, RMSE, relative absolute error, and root relative absolute error. The NB algorithm was accurate up to 79.56%. Since this model is not optimally configured, a developed model would require more training data for creation and testing.	[33]
Genetic algorithms with fuzzy logic	This work is a model implementing the genetic and fuzzy algorithms for effective disease prediction. For the implementation of GA, MATLAB R2006b was used. The principle of feature selection was implemented using fuzzy logic algorithms. Firstly, a simultaneous mapping was performed based on an appropriateness measure of variables values to each class using suitable membership functions according to each type of feature. Then, simple fuzzy reasoning mechanisms were proposed to deal, in unified way, with classification.	The proposed work helps minimize the cost and increase accuracy and can be used in future for better implementations.	Through this approach of GA, the accuracy went up to 87% with the training cost reducing by more than 50%.	[35]
k-NN, NB, DT, RF, SVM LR	These models were created in comparison for detection of type-2 diabetes. A total of 300 samples were taken in which 161 were diabetic, 60 non-diabetic and the rest were unconfirmed. Using feature summarization, eight features were selected with the WEKA tool used. These algorithms were used against a proposed framework that automatically extracts patterns of type 2 DM.	An application of genome wide association and phenome-wide association study in hope for its associations with DM. They proved to be an important association for future models.	Proposed model was evaluated on basis of accuracy, precision, specificity, sensitivity, and AUC. The proposed algorithm gained an AUC of 0.98, which outperforms the state-of-the-art AUC of 0.71.	[36]
LR, DT, RF, SVM	The author used the PIMA Indian women dataset concerned with women’s health with 8 attributes. Different models were trained for this dataset under different hyperparameters.	The author proposed to create advanced models on RF because of its highest accuracy and ability to overcome overfitting.	Different models were compared on basis of accuracy. RF gained the highest accuracy with 77.06% followed by SVM.	[22]
DT – J48, RF	The author obtained dataset from Luzhou from hospital physical examination. An independent test set was taken with 13700 samples. The data contained 14 attributes. Another dataset was the PIMA Indian diabetes dataset. The DT and RF algorithm were implemented in WEKA with principal component analysis.	The author hoped to predict the type of diabetes using a dataset containing the required data which would lead to be an added advantage for improving the accuracy.	The RF and the J48 algorithm achieved an accuracy of 73.95% and 73.88%, respectively, on the Luzhou dataset and 71.44% and 71.67%, respectively, on the PIMA dataset.	[37]
NB and SVM	Patient dataset of 500 records was collected from diabetes healthcare institute who have symptoms of heart disease. The dataset contained 9 attributes. Both the algorithms were implemented on WEKA dataset. For SVM, a radial basis function kernel was used.	Classifiers of this kind can help in early detection of the vulnerability of a diabetic patient to heart disease. There by the patients can be forewarned to change their lifestyle. This will result in preventing diabetic patients from being affected by heart disease, thereby resulting in low mortality rates as well as reduced cost on health for the state.	NB was able to classify 74% of instances correctly. For SVM, the accuracy gained was 95.6%.	[38]
DT, SVM, K-NN RF	The author used the PIMA Indian dataset. Then, data normalization was done followed by feature selection.	The benefit of this optimized machine learning model is that it is suitable for patients with DM and it can be applied to all the fields of health care environments.	The models are compared on the basis of accuracy, sensitivity, and specificity. DT has 78.25% accuracy.	[39]

Back to article page