Enhancing a Random Forest Model Based on Single Rule Reduction for Tax Evasion Depends on the Values of K in K-Fold Validation Technique
تعزيز نموذج الغابة العشوائية بالاعتماد على تقليل القواعد المفردة للتنبؤ بالتهرب الضريبي وفقًا لقيم K في تقنية التحقق المتقاطع K-Fold
2024 1st International Conference on Emerging Technologies for Dependable Internet of Things (ICETI) · 2024 · pp. 1–9
Abstract
This paper developed a new model called Single Rule Random Forest (SrRF) that enhances the performance of the Fandom Forest technique(RF)and reduces its rules, then compared the performance of this model with a set of ML models as follows (DT, RF, GPT) of the datasets provided by the Tax Authority of Yemen which consists of 1083 records using two values for k in K-Fold Cross Validation). The results showed that by using the value k=10, the GPT Classifier gave the highest result of 100 % but this result will cause overfitting, then the SRI-RF Classifier gave the best result with 99.89%, while RF gave the worst result. By using the value k=5, the GPT Classifier also gave the highest result of 100 % but this result also caused overfitting, then the SrRF Classifier gave the best result, while RF gave the worst result. From the above, the researchers also note that training the data using K-fold cross-validation with the value K=10 gave better results. On the other side, our proposed model SRIRF with the value of k=10 gave the best result.