Publicado en 3C Tecnología – Volume 11 Issue 2 (Ed. 42)
Due to the growing size of datasets, which contain hundreds or thousands of features, feature selection has drawn the interest of many scholars in recent years. Usually, not all columns show important values. As a result, the machine learning models may perform poorly since the noise or unnecessary columns may confound the algorithms. To address this issue, various feature selection methods have been developed to evaluate large dimensional datasets and identify their subsets of pertinent features. The data, however, frequently skews feature selection algorithms. As a result, ensemble approaches have emerged as a substitute that incorporates the benefits of single feature selection algorithms and makes up for their drawbacks. In order to handle feature selection on datasets with large dimensionality, this research aims to grasp the key ideas and links in the process of aggregating feature selection methods. The suggested idea is tested by creating a cross-validation implementation that combines a number of Python packages with functionality to enable the feature selection techniques. By identifying pertinent features in the human, chimpanzee, and dog DNA datasets, the performance of the implementation was demonstrated.
KeywordsCross-validation, Ensemble methods, Feature selection.
- Study of Different Coding Methods of Polar code in 5G Communication system
- Arduino Based Insect & Rodent Repeller for Living & Working Spaces
- Implementation of Hand Gesture-Controlled Mouse Using Artificial Intelligence
- Deep Learning based missing object Detection and Person Identification: An application for Smart CCTV
- Verification of Role of Data Scanning Direction in Image Compression using Fuzzy Composition Operations
- Limit cycles of perturbed global isochronous center
- Dynamics of Sofic Shifts