A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

Saadatfar, Hamid; Khosravi, Samiyeh; Hassannataj Joloudari, Javad; Mosavi, Amir; Shamshirband, Shahaboddin

Treffer 1 von 2

A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

Hamid Saadatfar, Samiyeh Khosravi, Javad Hassannataj Joloudari, Amir Mosavi, Shahaboddin Shamshirband

The K-nearest neighbors (KNN) machine learning algorithm is a well-known non-parametric classification method. However, like other traditional data mining methods, applying it on big data comes with computational challenges. Indeed, KNN determines the class of a new sample based on the class of its nearest neighbors; however, identifying the neighbors in a large amount of data imposes a largeThe K-nearest neighbors (KNN) machine learning algorithm is a well-known non-parametric classification method. However, like other traditional data mining methods, applying it on big data comes with computational challenges. Indeed, KNN determines the class of a new sample based on the class of its nearest neighbors; however, identifying the neighbors in a large amount of data imposes a large computational cost so that it is no longer applicable by a single computing machine. One of the proposed techniques to make classification methods applicable on large datasets is pruning. LC-KNN is an improved KNN method which first clusters the data into some smaller partitions using the K-means clustering method; and then applies the KNN for each new sample on the partition which its center is the nearest one. However, because the clusters have different shapes and densities, selection of the appropriate cluster is a challenge. In this paper, an approach has been proposed to improve the pruning phase of the LC-KNN method by taking into account these factors. The proposed approach helps to choose a more appropriate cluster of data for looking for the neighbors, thus, increasing the classification accuracy. The performance of the proposed approach is evaluated on different real datasets. The experimental results show the effectiveness of the proposed approach and its higher classification accuracy and lower time cost in comparison to other recent relevant methods.…

Metadaten
Dokumentart:	Artikel (Wissenschaftlicher)
Verfasserangaben:	Hamid Saadatfar ORCiD, Samiyeh Khosravi, Javad Hassannataj Joloudari ORCiD, Amir Mosavi ORCiD, Shahaboddin Shamshirband ORCiD
DOI (Zitierlink):	https://doi.org/10.3390/math8020286 Zitierlink
URN (Zitierlink):	https://nbn-resolving.org/urn:nbn:de:gbv:wim2-20200225-40996 Zitierlink
URL:	https://www.mdpi.com/2227-7390/8/2/286
Titel des übergeordneten Werkes (Deutsch):	Mathematics
Verlag:	MDPI
Sprache:	Englisch
Datum der Veröffentlichung (online):	21.02.2020
Datum der Erstveröffentlichung:	20.02.2020
Datum der Freischaltung:	25.02.2020
Veröffentlichende Institution:	Bauhaus-Universität Weimar
Institute und Partnereinrichtugen:	Fakultät Bauingenieurwesen / Institut für Strukturmechanik (ISM)
Jahrgang:	2020
Ausgabe / Heft:	volume 8, issue 2, article 286
Seitenzahl:	12
Freies Schlagwort / Tag:	OA-Publikationsfonds2020 K-nearest neighbors; KNN; Machine learning; artificial intelligence; big data; classification; classifier; cluster density; cluster shape; clustering; computation; data science; reinforcement learning
GND-Schlagwort:	Maschinelles Lernen
DDC-Klassifikation:	500 Naturwissenschaften und Mathematik
BKL-Klassifikation:	54 Informatik
Open Access Publikationsfonds:	Open-Access-Publikationsfonds 2020
Lizenz (Deutsch):	Creative Commons 4.0 - Namensnennung (CC BY 4.0)

Universitätsbibliothek
Weimar Open Access

A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

Volltext Dateien herunterladen

Metadaten exportieren

Weitere Dienste

UniversitätsbibliothekWeimar Open Access

A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

Volltext Dateien herunterladen

Metadaten exportieren

Weitere Dienste

Universitätsbibliothek
Weimar Open Access