TY  - JOUR
A1  - Saadatfar, Hamid
A1  - Khosravi, Samiyeh
A1  - Hassannataj Joloudari, Javad
A1  - Mosavi, Amir
A1  - Shamshirband, Shahaboddin
T1  - A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning
JF  - Mathematics
N2  - The K-nearest neighbors (KNN) machine learning algorithm is a well-known non-parametric classification method. However, like other traditional data mining methods, applying it on big data comes with computational challenges. Indeed, KNN determines the class of a new sample based on the class of its nearest neighbors; however, identifying the neighbors in a large amount of data imposes a large computational cost so that it is no longer applicable by a single computing machine. One of the proposed techniques to make classification methods applicable on large datasets is pruning. LC-KNN is an improved KNN method which first clusters the data into some smaller partitions using the K-means clustering method; and then applies the KNN for each new sample on the partition which its center is the nearest one. However, because the clusters have different shapes and densities, selection of the appropriate cluster is a challenge. In this paper, an approach has been proposed to improve the pruning phase of the LC-KNN method by taking into account these factors. The proposed approach helps to choose a more appropriate cluster of data for looking for the neighbors, thus, increasing the classification accuracy. The performance of the proposed approach is evaluated on different real datasets. The experimental results show the effectiveness of the proposed approach and its higher classification accuracy and lower time cost in comparison to other recent relevant methods.
KW  - Maschinelles Lernen
KW  - Machine learning
KW  - K-nearest neighbors
KW  - KNN
KW  - classifier
KW  - big data
KW  - clustering
KW  - cluster shape
KW  - cluster density
KW  - classification
KW  - reinforcement learning
KW  - data science
KW  - computation
KW  - artificial intelligence
KW  - OA-Publikationsfonds2020
Y1  - 2020
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20200225-40996
UR  - https://www.mdpi.com/2227-7390/8/2/286
VL  - 2020
IS  - volume 8, issue 2, article 286
PB  - MDPI
ER  - 
TY  - INPR
A1  - Mosavi, Amir
A1  - Torabi, Mehrnoosh
A1  - Hashemi, Sattar
A1  - Saybani, Mahmoud Reza
A1  - Shamshirband, Shahaboddin
T1  - A Hybrid Clustering and Classiﬁcation Technique for Forecasting Short-Term Energy Consumption
N2  - Electrical energy distributor companies in Iran have to announce their energy demand at least three 3-day ahead of the market opening. Therefore, an accurate load estimation is highly crucial. This research invoked methodology based on CRISP data mining and used SVM, ANN, and CBA-ANN-SVM (a novel hybrid model of clustering with both widely used ANN and SVM) to predict short-term electrical energy demand of Bandarabbas. In previous studies, researchers introduced few effective parameters with no reasonable error about Bandarabbas power consumption. In this research we tried to recognize all efﬁcient parameters and with the use of CBA-ANN-SVM model, the rate of error has been minimized. After consulting with experts in the ﬁeld of power consumption and plotting daily power consumption for each week, this research showed that ofﬁcial holidays and weekends have impact on the power consumption. When the weather gets warmer, the consumption of electrical energy increases due to turning on electrical air conditioner. Also, con-sumption patterns in warm and cold months are different. Analyzing power consumption of the same month for different years had shown high similarity in power consumption patterns. Factors with high impact on power consumption were identiﬁed and statistical methods were utilized to prove their impacts. Using SVM, ANN and CBA-ANN-SVM, the model was built. Sine the proposed method (CBA-ANN-SVM) has low MAPE 5 1.474 (4 clusters) and MAPE 5 1.297 (3 clusters) in comparison with SVM (MAPE 5 2.015) and ANN (MAPE 5 1.790), this model was selected as the ﬁnal model. The ﬁnal model has the beneﬁts from both models and the beneﬁts of clustering. Clustering algorithm with discovering data structure, divides data into several clusters based on similarities and differences between them. Because data inside each cluster are more similar than entire data, modeling in each cluster will present better results. For future research, we suggest using fuzzy methods and genetic algorithm or a hybrid of both to forecast each cluster. It is also possible to use fuzzy methods or genetic algorithms or a hybrid of both without using clustering. It is issued that such models will produce better and more accurate results.
This paper presents a hybrid approach to predict the  electric energy usage of weather-sensitive loads. The presented methodutilizes the clustering paradigm along with ANN and SVMapproaches for accurate short-term prediction of electric energyusage, using weather data. Since the methodology beinginvoked in this research is based on CRISP data mining, datapreparation has received a gr eat deal of attention in thisresear ch. Once data pre-processing was done, the underlyingpattern of electric energy consumption was extracted by themeans of machine learning methods to precisely forecast short-term energy consumption. The proposed approach (CBA-ANN-SVM) was applied to real load data and resulting higher accu-racy comparing to the existing models.
2018 American Institute of Chemical Engineers Environ Prog, 2018 
https://doi.org/10.1002/ep.12934
KW  - Data Mining
KW  - support vector machine (SVM)
KW  - Machine Learning
KW  - forecasting
KW  - Prediction
KW  - Electric Energy Consumption
KW  - clustering
KW  - artiﬁcial neural networks (ANN)
Y1  - 2018
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20180907-37550
N1  - This is the pre-peer reviewed version of the following article: https://onlinelibrary.wiley.com/doi/10.1002/ep.12934, which has been published in final form at 
https://doi.org/10.1002/ep.12934. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.
ER  - 
TY  - THES
A1  - Udrea, Mihai-Andrei
T1  - Assessment of Data from Dynamic Bridge Monitoring
N2  - The focus of the thesis is to process measurements acquired from a continuous
monitoring system at a railway bridge. Temperature, strain and ambient vibration
records are analysed and two main directions of investigation are pursued.

The first and the most demanding task is to develop processing routines able to extract modal parameters from ambient vibration measurements. For this purpose, reliable experimental models are achieved on the basis of a stochastic system identification(SSI) procedure. A fully automated algorithm based on a three-stage clustering is implemented to perform a modal parameter estimation for every single measurement. After selecting a baseline of modal parameters, the evolution of eigenfrequencies is
studied and correlated to environmental and operational factors.

The second aspect deals with the structural response to passing trains. Corresponding
triggered records of strain and temperature are processed and their assessment is
accomplished using the average strains induced by each train as the reference parameter.
Three influences due to speed, temperature and loads are distinguished and treated individually. An attempt to estimate the maximum response variation due to each factor is also carried out.
KW  - automatic modal analysis
KW  - stochastic subspace identification
KW  - modal tracking
KW  - modal parameter estimation
KW  - clustering
KW  - Messtechnik
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:wim2-20140429-21742
ER  -