Due to the importance of identifying crop cultivars, the advancement of accurate assessment of cultivars is considered essential. The existing methods for identifying rice cultivars are mainly time-consuming, costly, and destructive. Therefore, the development of novel methods is highly beneficial. The aim of the present research is to classify common rice cultivars in Iran based on color, morphologic, and texture properties using artificial intelligence (AI) methods. In doing so, digital images of 13 rice cultivars in Iran in three forms of paddy, brown, and white are analyzed through pre-processing and segmentation of using MATLAB. Ninety-two specificities, including 60 color, 14 morphologic, and 18 texture properties, were identified for each rice cultivar. In the next step, the normal distribution of data was evaluated, and the possibility of observing a significant difference between all specificities of cultivars was studied using variance analysis. In addition, the least significant difference (LSD) test was performed to obtain a more accurate comparison between cultivars. To reduce data dimensions and focus on the most effective components, principal component analysis (PCA) was employed. Accordingly, the accuracy of rice cultivar separations was calculated for paddy, brown rice, and white rice using discriminant analysis (DA), which was 89.2%, 87.7%, and 83.1%, respectively. To identify and classify the desired cultivars, a multilayered perceptron neural network was implemented based on the most effective components. The results showed 100% accuracy of the network in identifying and classifying all mentioned rice cultivars. Hence, it is concluded that the integrated method of image processing and pattern recognition methods, such as statistical classification and artificial neural networks, can be used for identifying and classification of rice cultivars.
Paper-based data acquisition and manual transfer between incompatible software or data formats during inspections of bridges, as done currently, are time-consuming, error-prone, cumbersome, and lead to information loss. A fully digitized workflow using open data formats would reduce data loss, efforts, and the costs of future inspections. On the one hand, existing studies proposed methods to automatize data acquisition and visualization for inspections. These studies lack an open standard to make the gathered data available for other processes. On the other hand, several studies discuss data structures for exchanging damage information among different stakeholders. However, those studies do not cover the process of automatic data acquisition and transfer. This study focuses on a framework that incorporates automatic damage data acquisition, transfer, and a damage information model for data exchange. This enables inspectors to use damage data for subsequent analyses and simulations. The proposed framework shows the potentials for a comprehensive damage information model and related (semi-)automatic data acquisition and processing.
This study aims to evaluate a new approach in modeling gully erosion susceptibility (GES) based on a deep learning neural network (DLNN) model and an ensemble particle swarm optimization (PSO) algorithm with DLNN (PSO-DLNN), comparing these approaches with common artificial neural network (ANN) and support vector machine (SVM) models in Shirahan watershed, Iran. For this purpose, 13 independent variables affecting GES in the study area, namely, altitude, slope, aspect, plan curvature, profile curvature, drainage density, distance from a river, land use, soil, lithology, rainfall, stream power index (SPI), and topographic wetness index (TWI), were prepared. A total of 132 gully erosion locations were identified during field visits. To implement the proposed model, the dataset was divided into the two categories of training (70%) and testing (30%). The results indicate that the area under the curve (AUC) value from receiver operating characteristic (ROC) considering the testing datasets of PSO-DLNN is 0.89, which indicates superb accuracy. The rest of the models are associated with optimal accuracy and have similar results to the PSO-DLNN model; the AUC values from ROC of DLNN, SVM, and ANN for the testing datasets are 0.87, 0.85, and 0.84, respectively. The efficiency of the proposed model in terms of prediction of GES was increased. Therefore, it can be concluded that the DLNN model and its ensemble with the PSO algorithm can be used as a novel and practical method to predict gully erosion susceptibility, which can help planners and managers to manage and reduce the risk of this phenomenon.
Piping erosion is one form of water erosion that leads to significant changes in the landscape and environmental degradation. In the present study, we evaluated piping erosion modeling in the Zarandieh watershed of Markazi province in Iran based on random forest (RF), support vector machine (SVM), and Bayesian generalized linear models (Bayesian GLM) machine learning algorithms. For this goal, due to the importance of various geo-environmental and soil properties in the evolution and creation of piping erosion, 18 variables were considered for modeling the piping erosion susceptibility in the Zarandieh watershed. A total of 152 points of piping erosion were recognized in the study area that were divided into training (70%) and validation (30%) for modeling. The area under curve (AUC) was used to assess the effeciency of the RF, SVM, and Bayesian GLM. Piping erosion susceptibility results indicated that all three RF, SVM, and Bayesian GLM models had high efficiency in the testing step, such as the AUC shown with values of 0.9 for RF, 0.88 for SVM, and 0.87 for Bayesian GLM. Altitude, pH, and bulk density were the variables that had the greatest influence on the piping erosion susceptibility in the Zarandieh watershed. This result indicates that geo-environmental and soil chemical variables are accountable for the expansion of piping erosion in the Zarandieh watershed.
In machine learning, if the training data is independently and identically distributed as the test data then a trained model can make an accurate predictions for new samples of data. Conventional machine learning has a strong dependence on massive amounts of training data which are domain specific to understand their latent patterns. In contrast, Domain adaptation and Transfer learning methods are sub-fields within machine learning that are concerned with solving the inescapable problem of insufficient training data by relaxing the domain dependence hypothesis. In this contribution, this issue has been addressed and by making a novel combination of both the methods we develop a computationally efficient and practical algorithm to solve boundary value problems based on nonlinear partial differential equations. We adopt a meshfree analysis framework to integrate the prevailing geometric modelling techniques based on NURBS and present an enhanced deep collocation approach that also plays an important role in the accuracy of solutions. We start with a brief introduction on how these methods expand upon this framework. We observe an excellent agreement between these methods and have shown that how fine-tuning a pre-trained network to a specialized domain may lead to an outstanding performance compare to the existing ones. As proof of concept, we illustrate the performance of our proposed model on several benchmark problems.
Temporary changes in precipitation may lead to sustained and severe drought or massive floods in different parts of the world. Knowing the variation in precipitation can effectively help the water resources decision-makers in water resources management. Large-scale circulation drivers have a considerable impact on precipitation in different parts of the world. In this research, the impact of El Niño-Southern Oscillation (ENSO), Pacific Decadal Oscillation (PDO), and North Atlantic Oscillation (NAO) on seasonal precipitation over Iran was investigated. For this purpose, 103 synoptic stations with at least 30 years of data were utilized. The Spearman correlation coefficient between the indices in the previous 12 months with seasonal precipitation was calculated, and the meaningful correlations were extracted. Then, the month in which each of these indices has the highest correlation with seasonal precipitation was determined. Finally, the overall amount of increase or decrease in seasonal precipitation due to each of these indices was calculated. Results indicate the Southern Oscillation Index (SOI), NAO, and PDO have the most impact on seasonal precipitation, respectively. Additionally, these indices have the highest impact on the precipitation in winter, autumn, spring, and summer, respectively. SOI has a diverse impact on winter precipitation compared to the PDO and NAO, while in the other seasons, each index has its special impact on seasonal precipitation. Generally, all indices in different phases may decrease the seasonal precipitation up to 100%. However, the seasonal precipitation may increase more than 100% in different seasons due to the impact of these indices. The results of this study can be used effectively in water resources management and especially in dam operation.
Cyber security has become a major concern for users and businesses alike. Cyberstalking and harassment have been identified as a growing anti-social problem. Besides detecting cyberstalking and harassment, there is the need to gather digital evidence, often by the victim. To this end, we provide an overview of and discuss relevant technological means, in particular coming from text analytics as well as machine learning, that are capable to address the above challenges. We present a framework for the detection of text-based cyberstalking and the role and challenges of some core techniques such as author identification, text classification and personalisation. We then discuss PAN, a network and evaluation initiative that focusses on digital text forensics, in particular author identification.
A vast number of existing buildings were constructed before the development and enforcement of seismic design codes, which run into the risk of being severely damaged under the action of seismic excitations. This poses not only a threat to the life of people but also affects the socio-economic stability in the affected area. Therefore, it is necessary to assess such buildings’ present vulnerability to make an educated decision regarding risk mitigation by seismic strengthening techniques such as retrofitting. However, it is economically and timely manner not feasible to inspect, repair, and augment every old building on an urban scale. As a result, a reliable rapid screening methods, namely Rapid Visual Screening (RVS), have garnered increasing interest among researchers and decision-makers alike. In this study, the effectiveness of five different Machine Learning (ML) techniques in vulnerability prediction applications have been investigated. The damage data of four different earthquakes from Ecuador, Haiti, Nepal, and South Korea, have been utilized to train and test the developed models. Eight performance modifiers have been implemented as variables with a supervised ML. The investigations on this paper illustrate that the assessed vulnerability classes by ML techniques were very close to the actual damage levels observed in the buildings.
Earthquake is among the most devastating natural disasters causing severe economical, environmental, and social destruction. Earthquake safety assessment and building hazard monitoring can highly contribute to urban sustainability through identification and insight into optimum materials and structures. While the vulnerability of structures mainly depends on the structural resistance, the safety assessment of buildings can be highly challenging. In this paper, we consider the Rapid Visual Screening (RVS) method, which is a qualitative procedure for estimating structural scores for buildings suitable for medium- to high-seismic cases. This paper presents an overview of the common RVS methods, i.e., FEMA P-154, IITK-GGSDMA, and EMPI. To examine the accuracy and validation, a practical comparison is performed between their assessment and observed damage of reinforced concrete buildings from a street survey in the Bingöl region, Turkey, after the 1 May 2003 earthquake. The results demonstrate that the application of RVS methods for preliminary damage estimation is a vital tool. Furthermore, the comparative analysis showed that FEMA P-154 creates an assessment that overestimates damage states and is not economically viable, while EMPI and IITK-GGSDMA provide more accurate and practical estimation, respectively.
The economic losses from earthquakes tend to hit the national economy considerably; therefore, models that are capable of estimating the vulnerability and losses of future earthquakes are highly consequential for emergency planners with the purpose of risk mitigation. This demands a mass prioritization filtering of structures to identify vulnerable buildings for retrofitting purposes. The application of advanced structural analysis on each building to study the earthquake response is impractical due to complex calculations, long computational time, and exorbitant cost. This exhibits the need for a fast, reliable, and rapid method, commonly known as Rapid Visual Screening (RVS). The method serves as a preliminary screening platform, using an optimum number of seismic parameters of the structure and predefined output damage states. In this study, the efficacy of the Machine Learning (ML) application in damage prediction through a Support Vector Machine (SVM) model as the damage classification technique has been investigated. The developed model was trained and examined based on damage data from the 1999 Düzce Earthquake in Turkey, where the building’s data consists of 22 performance modifiers that have been implemented with supervised machine learning.
The latest earthquakes have proven that several existing buildings, particularly in developing countries, are not secured from damages of earthquake. A variety of statistical and machine-learning approaches have been proposed to identify vulnerable buildings for the prioritization of retrofitting. The present work aims to investigate earthquake susceptibility through the combination of six building performance variables that can be used to obtain an optimal prediction of the damage state of reinforced concrete buildings using artificial neural network (ANN). In this regard, a multi-layer perceptron network is trained and optimized using a database of 484 damaged buildings from the Düzce earthquake in Turkey. The results demonstrate the feasibility and effectiveness of the selected ANN approach to classify concrete structural damage that can be used as a preliminary assessment technique to identify vulnerable buildings in disaster risk-management programs.
Coronary Artery Disease Diagnosis: Ranking the Significant Features Using a Random Trees Model
(2020)
Heart disease is one of the most common diseases in middle-aged citizens. Among the vast number of heart diseases, coronary artery disease (CAD) is considered as a common cardiovascular disease with a high death rate. The most popular tool for diagnosing CAD is the use of medical imaging, e.g., angiography. However, angiography is known for being costly and also associated with a number of side effects. Hence, the purpose of this study is to increase the accuracy of coronary heart disease diagnosis through selecting significant predictive features in order of their ranking. In this study, we propose an integrated method using machine learning. The machine learning methods of random trees (RTs), decision tree of C5.0, support vector machine (SVM), and decision tree of Chi-squared automatic interaction detection (CHAID) are used in this study. The proposed method shows promising results and the study confirms that the RTs model outperforms other models.
The longitudinal dispersion coefficient (LDC) plays an important role in modeling the transport of pollutants and sediment in natural rivers. As a result of transportation processes, the concentration of pollutants changes along the river. Various studies have been conducted to provide simple equations for estimating LDC. In this study, machine learning methods, namely support vector regression, Gaussian process regression, M5 model tree (M5P) and random forest, and multiple linear regression were examined in predicting the LDC in natural streams. Data sets from 60 rivers around the world with different hydraulic and geometric features were gathered to develop models for LDC estimation. Statistical criteria, including correlation coefficient (CC), root mean squared error (RMSE) and mean absolute error (MAE), were used to scrutinize the models. The LDC values estimated by these models were compared with the corresponding results of common empirical models. The Taylor chart was used to evaluate the models and the results showed that among the machine learning models, M5P had superior performance, with CC of 0.823, RMSE of 454.9 and MAE of 380.9. The model of Sahay and Dutta, with CC of 0.795, RMSE of 460.7 and MAE of 306.1, gave more precise results than the other empirical models. The main advantage of M5P models is their ability to provide practical formulae. In conclusion, the results proved that the developed M5P model with simple formulations was superior to other machine learning models and empirical models; therefore, it can be used as a proper tool for estimating the LDC in rivers.
The seismic vulnerability assessment of existing reinforced concrete (RC) buildings is a significant source of disaster mitigation plans and rescue services. Different countries evolved various Rapid Visual Screening (RVS) techniques and methodologies to deal with the devastating consequences of earthquakes on the structural characteristics of buildings and human casualties. Artificial intelligence (AI) methods, such as machine learning (ML) algorithm-based methods, are increasingly used in various scientific and technical applications. The investigation toward using these techniques in civil engineering applications has shown encouraging results and reduced human intervention, including uncertainties and biased judgment. In this study, several known non-parametric algorithms are investigated toward RVS using a dataset employing different earthquakes. Moreover, the methodology encourages the possibility of examining the buildings’ vulnerability based on the factors related to the buildings’ importance and exposure. In addition, a web-based application built on Django is introduced. The interface is designed with the idea to ease the seismic vulnerability investigation in real-time. The concept was validated using two case studies, and the achieved results showed the proposed approach’s potential efficiency
One of the most important subjects of hydraulic engineering is the reliable estimation of the transverse distribution in the rectangular channel of bed and wall shear stresses. This study makes use of the Tsallis entropy, genetic programming (GP) and adaptive neuro-fuzzy inference system (ANFIS) methods to assess the shear stress distribution (SSD) in the rectangular channel.
To evaluate the results of the Tsallis entropy, GP and ANFIS models, laboratory observations were used in which shear stress was measured using an optimized Preston tube. This is then used to measure the SSD in various aspect ratios in the rectangular channel. To investigate the shear stress percentage, 10 data series with a total of 112 different data for were used. The results of the sensitivity analysis show that the most influential parameter for the SSD in smooth rectangular channel is the dimensionless parameter B/H, Where the transverse coordinate is B, and the flow depth is H. With the parameters (b/B), (B/H) for the bed and (z/H), (B/H) for the wall as inputs, the modeling of the GP was better than the other one. Based on the analysis, it can be concluded that the use of GP and ANFIS algorithms is more effective in estimating shear stress in smooth rectangular channels than the Tsallis entropy-based equations.
The study presents a Machine Learning (ML)-based framework designed to forecast the stress-strain relationship of arc-direct energy deposited mild steel. Based on microstructural characteristics previously extracted using microscopy and X-ray diffraction, approximately 1000 new parameter sets are generated by applying the Latin Hypercube Sampling Method (LHSM). For each parameter set, a Representative Volume Element (RVE) is synthetically created via Voronoi Tessellation. Input raw data for ML-based algorithms comprises these parameter sets or RVE-images, while output raw data includes their corresponding stress-strain relationships calculated after a Finite Element (FE) procedure. Input data undergoes preprocessing involving standardization, feature selection, and image resizing. Similarly, the stress-strain curves, initially unsuitable for training traditional ML algorithms, are preprocessed using cubic splines and occasionally Principal Component Analysis (PCA). The later part of the study focuses on employing multiple ML algorithms, utilizing two main models. The first model predicts stress-strain curves based on microstructural parameters, while the second model does so solely from RVE images. The most accurate prediction yields a Root Mean Squared Error of around 5 MPa, approximately 1% of the yield stress. This outcome suggests that ML models offer precise and efficient methods for characterizing dual-phase steels, establishing a framework for accurate results in material analysis.
Polylactic acid (PLA) is a highly applicable material that is used in 3D printers due to some significant features such as its deformation property and affordable cost. For improvement of the end-use quality, it is of significant importance to enhance the quality of fused filament fabrication (FFF)-printed objects in PLA. The purpose of this investigation was to boost toughness and to reduce the production cost of the FFF-printed tensile test samples with the desired part thickness. To remove the need for numerous and idle printing samples, the response surface method (RSM) was used. Statistical analysis was performed to deal with this concern by considering extruder temperature (ET), infill percentage (IP), and layer thickness (LT) as controlled factors. The artificial intelligence method of artificial neural network (ANN) and ANN-genetic algorithm (ANN-GA) were further developed to estimate the toughness, part thickness, and production-cost-dependent variables. Results were evaluated by correlation coefficient and RMSE values. According to the modeling results, ANN-GA as a hybrid machine learning (ML) technique could enhance the accuracy of modeling by about 7.5, 11.5, and 4.5% for toughness, part thickness, and production cost, respectively, in comparison with those for the single ANN method. On the other hand, the optimization results confirm that the optimized specimen is cost-effective and able to comparatively undergo deformation, which enables the usability of printed PLA objects.
In this research, an attempt was made to reduce the dimension of wavelet-ANFIS/ANN (artificial neural network/adaptive neuro-fuzzy inference system) models toward reliable forecasts as well as to decrease computational cost. In this regard, the principal component analysis was performed on the input time series decomposed by a discrete wavelet transform to feed the ANN/ANFIS models. The models were applied for dissolved oxygen (DO) forecasting in rivers which is an important variable affecting aquatic life and water quality. The current values of DO, water surface temperature, salinity, and turbidity have been considered as the input variable to forecast DO in a three-time step further. The results of the study revealed that PCA can be employed as a powerful tool for dimension reduction of input variables and also to detect inter-correlation of input variables. Results of the PCA-wavelet-ANN models are compared with those obtained from wavelet-ANN models while the earlier one has the advantage of less computational time than the later models. Dealing with ANFIS models, PCA is more beneficial to avoid wavelet-ANFIS models creating too many rules which deteriorate the efficiency of the ANFIS models. Moreover, manipulating the wavelet-ANFIS models utilizing PCA leads to a significant decreasing in computational time. Finally, it was found that the PCA-wavelet-ANN/ANFIS models can provide reliable forecasts of dissolved oxygen as an important water quality indicator in rivers.
This research aims to model soil temperature (ST) using machine learning models of multilayer perceptron (MLP) algorithm and support vector machine (SVM) in hybrid form with the Firefly optimization algorithm, i.e. MLP-FFA and SVM-FFA. In the current study, measured ST and meteorological parameters of Tabriz and Ahar weather stations in a period of 2013–2015 are used for training and testing of the studied models with one and two days as a delay. To ascertain conclusive results for validation of the proposed hybrid models, the error metrics are benchmarked in an independent testing period. Moreover, Taylor diagrams utilized for that purpose. Obtained results showed that, in a case of one day delay, except in predicting ST at 5 cm below the soil surface (ST5cm) at Tabriz station, MLP-FFA produced superior results compared with MLP, SVM, and SVM-FFA models. However, for two days delay, MLP-FFA indicated increased accuracy in predicting ST5cm and ST 20cm of Tabriz station and ST10cm of Ahar station in comparison with SVM-FFA. Additionally, for all of the prescribed models, the performance of the MLP-FFA and SVM-FFA hybrid models in the testing phase was found to be meaningfully superior to the classical MLP and SVM models.
Pressure fluctuations beneath hydraulic jumps potentially endanger the stability of stilling basins. This paper deals with the mathematical modeling of the results of laboratory-scale experiments to estimate the extreme pressures. Experiments were carried out on a smooth stilling basin underneath free hydraulic jumps downstream of an Ogee spillway. From the probability distribution of measured instantaneous pressures, pressures with different probabilities could be determined. It was verified that maximum pressure fluctuations, and the negative pressures, are located at the positions near the spillway toe. Also, minimum pressure fluctuations are located at the downstream of hydraulic jumps. It was possible to assess the cumulative curves of pressure data related to the characteristic points along the basin, and different Froude numbers. To benchmark the results, the dimensionless forms of statistical parameters include mean pressures (P*m), the standard deviations of pressure fluctuations (σ*X), pressures with different non-exceedance probabilities (P*k%), and the statistical coefficient of the probability distribution (Nk%) were assessed. It was found that an existing method can be used to interpret the present data, and pressure distribution in similar conditions, by using a new second-order fractional relationships for σ*X, and Nk%. The values of the Nk% coefficient indicated a single mean value for each probability.