Evaluation of electrical efficiency of photovoltaic thermal solar collector

Solar energy is a renewable resource of energy that is broadly utilized and has the least emissions among renewable energies. In this study, machine learning methods of artificial neural networks (ANNs), least squares support vector machines (LSSVM), and neuro-fuzzy are used for advancing prediction models for the thermal performance of a photovoltaic-thermal solar collector (PV/T). In the proposed models, the inlet temperature, flow rate, heat, solar radiation, and the sun heat have been considered as the inputs variables. Data set has been extracted through experimental measurements from a novel solar collector system. Different analyses are performed to examine the credibility of the introduced approaches and evaluate their performance. The proposed LSSVM model outperformed ANFIS and ANNs models. LSSVM model is reported suitable when the laboratory measurements are costly and time-consuming, or achieving such values requires sophisticated interpretations.


Introduction
Developing more efficient systems and utilizing other energy resources are taking more significance since the amount of available fossil fuel resources are facing a decreasing slope.
There are several renewable energy sources that can be exploited to satisfy the energy sector demands (Qin, 2015). However, solar energy is considering more attention since it is available almost everywhere, and also it is regarded as clean energy with no harmful effect on the environment (Al-Maamary, Kazem, & Chaichan, 2017;Bong et al., 2017;Kannan & Vakeesan, 2016;Twidell & Weir, 2015). Solar energy is useful for various applications, including heating, cooling, and electricity production . There are two defined classifications of active and passive for utilizing solar energy. In the passive approach, there is no requirement for any extra equipment, and sun radiations utilized. While in the latter, the existence of mechanical components is necessary for solar energy utilization and the conversion process of solar energy to another form of energy is not direct. Solar collectors classified in the active approach of solar energy conversion to a targeted type of energy (Kannan & Vakeesan, 2016;Lewis, 2016;Modi, Bühler, Andreasen, & Haglind, 2017;Sijm, 2017;Wagh & Walke, 2017). Several factors affect the performance of solar-related systems including the absorption specifications of the applied materials, solar radiation of the region, operating condition (such as the temperature and daylight hours) and etc. (Qin, 2016;Qin, Liang, Luo, Tan, & Zhu, 2016;Qin, Liang, Tan, & Li, 2016). These parameters must be considered for modeling and designing solar energy technologies.
A solar collector defined as equipment which is used to gather sun-rays and absorb sunlight thermal energy and delivered it to a working fluid, mostly air or water. The transferred thermal energy in the working fluid can be stored in a storage tank to be used when solar energy is not sufficient or is not available (e.g., during the nights). Photovoltaic panels use solar irradiations and produce electricity. Moreover, during this electricity production process, a considerable amount of waste heat is also generated which can be taken its benefit by integrating a network of tubes which containing a fluid for heat transfer process (Ahmad, Saidur, Mahbubul, & Al-Sulaiman, 2017;Kumar, Prakash, & Kaviti, 2017).
The photovoltaic panels or so-called solar thermal collectors transform solar energy to the convenient electrical energy. Photovoltaic collector (PV) cells are challenged with low efficiency due to the high heat. Yet, the novel design of the electrical-thermal interaction in a hybrid photovoltaic/thermal (PV/T) collector is reported as an alternative to increase efficiency through heat dissipation (A. K. Pandey et al., 2016). Solar collectors categorized into two classifications based on the tracking model: no tracking system installed, fixed collectors, and a tracker system provided for tracking the sunlight during the daylight, tracking collectors. There is no movement for the fixed collectors, while the tracking collectors move in a way where the incoming sun-rays are perpendicular to the surface of the collectors. Flat plate collectors, evacuated tube collectors are classified as the fixed collector. There are two subclasses of single-axis tracking and double axis tracking for tracking of solar collectors. The former classified into three groups of parabolic and cylindrical trough collectors and linear Fresnel collectors. The latter examples are central tower collectors, parabolic dish collectors, and circular Fresnel lenses.
All of the mentioned technologies have their specific applications based on the feasibility of the required and available amount of energy demand and also some other climatic considerations (Fuqiang et al., 2017;Hussain et al., 2013;K. M. Pandey & Chaurasiya, 2017).
Predictive models are widely used for pattern recognition and estimating the behavior of various systems and technologies (Qin, Liang, Tan, & Li, 2017;Ramezanizadeh, et al. 2019). Currently, several methods are developed to predict the quantity of solar energy production. The primary methods classified in the two approaches of the cloud imagery integrated with physical models and machine learning approaches. The prediction horizon is the distinction making factor for selecting between the methods.
Since the calculation of the thermal efficiency by conventional solution methods results in solving complicated mathematical differential equations that are time consuming, the use of machine learning methods is considered. These methods can provide accurate prediction of the studied process by saving time and cost compared to laboratory methods. In this research, soft-computational techniques were employed to forecast the efficiency of PV/T collector. These selected approaches are namely, MLP-ANN, ANFIS, and LSSVM. The sun heat, flow rate, inlet temperature, and solar radiation are considered as the inputs variables for training and testing machine learning models to study the electrical efficiency yield as the output.

The adaptive neuro-fuzzy inference model
The momentum duty of the adaptive neuro-fuzzy inference (ANFIS) is to discover for fuzzy decision guidelines in the feed-forward framework. The establishment of conventional ANFIS based on 1 st order Takagi-Sugeno inference model is demonstrated in the following figure, Fig. 1.   Figure 1. Establishment of typical ANFIS. The ANFIS model states that a primary regulation made of 5 layers. As shown in figure   1, inputs of x and y fed into the built model, and the following output of f has resulted. In this mode, two different if-then fuzzy statements defined as follows (Brown & Harris, 1994;Lin & Lee, 1996): 1: 1 1; ℎ 1 = 1 + 1 + 1 (1) 2: 2 2; ℎ 2 = 2 + 2 + 2 (2) Where α1, α2, β1, and β2 are the fuzzy sets for x and y. Furthermore, the variables of m1, n1, r1, m2, n2, and r2 represent the final outputs of the training workflow.
The node functions are defined in every layer as follows: Layer I is the fuzzification of the task. Each node i represents an adaptive node. The outcome of each node in this layer is: x and y are the node's input data, i. μαi and μβi are functions for the fuzzy membership.
Layer II: devoted to managing the layer and nodes with constant (i=M). The receiving signals are consequently produced and resulted in the output. The output calculated by applying the following equation: Layer III is defined as the normalization layer. The normalized data of the i th node, N, calculate the normalized strength as follows: Layer IV is configured to de-fuzzy the data. Where between every node i and a node function, an adaptive relation is defined: The parameter sets of this node are mi, ni and ri, respectively.
Layer V is the final layer. The overall output of all receiving signals are calculated by a fixed node of E in this layer and then are summed: As mentioned above, the tuning parameters in the ANFIS structure are its membership parameters. These parameters can be determined optimally using evolutionary and optimization algorithms, e.g. PSO, GA, ACO, ICA. In the current study, these parameters are optimized using the PSO algorithm.

The multi-layer perceptron artificial neural network model
The ANNs are composed of a several internal, external, and hidden neural layers (Mitchell, 1997;Schalkoff, 1997;Yegnanarayana, 2009 Where, wi denotes the weight values, n represents the number of neurons in the hidden layer, wi,3 indicates the weight values and b3 is the bias. The outcome named Z. Moreover, the layout of the ANN is trained and is gone through an optimization process by utilizing the Back Propagation (BP) algorithm. During the training stage, the optimum statuses of weights and biases calculated. While biases and weights reach their optimum values, the disparity of the prediction of the ANN model and the real measured data is minimized. The value of the prediction error is obtained as: Where, p, , , and indicate the quantity of the training data, the i th neuron which belongs to the l th output layer, and the i th real output corresponding to the p th training data, respectively.
Based on Eq. (14) moreover, Eq. (15), BP algorithm is used to transfer the bias terms and also the weight's terms: Here, λ indicates the learning rate, and k states the iteration numbers.

The radial basis function artificial neural network model
The process of the radial basis function artificial neural networks (RBF-ANNs) is demonstrated in Fig. 3. There are many interconnected neurons in the RBF-ANNs. It composed of 3 layers of input, hidden layers, and output (Wasserman, 1993). The input layer's task is to import the input parameters into the transfer function. The number of model input parameters is equal to the number of nodes in the input layer. The hidden layer is the most noticeable part of the RBF-ANNs. Radially symmetry is a prominent feature of these nodes in this layer. Finally, by applying the weight factor from the output layer node to the hidden layer node, the output of this model is generated. The MLP is structurally analogous to RBF-NNs. However, the calculation process is not similar since, in the RBF-NNs, one hidden layer exists, uses, and estimates in the calculation process, but the MLP-NNs employ multiple hidden layers that are interconnected. Before applying the RBF-NNS, an activation function of the hidden layer defined, and the highest quantity of the neurons specified. Here, neurons considered as a processing unit of the network. Besides, the assessment of the optimum values is a crucial task in the process of modifying the process based on the assessment. Weight factors are used to train the RBF-NNs (Park & Sandberg, 1993).
The essential traits of the RBF-NNs are listed as follows: ▪ Triple-layer structure.
▪ Activation functions of Gaussian used in the hidden layer.
▪ Weight delivered to the hidden layer and then assigned to the output layer.
▪ An acceptable degree of interpolation.
In the interpolation algorithm, the input data mapped to the corresponding objective value of t p . Thus, each input vector required an activation function. This process performed by (‖ − ‖). Here, is the activation function and ‖ − ‖ denotes the Euclidean position difference between x and x p . The output is calculated as follows: Where, is the weight factor and denotes the q th input vector. In other words, to regulate the weight terms to come close to the Eq. (17), the interpolation process is necessary: Among available activation functions, the Gaussian activation function is mostly used.
This function is defined as follows: where, and r denote the interpolating function and the distance between a center of "c" and the local position of data point "x", respectively.

The least square support Vector Machine model
ϕ(x) and substitute the kernel function and the output layer vector, respectively.
Furthermore, b and x represent the bias, and the inputs into the N× matrix, respectively. In this matrix, the N denotes the trained data and n states the input parameters' number. Vapnik presented a meticulous procedure to obtain weight and bias. In this process, the following function must be minimized (Vapnik, Golowich, & Smola, 1997): By these following restrictions: In the above equations, is the k th input, indicates the k th output, indicates the accuracy of the function estimation, and * denote the slack factors. In overall, in order to specify the allowable deviations, slack terms are employed. A modifiable term of c > 0 requires to adjust the value range of the deviation from the ε.
SVM method is modified to Least Square Support Vector Machine (LSSVM) to be able to cover linear equations through linear programming to get a faster and more curious response than the conventional SVM approach. The LSSVM approach is as follow: In the above equations, the training parameter denoted by and the regression error of the training steps is represented by .
Moreover, in comparison with the SVM method, equality constraints are used instead of the inequality constraints. The Lagrangian approach is used to solve the above problem (Eq.
Therefore, the LSSVM method should be applied to solve the 2N+2 equations and other unknown variables of , , w, and b.
indicates the regulating variable of the LSSVM approach. Since both of SVM and LSSVM methods are used kernel functions, the presence of other tuning parameters is essential. Here, RBF kernel has been employed: 2 is acted as a regulating parameter. Therefore, the target parameters of the LSSVM can be obtained more precisely by decreasing the error between the predicted results and the actual illustrations. For the LSSVM approach; the mean square error (MSE) is presented as follows:

Experimental Procedure and Data Preparation
Data was gathered from a laboratory scale PV/T setup that has a new design in layering of the thermal section. As presented in Figure 5(a) a half pipe is used instead of full circle tube as the fluid channel that is bonded to the absorber plate using special adhesives. This   The water mass flow rate is an essential factor in the PV/T system. In this study, the water mass flow rate is 1 2 to 4 lit/min and other system parameters are recorded. Also, the influence of water inlet temperature (20 ℃ <Tinlet< 45 ℃ ) on the PV/T system has experimented. parameters to have a visual insight from high-dimensional data. For plotting these diagrams,

Andrews tool is used in MATLAB 2018 library. This diagram is a non-integer model of the
Kent-Kiviat radar diagram or the smoothened model of a parallel coordinate diagram (Andrews, 1972). Curves belonging to samples of a similar class will usually be closer together and their behavior is similar. As can be seen in this figure, since the Andrews diagrams of the inlet temperature, heat, solar radiation, the heat of the sun, and electrical efficiency are very close together, these parameters behave similarly, while flow rate behaves very differently.

Preprocessing Procedure
Four machine-based prediction models of MLP-ANN, RBF-ANN, ANFIS, and LSSVM were developed in Matlab 2018 software to model the efficiency of the PV/T system. In order to find the objective of the efficiency of the PV/T system, some affecting parameters are assumed to be known and inserted as an input to the model. These variables are inlet temperature, flow rate, heat, solar radiation, and heat of the sun. An overall number of 98 data points were utilized in the models above to forecast the desired objective.
The data classified into two subclasses of train and test, which 75% of the data considered as training and the remaining belong to the test subclass. The former is used to specify the external variables of the developed models, while the latter checks the precision of the model's output. To have a homogenized data set, the following equation, Eq. (28) is used to normalize the data points in the normalization range of [-1, 1]: D is the variable, n stands for normalized, min refers to a minimum, and max states the maximum amounts of the corresponding variable. In these models, inlet temperature, flow rate, heat, solar radiation, and heat of the sun are the input of the problem while the electrical efficiency is designed to be the target objective.

ANN
In this study, RBF and MLP are implemented to model the output of the electrical efficiency of the PV/T system collector. Seven hidden neurons were used for the training section in order to specify the target parameter by minimizing the distance of the forecasted and actual measured data. It is worth noting that the number of hidden neurons is seven. This number was obtained by trial and error method. For the ANN model we use ANN toolbox of MATLAB and also the Levenberg Marquardt (LM) algorithm was chosen according to its applicability in optimization problems in order to determine optimal weight and bias values.
The mean squared error of the obtained forecasted values from the MLP practice is depicted in Fig. 11. Moreover, Table 1 presents the optimum of bias and weight.  Besides, to train the RBF-ANN model, the Levenberg-Marquardt algorithm is used with 50 iterations. The training process of the radial basis network is regularly less timeconsuming than the sigmoid or the linear network. The performance of the RBF-ANN method during various iterations is demonstrated in Fig. 12.

ANFIS method
In facilitate the advancement of the ANFIS model, the Particle swarm optimization (PSO) approach was used. The overall numbers of ANFIS variables are dependent on clusters' number, Nc, variables' number, Nv, and the number of membership function variables (NMF) as follows: The membership function of this study is the Gaussian membership function. Z and σ 2 are the two membership function variables. The primary input parameters are sun heat, inlet temperature, flow rate, and solar radiation. Seven clusters are primarily considered. Hence, the overall number of ANFIS parameters is 84. In order to obtain the optimum status of the ANFIS parameters, the RMSE between experimentally measured and the forecasted values is reflected as the cost function of the PSO algorithm Fig. 13. The RMSE of each iteration is shown. The trained membership function for input data is illustrated in Fig. 14.

LSSVM
The LSSVM approach employs two regulating variables in its algorithm. These variables are γ and σ 2 . The regulation variable is stated by γ, and the kernel variable is the RBF.
Moreover, the LSSVM method is hybridized with GA to specify the optimum response of the introduced model.

Models' Evaluation
Different statistical criteria such as R-squared, Root Mean Squared Error (RMSE) and etc. are applicable to evaluate the confidence, reliability and accuracy of the models (Qin & Hiller, 2016;Qin, Hiller, & Bao, 2013). In this research, the proposed approaches are evaluated based on various statistical methods as listed in the following: Where N denotes the quantity of data points. The superscripts of exp. and cal. are for values which experimentally and based on calculation were obtained, respectively. ̅ indicates the mean efficiency obtained through experimental measurements.

Results and Discussion
The obtained results from applying four introduced intelligent techniques are described in detail in Table 2. The used data set consists of 98 data points.      analyses are also performed for train, test, and overall data. Table 3. represents the results indicating that the proposed methods express precise estimation.

Outlier Detection
The trustworthiness of the employed models is exceptionally dependent on the experimentally measured data points (Rousseeuw & Leroy, 2005). Outliers called to those data (individual or group) which their behaving trend is not following other data points.
R=3 is the recommended cut-off value. The lines of = ±3 on the vertical axis limit the feasible region. On the other hand, the feasible space on the horizontal axis is specified between lines of H=0 and H=H * =0.09. Those data that were outside of the acceptable range are called the Outlying. Based on William's plot, which is depicted in Fig. 18, most of the data are placed in the acceptable range except for one data for all studied models.
In order to demonstrate the reliance of the objective of the study on input parameters, a sensitivity analysis is carried out. A relevancy factor of -1< r <+1 is selected in the sensitivity analysis. As the r is closer to unity states that the final objective parameter is highly affected by the input variables. The positive values of r state the increasing effect of input parameters on the final objective, and negative values of r represents a decreasing trend for the dependency of the target to the inputs. Relevancy factor is obtained as follows: , expresses the i th input, ̅ denotes the mean value of the k th input, indicates the i th output, and ̅ represents the mean value of output. N is the overall population of data. The relevancy factors for all of the input data are illustrated in Fig. 19. The inlet temperature is monitored to be the most affecting variable in the efficiency of the PV/T system since the relevance factor of 0.36 was computed.

Conclusion
Machine-based methods of MLP-ANN, RBF-ANN, ANFIS, and LSSVM were utilized to establish a mathematical model between efficiency of PV/T collector and input parameters of inlet temperature, flow rate, heat, solar radiation, and heat of the sun. To this end, experimental measurements prepared by designing a solar collector system and a hundred data extracted. The trustworthiness of the models in precise estimation of the efficiency shown with graphical and statistical approaches. In order to demonstrate the comprehensiveness of the models, the outlying recognition performed. It was shown that the results of the LSSVM model were more satisfactory than other models. R-squared (R 2 ) and