Analysis of Changes in Market Shares of Commercial Banks Operating in Turkey Using Computational Intelligence Algorithms

Amasyali, M. Fatih; Demırhan, Ayse; Bal, Mert

doi:https://doi.org/10.1155/2014/649860

Advances in Artificial Intelligence

On this page

Abstract Introduction Dataset Experimental Results Conclusion References Copyright Related Articles

Research Article | Open Access

Volume 2014 | Article ID 649860 | https://doi.org/10.1155/2014/649860

Analysis of Changes in Market Shares of Commercial Banks Operating in Turkey Using Computational Intelligence Algorithms

M. Fatih Amasyali,¹Ayse Demırhan,²and Mert Bal³

Academic Editor: Wolfgang Faber

Received15 Nov 2013

Revised24 Feb 2014

Accepted13 Mar 2014

Published15 Apr 2014

Abstract

This paper aims to model the change in market share of 30 domestic and foreign banks, which have been operating between the years 1990 and 2009 in Turkey by taking into consideration 20 financial ratios of those banks. Due to the fragile structure of the banking sector in Turkey, this study plays an important role for determining the changes in market share of banks and taking the necessary measures promptly. For this reason, computational intelligence methods have been used in the study. According to the research results, it is seen that it was not able to properly anticipate the data for the banking sector in the periods of financial crises (2000-2001 and 2008-2009). However, it is seen that, Simple Linear Regression is distinguished as a good algorithm among the computational intelligence algorithms for all periods between the years 1990 and 2009.

1. Introduction

As a natural result of the financial liberalization in the economy and the banking industry in Turkey after 1980s, the competition in the banking industry increased significantly due to the reasons such as many new domestic and foreign players in the banking industry, release of the fund transfers especially from international markets, enabling the banks to make transactions in foreign currencies, advances in the technology, and introducing new services by the banks in the industry. Therefore, a bank, operating in the banking industry, can differentiate itself from the other banks only if it can develop new strategies.

In recent years, because of economic and financial crisis, some of the public and private banks were bankrupted and some of them are merged and therefore they were forced to change how they operate. In this instance, a serious competition occurred among the surviving banks to take the market shares of the banks that have left the industry. The banks, which have evaluated the present circumstances, used cutting edge technology, and improved the scope of their products and services, were able to advance forward significantly. Thus, these advances create a necessary environment for such banks to improve their market shares. Therefore, evaluating their position in the market and developing new strategies in accordance with their positions became much more important.

The presence of a tough competition between the banks besides the fragile structure of the banking sector in Turkey makes it important to determine the change in the market shares of banks and to take the necessary measures. For this reason, goal-oriented estimations that would be made by using computational intelligence methods are of great importance for the industry.

2. General Situation of Turkish Banking Sector

The banks during the period following 1980s are mostly multibranch banks which were financed by public sector and operating in deposit banking field and have oligopolistic structure. In private banking point of view, the banks belonging to holdings have significance in the sector. In the banking sector between 1980 and 1990, with the help of the politics towards financial liberalization and resulting environment, the banks could adapt themselves to meet the international criteria [1]. During this period of financial liberalization process in Turkey, Capital Markets, Istanbul Stock Exchange (ISE), the Interbank Money Market (Interbank), and Foreign Exchange Market were established and Istanbul Gold Exchange started to operate [2]. In addition to this, Savings Deposit Insurance Fund (SDIF) is also established. SDIF is an institution in the banking sector which guarantees the rights of depositors and the regulating system. Gaining functionality of the free economy mechanism and rearrangements towards liberalization of the market have created significant impacts on the banking system [3]. The competition in the market increased due to new foreign/domestic players in the sector and liberating the deposit/credit interest rates [4].

As a result of the new legal and institutional regulations, helping new players to enter the market, compete, and expand in the sector, the number of the banks in Turkish banking sector has experienced a rapid expansion in employment, service diversification, and technological infrastructure points [5].

In 1999, a significant economic contraction in the economic activities and rising real interest rates caused a decline in private sector credit demand, a deterioration of the asset quality, and an increase in the need of liquidity of public banks, in particular banks with weak financial situation. In the late 1990s, macroeconomic instability in the banking sector, financial risks encountered, high costs of the resources, unfair competition conditions, lack of equity capital, oligopolistic structure, the rapid advancements in technology, holdings, and shallow capital market are some of the known basic problems [6, 7].

At the beginning of 2000, a comprehensive economic program has been implemented in order to reduce inflation and restore economic growth environment. In general, since the second half of the year, the conditions such as delays in the structural adjustment arrangements, inflation which did not fall as quickly as expected, and increasing costs of public goods and services were increased at the same level as the inflation; uncontrolled domestic demand caused deterioration of the economic outlook, and as a result of those reasons a serious concussion occurred in November 2000 and February 2001 in the banking sector.

Even though their arising points are quite different from each other, the nature of both crises which occurred in November 2000 and February 2001 is financial [8]. The starting circumstances of the crises affect the duration and depth of the crisis. The circumstances in question usually show the fragility of the economic structure in the country experiencing the crisis [9]. The basis of these two crises experienced in Turkey is due to the fragile nature of the banking sector [10]. November 2000 crisis was due to the liquidity problem in the banking sector. Increasing demand for liquidity during this period could not be met and this caused accelerated increase in the interest rates and exchange rates and the crisis in the financial market was deepened [11]. In addition, if we look at the crisis in February 2001, with the impact of rising financial fragility after November, it was a direct attack on Turkish Lira. This crisis that started in economic and financial system has quickly spread to the real sector.

November 2000 and February 2001 crisis intensified the problems happening in the banking sector further and introduced new problems. The banking sector, which was having serious difficulties due to liquidity and interest risk after the November crisis, faced significant losses arising from foreign exchange risk with the February crisis [12].

The significant increase in the interest rates, as a result of the crisis occurred in the November 2000, the financial structure of the public banks in the need of excessive overnight borrowing and the banks under the coverage of SDIF [13].

In order to solve the problems permanently and increase the competitiveness of the banking sector and after the financial crises in November 2000 and February 2001, a program has been established; a special emphasis was given to financial sector and the measures to be taken in this regard were determined.

The financial sector in Turkey was also affected negatively to a large extent by the global crises that affected all the world economies negatively in 2008. When it is considered from the point of deposit in investment and development banks, particularly in the last quarter of the year 2008 and in the beginning of the year 2009, the balance sheet risks increased rapidly, foreign funding opportunities became fewer, and the need for liquidity increased [14].

In line with the international developments, the income in the Turkish economy decreased, the private sector demand shrank, and the volume of foreign trade decreased whereas the budget deficit and unemployment rate increased and direct investments decreased. Therefore, in order to soften and restrict the negative impacts of the global crisis, many measures were taken by the related institutions in order to expand both the money and the financial politics during that period [15].

Computational intelligence algorithms that we have used in the paper will be described in Section 3.

3. Computational Intelligence Algorithms

In computational intelligence area, the aim is prediction of output values based on input values. If the output values are labels, this problem is named as a classification problem. If the output values are numeric values, it is named as a regression problem. Both of classification and regression problems construct a prediction model from the training set. The model performance is evaluated on the test set. For our problem the outputs are numeric. So, we have used the regression algorithms.

The regression approximation addresses the problem of estimating a function based on a given dataset , where, is input vector, is the real value, is the estimated value, is the estimation function, is the number of observations, and is the number of features.

In litreture, several regression algorithms are proposed for building the estimation function. In this section, the used ones are briefly introduced.

3.1. Zero Rule (ZeroR)

Zero Rule (ZeroR, 0-R) is a trivial classifier, but it gives a lower bound on the performance of a given dataset which should be significantly improved by more complex classifiers. As such it is a reasonable test on how well the class can be predicted without considering the other attributes [16].

3.2. M5 Model Rules (M5R)

Generates a decision list for regression problems using separate-and-conquer. In each iteration it builds a model tree using M5 and makes the “best” leaf into a rule [17].

3.3. Decision Table (DT)

A decision table consists of a hierarchical table in which each entry in a higher level table gets broken down by the values of a pair of additional attributes to form another table. The structure is similar to dimensional stacking. Given a training sample containing labeled instances, an induction algorithm builds a hypothesis in some representation. The representation investigated here is a decision table with a default rule mapping to the majority class, when it is abbreviated as DTM. A DTM has two components:(1)a schema, which is a set of features, and(2)a body,which is a multiset of labeled instances.Each instance consists of a value for each of the features in the schema and a value for the label. Given an unlabelled instance , the label assigned to the instance by a DTM classifier is computed as follows.

Let be the set of labeled instances in the DTM exactly matching the given instance , where only the features in the schema are required to match and all other features are ignored. If , return the majority class in the DTM. Otherwise, return the majority class in . Unknown values are treated as distinct values in the matching process [18].

3.4. Reduced Error Pruning Tree (REP Tree)

Reduced Error Pruning (REP) was introduced by Quinlan [19] in the context of decision tree learning. It has been subsequently adapted to rule set learning as well [20]. It produces an optimal pruning of a given tree—the smallest tree among those with minimal error with respect to a given set of pruning examples [20, 21]. The REP algorithm works in two phases: first the set of pruning examples is classified using the given tree to be pruned. Counters that keep track of the number of examples of each class passing through each node are updated simultaneously. In the second phase—a bottom-up pruning phase—those parts of the tree that can be removed without increasing the error of the remaining hypothesis are pruned away [22]. The pruning decisions are based on the node statistics calculated in the top-down classification phase.

3.5. Conjunctive Rule (CR)

This class implements a single conjunctive rule learner that can predict for numeric and nominal class labels.

A rule consists of antecedents “AND”ed together and the consequent (class value) for the classification/regression. In this case, the consequent is the distribution of the available classes (or mean for a numeric value) in the dataset. If the test instance is not covered by this rule, then it is predicted using the default class distributions/value of the data not covered by the rule in the training data. This learner selects an antecedent by computing the Information Gain of each antecedent and prunes the generated rule using Reduced Error Pruning or simple prepruning based on the number of antecedents. For classification, the information of one antecedent is the weighted average of the entropies of both the data covered and not covered by the rule. For regression, the information is the weighted average of the mean-squared errors of both the data covered and not covered by the rule. In pruning, weighted average of the accuracy rates on the pruning data is used for classification while the weighted average of the mean-squared errors on the pruning data is used for regression [23].

3.6. Gaussian Process Regression (GPR)

Probabilistic regression is usually formulated as follows.

Given a training set of pairs of (vectorial) inputs and noisy (real, scalar) outputs , compute the predictive distribution of the function values (or noisy ) at test locations . In the simplest case we assume that the noise is additive, independent, and Gaussian, such that the relationship between the (latent) function and the observed noisy targets is given by where is the variance of the noise, and we use the notation for the Gaussian distribution with mean and covariance [24].

The GP is a popular nonparametric model for supervised learning problems [25]. A Gaussian Process (GP) is a collection of random variables, any finite number of which have consistent joint Gaussian distributions. Gaussian Process (GP) regression is a Bayesian approach which assumes a GP prior over functions; that is, a priori the function values behave according to where is a vector of latent function values and is a covariance matrix, whose entries are given by the covariance function . Valid covariance functions give rise to semidefinite covariance matrices. In general, positive semidefinite kernels are valid covariance functions. The covariance function encodes our assumptions about the function we wish to learn by defining a notion of similarity between two function values as a function of the corresponding two inputs.

A very common covariance function is the Gaussian, or squared exponential: where controls the prior variance, and is an isotropic lengthscale parameter that controls the rate of decay of the covariance that determines how far away must be from for to be unrelated to [24].

3.7. Isotonic Regression (IR)

The Isotonic Regression (IR) problem is considered in the following least distance setting. Given a vector , a strictly positive vector of weights , and a directed acyclic graph with the set of nodes , we find that solves the problem Since this is a strictly convex quadratic programming problem its solution is unique. The optimality conditions and an analysis of the typical block (or cluster) structure of can be found in [26–28]. The monotonicity constraints defined by the acyclic graph imply a partial order of the components .

A special case of Isotonic Regression problem arises when there is a complete order of the components. This problem, referred to as IRC problem, is defined by a directed graph and is formulated as follows: IR problem has numerous important applications, for instance, in machine learning, data mining, statistics, operations research, and signal processing. These applications are characterized by a very large value of . For such large-scale problems, it is of great practical importance to develop algorithms whose complexity does not rise with too rapidly. The existing optimization-based algorithms [29] and statistical IR algorithms have either high computational complexity or low accuracy of the approximation to the optimal solution they generate.

The most widely used method for solving IRC problem (5) is the so-called Pool-Adjacent-Violator (PAV) algorithm [30–32]. This algorithm is of computational complexity [11–13]. The PAV algorithm has been extended by Pardalos and Xue [28] to the special case of IR problem (4), in which the partial order of the components is presented by a directed tree [33].

3.8. Linear Regression (LR)

The final model is a linear regression of a subsample of the attributes. The subsample is selected by iteratively removing the one with the smallest standardized coefficient until no improvement is observed in the estimate of the error given by the Akaike Information Criterion () [34] The value, with the assumption of eventual normally distributed errors, is calculated as follows: where is the number of parameters in the model, is the residual of sum of the squares, and is the number of observations [35].

3.9. Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) also referred to as multilayer feedforward neural networks is the most used and popular neural network method. It belongs to the class of supervised neural network. The MLP topology consists of three sequential layers of processing nodes: an input layer, one or more hidden layers, and an output layer which produces the classification results. A MLP structure is shown in Figure 1 [36].

The principle of the network is that when data are presented at the input layer, the network nodes perform calculations in the successive layers until an output value is obtained at each of the output nodes. This output signal should be able to indicate the appropriate class for the input data. A node in MLP can be modeled as one or more artificial neurons, which computes the weighted sum of the inputs at the presence of the bias and passes this sum through the nonlinear activation function. This process is defined as follows [36]: where is the linear combination of inputs , is the bias (adjustable parameter), is the connection synaptic weight between the input and the neuron , is the activation function (usually nonlinear function) of the th neuron, and is the output. Here, hyperbolic tangent and logistic sigmoid function can be used for the nonlinear activation function. But, in most of the application widely used logistic sigmoid function is applied as follows: where represents the slope of the sigmoid [37].

The bias term contributes to the left or right shift of the sigmoid activation function, depending on whether takes a positive or negative value.

Learning in a MLP is an unconstrained optimization problem, which is subject to the minimization of a global error function depending on the synaptic weights of the network [38].

The first backpropagation learning algorithm for use with MLP structures was presented by Rumelhart et al. [39]. The backpropagation algorithm is one of the simplest and most general methods for the supervised training of MLP. This algorithm uses a gradient descent search method to minimize a mean square error between the desired output and the actual outputs. Backpropagation algorithm is defined as follows [36–40]:(I)initialize all the connection weights with small random values from a pseudorandom sequence generator;(II)repeat until convergence (either when the error is below a preset value or until the gradient is smaller than a preset value);(i)compute the update using ,(ii)iterative algorithm requires taking a weight vector at iteration and updating it as ,(iii)compute the error ,where is the iteration number, represents all the weights in the network, and is the learning rate and merely indicates the relative size of the change in weights. The error can be chosen as the mean square error function between the actual output and the desired output and and are the desired and the network output vector of length :

3.10. Simple Linear Regression (SLR)

It is the process of fitting straight lines (models) between each attribute and output. In (11), the values of and are estimated by the method of least squares. Consider The model having lowest squared error is selected as the final model among each parameter model [35].

3.11. Support Vector Machines (SVMs)

The Support Vector Machines (SVMs) [41] are types of learning machines based on statistical learning theory. SVMs are supervised learning methods that have been widely and successfully used for pattern recognition in different areas [42]. Especially in recent years SMVs with linear or nonlinear kernels have become one of the most promising learning algorithms for classification as well as regression [43]. The problem that SVMs try to solve is to find an optimal hyperplane that correctly classifies data points by separating the points of two classes as much as possible [44].

Let (for ) be the input vectors in input space, with corresponding binary labels . Let be the corresponding vectors in feature space, where is the implicit kernel mapping, and let be the kernel function, implying a dot product in the feature space [45]. represents the desired notion of similarity between data and . needs to satisfy Mercer’s condition in order for to exist [44]. There are a number of kernel functions which have been found to provide good generalization capabilities [46]. The kernel function that has been used in SVM is a linear function and the details of the function are given below.

Linear Kernel: .

The optimization problem for a soft-margin SVM is subject to the constraints and , where is the normal vector of the separating hyperplane in feature space, and is a regularization parameter controlling the penalty for misclassification. Equation (12) is referred to as the primal equation. From the Lagrangian form of (12), we derive the dual problem: subject to . This is a quadratic optimization problem that can be solved efficiently using algorithms such as Sequential Minimal Optimization (SMO) [47]. Typically, many go to zero during optimization, and the remaining corresponding to those are called support vectors. To simplify notation, from here on we assume that all nonsupport vectors have been removed, so that is now the number of support vectors, and for all . With this formulation, the normal vector of the separating plane is calculated as Note that because is defined implicity, exists only in feature space and cannot be computed directly. Instead, the classification of a new query vector can only be determined by computing the kernel function of with every support vector where the bias term is the offset of the hyperplane along its normal vector, determined during SVM training [45].

3.12. IB1 (One Nearest Neighbor)

IB1 [33] is an implementation of the simplest similarity based learner, known as nearest neighbor. IB1 simply finds the stored instance closest (according to Eucklidean distance metric) to the instance to be classified. The new instance is assigned to the retrieved instance’s class. Equation (16) shows the distance metric employed by IB1. Consider Equation (16) gives the distance between two instances and ; and refer to the th feature value of instances and , respectively. For numeric valued attributes , for symbolic valued attributes , if the feature values and are the same, and if they differ [48].

3.13. K-Star

K-star [49] can be considered as a variation of instance based learning which uses an entropic distance measure. To compute the distance between two samples, the concept of “complexity of transforming from one sample into another sample” is introduced. A K-star distance is then defined by summing over all possible transformation paths between two distances. This approach can be applied on real valued attributes, as well as symbolic attributes [50].

3.14. Additive Regression (AR)

It is initially focused on the regression problem where the response is quantitative, and it is aimed to model the mean . The additive model has the following form: Here there is a separate function for each of the input variables . More generally, each component is a function of a small prespecified subset of the input variables. The “backfitting algorithm” is a convenient modular algorithm for fitting additive models. A backfitting update is Any method or algorithm for estimating a function of can be used to obtain an estimate of the conditional expectation in (18). In particular, this can include nonparametric smoothing algorithms, such as local regression or smoothing splines. In the right hand side, all the latest versions of the functions are used in forming the partial residuals. The backfitting cycles are repeated until convergence [51]. Under fairly general conditions, backfitting can be shown to converge to the minimizer of [52].

3.15. Bagging

Bagging [53] predictors is a method for generating multiple versions of a predictor and using these to get on aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy [53].

4. Purpose of Study

The study aims to determine which parameters (factors) affect the change in the market shares of the public and private commercial banks and the foreign banks established in Turkey and have branches operating in Turkey between 1990 and 2009, in parallel with the economic and social developments that occurred during this period in the country.

Three market shares, which can be considered as the indicator of all market shares, are taken into account in the study. These figures (Y1, Y2, and Y3) are the market shares as an indicator of the market share in total deposits (Y3), as an indicator of the success of the bank in deposit collection, the market share in total credits (Y3), as an indicator of the contribution to the total credits, and finally the market share in the total assets (Y1) as an indicator of the change in the reliability and soundness of the bank.

Because of the long-term effects of the Russian and Asian crises encountered in 1997 and 1998, November 2000 and February 2001 crises were experienced successively with deeper financial strength in Turkey. These crises affected mostly the banks which do not have solid financial structures and cannot act according to the conditions experienced. Therefore, 2001 crisis with regard to the banking sector should not only be considered economically. After 2001, the banking sector has been renewed with new regulations. Therefore, another purpose of the study is to determine if the crises affect the market shares of the banks.

5. Dataset and Parameters Used in Study

5.1. Scope of Study

The public and private commercial banks and the foreign banks established in Turkey between 1990 and 2009 are included in the study. The bankrupted banks, the banks that were transferred to SDIF and the way of operation has been changed and development and investment banks are excluded from the study.

For the period 1990–2009, the data belonging to 30 banks listed in Table 1 are used in the study.

The market shares of the banks can be determined using three different criteria: market shares in total assets, total credits, and total deposits [54]. Total assets, total credits, and total deposits factors, which can be used as indicators of the change in the market share, shall be explained using the following ratios: which is important for banks to continue their operations; capital adequacy ratios, profitability ratios which is important to measure the profitability of the banks, liquidity ratios which show the status of short-term debt repayment, income-expenditure structure rates, which give information about which items are playing important role in total income and expenditure of the banks and correction of the profit of the bank changing which income and expenditure items, and asset quality and the ratios showing the group shares of the banks. All of those figures show the situation of the banks as well as giving illustrative information about if the banks will face a problem in the future. Therefore, following those ratios will help to take necessary precautions in case of negative developments. The precautions that will be taken immediately will not only prevent the decrease in the market shares but also will help the banks to catch the opportunities that cause their market shares to increase. The ratios showing the market shares of the banks used in the study are dependent variables and capital adequacy, asset quality, liquidity, profitability, and the ratios showing the group shares are explanatory variables. An average value is calculated as a dependent variable that indicates the market shares of banks by using total assets (Y1), total credit (Y2), and total deposit (Y3) rates of all the banks in the sector. In determination of the market shares of the banks in the study during the period 1990 and 2009, analysis will be developed using computational intelligence algorithms and 20 financial ratios relating to capital adequacy, asset quality, liquidity, profitability, and the ratios showing the group shares.

The data used in the study consists of secondary data which were taken from the book “Our Banks” and prepared annually by Banks Association of Turkey. 20 financial ratios prevailing for the period between 1990 and 2009 are shown in Table 2.

The liquidity ratios are defining the power of repayment of the short-term debt of businesses, and the profitability ratios are the indicators for the businesses’ ability to provide a satisfactory profit for the invested capital and the effectiveness of management in well-functioning financial markets. To those areas where the profitability is limited, financial investors do not put their money and the entrepreneurs do not make investments. In addition, the credit institutions are reluctant to give credits to the businesses which have low profitability. The ratios relating to capital adequacy, asset quality, liquidity, profitability, and the ratios showing the group shares are important to the businesses and therefore they are crucial to banks. With the calculation of those ratios, the banks can determine both their situation and their position in the sector more efficiently. Therefore, all of the ratios above are included as explanatory variables.

6. Experimental Results

The bank dataset collection includes 20 features/inputs (see Table 2) and an output feature (changes in market shares) for each year (1990–2009). For each year, a dataset is formed including input features and an output feature. Each dataset consists of 30 samples. All features were normalized between 0 and 1.

We have applied 15 machine learning algorithms from WEKA library [16] to predict the output of samples. In Table 3 the algorithms and IDs are given.

The default design parameters were selected for Zero Rule, M5 rules, decision table, conjunctive rule, REP Tree, Gaussian Processes, Isotonic Regression, linear regression, Multilayer Perceptron, Simple Linear Regression, support vector machine regression, one nearest neighbor, and Kstar algorithms. For the meta algorithms (additive regression and bagging), the ensemble sizes were selected as 100.

The performance of each classification algorithm was evaluated using 5-fold cross-validation. In each 5-fold cross-validation, each dataset is randomly split into 5 equal sized segments and results are averaged over 5 trials. The performance evaluation is based on root mean-squared error (RMSE) defined in (19). Consider where is the estimated output for th test sample, is the real output value of th test sample, and is the number of the test samples. Table 4 shows the root mean-squared error results.

According to Table 4, the best algorithm (having the most minimum errors) is Simple Linear Regression. This very simple algorithm generally performs better than all the other complicated algorithms.

According to Table 4, the best predicted years are 2006 and 2007. The worst predicted years are 2001, 2002, 2008, and 2009.

Due to the financial crisis experienced in the beginning of 2000s in Turkey and the global crisis that affected our country as well as all the world economies in the period 2008-2009, the estimations for those periods (2001, 2002, 2008, and 2009) did not lead to good results. It is seen that this situation is based on the fact that economic and financial data are much more fluctuating in the crisis periods in comparison to stagnation periods.

Two questions were raised when we want to have more understanding on the dynamics of the datasets.(1)The datasets includes 20 features. It is known that the irrelevant features can badly affect results. Is it possible to get better results with fewer features?(2)Which features affected the results the most? To answer these related questions we applied Correlation Based Feature Selection (CFS) [50] algorithm for each year’s dataset separately. CFS method will be explained briefly below.

CFS evaluates the worth of a subset of attributes by considering the individual predictive ability of each future along with the degree of redundancy between them. Correlation coefficients are used to estimate correlation between subset of attributes and class, as well as intercorrelations between the features. Relevance of a group of features grows with the correlation between features and classes and decreases with growing intercorrelation [50]. CFS is used to determine the best feature subset and is usually combined with search strategies such as forward selection, backward elimination, bidirectional search, best-first search, and genetic search. Equation formalizes the heuristic: where is the correlation between the summed feature subsets and the class variable, is the number of subset features, is the average of the correlations between the subset features and the class variable, and is the average intercorrelation between subset features [48–55]. Equation (20) is, in fact, Pearson’s correlation, where all variables have been standardized. In Table 5, the selected features are given.

In Table 6, the selection frequencies of the fetures are given.

According to Table 6 the most selected features are among GS groups. In our study, domestic and foreign banks other than development and investment banks operating in Turkey are examined. According to our study, the share of each single bank within its own group has an influence on its share in the whole banking sector to a certain extent. As it may be derived from the analysis results, the variable representing the group shares had mostly a priority and was effective for the forecasts made with regard to the market shares of banks. Therefore, it is seen that such situation may have a positive impact on the banks, which are able to come to the fore in terms of technological developments, wide product range, and competitive advantage and increase their shares within their own groups, towards increasing their shares throughout the whole industry banks.

With the dimensionally reduced datasets, the same experiments are repeated with the same 15 algorithms in Table 7. The results are given at Table 7.

According to Table 7, the best algorithms (having the most minimum errors) are Simple Linear Regression and Support Vector Regression.

The best and the worst predicted years are the same with all featured datasets.

To determine the effect of feature selection Figure 2 was drawn. In Figure 2, -axis shows the years and -axis shows the minimum error of the 15 algorithms. The solid line shows the minimum errors of all featured datasets. The dotted line shows the minimum errors of dimensionally reduced datasets.

According to Figure 2, the feature selection process has positive effects over almost all years. Because of the fact that the obtained errors with dimensionally reduced datasets are lower than all featured datasets, the generated rules by the best algorithms over dimensionally reduced datasets are given at Table 8.

When Tables 8 and 5 are investigated together, it can be seen that the selected features by the feature selection algorithm and the used features in the rules are very similar.

The performance differences among algorithms were also investigated with -test. -test says whether there is a significant difference between two distributions or not. In our experiments, each algorithm produces 5 RMSEs for each dataset, because we used 5-fold cross-validation. The distributions which are compared are these errors of algorithms. In Table 9, the number of datasets where there is significant difference between the algorithms at the rows and columns is given. The most successful 4 algorithms (decision table, linear regression, Simple Linear Regression, and Support Vector Regression) were compared with all the other algorithms.

In Table 9, the values in the cells has the (//) format. shows that the number of datasets of the algorithm at the column has statistically significant lower error than the algorithm at the row. shows that the number of datasets of the algorithm at the column has statistically significant higher error than the algorithm at the row. shows the number of datasets the algorithm at the column has no statistically significant difference error than the algorithm at the row.

According to Table 9, decision table algorithm has no better performance than the other 3 compared algorithms (see third row). There is not any significant difference between Simple Linear Regression and linear regression. Support Vector Regression has 2 better and one lower performance than linear regression.

In other words, there are very small differences between the best performed 4 algorithms over 20 datasets. Simple Linear Regression generates very simple and understandable rules. It has very good performance among other 14 algorithms. Because of its simplicity, the prediction process time is very fast. Because of these reasons we can say that Simple Linear Regression is the best solution to predict market shares values in bank datasets.

7. Conclusion

The study uses the data of 30 domestic and foreign banks other than development and investment banks, which have been operating between the years 1990 and 2009 in Turkey. 20 financial ratios of those banks, which are gathered by the Banks Association of Turkey, were taken into consideration in order to model and anticipate the change in their market shares.

This study would serve to determine the changes in market share of banks and to take measures quickly when it is necessary due to the fragile structure of the banking sector in Turkey. For this reason, computational intelligence algorithms were used in the study so that the algorithm that leads to the best estimation could be determined. As a result of the analysis, it is seen that Simple Linear Regression is the best algorithm for all periods. However, due to the financial crisis experienced in the beginning of 2000s in Turkey and the global crisis that had big impacts on our country as well as all the world economies in the period 2008-2009, the estimations for those periods (2001, 2002, 2008, and 2009) did not lead to good results.

As a result of the analysis, it is seen that the share of each bank within its own group was more effective with regard to the estimations for the market shares of banks throughout the whole sector. For this reason, it is seen that such situation may have a positive impact on the banks, which are able to come to the fore in terms of technological developments, wide product range, and competitive advantage and increase their shares within their own groups, towards increasing their shares throughout the whole industry banks.

It is seen that using some specific features rather than all features leads to better results in the analysis. During the selection of features, it is seen that the specified features are in line with the features of the rules created.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

F. Çolak and A. Yigidim, Türk Bankacılık Sektöründe Kriz, Nobel Yayın Dağıtım, Ankara, Turkey, 2001 (Turkish).
S. Oksay, “Finansal piyasalarda yeni yasal düzenlemeler ihtiyacı ve türk finans sistemi,” Sosyal Bilimler Enstitüsü Öneri Dergisi, 2000 (Turkish).
View at: Google Scholar
M. Ural, “bankacılık sistemimizde verimlilik,” D.E.Ü.İ.İ.B.F. Dergisi, vol. 2, pp. 147–157, 1999 (Turkish).
View at: Google Scholar
Türkiye Bankalar Birliği, “50. Yılında Türkiye Bankalar Birliği ve Türkiye’de Bankacılık Sistemi (1958–2007),” İstanbul, Turkey, 2008 (Turkish).
View at: Google Scholar
Bankacılık Düzenleme ve Denetleme Kurumu, “Yıllık Rapor 2001,” 2002 (Turkish).
View at: Google Scholar
H. Seyidoğlu, Uluslar Arası İktisat, Geliştirilmiş, Güzem Yayıncılık, İstanbul, Turkey, 13th edition, 1999 (Turkish).
B. Tunay, Finans Sisteminde Yeni Yönelimler: Türk Finans Piyasalarının Bugünü ve Geleceği, Beta Basım yayım Dağıtım, İstanbul, Turkey, 2001 (Turkish).
H. Tunç, “Finansal Kriz ve Türkiye Ekonomisi,” İSO Dergisi 421, 2001 (Turkish).
View at: Google Scholar
K. Duman, “Finansal kriz ve bankacılık sektörünün yeniden yapılandırılması,” Akdeniz Üniversitesi İ.İ.B.F. Dergisi, vol. 4, 2002 (Turkish).
View at: Google Scholar
I. Sayım, K. Duman, and A. Korkmaz, “Türkiye ekonomisinde finansal krizler: bir faktör analizi uygulaması,” D.E.Ü.İ.İ.B.F. Dergisi, vol. 1, pp. 45–69, 2004 (Turkish).
View at: Google Scholar
R. Karluk, Türkiye Ekonomisi: Tarihsel Gelişim Yapısal ve Sosyal Değişim, Beta Basım, İstanbul, Turkey, 2004 (Turkish).
S. Uyar, Bankacılık Krizleri, Ziraat Matbaacılık, Ankara, Turkey, 2003 (Turkish).
Bankacılık Düzenleme ve Denetleme Kurumu, “Bankacılık Sektörü Yeniden Değerlendirme Programı,” 2001 (Turkish).
View at: Google Scholar
Türkiye Bankalar Birliği, “Bankalarımız 2009,” İstanbul, Turkey, 2010, (Turkish).
View at: Google Scholar
Türkiye Bankalar Birliği, “Bankalarımız 2010,” İstanbul, Turkey, 2011, (Turkish).
View at: Google Scholar
Weka Manual for Version 3-6-3.
M. Hall, G. Holmes, and E. Frank, “Generating rule sets from model trees,” in Proceedings of the 12th Australian Joint Conference on Artificial Intelligence, pp. 1–12, Springer, Sydney, Australia, 1999.
View at: Google Scholar
R. Kohavi, “The power of decision tables,” in Proceedings of the European Conference on Machine Learning (ECML '95), N. Lavrac and S. Wrobel, Eds., pp. 174–189, Springer, Berlin, Germany, 1995.
View at: Google Scholar
J. R. Quinlan, “Simplifying decision trees,” International Journal of Man-Machine Studies, vol. 27, no. 3, pp. 221–234, 1987.
View at: Google Scholar
T. Elomaa and M. Kääriäinen, “An analysis of reduced error pruning,” Journal of Artificial Intelligence Research, vol. 15, pp. 163–187, 2001.
View at: Google Scholar
F. Esposito, D. Malerba, and G. Semeraro, “A comparative analysis of methods for pruning decision trees,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 5, pp. 476–491, 1997.
View at: Publisher Site | Google Scholar
M. Kääriäinen, T. Malinen, and T. Elomaa, “Selective rademacher penalization and reduced error pruning of decision trees,” Journal of Machine Learning Research, vol. 5, pp. 1107–1126, 2004.
View at: Google Scholar
I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, San Francisco, Calif, USA, 1999.
J. Q. Candela, C. E. Rasmussen, and C. K. I. Williams, “Approximation methods for Gaussian process regression,” Tech. Rep., http://research.microsoft.com/pubs/70486/tr-2007-124.pdf.
View at: Google Scholar
C. Walder, K. I. Kim, and B. Schölkopf, “Sparse multiscale Gaussian process regression,” in Proceedings of the 25th International Conference on Machine Learning (ICML '08), W. W. Cohen, A. McCallum, and S. T. Roweis, Eds., pp. 1112–1119, ACM Press, Helsinki, Finland.
View at: Google Scholar
M. J. Best and N. Chakravarti, “Active set algorithms for isotonic regression; a unifying framework,” Mathematical Programming, vol. 47, no. 3, pp. 425–439, 1990.
View at: Google Scholar
C. I. C. Lee, “The min-max algorithm and isotonic regression,” The Annals of Statistics, vol. 11, no. 2, pp. 467–477, 1983.
View at: Publisher Site | Google Scholar
P. M. Pardalos and G. Xue, “Algorithms for a class of isotonic regression problems,” Algorithmica, vol. 23, no. 3, pp. 211–222, 1999.
View at: Google Scholar
V. de Simone, M. Marina, and G. Toraldo, “Isotonic regression problems,” in Encyclopedia of Optimization, C. A. Floudas and P. M. Pardalos, Eds., Kluwer Academic, Dordrecht, The Netherlands, 2001.
View at: Google Scholar
M. Ayer, H. D. Brunk, G. M. Ewing, W. T. Reid, and E. Silverman, “An empirical distribution function for sampling with incomplete information,” The Annals of Mathematical Statistics, vol. 26, no. 4, pp. 641–647, 1955.
View at: Publisher Site | Google Scholar
R. E. Barlow, D. J. Bartholomew, J. M. Bremner, and H. D. Brunk, Statistical Inference under Order Restrictions, Wiley, New York, NY, USA, 1972.
D. L. Hanson, G. Pledger, and F. T. Wright, “On consistency in monotonic regression,” The Annals of Statistics, vol. 1, no. 3, pp. 401–421, 1973.
View at: Publisher Site | Google Scholar
O. Burdakov, O. Sysoev, A. Grimvall, and M. Hussian, “An O(n²) algorithm for isotonic regression,” in Nonconvex Optimization and Its Applications, G. di Pillo and M. Roma, Eds., vol. 83 of Large-Scale Nonlinear Optimization Series, pp. 25–33, Springer, 2006.
View at: Google Scholar
H. Akaike, “A new look at the statistical model identification,” IEEE Transactions on Automatic Control, vol. 19, no. 6, pp. 716–723, 1974.
View at: Publisher Site | Google Scholar
S. Ekinci, U. B. Celebi, M. Bal, M. F. Amasyali, and U. K. Boyaci, “Predictions of oil/chemical tanker main design parameters using computational intelligence techniques,” Applied Soft Computing Journal, vol. 11, no. 2, pp. 2356–2366, 2011.
View at: Publisher Site | Google Scholar
H. Yan, Y. Jiang, J. Zheng, C. Peng, and Q. Li, “A multilayer perceptron-based medical decision support system for heart disease diagnosis,” Expert Systems with Applications, vol. 30, no. 2, pp. 272–281, 2006.
View at: Publisher Site | Google Scholar
V. Havel, J. Martinovic, and V. Snasel, “Creating of conceptual lattices using multilayer perceptron,” in Proceedings of the International Workshop on Concept Lattices and Their Applications (CLA '05), R. Belohlavek and V. Snasel, Eds., pp. 149–157, Olomouc, Czech Republic, 2005.
View at: Google Scholar
A. B. Goktepe, E. Agar, and A. H. Lav, “Role of learning algorithm in neural network-based backcalculation of flexible pavements,” Journal of Computing in Civil Engineering, vol. 20, no. 5, pp. 370–373, 2006.
View at: Publisher Site | Google Scholar
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986.
View at: Google Scholar
R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, John Wiley & Sons, New York, NY, USA, 2001.
V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, NY, USA, 1995.
B. Keshari and S. M. Watt, “Hybrid mathematical symbol recognition using support vector machines,” in Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR '07), pp. 859–863, IEEE Computer Society, Curutiba, Brazil, September 2007.
View at: Publisher Site | Google Scholar
C.-M. Huang, Y.-J. Lee, D. K. J. Lin, and S.-Y. Huang, “Model selection for support vector machines via uniform design,” Computational Statistics and Data Analysis, vol. 52, no. 1, pp. 335–346, 2007.
View at: Publisher Site | Google Scholar
E. Frias-Martinez, A. Sanchez, and J. Velez, “Support vector machines versus multi-layer perceptrons for efficient off-line signature recognition,” Engineering Applications of Artificial Intelligence, vol. 19, no. 6, pp. 693–704, 2006.
View at: Publisher Site | Google Scholar
T. Benyang and D. Mazzoni, “Multiclass reduced-set support vector machines,” in Proceedings of the 23rd International Conference on Machine Learning (ICML '06), pp. 921–928, ACM, Pittsburgh, Pa, USA, June 2006.
View at: Publisher Site | Google Scholar
J. N. S. Kwong and S. Gong, “Learning support vector machines for a multi-view face model,” in Proceedings of the British Machine Vision Conference (BMVC '99), T. P. Pridmore and D. Elliman, Eds., pp. 503–512, British Machine Vision Association, Nottingham, UK, 1999.
View at: Google Scholar
J. Platt, “Fast training of support vector machines using sequential minimal optimization,” in Advances in Kernel Methods-Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola, Eds., MIT Press, 1999.
View at: Google Scholar
M. A. Hall, Correlation-based feature selection for machine learning [Ph.D. thesis], The University of Waikato, Hamilton, New Zealand, 1999.
J. G. Cleary and L. E. Trigg, “K*: an instance-based learner using on entropic distance measure,” in Proceedings of the 12th International Conference on Machine Learning (ICML '95), A. Prieditis and S. J. Russell, Eds., pp. 108–114, Morgan Kaufmann, Tahoe City, Calif, USA, 1995.
View at: Google Scholar
Y. Zhao, “Learning user keystroke patterns for authentication,” International Journal of Mathematics and Computer Science, vol. 1, pp. 149–154, 2005.
View at: Google Scholar
J. Friedman, T. Hastie, and R. Tibshirani, “Additive logistic regression: a statistical view of boosting,” The Annals of Statistics, vol. 28, no. 2, pp. 337–407, 2000.
View at: Google Scholar
A. Buja, T. Hastie, and R. Tibshirani, “Linear smoothers and additive models,” The Annals of Statistics, vol. 17, no. 2, pp. 453–555, 1989.
View at: Publisher Site | Google Scholar
L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, 1996.
View at: Google Scholar
Türkiye Bankalar Birliği, “Bankalarımız 2005,” Türkiye Bankalar Birliği Yayınları, İstanbul, Turkey, 2006, (Turkish).
View at: Google Scholar
A. G. Karegowda, A. S. Manjunath, and M. A. Jayaram, “Comparative study of attribute selection using gain ratio and correlation based feature selection,” International Journal of Information Technology and Knowledge Management, vol. 2, pp. 271–277, 2010.
View at: Google Scholar

Copyright

Copyright © 2014 M. Fatih Amasyali et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

11541

Downloads

1001

Citations