The main purpose of this paper is to investigate the effects of COVID-19 regarding the efficiency of industries based on data in the Tehran stock market. A hybrid model of Data Envelopment Analysis (DEA) and data mining techniques is used to analyze the investment behavior in Tehran stock market. Particularly during the COVID-19 pandemic, many companies face financial crises. That is why companies with inferior performance must be benchmarked with efficient companies. First, the financial data of investments on selective companies are analyzed using data mining approaches to recognize the behavioral patterns of investors and securities. Second, customers are clustered into 3 selling and 4 buying groups using data mining techniques. Then, the efficiency of active companies in stock exchange is evaluated using input-oriented DEA. The results indicate that, among 23 industries listed on the stock market in Iran, solely nine were efficient in 2019. Moreover, in 2020, the number of efficient industries further decreased to six industries. Comparing the obtained results with those of another study which was conducted in 2018 by other researchers revealed that COVID-19 strongly affects the performance of an industry and some industries which were efficient in the past such as the bank industry became inefficient in the following year.

1. Introduction

Creating a suitable environment for investors to invest in stock markets is essential for economic progress [1]. Stable economic conditions usually lead to a predictable environment, and all these will persuade investors for a more reliable investment [2]. The investment market plays an important role in the allocation of financial sources to companies in developing countries [3]. Hence, the identification of key criteria for a safe investment is critically important. Processing big data of past investments in a specified field can be very useful for making efficient investment decisions. Data mining is one of these processing methods [4]. Stock exchanges, which can provide large amounts of historical data related to previous investments, are a suitable resource for applying data mining approaches [5]. Today stock exchange markets are rapidly growing. Every day, a large number of shares are traded, and huge amounts of data result. At the same time, global stock exchange markets are being created, which can be considered as an opportunity for investors. In such a situation, the main problem is the selection of proper investment criteria and the creation of optimal portfolios. Economists and experts believe that competition among actors in the market leads to improved pricing efficiency of shares [6]. Nevertheless, investors and other market participants usually assume that a random selection of shares is not a useful strategy. It is very important for us to know which factors have an impact on the behavior of investors. Another issue that has created some problems in Iranian investment markets is the deficiencies in analyses. A lot of experts and investors believe that price divided by earnings per share (P/E) is the main index of change in prices. Lack of long-term models has pushed investors to employ short-term models for analyses. When a large number of investors invest their money based on a limited number of models, many people simultaneously focus their investments on a small number of companies or industries and disinvest from other industries at the same time. This intensifies problematic behaviors in the market. When there are a lot of fluctuations in economic variables, the profitability of companies can also be significantly affected by these changes. In such a situation, investors do not trust the published data. Besides, operational conditions and efficiency might have an influence on share prices. Traina [7] showed that rates in the financial statements have an impact on share prices. In other words, figures in the financial statements show the performance and efficiency of the companies. Investment in efficient companies produces higher returns. Here, the main problem is that the evaluation of the efficiency of companies with various criteria and indices is not an easy task. In addition, decision-making based on a limited number of indicators is insufficient. Although the evaluation of companies by financial statements is difficult, DEA models can help incorporate several criteria and indicators when assessing the companies [8, 9]. The data are gathered from the Tehran stock exchange. The efficiency of companies whose shares have been bought is determined and compared.

Since the emergence of COVID-19 in 2019, many aspects of our lives have been affected. Many countries were forced to lock down, and subsequently, many industries did not work. The stock market values of many industries decreased dramatically, and many people lost their work. This research can help to find out which companies in the stock market of Iran are efficient and which of them are not. In addition, inefficient companies can benefit from benchmarking with efficient companies.

Data mining is a technique for extracting patterns and also finding correlations among big data sets for prediction purposes [10]. This technique can be applied for increasing revenue, customer satisfaction and relationship, reducing costs and risks, and so on. Data mining includes many methods. One of them is clustering, which is based on the idea of putting similar things into the same cluster.

There are many methods for evaluating the performance of companies or businesses such as the Balanced Scorecard or EFQM (European Foundation for Quality Management). One of the most popular methods for performance evaluation of companies is Data Envelopment Analysis (DEA), which is based on the linear programming method [11]. In this method, Decision-Making Units (DMUs) are evaluated by input and output factors. This method consists of the variants BCC by Banker et al. [12] and CCR by Charnes et al. [13].

COVID-19 has spread rapidly in many countries, including Iran, with a strongly increasing number of cases since the end of February 2020. From the beginning of this pandemic, all public places were closed in Iran, and social distancing and home quarantine were encouraged. Due to US sanctions against Iran and a resulting fragile economy, the government could not shut down the country. Hence, the number of patients and death rates increased dramatically. The government reduced the weekly work hours in government organizations. Most of the private companies and factories could not maintain production levels. In some cases, the government ordered a shutdown in the capital of Iran and some other cities for one week. These issues led to a decreased performance of companies.

Several studies conducted in Iran investigate the effects of the COVID-19 pandemic. Ahmadi and Ramezani [14] studied the COVID-19 effects from an emotion-focused therapy perspective. Samadi et al. [15] used Wavelet Coherence Analysis to examine the comovements between markets in a time period from September 2014 to June 2020 as a period of intense uncertainty in Iran. According to our search, no study was conducted to assess the effect of COVID-19 on the Iran stock market.

The stock market of Iran is the main source of pricing the most important goods such as cement and other commodities and for allocating capital for companies. Besides, many people invest their money, and if stocks decrease abnormally, many people lose their properties. Therefore, the stock market of Iran plays a key role for both companies and private persons.

There are a lot of papers published not only about stock markets worldwide but also about the Iran stock market using diverse methods such as MCDM or DEA. However, previous studies on the Iranian stock market were carried out in normal situations, so that the following question remains: Does the pandemic have a significant effect on the Iran stock market?

The novelty of this paper is using a hybrid technique for clustering and efficiency assessment. We examine the performance of Iran’s stock market during the COVID-19 pandemic and investigate the efficiency of Iranian companies in the stock market.

First, well-known data mining approaches such as clustering methods are used to investigate the transaction behavior of investors using 30 criteria. Afterward, the Data Envelopment Analysis (DEA) approach is employed to assess the efficiency of investments. The novelties of this study are divided into two parts. In the first part, data mining methods are used to study the behavior of customers in the stock markets in Iran based on their transactions. The second part includes the evaluation of the efficiency of active companies in the Tehran stock exchange using DEA. The proposed approach is applied to all active industries in Tehran stock exchange.

The next parts of this paper are organized as follows. In Section 2, a brief literature review of past research works is provided. In Section 3, the proposed approaches are discussed. The case study and results are presented in Section 4. Finally, the paper will be concluded in Section 5.

2. Review of Past Research

2.1. Studies Related to Efficiency Assessment in Stock Markets

In this section, we review a number of studies employing neural networks, clustering and DEA, and other data mining approaches to stock market analysis.

Hiransha et al. [16] used novel deep learning architectures such as Multilayer Perceptrons (MLP), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) Networks, and Convolutional Neural Networks (CNN) to predict stock prices of companies on the National Stock Exchange of India and the New York Stock Exchange. As a result, the Convolutional Neural Network (CNN) method performed better than other methods due to its ability to catch sudden changes in the network during the forecasting phase.

Montenegro and Molina [4] started from the fact that the neural networks appeared superior to other methods in modeling nonlinear data and worked on the prediction of the day-to-day stock value of the S&P 500 Index. By creating a data set from the daily market activity values of the stocks between June 7, 2013, and June 6, 2018, for each company in the S&P 500 Index, the Deep Learning Neural Network method was used in the training of the network and the Feature Selection Analysis method in determining the behavioral tendencies of the companies. Thus, this paper helped decision-makers to improve investment behavior.

Yazdi et al. [17] measured the performances of 25 insurance companies, four of which were public and the rest were private, in Iran during the 2014–2015 period. They developed and used a new approach based on Data Envelopment Analysis (DEA) together with fuzzy clustering. In the study, which considers employees, capital, and total assets as input variables and total costs, paid compensation, and profit as output variables, it is concluded that ten companies belong to an efficient cluster with the output-oriented CCR method. Similar to our study, the authors analyzed DMUs (banks and companies listed at the stock market) using DEA and clustering methods.

Khedr et al. [5] estimated the behavior in the stock market by using news sentiment analysis and data mining techniques, considering a data set containing the opening, high, low, and closing (OHLC) prices of the stocks of three companies traded in the NASDAQ market, and three news articles per day about the market and the company. According to the results obtained, a strong correlation was found between the news classified as positive or negative and the fluctuations in stock prices. In addition, it has been observed that using the opening, high, low, and closing prices of stocks increases the accuracy of the market behavior prediction up to 89.80%. The used method is similar to other studies, including our investigation, as a data mining approach is applied.

Karimi and Barati [18] evaluated the financial performances of 72 companies selected from the automotive, pharmaceutical, petrochemical, and cement sectors traded on the Tehran stock exchange using the negative data-limited adjusted measure in DEA [19]. In this study, financial ratios such as assets turnover ratio, quick ratio, and current ratio serve as input criteria, whereas earnings per share ratio, return on assets ratio, net profit margin ratio, and so on are output criteria. After all, fifty-eight companies were determined as efficient, and the rest were determined as inefficient companies; then, efficient companies were ranked using the Andersen and Petersen model. In contrast to our study, this paper did not cluster industries, and the study is conducted under normal conditions, that is, without the effects of COVID-19 on the efficiency of companies.

Anouze and Bou-Hamad [20] considered MENA countries in the Middle East and North Africa and investigated the effects of 151 banks on environmental factors and performance in the years 2008–2010. They used data envelopment analysis and statistical data mining techniques such as classification and regression trees (CART), conditional inference trees (CIT), random forest based on CART and CIT, bagging, Artificial Neural Networks, and logistic regression. In the study, fixed assets, deposits, equity, and personnel expenses are input variables. Loans, net income, and liquid sources are determined as output variables. In summary, it was concluded that random forest and bagging methods are better statistical tools than other methods in measuring bank performance by using the importance rankings of the variables. The study used DEA and data mining methods, but there are differences in the used data mining techniques compared to our analysis.

Chang et al. [21] analyzed the portfolio performance valuation of 44 textile companies in Taiwan from 2011 to 2018 using the nested dynamic network (NDN) Data Envelopment Analysis (DEA) method. They applied a two-stage approach to their work. First, the additive dynamic DEA method was applied to find efficient financial assets. Then, an evaluation was made with the NDN DEA model to evaluate the portfolios consisting of selected effective financial assets in periods of three, four, five, or even six years. Also, here, the DEA technique was employed, but with a different data mining method compared to our study.

Zhong and Enke [22] evaluated 60 financial and economic factors in the S&P 500 Index using clustering and classification as data mining methods for daily stock market return prediction. Within the scope of this evaluation, 60 factors in the data set consisting of 2518 trading days between June 1, 2003, and May 31, 2013, were classified using the Principal Component Analysis (PSA) method to find the most important and most effective main components. In the next stage, by obtaining 12 new data sets from all adjusted data, the daily direction of returns was estimated by an Artificial Neural Network (ANN) and logistic regression methods. It has been concluded that the ANN achieves higher accuracy than logistic regression and classification and cluster mining are important in reducing the size of the data and increasing its efficiency.

Rezaee et al. [23] applied integrated methods to estimate the online financial performance of companies when online data were taken from the Tehran stock exchange in 2007–2012 at different time intervals. The used methods are Dynamic Fuzzy C-Means for updating cluster and membership numbers, Data Envelopment Analysis (DEA) for evaluating companies using financial ratios, and finally Artificial Neural Networks (ANN) for predicting the future performance of companies. The difference to our approach is that we used the K-means method, while Rezaee et al. [23] used fuzzy C-means.

Mehlawat et al. [24] evaluated 20 risky assets with their credibility by using variance and Conditional Value at Risk (CVaR) as risk measurement tools together with liquidity and entropy criteria. Within the scope of this evaluation, randomly selected portfolios with different sample sizes, including risk and entropy as input values and return and liquidity as output values, were examined by the DEA method. Moreover, these risky assets were combined with the fuzzy multipurpose portfolio model, and performance evaluation was made. In addition, the result of this paper indicated how a portfolio could be selected considering the risk of companies by using DEA and VaR.

Mashayekhi and Omrani [25] have proposed a new multipurpose model, which incorporated DEA cross-efficiency into the mean-variance Markowitz model to select an investment portfolio. The model examined 52 companies operating in Iran’s stock market. The results showed that the amounts of risk and return were considerably appropriate at the same time in comparison with the Markowitz model and DEA.

Table 1 provides an overview of methods used in this field.

As mentioned above, stock market evaluation is one of the most popular subjects in financial analysis. The reason is that the stock market is the engine of economics in most countries. In some related studies such as ours, the DEA method is used solely or in combination with data mining or Artificial Intelligence (AI) approaches. In DEA, the DMUs are evaluated and categorized by data analysis. In other methods, the regression shows the relationship between stock markets and factors that affect them. Some other approaches use AI to predict the performance of companies in the stock market. In our study, the effects of the COVID-19 pandemic as one of the substantial issues that affected economies worldwide and especially stock markets are evaluated. In particular, it is analyzed which industries are efficient. Inefficient industries should be benchmarked with efficient industries in order to improve.

2.2. Studies on Economic Effects of COVID-19

In the following, we provide a brief overview of studies related to the economic consequences of the COVID-19 pandemic in different countries.

Caraka et al. [27] studied the effects of COVID-19 on the environment and the economy of Indonesia. The conducted statistical analysis and results show that COVID-19 has a significant effect on the economy of Indonesia. Albu et al. [28] used a logistic model to predict the effects of the pandemic on the economy of Romania. Based on their simulation, pandemic evolution was classified into four distinct phases. Three scenarios were considered to estimate the economic impact of the epidemic at three levels. Results showed that, in the long term, an economic program based on large investment could contribute to restoring growth levels both worldwide and in the case of the EU countries.

Grima et al. [29] studied the previous mechanisms which were applied to provide an understanding of the challenges related to GDP. A simple statistical analysis was used by adopting data that were collected from government websites, online statistics, published reports, trends, and internal data. It is mentioned in the study that the research will help risk managers and leaders to understand the devastating social and economic impact of such disruptions and act proactively to avoid repetition and the negative effects of being unprepared.

Thorbecke [30] studied the impact of COVID-19 on the United States economy. Stock returns for 125 sectors are considered during the COVID-19 crisis in this study. The paper investigates how both the macroeconomic environment and sector-specific factors affect returns. Several macroeconomic variables were used in this study. With the help of a regression technique, estimation equations were used to predict the stock price index. Finally, the study discusses that stock prices are useful because they provide a measure of how investors expect shocks to impact future cash flows across sectors. Results show that sectors impaired by idiosyncratic factors include airlines, aerospace, real estate, tourism, oil, brewers, retail apparel, and funerals.

3. Research Methodology

The main steps of the proposed approach of this study are presented here. First, some essential financial indicators are determined and defined. Then, a clustering approach is conducted to cluster the considered companies of this study. The performance of each company and related clusters is determined through DEA.

3.1. Main Financial Indicators

In order to investigate the stock exchange, some financial indicators are required. Consulting a number of experts, who have enough experience in the related field, yielded to identifying the most essential financial indicators. They are defined as follows (see, e.g., [31, 32]):

Payments to shareholders include capital, share reduction, savings, profits, and losses.

Return of equity shows the efficiency of management of a company in employing the resources to obtain profits. It measures the rate of return on the owners’ investments. Return on equity is calculated using

Return on assets is calculated by dividing the net income of the company (income after taxes) by total assets as shown in

3.2. Methods
3.2.1. Clustering Approach

The first phase of the proposed approach is the clustering of investment records in order to find subsets of data in a way that the variance within clusters is minimized and the variance between clusters is maximized [33]. One of the main problems in clustering is to determine the number of clusters. There are several methods for determining the number of clusters. In this study, Wilk’s lambda method [34, 35] as shown in (5) is used to determine the suitable number of clusters:

is the variance within a cluster and is the total variance. If the diagram of Wilk’s lambda coefficient is drawn on the basis of k (cf. Figure 1), the first jump in the diagram is the optimal number of clusters. Based on the data gathered from the Tehran stock exchange, the following variables are used to cluster the customers:(i)Online buying and selling of shares(ii)Time of buying and selling, which is divided into four periods (i.e., the company and industry whose shares have been sold)(iii)The amount of sold or bought shares(iv)Value of transactions, which is equal to the volume of transactions made by means of share prices on that day

Based on the data gathered from the Tehran stock exchange, the following variables are used to cluster the customers:(i)Online buying and selling of shares(ii)Time of buying and selling, which is divided into four periods (i.e., the first six months of 2019, the second six months of 2019, the first six months of 2020, and the second six months of 2020)(iii)Company and industry whose shares have been sold(iv)The amount of sold or bought shares(v)Value of transactions, which is equal to the volume of transactions made by mean of share prices on that day [3638]

3.2.2. Analyzing the Efficiency Scores

Farrell [39] used a method on the basis of estimating production functions to measure the technical efficiency in a manufacturing company incorporating one input and one output. Farrell [39] also used the approach to evaluate the efficiency of the farming industry in the USA compared to the other countries. In 1978, Charnes et al. [13] developed Farrell’s idea and presented a model that had the ability to measure efficiency in the presence of several inputs and outputs. The model proposed by Charnes et al. [13] was the first official Data Envelopment Analysis (DEA) model. In 1984, Banker et al. [12] extended the proposed model in the presence of variable return to scale assumptions, called the BCC model. The difference between the CCR and the BCC model is that the return to scale is constant in the CCR model, whereas in the BCC model, it is not.

As the companies are assumed as DMUs, they should be analyzed using the DEA approach in the next phases of this research. Experts are asked to divide the financial indices into two main classes as inputs and outputs. Based on the views of experts of the stock exchange, the indicators are divided into inputs (total assets, total debts, and payments to shareholders) and outputs (operational profits and losses, return of equity, return on assets, and sales). A company is shown in Figure 2 as a DMU.

The input-oriented DEA based on the variable return to scale assumption is used to evaluate the financial efficiency of companies. This model was first proposed by Banker et al. [12] and is formalized in (6). Let and be the input and output vectors of for .

Here, p is the DMU being evaluated in the set of . is the measure of the efficiency of DMU p. are the weights assigned to output and input for solving the DEA model. is the free variable. Model (6) is a fractional mathematical programming problem, and its global optimum is hard to find. So, [12] proposed the following linear form:

In the BCC model, the sign of the variable W indicates the return to scale influence for each DMU.A: if W < 0, return to scale is descendingB: if W = 0, return to scale is fixedC: if W > 0, return to scale is ascending

The dual of the linear model (7), which is called the envelopment form, is specified as

These models can be used to evaluate the financial performance of companies as DMUs, which were schematically depicted in Figure 2.

3.3. Research Procedure

In our study, the following steps are conducted:

Step 1. Extract inputs and outputs.

Step 2. Select companies from population.

Step 3. Collect data based on inputs and outputs.

Step 4. Run the DEA method.

Step 5. Use the C-mean method for clustering the result of DEA.

Step 6. Benchmark inefficient clusters from efficient cluster.
Figure 3 shows the procedure of research.

3.4. Data Sample

The proposed approach of this study is applied to 214 companies in total, which are listed at the Tehran stock exchange and belong to 23 industries. The number of companies related to each industry is shown in Table 2. Data regarding the considered financial indicators are taken from published balance sheets of the companies. In addition, data of transactions of 56643 customers in 2019 and 2020 are gathered and used for our study.

4. Results and Discussion

After prescreening, the data of those customers who have at least one buying and one selling record during this period are selected. Financial data of customers, companies, and industries are collected from the Tehran stock exchange database. Excel and SPSS are used for preprocessing and analyzing the data. MATLAB is used for clustering. GAMS software is utilized to codify the DEA models.

4.1. Proper Number of Clusters for Buying and Selling Data

As mentioned before, Wilk’s lambda coefficient is used to determine the suitable number of clusters. As shown in Figure 1, the first jumps for buying and selling data occur for k = 3 and k = 4 clusters, respectively. So, the suitable number of clusters based on Wilk’s lambda coefficient is 3 and 4 for buying and selling data, respectively.

4.2. Dimension Reduction by Principal Component Analysis

Principal Component Analysis (PCA), which is a method for reducing the dimension of high-dimensional data on the basis of the direction of data dispersion [40, 41], is used to reduce the dimension of buying and selling data in favor of showing the results of clustering in a two-dimension plot. PCA is used for both buying and selling data. The associated two-dimensional clustering plots are shown in Figure 4.

4.3. Cluster Analysis Using the K-Means Method

Both buying and selling records are distinctively clustered using the K-means method considering the suitable number of clusters suggested by Wilk’s lambda coefficient. It is notable that a record in the database is associated with a person who has accomplished a buy or sell transaction from 2019 to 2020. Results of clustering are shown in Table 3. It is notable that the clustering is accomplished in a 25-dimensional space as shown in Table 3, and buying and selling include 3 and 4 clusters, respectively. The variables which were considered for clustering the data are presented in Table 4.

In buying, the first buying-related cluster has the lowest number of members among the others. It also has the highest average of trading value. Reconsidering the customers included in this cluster, it has been recognized that they were corporate entities. The second buying-related cluster has a middle number of members compared to the other clusters. It is notable that the average of trading values is the lowest in comparison with the other clusters. The third buying-related cluster has the highest number of members among the others. It is notable that the average of trading values is the lowest amount in comparison with the other clusters. More formally, the customers included in this cluster invest in low volume and high diversity.

Regarding selling, the first cluster has the lowest number of members and the highest average of trading value. The second cluster has the middle number of members and the lowest average of trading value. The third selling-related cluster has the highest number of members and the lowest average of trading value. An investigation of customers of this cluster reveals that they are normal persons with an average investment of 15617 USD.

The hit number and percentage of records in both buying and selling clusters are shown in Tables 2 and 5, respectively.

4.4. Results of Customer Clustering

Buying data and selling data included 31587 and 25056 records, respectively. The third cluster of buying with 31507 records and the fourth cluster of selling with 24503 records have the highest number of members among the clusters. This shows that the behaviors of customers in these clusters are similar. Figure 5 shows the mean of traded shares in each cluster. Figures 68 present transaction details of clusters 1–3. As can be seen in Figure 6, in the first cluster, investors have invested in two sectors (bank and multi industrial company). This cluster has the highest value of transactions and the lowest dispersion of investment in various industries. As can be seen in Figure 7, in the second cluster, investors have invested in 16 industries. Technical and engineering services and chemical products have the largest shares in buying transactions of this cluster.

As can be seen in Figure 8, in the third cluster, investors have invested in all 23 industries. Among these industries, bank and chemical products are most often traded in this cluster.

4.5. Results of Efficiency Measurement Using DEA in Each Cluster

In the prescreening phase, negative data are shifted using variable exchanges. Portela et al. [42] introduced a method for exchanging negative data in DEA with positive for solving the model. These variable exchanges are done by the following equations for both output and input variables.

First, consider (p) as the SP range, which measures the distance between a reference variable and the current variable. For output variables yr, it is

For input variables xi, we have

Data are also normalized. The DEA models are codified in GAMS software. The input-oriented DEA model considering variable return to scale is used to evaluate the efficiency of the companies in the Tehran stock exchange. Table 6 presents the efficiency scores for 2019. According to the data presented in Table 6, among 23 industries, 9 are efficient. Seventy-nine percent of inefficient industries have an efficiency lower than 0.6. The transportation industry with an efficiency of 0.03 is the most inefficient company.

In order to suggest a practical benchmark for increasing the efficiency of inefficient industries, a reference set (linear combination of efficient companies) is used.

Each reference set includes efficient DMUs, which can construct efficient projections of the associated inefficient DMU. According to Table 6, the tile industry and the multi-investment industry have the highest presence in the reference sets of all inefficient industries; that is, they occur most often in nonefficient clusters.

For instance, the inefficient computer industry can follow methods of the tile industry, the multi-investment industry, and the car industry in the selection of inputs and outputs to be projected toward the efficient frontier. Based on efficiency scores reported in Table 6 and using reference sets, one can project the inefficient DMUs toward the efficient frontier. This can be assumed as a practical benchmark for inefficient DMUs. Tables 79 represent the efficiency scores and fraction of total investment in each industry.

Table 10 presents the efficiency scores of industries in 2020. An efficiency of 1 indicates that the DMU is efficient, whereas efficiency of less than 1 indicates inefficiency.

In Table 11, the industries are categorized according to the efficiency scores.

Figure 9 presents the efficiency scores of DMUs in both 2019 and 2020. Thus, the figure provides suitable information to compare the situation of a DMU during the period 2019–2020. Figure 9 shows that DMU23 and DMU21 have a major reduction of efficiency (92% and 46% compared to their efficiency). DMU06 has a 35% growth of efficiency compared to its efficiency in 2019.

Tables 12 and 13 present the efficiency scores for the second and the third cluster in 2020.

5. Conclusions and Future Research Directions

In this study, a hybrid procedure based on clustering analysis and DEA was proposed to investigate the selling and buying behavior of investors. The whole procedure was applied to some financial records of the Tehran stock exchange. The main steps of the proposed procedure are as follows. In the first stage, a prescreening method was accomplished on data. Then, a clustering approach was conducted to investigate the main clusters of selling and buying records. Finally, DEA was used to measure the efficiency scores of each cluster. The results were analyzed based on the financial data of the customers in the Tehran stock exchange. The efficient and inefficient DMUs (companies) were determined based on efficiency scores in each buying and selling cluster. The reference set of each inefficient DMU was proposed to achieve the projection toward the efficient frontier. The reference set can help the managers of inefficient industries to move toward the best benchmarks in the market. In future studies, more clustering approaches can be considered. Uncertainty in data and clusters as well as fuzziness in inputs and outputs can be modeled through fuzzy clustering approaches and fuzzy DEA modeling.

The phenomenon of COVID-19 has strongly affected our lives. Production and service companies are no exemptions from this observation. In Iran, the pandemic has strong effects on the fragile economy. Rates of death people, lockdowns, and decreased productivity are some of the various factors with economic impact. Hence, undoubtedly, the stock market is affected by this pandemic. In this paper, the performance of companies in the stock market during two years of the pandemic is evaluated.

The result indicated that, among 23 industries listed on the stock market in Iran, solely nine were efficient in 2019, and it shows that the COVID-19 pandemic had a strong effect on the stock market. However, in 2020, the number of efficient industries is less than in 2019 and reached six industries only. Among efficient categories based on the clustering method, the banking and investment industries were the most efficient among them. The reason is that some of them are supported by the government, and others had a suitable financial backup to tackle this problem. However, in 2020, the bank industry was not efficient because the effect of spreading COVID-19 was stronger than effects by government support or other issues. The worst industry in both 2019 and 2020 is transportation. The reason is that, according to rules resulting from the COVID-19 pandemic, many countries forbid or limit entrance to their territories; hence, the performance of transportation decreased dramatically. Another industry that is at the bottom of inefficiency in both 2019 and 2020 is the insurance industry. The reason is that many people in Iran became infected with COVID-19 and, hence, high costs result for these insurance companies to cover medical treatment of insurers. The result of the clustering pointed out that industries with a strong direct effect from COVID-19 had the least performance and were inefficient. For increasing their performance and efficiency, they must pay more attention to financial issues and indicators. In other words, the outbreak of the COVID-19 pandemic diminished the efficiency of companies in various industries in Iran. Comparing our results with those of another study conducted during a normal economic situation [18], we find that usually the three industries, automobile, pharmacy, and cement, had the highest efficiency among other industries. However, in 2019, solely automobile and pharmacy industries remained efficient, whereas the cement industry became inefficient. The reason is that the demand for building houses and thus for cement was decreased dramatically. In 2020 during the outbreak of the pandemic, the automobile section became inefficient as well. The reason is that people do not invest in buying cars in situations of crisis. The pharmacy industry remained efficient in both years because obviously many people were looking forward to treatment; hence, they needed pharmacy and other facilities for treatment.

For future research, we suggest further development of the employed methodology. In particular, researchers can investigate an uncertain environment, for example, by using approaches based on fuzzy numbers, D numbers, or Z numbers.

Data Availability

Used data can be made available upon request to the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.