#### Abstract

Composite indices are a great tool for researchers and policymakers alike as they provide a simplification of reality of complex phenomena, as well as their enabling ability for cross-country comparisons. A troublesome issue with constructing composite indices is the selection of the weighting system as it can greatly influence the results of the index developed. One of the most reliable weighting systems is the expert weighting system, where experts on the topic being studied are delegated the weight selection process, and the average of their responses are then transformed into weights. The limitation of this method, however, is the high subjectivity, uncertainty, and inconsistency of the expert responses. This paper seeks to address this limitation by providing a guide to researchers on how to improve the expert weights by subjecting them to the fuzzy analytic hierarchy process (FAHP) method for multicriteria decision making (MCDM) to compute the fuzzy weights, a more objective and reliable weights relative to expert weights. That said, and despite the benefits of the FAHP method, it can produce weights that can skew the composite index results. To address this limitation, the study introduces the interval weights, which are calculated by finding the midpoint between the expert weights and the fuzzy weights. The resulting interval weights exhibit the benefits of both principal component analysis (PCA) and the FAHP process, the difference being that PCA cannot be applied for noncompensatory indices.

#### 1. Introduction

Composite indices provide a simplification of reality for phenomena that are complex and multidimensional in nature through a mathematical process known as aggregation [1]. Aggregation is the most important step when constructing composite indices, and it involves: (1) the data normalization method, i.e., min-max, z-standardization, and ranking; (2) the data aggregation method, i.e., arithmetic mean, geomean, and quadratic mean; and (3) the weighting system of choice, i.e., equal weights, subjective weights, expert weights, and principal component analysis (PCA) [1]. The most critical part of the aggregation process is the weighting system as the outcome of the composite indices are highly dependent and sensitive to the chosen system; this is referred to as the “index problem” [2]. To address this problem, the authors tend to select equal weights, which is the most popular method for composite indices in literature [2, 3], as well as the most appropriate method for compensatory indices, i.e., where a deficiency in one proxy is compensated by the good performance of another substitutable proxy. However, for noncompensatory and partially compensatory indices, the most reliable weighting system is the expert weighting system where a panel of experts on the phenomenon or field being studied are delegated the weight selection process. This process, however, has its limitations as expert responses are troubled with high subjectivity, uncertainty, and inconsistency [4, 5].

That said, how can one address the limitations of the expert weighting system? Are there any methods that can reduce the subjectivity, uncertainty, and inconsistency of the expert responses? To address these limitations, the expert responses are subjected to the fuzzy analytic hierarchy process (FAHP) method. The fuzzy analytic hierarchy process (FAHP) is based on Zadeh’s [6] fuzzy set theory and Saaty’s [7] analytic hierarchy process (AHP) method for multicriteria decision making (MCDM), a technique that helps in complex decision-making problems particularly in the presence of a large number of alternatives and criteria [8, 9]. According to Kabir and Hasin [10], there are several methods to conduct FAHP, including but not limited to: fuzzy logarithmic least-squares method [11], geomean method [12], synthetic extent analysis [13], fuzzy least square method [14], lambda-max method [15], nonlinear fuzzy preference programming [16], and two-stage logarithmic programming [17], among others.

Like PCA, the FAHP method highlights the most prominent proxies utilized in the construction of the indices and assigns them the highest weight. However, FAHP is advantageous over PCA as the latter can only be applied to substitutable indicators or fully compensatory indices [1], whilst the former can be applied to indices of various compensatory nature. By integrating expert responses and the FAHP method, the researcher can improve the objectivity of the weights and reduce the uncertainty of the expert responses.

Despite its merits, the FAHP method is not without its limitations. To elaborate, the original AHP method developed by Saaty [18] can be utilized to weight criteria according to paired comparisons [19] without the need to further fuzzify the process as the method already incorporates fuzziness. This opinion is shared by Kubler et al. [20], who posit that no additional benefit is reaped from including fuzzy logic in the AHP procedure. Mukherjee [21] agrees with both scholars stating that FAHP adds unnecessary complexity to the decision-making process and that it violates the fundamentals of AHP. Moreover, he addresses the issue of the lack of consistency measures in FAHP relative to traditional AHP, stating that limited research has been conducted on the matter. This opinion is shared by Zhü [22] who posits that no methods exist to check the validity or consistency of results under FAHP.

With that said, why FAHP over AHP? FAHP is preferred over AHP since the latter does not deal well with uncertainty. To elaborate, summarizing expert opinion on a particular proxy and transforming them into a single value has a high degree of uncertainty, which could be related to a poor understanding of the task, expert bias, low interest in the survey by the experts, or simply human error, all of which could lead to inaccurate outcomes. Moreover, AHP is more appropriate for more straightforward or crisp decisions [10], whereas FAHP is more appropriate where ambiguity and fuzzy outcomes are likely present. As such, and due to the forestated research gaps, the FAHP method is better suited than the AHP method when collecting expert responses for developing composite indices as this research is attempting.

Thus, the purpose of this paper is to serve as a guide for researchers on how to integrate expert responses with the FAHP method to reduce the subjectivity, uncertainty, and inconsistency of the expert weighting system, and to generate fuzzy weights to be applied to composite indices which are more objective, reliable, and valid in comparison. To achieve this objective, a step-by-step guide on how to generate expert weights will be presented, i.e., how to develop the survey, select the appropriate scale, select the experts, assign alpha values based on the respondent’s level of expertise, collect the expert responses, and interpret their responses, followed by a step-by-step guide on how to integrate the FAHP method with the expert weights, as well as how to normalize the fuzzy weights to obtain the interval weights, which are the midpoint between the fuzzy weights and the expert weights, to address the lopsidedness limitation of the fuzzy weights as they can skew the outcome of the composite index.

#### 2. Literature Review

Numerous studies have discussed the influence of weights on composite indices and the uncertainty and doubt they cast over the robustness and validity of the results, i.e., Permanyer [23]; B. Zheng and C. Zheng [24]; Becker et al. [25]; Seth and McGillivray [26]; among others. This has led many scholars to come up with innovative methods on how to address the weighting problem. One such method is by integrating expert opinion with the FAHP method as achieved by Gopal and Thakkar [27]; Kashani et al. [28]; Bisht et al. [29]; and others. However, many of the studies that adopt this approach fail to address the uncertainty and inconsistency of the expert responses, and the replicability of their methodologies for other composite indices is questionable. This is due to failing to develop a methodology that is highly transparent, tests the validity and reliability of the expert responses, and accommodates for indices with a large number of proxies. The upcoming text will elaborate upon some of these studies and their limitations.

Gopal and Thakkar [27] utilize expert opinion and the FAHP method to develop weights for a sustainability index. The authors group the index proxies into five dimensions, which are further broken down into several subindices. This was achieved to allow for the adoption of a 9-point scale to limit uncertainty. Three experts were delegated the weighting process by participating in a questionnaire. The authors apply Chang’s [13] extent analysis method to solve the FAHP problem. The authors note that the resulting index weights are troubled by the subjectivity brought upon by expert opinion. According to the authors, this problem cannot be solved, and they do not attempt to mitigate it.

Kashani et al. [28] also utilize extent analysis to develop index weights for a safety performance index by integrating it with the input of seven safety experts regarding the relative importance of pairs of indicators. The limitation of this method is that the resulting fuzzy sets were not based on a 1v1 comparison of the indicators, but rather on qualitative measures of relative importance. Moreover, similar to Gopal and Thakkar [27], the authors fail to address whether the expert responses were valid, reliable, and consistent.

Another study that integrates expert opinion with FAHP is that by Bisht et al. [29] who sought expert opinion using a linguistic scale and transformed the expert responses into fuzzy sets and subsequent fuzzy weights for a revised leachate pollution index (r-LPI). The authors adopt three FAHP methods: extent analysis [13], nonlinear fuzzy preference programming (FPP) [16], and logarithmic fuzzy preference programming (LFPP) [30]. According to the authors, the weights resulting from the LFPP method were most accurate and were subsequently utilized for the index. Similar to Gopal and Thakkar [27], the authors group the proxies into three indicators, which are further broken down into 11 subindicators to facilitate the adoption of a 9-point scale. Grouping the 11 indicators into three groups leads to a pairwise matrix and subsequent fuzzy values which could be highly misleading as the impact of some of the subindicators could be diminished. Moreover, there is a lack of transparency on the decision to utilize the weights resulting from the LFPP method over the other FAHP methods.

##### 2.1. Research Gaps and Contribution

Few studies have attempted to address the index weighting problem using FAHP and MCDM, and more specifically, fewer studies have attempted to integrate expert opinion based on scales of relative importance with FAHP to develop weights for composite indices. Regarding the shortcomings of the studies reviewed, they fail to address the skewness limitation of the fuzzy weights and their influence on the index results. Moreover, the methods used categorize the index proxies into smaller groups to facilitate the adoption of a 9-point scale, which could lead to misleading results as a subindicator that could be of high importance and influence on the index results could be marginalized due to being categorized under a group of diminutive importance. Lastly, the authors do not address whether the use of FAHP is justified and whether the index weights resulting from the expert opinions might suffice. As such, and due to the forestated research gaps, this paper seeks to provide a guide to researchers on how to integrate expert opinion with the FAHP method for developing weights for composite indices which involves assigning alpha values to the experts based on their level of expertise, utilizing a scale of relative importance to provide leeway for the experts to express their opinion on the importance of the various proxies, calculating the internal consistency of the expert survey to add high transparency and validity to the expert opinions, introducing fuzzy values by percentiles to facilitate the application of the FAHP to composite indices with a large number of variables instead of categorizing the proxies into groups and subindicators which could diminish their importance, and addressing the skewness limitation of fuzzy weights by calculating the midpoint between the expert weights and the fuzzy weights to produce the interval weights, which reduce the uncertainty, subjectivity, and inconsistency of the expert and fuzzy weights.

#### 3. Methodology

The purpose of this section is to discuss the process of generating expert and fuzzy weights, and how to integrate the two to arrive at the interval weights, which are more reliable and valid in comparison. The research methodology is summarized in the following flowchart (Figure 1), followed by a more detailed explanation in the upcoming text.

##### 3.1. The Expert Weights

The first step of the expert weighting process for the development of composite indices is to develop the survey for the experts to provide their responses through. There are various mediums for the researcher to develop their survey such as Google Forms, therefore, the choice of the medium is not of significant importance, nor does it impact the outcome of the expert responses. What is important, however, is the design of the survey, particularly the weighting scale that allows the experts to assign levels of importance to the proxies that make up the composite index. To elaborate, the scale to be developed must be clear and concise, long enough to allow the experts to properly assign weights to the relative importance of the proxies, but not long to the extent that it reduces the significance of a higher score on the scale. Moreover, the scale must be short enough to avoid confusion in weight selection, but not to the extent that selecting the weights becomes an obsolete and meaningless exercise. According to literature, the best practice when developing a scale of relative importance for survey responses is to use a scale of 1 through 7 [33]. Such a scale leads to higher validity, reliability, and internal consistency, and reduces information loss since the scale length allows the responder to properly make a distinction between the options presented.

Once the survey is developed, the second step is to identify the experts whose opinions will be transformed into weights. The most important factor for selecting the experts is their familiarity of the topic or phenomenon being studied followed by their level of expertise. Regarding the former, the researcher can identify the experts familiarity of the subject by performing their due diligence and examining the authors profiles on their affiliated institution or websites such as ResearchGate to identify their field of expertise, as well as the number of articles they have published on the topic being studied. Regarding the latter, the researcher can identify the experts level of expertise through author metrics such as the H-index or G-index, which are measures of an author’s research output and citation impact, or through the number of articles they have published in high impact journals, i.e., the number of publications in Q1 journals.

The third step involves reaching out to the experts. This involves sending the experts an email that contains a well-worded and professional cover letter that describes the purpose of the survey and the study it pertains to, as well as a link to the survey in question.

The fourth step involves collecting the survey responses, analyzing the results, and transforming the combined results from an output of a scale of relative importance to actual weights. This process can be further improved by assigning alpha values (*α*) based on the experts level of expertise. This can be determined by metrics such as the H-index, or by including a survey response question asking the respondents how many articles they have published on the topic being studied in Q1 journals—for example.

###### 3.1.1. A Case Study Based on the “Bad Behavior Index”

This section provides an example of a survey conducted for the purpose of developing weights based on expert opinions for the construction of the “Bad Behavior Index” (BBI), a composite measure of the development hindering behavior of individuals and institutions. The BBI is a noncompensatory index that seeks to quantify the behavior of individuals and institutions within the context of socio-economic development. The theoretical basis of the index is based on Al Ghazaly’s [31] concept of *Mafsada*, i.e., societal harm, in the Maqasid of Shariah, i.e., Islamic jurisprudence; and Adam Smith’s [32] concept of Worthless Fellow in the Theory of Moral Sentiments. The common theme between these theories is the concept of adherence, as both frameworks posit that there are certain rules and guidelines that individuals must abide by to achieve happiness and well-being. For clarity and simplicity, the number of variables of the index has been reduced to provide a simpler illustration of how to interpret and transform the survey responses. A sample survey response is presented in Figure 2.

A total of 20 experts were identified based on their familiarity and level of expertise of the field of Islamic Economics, particularly the subject of “Maqasid of Shariah” given that the proxies that the index being developed by the researcher are influenced by the corollaries of these Maqasid or goals. Experts were contacted via email and provided with a cover letter that describes the purpose of the survey and the study it pertains to. Moreover, the respondents were notified that their anonymity was ensured to protect their privacy.

Of the 20 experts contacted, 10 experts were from the same country as the institution the researcher pertains to, and 10 were from countries all over the world. The survey received 5 responses with a total response rate of 25% (Table 1), and a response rate of 50% for same-country respondents. The fact that all responses were from the same country as the institution the researcher pertains to suggests that the researcher should target same-country experts and institutions to receive higher response rates. For nonresponses, the researcher sent out a one-time reminder via email to the experts, which did in fact lead to 2 of the 5 responses.

Once the expert responses are collected, the responses are then analyzed and transformed using the soft-max function into weights to be assigned to the various proxies. The expert weights to be applied to the index are presented in Table 1.

The survey sent to the experts included a reminder at the beginning of the survey of the purpose of the research and included a field that required the expert to type in their name for the sake of tracking which respondent has completed the survey and who has not, i.e., for the purpose of following up with the experts to remind them to take the survey or to thank them for their participation. Following the “name” field, a multiple-choice question was presented to the experts which asks them to choose between 5 options related to the number of publications they have in Q1 or Q2 journals (Figure 3).

Following this question is a section that allows the experts to provide their responses on the importance of various proxies as measures of the development hindering behavior of individuals and institutions within the context of socio-economic development. The scale presented to the experts is based on a scale of relative importance from 1 through 7, which allows for the experts to properly make a distinction of their preferences when assigning the values to the various proxies.

For further validity, and based on the experts responses to the question of how many journal articles in high impact journals they have published, the expert responses are assigned *α*-values (Table 2) which further transform the results.

The advantage of assigning alpha values to the experts is to provide weights or value to their responses. As such, an expert with a higher experience in the field, i.e., has more publications on the phenomenon being studied or more publications in high impact journals, is assigned a higher alpha value meaning that their opinion is of greater importance relative to the other respondents. The disadvantage of such method is that it could skew the survey results as it adds further subjectivity to the weighting process. Another limitation of assigning alpha values is that it could undermine the opinion of researchers who are younger in age, hence have fewer publications and yet their opinions could be of equal or greater value than a more experienced respondent. Assigning alpha values has both its benefits and drawbacks, and it is up to the researcher to decide whether they want to apply this method when conducting their survey. The finalized expert responses and weights are presented in Table 3.

Comparing the expert weights () pre and post alpha value adjustment shows low variability between the two weighting methods (Table 4), and a high level of Pearson correlation with a score of 0.977. The high similarity between the two methods indicates that the experts responses are highly consistent.

To add emphasis to this position, the Cronbach alpha score, i.e., a measure of internal consistency, of the survey responses is at an acceptable value of 0.857 (Table 5). Both scores provide justification for applying the alpha values method when calculating the expert weights ().

###### 3.1.2. Survey Limitations: Nonresponse Bias and Small Sample Size

Before elaborating upon how to improve the expert weights by integrating them with FAHP, the major limitation of collecting expert opinions via surveys must be addressed, i.e., nonresponse bias due to low response rates, as well as how to tackle such limitations. Nonresponse bias occurs when subjects decline to participate in the survey or are simply uninterested, which leads to a smaller sample size that is not fully representative of the population. Nonresponse bias is a difficult issue to address, especially since the standard for survey responses has increased to a minimum threshold of 70%–80% response rate [34–36]. That said, some researchers posit that increasing response rates does not guarantee a reduction in nonresponse bias, as even at the higher response levels mentioned, non-response bias can still occur [34, 37]. Furthermore, despite low response rates being critical to the quality of the survey results, some researchers find that higher response rates offer minimum to no reduction of the levels of nonresponse bias [36]. Moreover, Cook et al. [38] find the diversity or the representativeness of the survey respondents is a more important measure than the survey response rate.

With that said, if the researcher is adamant about increasing the survey response rate to reduce nonresponse bias, there are several methods the researcher can apply. The first method is to develop a personalized and well-worded email and cover letter showcasing to the potential expert how valuable their participation in the survey and subsequent output is to the researcher, i.e., proof of value. The second method is to enclose a cover letter that defines the scope of the research being conducted and the purpose of the survey, including getting the support of the institution the researcher pertains to in the form of the dean’s signed approval, or any other senior faculty member who is well regarded by the academic community—this step is particularly important for novice researchers and PhD students, as getting the support of a senior faculty member can increase the response rate of the survey. The third method is to make sure that the survey is well-worded, clear, concise, not too long, and visually appealing yet simple, as failing to do so could lead to nonresponse which is “beyond statistical sampling error” [39]. The fourth method is to ensure the privacy and anonymity of the survey respondents; this can be achieved by clearly stating in the survey that the respondents’ names, responses, and any other personal information will remain private. The fifth method is to follow up with the survey respondents in a professional manner, which includes a soft reminder with an adequate time span between the first time the survey was communicated to the respondents and the potential email reminder. The sixth method is to develop a survey that can be easily accessed from different communication devices, i.e., phone and tablet, without sacrificing the user experience. This involves including a clickable and short URL which swiftly transfers the respondent to the survey. The seventh method, and this is contingent on what is considered ethically permissible, is to provide incentives for the respondents. This method is championed by Stanley et al. [40] who find that “larger incentives were associated with increased interview completion rates with minimal impact on data quality or bias.”

Besides the low response rate limitation, the small sample size of this study must be addressed. The justification for using a small sample size of 20 is based on the type of respondents who have participated in the questionnaire. To elaborate, the respondents selected are experts in their field, and it is natural to assume that the sample size will be small. Moreover, a precedent exists among literature where several scholars have utilized both expert opinion and the FAHP method in the same study where the sample size was small. For example, Anjomshoae et al. [41] sought the opinion of 15 experts of which 6 participated in the study (40% response rate); Beltrão and Carvalho [42] sought the opinion of 19 experts of which 7 participated in the study (36.8% response rate); and Majumdar et al. [43] sought the opinion of 110 experts of which 40 participated in the study (36.4% response rate). All the referenced studies, including this one, meet the 20% response threshold by Malhotra and Grover [44] for surveys to have a meaningful conclusion. In summary, high response rates are essential for the quality of the survey, but they are not the only important factor, as the quality and the diversity of the respondents, and the design of the survey itself including well-worded questions, are one of many important factors that could affect the validity and reliability of the survey responses.

##### 3.2. The FAHP Methodology

Ahmed and Kilic [45] conduct a citation analysis of Google Scholar for the years 2000–2017 and find that the most popular FAHP methods are those of van Laarhoven and Pedrycz [11]; Buckley [12]; and Chang [13]. Radionovs and Uzhga-Rebrov [46] compare the three FAHP methods and find the methods of Buckley [12] and Chang [13] to be superior to those of van Laarhoven and Pedrycz [11] given that the latter has significant limitations. To elaborate, the triangular fuzzy numbers calculated using the van Laarhoven and Pedrycz [11] method are only approximations leading to higher uncertainties relative to other methods [46]. Moreover, a solution does not always exist for linear equations used for the calculation of the fuzzy weights [47]. Furthermore, the van Laarhoven and Pedrycz [11] method involves highly complex calculations even for the simplest of tasks [46]. Also, the method only allows for triangular fuzzy numbers [47]. The aforementioned limitations of the van Laarhoven and Pedrycz [11] method are why Buckley [12] and Chang’s [13] methods are more popular and superior methods for calculating fuzzy weights.

This research adopts Buckley’s [12] geomean method for calculating the fuzzy weights since it is easy to compute and guarantees a unique solution [47]. The main disadvantage of Buckley’s [12] method is that it requires defuzzification, i.e., transforming a fuzzy set into a single number. Chang’s [13] method, although highly popular, is characterized by the major limitation of leading to incorrect decisions or outputs since it may assign zero weights to the items or variables involved in the fuzzy process, as well as only allowing for triangular fuzzy numbers. Its advantage over Buckley’s [12] method is that it is easier to compute and is quite similar to Saaty’s [7] AHP, meaning that unlike Buckley’s [12] method, it facilitates for calculating the consistency of the FAHP outputs.

###### 3.2.1. Integrating Expert Weights with the FAHP Method

The first step in calculating the fuzzy weights involves defining the problem and determining the goal of the FAHP method. In the case of this research, the FAHP method is utilized to transform survey responses of experts into fuzzy weights to be used for composite indices.

The second step involves developing the hierarchy structure. This involves a visual illustration of the various criteria and variables involved in the fuzzy process, their perceived importance, and possible alternatives, if any. An illustration of an FAHP structure is presented in Figure 4.

This research does not develop a hierarchal structure with alternatives like Buckley [12] but opts instead to categorize the proxy weights (Table 4) into percentiles so that they can be transformed into triangular fuzzy numbers (Table 6).

The percentile method is best suited when the index has a large number of proxies, and it allows the researcher to set a threshold between each category in the triangular fuzzy scale (Table 7).

Integrating the fuzzy values by percentile (Table 6) and the fuzzy scale of relative importance (Table 7) leads to a triangular fuzzy scale (Table 8), which facilitates the development of the pairwise comparison matrix.

Elaborating upon Table 8, the first column exhibits the various proxies that are to be used in the index being developed. The second column exhibits the weights computed based on the expert responses. The third column exhibits the rank of the corresponding proxy among the 10 proxies included in the index based on its weight. The fourth column transforms the rank into percentiles, and it is calculated as follows: [(*n* – *r*_{i})/*n* 100], where *n* is equal to 10, i.e., the number of proxies, and *r*_{i} represents the weight rank of proxy *i* relative to other proxy weights. The fifth column exhibits the value of the intermediate scale, which is a scale based on fuzzy numbers (1, 3, 5, 7, 9). The sixth column transforms the intermediate scale into a triangular fuzzy scale. The seventh column exhibits the distance the proxy is from the ideal scale, i.e., where the proxies are “absolutely important” (see Table 9). The STI scale, or steps to ideal scale, represented in column 7, helps the researcher when assigning the triangular fuzzy values in the pairwise comparison matrix by setting triangular values which make a distinction based on their distance to one another. Table 9 provides an example of how to assign triangular fuzzy values based on the STI scale.

The third step of the FAHP method is to develop the fuzzy pairwise comparison matrix (FPCM), i.e., the fuzzification process. The previous methods discussed build up to this step as they facilitate the process of transforming the expert weights into fuzzy values. The FPCM is presented in Table 10.

To elaborate upon the FPCM, let us attempt to compute the triangular fuzzy values for the proxy “restricting economic freedoms” (EF) in relation to the proxy “rule of law” (RL). The proxy EF is 1 STI from RL, i.e., EF pertains to the 70th percentile with an F-Scale of (6,7,8) and RL pertains to the 60th percentile with an F-Scale of (4,5,6). This means that EF is of higher importance relative to RL in a 1v1 comparison. As such, the triangular fuzzy value of EF relative to RL is (2,3,4) as exhibited in column 7 row 1. On the other hand, RL relative to EF is assigned a triangular fuzzy value of (1/2, 1/3, 1/4) which is the inverse of the F-value of EF relative to RL, as exhibited in column 1 row 7.

The fourth step is to apply Buckley’s [12] geomean method to calculate the fuzzy weights. This involves multiplying the first value of the triangular fuzzy scale of each column with one another to the root of 1/*n*, where *n* is the number of variables, and *a* is the first integer corresponding to the triangular fuzzy value set *a*_{1}, *a*_{2}, *a*_{3} of proxy *i*; i.e., *r*_{i} = [*a*_{i1} × *a*_{i2} × … × *a*_{in}]^{1/n}, which when applied to column 1 gives us: *r*_{1} = [(1 × 6 × 4 × 2 × 1/2 × 1/2 × 2 × 9 × 4 × 9)^{(1/10)}] = 2.45. When this computation is applied to columns 2 and 3, respectively, and the geomean is utilized to aggregate the values of these columns, the results are as follows (Table 11).

The fifth step is to calculate the fuzzy weights . In order to calculate the fuzzy weights, the researcher needs to calculate the sum of each column *r*_{i}, and then multiply the inverse of this sum, i.e., , with the corresponding value in column *r*_{i}; i.e., (∑ *r*_{1})^{−1} × *r*_{1i}. The computations for the proxy EF are presented in Table 12.

Once the researcher repeats this step for each individual proxy, they are left with three columns that represent the fuzzy weights of the triangular fuzzy sets of the index proxies. This is presented in Table 13.

The sixth step in the FAHP method is to defuzzy or normalize the weights. This includes calculating the average of the fuzzy weights for each proxy, i.e., defuzzified _{EF} = ()/*n* = (0.14 + 0.16 + 0.18)/3 = 0.16. The defuzzified weights are exhibited in column 5 of Table 13, and they are the final weights to be applied to the composite index after being subjected to the FAHP method.

Comparing the weights computed using the FAHP method and the weights computed from the expert responses, with and without alpha adjustment, a large variability between the different weighting methods becomes clear (Table 14).

The reason for that is because, similar to PCA, the FAHP method helps the decision-maker by highlighting the most prominent items or variables being studied and assigns them higher values facilitated by a 1v1 comparison unlike the expert opinion method, which is purely based on a scale of relative importance. This 1v1 comparison has its both advantages and limitations as even though it elevates the weights of the most important variables, it could exaggerate their value, as well as lessen the impact of lower weighted variables. Subsequently, this can greatly skew the results of the index due to highly unbalanced proxy weights.

##### 3.3. The Interval Weights

To address this limitation, the paper introduces interval weights, which are the midpoint between the fuzzy weights and the alpha-adjusted expert weights (Figure 5). The interval weights address the limitation of highly exaggerated or skewed weights by normalizing them and bringing them closer to the expert values, but not to the extent that the FAHP method and subsequent fuzzy weights become obsolete.

The finalized weights to be applied to the composite index, after (1) transforming the survey responses of experts into weights, (2) subjecting these weights to alpha adjustments based on the experts level of expertise, (3) integrating these weights with the fuzzy analytic hierarchy process method, and (4) normalizing the fuzzy weights so that they would not skew the composite index results by calculating the interval weights (), are presented in Table 15.

#### 4. Discussion

##### 4.1. Fuzzy Values and Percentile Ranks

The primary benefit of the percentile method introduced in this paper is that it allows the researcher to apply the FAHP method to composite indices with a large number of proxies when attempting to generate index weights. This method is advantageous over the methods covered in the literature review as it does not diminish the importance of variables by grouping them into indicators and subindicators.

##### 4.2. Expert Weights vs. Fuzzy Weights

If one opts for the expert weighting system in the aggregation process of composite indices, and the expert opinions are collected using a scale of relative importance, the scale in question does not tell the bigger picture, even after the responses are transformed into weights. To elaborate, some of the proxies are significantly more important than others based on theory and supporting literature, but the responses based on the scale of relative importance might not reflect that since the experts are independently assigning the values to the proxies whilst not taking into consideration the significance level of each proxy on the phenomenon being studied. This is where it is advisable to apply the FAHP method to the expert weights. The FAHP method facilitates for a 1v1 comparison of the proxies to identify which of the two compared proxies are of higher importance. After subjecting the proxies to this 1v1 comparison, the difference between the proxies in terms of relative importance becomes quite clear. This 1v1 comparison is reflected in the pairwise comparison matrix and facilitates the calculation of more accurate weights, i.e., the fuzzy weights. That said, the fuzzy weights can skew the results of the index developed, and it is up to the researcher to evaluate whether it is appropriate to apply the interval weights method to normalize the weights, or simply take the fuzzy or expert weights as they are.

##### 4.3. Interval Weights

Comparing the results between the weights computed from the expert responses and the weights computed through the FAHP method, it becomes clear that integrating the FAHP method with expert opinion and allowing for a 1v1 comparison of the proxies leads to assigning higher weights to the proxies which are a better representative of the development hindering behavior of individuals and institutions. These fuzzy weights () address the subjectivity and inconsistency of the expert opinion but are not without limitations themselves as they can skew the results of the index. The interval weights () address these limitations by serving as the midpoint between the expert weights and the fuzzy weights. The interval weights are the most reliable weights relative to the other options, as they address the subjectivity, uncertainty, and inconsistency of the expert weights, and the lopsidedness of the fuzzy weights.

#### 5. Conclusion and Future Research

This paper provides a guide to researchers on how to integrate expert opinions with the fuzzy analytic hierarchy process (FAHP) for the purpose of generating weights for composite indices. The expert weighting system is the most reliable system when assigning weights to the proxies of composite indices since it transforms the opinions of experts on the topic or phenomenon being studied into weights. This method is not without its limitations as expert opinions are troubled by high subjectivity, uncertainty, and inconsistency. Moreover, expert opinions gathered through survey responses are subjected to certain conditions such as having a high survey response rate, which is difficult to achieve and could subsequently lead to nonresponse bias. Furthermore, the index weights generated via the expert weighting system do not eliminate subjectivity in the weighting process, nor do they facilitate for a 1v1 comparison of the proxies, a process that provides a more accurate picture of which variables better reflect the phenomenon being studied. To address this limitation, the expert weights are subjected to the FAHP method to produce the fuzzy weights, which are more valid and reliable in comparison. The advantage of the FAHP method over the AHP method is that it is better at handling uncertainty. Moreover, the FAHP method, similar to principal component analysis (PCA), exhibits to the researcher the most important variables or proxies to the phenomenon they are attempting to measure, the difference being that unlike FAHP, PCA cannot be utilized when constructing indices that are noncompensatory in nature; i.e., insufficiencies in one proxy cannot be compensated by a better performing proxy.

With that said, the FAHP method is not without its limitations, which pertain to computational difficulty, unnecessary complexity and fuzziness, and issues of consistency. Another limitation of the FAHP method is that it can skew the index results by producing weights which have a high variability relative to the expert weights. It must be noted that it is up to the researcher to determine whether such variability is justified based on theory and literature. In the instance where the variability is exaggerated, researchers can normalize the fuzzy weights by subjecting them to the interval approach method which generates index weights by finding the midpoint between the fuzzy weights and the expert weights. Such method reduces the variability between the two weighting outcomes and produces weights which exhibit the benefits of both FAHP and PCA.

Regarding future research, more research must be produced on how to select the best weighting method when constructing composite indices since the output or the results of the index are highly dependent on the weights, i.e., high sensitivity. Furthermore, more effort must not be put into producing more approaches to fuzzy AHP, but rather into identifying which methods produce more reliable results, as well as how to reduce the complexity of such methods, and how to improve the consistency of the FAHP output.

#### Data Availability

The data used in the manuscript comprises of primary survey data. Data is present present in the article.

#### Conflicts of Interest

The authors declare no conflicts of interest.

#### Acknowledgments

The article processing charges are paid by the National University of Malaysia (UKM) in Bangi, Selangor.