Abstract

The most widespread approach to transport appraisal is to combine cost-benefit analysis (CBA) with environmental assessments and public consultations. However, large-scale transport projects such as the HS2 high-speed rail system in the UK seem to have pushed this approach beyond its limits, leading to broad discontent with the appraisal process. There is a need both to develop new methods capable of integrating a wide range of perspectives in a systematic manner and to test these for large-scale projects. Multicriteria analysis (MCA) has proven useful in supporting transport decision-making by including a broader set of criteria in the appraisal process. Multiactor multicriteria analysis (MAMCA) has extended this approach to include multiple actors and stakeholders in the judgment and decision-making process. This paper builds on the MAMCA method and demonstrates its practicability and usability by applying it to the case of HS2. The purpose of this paper is not to reach a definitive conclusion on the desirability of various project options, but to complement existing transport appraisal methods by making different perspectives explicit. For example, the results for this case show contrasting views for different groups of transport professionals: a favorable assessment of HS2 among transport planners employed in government, but an unfavorable assessment among transport researchers with a background in sustainability. In terms of contribution to the development of data collection methods, this research demonstrates the usefulness of conducting semistructured interviews in conjunction with an online questionnaire for the assessment and weighting process within MCA. Because MCA results are expressed in terms of relative desirability of projects, the approach also effectively systematizes the inclusion and assessment of multiple options. Overall, the proposed method enhances the capacity to analyze conflicting views in large-scale transport project appraisal processes.

1. Introduction

Decisions to invest in large-scale infrastructure projects such as high-speed rail (HSR) typically extend beyond traditional economic evaluation using cost-benefit analysis (CBA) to include factors such as wider economic effects, regional development, long-term environmental impacts, strategic growth of certain industries, and even issues of national pride. In some cases, the strategic goals may even take precedence over the results of conventional economic analysis [1]. However, such factors are not generally compared or weighed in a systematic manner [2, 3]. Due to the size and complexity of HSR projects, expertise regarding project impacts can become contested, as various experts from different clusters of specialist knowledge disagree not only on outcomes but also on assessment methods [4].

CBA itself has been the subject of decades-long criticism, and although various attempts have been made, both in theory and in practice, to broaden the criteria considered, as well as the stakeholders involved, current transport appraisal methods still consist of little more than CBA supplemented by environmental assessments and public consultations. While smaller-scale projects may arguably be well served by CBA, larger transport projects tend to push this framework beyond its limits, leading to broad discontent with the process [4]. Current transport appraisal methods have been substantially discredited as a result. There is a need both to develop methods capable of integrating a wide range of perspectives in a systematic manner and to test these for large-scale projects.

Multicriteria analysis (MCA) has proven useful in supporting transport decision-making by including broader sets of criteria in the appraisal process [5]. The purpose of this paper is to advance the practicability and usability of MCA for large-scale transport project appraisal. Specifically, this paper builds on the multiactor multicriteria analysis (MAMCA) method proposed by Macharis et al. [6, 7] and applies it to the case of HS2 Phase I in order to demonstrate its usefulness in comparing projects across multiple criteria and from multiple perspectives. HS2 is a proposed high-speed railway network connecting major cities in Britain. Phase I will connect London and Birmingham in the West Midlands (221 km), and Phase II will extend the network to Manchester, Sheffield, and Leeds (for a total of about 530 km of high-speed rail lines). Further details about this case will be provided in Sections 3.3.1 and 3.3.2. For additional background on HS2 in the context of UK transport planning, see [8].

The MAMCA approach of incorporating multiple actors as well as multiple criteria is applied to the case of HS2 Phase I by conducting a series of structured interviews with key actors using a specially designed MCA questionnaire. Conceived as a decision-making tool for transport appraisal, a full MAMCA process involves in-depth consideration of project objectives and options upfront and a final project recommendation at the end. Here, this process is abbreviated by taking certain project objectives and options as given and stopping short of recommending a specific project in the final step. One advantage of this abbreviated process is that it does not require established transport appraisal procedures to be completely replaced. Indeed, it can complement existing methods—in which environmental assessments and public comments end up as voluminous appendices (see also [9])—by presenting multiple perspectives side-by-side, thus increasing the visibility of alternate viewpoints.

This paper is structured as follows. Section 2 provides background on transport appraisal methods and situates the MAMCA approach within the appraisal literature. Section 3 describes the research methods, and Section 4 presents the results of applying the MAMCA process to the HS2 Phase I case. Section 5 discusses the implications of these results, and Section 6 concludes the paper with recommendations for future research.

2. Transport Appraisal Methods

CBA is a widely applied approach for quantifying various types of project impacts based on national or supranational guidance (see, for example, [10]). Concerned with efficient allocation of economic resources, CBA aims to aggregate impacts across space and time by translating all impacts into discounted monetary terms. This common unit brings obvious advantages of comparability, both across a range of impacts and among project options [11].

As applied in practice, however, CBA assessments have long been criticized for their failure to adequately address the consequences of transport development and for being too narrow in terms of criteria considered [3, 5, 1216]. CBA is said to favor the pursuit of easily measurable economic objectives at the expense of more complex and long-term social and environmental goals [17]. Finally, CBA methods pose particular challenges for large-scale transport projects: as size increases, so does uncertainty, and therefore the cost of trying to establish certainty too early in the appraisal process [18].

By contrast, MCA compares projects across multiple criteria, thereby making it possible to assess impacts that are impossible or impractical to monetize. At its core, MCA consists of three fundamental steps: (1) assessing project performance against the criteria; (2) weighting the criteria; and (3) combining the assessments and weights to derive an overall value for each project. The MCA literature proposes a variety of techniques for accomplishing these steps and encompasses a wide range of methods for identifying project options, objectives, criteria, and stakeholders [2528].

The UK has been at the forefront of MCA developments for transport project assessment [14], and the official UK Transport Analysis Guidance (WebTAG) combines CBA and MCA approaches in a wider decision-support framework [29, 30]. A key outcome of the impact assessment of transport appraisal is the completion of an Appraisal Summary Table, which summarizes all economic, environmental, and social impacts, qualitatively and quantitatively [31]. For HS2 Phase I, the environmental impact assessment (EIA) process, by itself, is said to have generated almost 50,000 pages of material [19, 32]. This input then feeds into the MCA analysis of the strategic, economic, financial, delivery, and commercial case (see [31] Figure 3 p6).

Another challenge for project appraisal is to incorporate the concerns of various stakeholders. Although existing European legislation prescribes various mechanisms for public participation, for example, via directives on EIA and strategic environmental assessment, general experience suggests that different types of stakeholders are not effectively integrated in practice. For example, EIAs are often found to be subject to “unstructured stakeholder involvement and inefficient public participation” [33].

In transport appraisal, the MAMCA method has been proposed as a method to formalize the inclusion of various competing stakeholder interests [6, 7, 34]. Based on the strategic stakeholder management literature, a stakeholder is defined as any individual or group (organized or not) who is able to affect or is affected by (or both) the ultimate outcome of a particular issue [6, 35]. This paper follows and further develops the MAMCA approach by focusing on a subset of stakeholders and proposing specific appraisal steps for soliciting a broad range of perspectives within that subset. The stakeholder groups of interest in this paper include transport planners and professionals broadly defined to incorporate expertise in the diverse fields of transportation, energy, economics, and environmental issues as they relate to HS2. These actors thus represent a multitude of relevant perspectives, even as they cannot be seen to represent all stakeholders. We then take these actors through an actual appraisal process in order to demonstrate the feasibility of soliciting, aggregating, and presenting multiple perspectives.

3. Methods

3.1. Data Collection

Data collection aimed for a balance between the two extremes of gathering all respondents in a single workshop (maximum interaction) and conducting an online survey (no interaction). Specifically, we conducted in-person interviews, combining semistructured discussion with completion of a structured electronic questionnaire. This had the advantage of enabling the interviewer to provide clarification of the steps, criteria, scales, and other complexities (similar to a workshop setting), thereby enhancing data quality. In contrast to a workshop setting, however, there was no interaction among respondents. This may have disadvantages if the assessment goals are exploratory (e.g., defining objectives), but may have advantages if the goal is to ensure representation of a variety of perspectives (e.g., avoiding group-think; providing confidentiality which encourages respondents to share views more fully with interviewer). The semistructured interview format, combined with the structured online tool, provides a rich source of qualitative data that serve to improve the process and reach a fuller understanding of the case.

The target population for interviews consisted of transport planners and experts, both practitioners and researchers, employed in all sectors (public, private, non-profit, and academic). In order to test the approach, we were primarily interested in transport professionals in the UK, or in some cases from other parts of Europe if they were involved with HS2.

To identify potential respondents, we relied on three sources: (1) a long list of attendees from private and governmental institutions present at a large seminar on appraisal methods at University College London held in 2014; (2) the official parliamentary reports listing all petitioners with their evidence; and (3) our own network of transport planners and academics.

In all, we interviewed ca. 40 transport professionals, 33 of whom filled in the questionnaire. Even if this does not represent a perfect sample of all relevant transport professionals, it comprises a substantial number of professionals who represent a broad diversity of expertise. Moreover, this research incorporates their views in a transparent and systematic manner. Although the official appraisal of HS2 Phase I solicited the input of many professionals and experts, the process by which this input was incorporated was quite idiosyncratic and opaque.

3.2. Overview of Appraisal Steps and Survey Process

The twelve steps of the appraisal process we defined are comparable both to von Winterfeldt and Edwards’s [36] eight-step MCA process and to Macharis et al.’s [6] seven-step MAMCA process. The individual appraisal steps (see Table 1) are best described by grouping them in terms of three stages of the survey process:(i)appraisal steps conducted as part of questionnaire design (defining objectives, project options, and criteria; developing questions to identify stakeholder groups);(ii)appraisal steps conducted through response elicitation (selecting and weighting criteria; assessing project performance);(iii)appraisal steps conducted during data analysis (identifying stakeholder groups; calculating project preferences for each stakeholder group).

3.3. Questionnaire Design

Several of the initial appraisal steps were conducted and defined during the process of designing the questionnaire.

3.3.1. Project Objectives

Objectives should be defined before projects are assessed. A statement of objectives clarifies what the decision is trying to achieve; it also frames the problem at hand, thereby limiting the options that may be considered. The defining of objectives therefore has considerable influence on subsequent appraisal steps. In a full MAMCA process, defining the problem and brainstorming alternatives (options generation) are an important part of the reflexive process.

In the real world of transport planning, however, objectives are typically set by governments, and not always in transparent ways. In the case of HS2, the objectives laid out by the UK government are as follows (see [19] section 3.1):(i)provide sufficient capacity to meet long-term demand and to improve resilience and reliability across the network;(ii)improve connectivity by delivering better journey times and making travel easier.

These objectives were reproduced “as is” in the first section of the questionnaire. Limiting the scope in this way required respondents to accept the validity of the stated objectives and arguably raises concerns about addressing wider sustainability issues (see, e.g., [37]). However, conducting a full MAMCA process was beyond the scope of this research.

3.3.2. Project Options

During the early stages of the HS2 Phase I appraisal process, a number of alternatives were proposed and assessed. In the questionnaire, we selected two rail proposals for further analysis and comparison, in addition to the officially adopted HS2 project. One is an alternative high-speed rail alignment following an existing transport corridor (the M1 motorway alignment, see HS2 Ltd, 2012). The other is an extended upgrade to the existing West-Coast Main Line. This upgrade would tackle “bottlenecks” and provide additional capacity mainly through a program of train lengthening, increased frequency, modernization of junction designs, and the provision of additional tracks in some locations [21, 22]. Having decided to adopt the official HS2 goals for our own appraisal process, we selected these particular proposals because they, too, accept the objectives of HS2 as given and seek to meet those same objectives through alternative projects. Furthermore, both are rail projects, which aids the comparison with HS2. Finally, both alternative proposals were sufficiently well developed for information to be available on the potential impacts of each.

The questionnaire displayed a summary table with key features of each project (see Table 2), as well as a map showing the three alignments (see Figure 1). More detailed descriptions of the three project options were available by clicking a button in the online survey (see Table 3), or directly from the interviewer. Attention was given to writing the descriptions as impartially as possible to avoid inferring potential positive or negative impacts.

The questionnaire gave respondents the possibility of adding a fourth project option of their choice. A number of respondents chose to do so, in which case the questionnaire incorporated this additional option into subsequent performance assessment questions. Several respondents added a “Do minimum” option, some because they saw a need to establish a neutral baseline, others because they contested the stated goals of increased capacity and speed (arguing, for example, that accessibility, affordability, and quality of journey experience on the rest of the network were more important in the UK context). A few others contested the geographical scope and chose to add investment in urban mobility (centered around improving public transport and cycling facilities) as a more realistic and cost-effective alternative for improving mobility and for reducing carbon emissions. On the one hand, allowing the inclusion of such options raises comparability challenges, since these are not assessed by all respondents; on the other hand, doing so provides an opportunity to record the feedback and proceed with the survey.

3.3.3. Assessment Criteria

The criteria weighting process described below (Sections 3.4.1 and 3.4.2) is based on a list of assessment criteria developed by Barradale and Cornet [23] in a prior, preparatory stage of this research. Using the criteria listed in the Transport Analysis Guidance (WebTAG) by the UK government [29] and the impacts assessed in the HS2 appraisal documents [32, 38] as a starting point, Barradale and Cornet [23] consulted a wide range of experts, adopting an iterative, mixed deductive/inductive approach to produce a comprehensive and coherent list of criteria for comparing HS2 and its alternatives.

In addition to direct project impacts (those costs and benefits typically considered in transport appraisal, including the official appraisal of HS2 Phase I), this list includes broader impacts on society and the environment. The final list of 28 assessment criteria is presented in Table 4, with complete descriptions in Table 5. The purpose of this comprehensive list is to have a reference point for the participants to consider in their criteria selection. They might not choose to evaluate all criteria, but at least they are given the possibility of considering a wide range of issues, thus addressing the problem of omission bias.

3.3.4. Questions to Identify Stakeholder Groups

Stakeholder groups may be selected and defined in various ways, depending in part on the goal of analysis. As mentioned above, the overall subset of stakeholders targeted in this paper was transport planners and experts, where expertise is broadly defined to include engagement with a wide range of issues relating to HS2. This larger group was then subdivided into smaller groups to highlight the diversity of perspectives.

A key objective in this particular application of the MAMCA approach to the case of HS2 Phase I was the designation of a stakeholder group to represent a “sustainability viewpoint” (see [39] for details on the concept of sustainability viewpoint and various ways to define it). Here, this stakeholder group consists of transport professionals with “sustainability expertise” (defined in Section 4.3).

Regardless of focus, stakeholder identification must be based on clearly defined criteria that are independent of appraisal process and outcome. For the purpose of identifying stakeholder groups in this paper, respondents were asked questions about their professional background and experience:(i)educational background, including transport and environmental studies,(ii)sector of employment,(iii)type of involvement with HS2/transport infrastructure,(iv)areas of focus/analysis within transport planning and appraisal (e.g., social and environmental impacts).

3.4. Response Elicitation

The next three appraisal steps were conducted through the elicitation of responses in the online questionnaire.

3.4.1. Criteria Selection

The questionnaire displayed the list of 28 criteria shown in Table 4, with more detailed definitions available to respondents by clicking a button in the online survey (see Table 5) or asking the interviewer for clarification. Respondents were given the opportunity to add (up to 3) additional criteria, in case they felt some were missing. Respondents were also given the possibility of adding open-ended comments. This qualitative data was used in earlier stages of the research to refine the criteria list (see [23]).

Respondents were then asked to select criteria from the full list of 28 (plus any the respondent had added). The software was set up to require at least 3 but could accommodate any number up to 28 (plus any added criteria). From an analysis perspective, more is better; from a user perspective, fewer is better, as each additional criterion lengthens the assessment process. In balancing the benefits of more versus fewer criteria, we suggested that respondents select “at minimum 6.” In practice, respondents rarely picked more than 9.

Some respondents raised a question about the appropriate basis for criteria selection and whether it should be contextual relevance (i.e., which criteria are most relevant in a particular case for comparing specific projects?) or normative preference (i.e., which impacts should we care about most?). Some respondents considered this to be a critical distinction: for example, one respondent rated biodiversity and carbon footprint very highly in principle, yet did not deem it necessary to select them in this case, because he considered the marginal differences among the three projects to be too small to matter.

We chose to circumvent the issue through careful phrasing of the question to avoid mentioning either “relevance” or “importance”: the questionnaire asked respondents which criteria they thought “should be used for assessing the pros and cons of HS2 Phase I and its alternatives [emphasis added].” In other words, respondents selected criteria on whatever basis they deemed appropriate. Most respondents were content with this lack of specificity, perhaps intuitively conflating relevance and importance in their selection.

3.4.2. Criteria Weighting

In the next step, respondents were asked to weight the criteria by rating “the relative importance of each criterion” they had selected. Respondents were found to be comfortable with using sliders, which provided an easy-to-understand and consolidated visual representation of their preferences (see Figure 2).

3.4.3. Performance Assessment

Methods for assessing project performance should address issues of accuracy and objectivity. Accuracy involves both the assessor (whose judgment is solicited) and the assessment process (how the judgment is elicited). Determining who should conduct the assessment involves a value judgment about who is sufficiently qualified to be able to make an accurate assessment of project performance.

On this issue we made a key methodological choice to ask the same people who had selected and weighted criteria in the previous step to also conduct the assessments. Procedurally this involved each respondent assessing project performance for each criterion he/she had selected. This approach is unusual: more commonly, the weighting of criteria is decoupled from the assessment of project performance, with the latter performed by a single expert, or small group of them, using available knowledge and forecasts [34]. However, reliance on this conventional notion of expertise involving few individuals may introduce overconfidence bias with respect to those criteria that transport planners are accustomed to assessing, such as capacity and traffic impacts, at the expense of wider economic, social, and environmental impacts, with which they may be less familiar.

Since our interviews specifically targeted transport professionals who were familiar with HS2, we felt it was reasonable to assume a generally sufficient level of competence among respondents. We also assumed that those respondents who selected particular criteria were also best qualified to make an expert judgment about them. Even should this not always be the case, the phrasing of our question about project performance explicitly asked respondents to “evaluate to the best of your ability, however feel free to skip directly to the next question if you have no opinion.” One improvement could be to ask respondents to rate their level of confidence in their own assessments, but we perceived it to be more efficient to ask respondents to explain their ratings generally. This qualitative approach helped respondents clarify their answers and improve the quality of input and also provided us as researchers with a better understanding of the numerous complexities hidden behind the assessment of a single criterion.

The assessment process also influences the accuracy of assessment data. In particular, does the process accurately capture people’s judgments (whatever those judgments are)? We selected the Analytic Hierarchical Process (AHP), one of the most common MCA elicitation techniques used in the transport field [34], for conducting assessments. AHP’s pairwise-comparison approach is easily understood by respondents, and the inclusion of redundancy provides a consistency check. Scientific robustness (e.g., known rank reversal issues) can be addressed by using multiplicative AHP [40]. This is also an improvement on all implementation criteria as the method does not require any specialized software, only a standard spreadsheet. A multiplicative structure is introduced to fit the ratio judgments made during the comparisons, the scale is adjusted to fit the multiplicative structure (-8, -6, …0, …6, 8), and the aggregation of scores is based on simple geometric means [24]. AHP requires that criteria be nonoverlapping, mutually exclusive, and limited in total number so as to avoid an exponential number of comparisons. Alternative methods exist when criteria have strong mutual dependencies, e.g., ANP, Analytic Network Process [41].

Because of its cognitive simplicity (reducing complex decisions down to a series of pairwise comparisons), AHP captures people’s judgments accurately and has been shown in many settings to be reliable and robust [42]. In order to give flexibility in the number of project options considered (since respondents had the possibility in part 1 of the questionnaire to add an option) without losing accuracy in the calculations, we selected the multiplicative variant of AHP [40].

The questionnaire iterated randomly through all selected criteria and asked respondents, “For each pair of project options below, which one do you believe would perform better in terms of <criterion>?” This formulation is important so that each alternative is assessed from the perspective of positive performance, independently of whether the criterion is a cost or a benefit. The multiplicative AHP scale was adapted to this case based on Lootsma [43, 44], as shown in Figure 3. A full description of the scale was available to respondents by clicking a button.

In addition to accuracy, objectivity is also considered to be an important aspect of assessment quality. Ideally assessments of project performance should be “objective” and value-free—in other words, separate from people’s preferences or desires. In reality, however, respondents may be subject to motivational bias (e.g., exaggerating the objective assessment in favor of their preferred option), whether consciously or unconsciously [45]. In an effort to reduce motivational bias due to organizational affiliation, respondents were asked to answer questions from their “individual perspective based on your cumulative knowledge and experience, not just as a representative of your current organization or job.” Furthermore, performance assessments were averaged across all respondents, thus reducing the impact of any particular assessment on the results.

Even if it is not possible to guarantee complete objectivity in outcome for each assessment, it is important to note that the assessment process was procedurally objective, i.e., the assessment of each criterion was conducted in exactly the same way, regardless of how important the criterion was considered to be.

Project performance assessments are averaged across all respondents. Calculated for each of the 28 criteria separately, the average performance assessment for each criterion is the geometric mean of all individual pairwise comparisons. Using standard formulas detailed in Olson et al. [24], Table 6 shows the calculation for the assessment example used in Figure 3.

4. Results

4.1. Performance Assessments for Each Criterion

Before project performance assessments for each criterion can be averaged across respondents, each individual’s assessment should be checked for internal consistency. For the AHP process, one rule of thumb that has been suggested is to exclude all inconsistent judgments over 10% [46]. This threshold was developed for comparing large numbers of items (such as 8x8 matrices) and has been applied primarily in workshop settings, where the AHP process often includes a review session/round where inconsistent answers are discussed and adjusted. With an online questionnaire, inconsistent answers can either be flagged during data collection (through an immediate consistency-calculation-and-answer-adjustment feature) or be discarded later during data analysis. The former lengthens the time required for respondents to complete the performance assessments. Given the many other important areas of response elicitation in the appraisal process, along with a desire to make sure the total time commitment did not go beyond 1-1.5 hours (which for a typical questionnaire would be unthinkably long, but conducted in an interview setting was acceptable to respondents), it was decided to forgo requiring fully consistent performance assessments before moving on. The questionnaire-based process therefore consisted of a single round, leaving only a binary choice in the analysis stage to either include or exclude answers. A strict consistency threshold was found to favor assessments that gave more equal performance to all three options, whereas higher thresholds allowed more differentiated judgments to be included. Furthermore, for most answers up to 50% inconsistency, the intention of respondents was still clear even if the use of the scale was not entirely accurate. For example, in terms of journey experience, one respondent assessed HS2 as outperforming both the M1 alignment and the WMCL upgrade equally (value = 4 for both), yet also assessed the WCML upgrade as underperforming the M1 alignment (value = -4). This implies that a higher number (i.e., 6 or 8) should have been entered for HS2’s performance vis à vis WMCL. The essence of the answer (i.e., HS2 > M1 > WCML) is nonetheless clear. The 10% threshold was therefore deemed needlessly strict, and 33% was selected as a more appropriate threshold for data validity. This resulted in dropping 36 out of 238 total assessments, leaving 202 valid assessments in the results.

Figure 4 presents the project performance assessment results for this case. Each bar shows the relative performance of the three projects on one criterion. The number to the left of each bar is the number of valid assessments included in the average for that criterion (number of respondents who selected that criterion). The discrepancy in number of assessments for different criteria reflects the variation in their perceived importance and/or relevance to the case. Figure 4 also provides an overview of how well each project performs on each of the three impact categories: direct impacts, indirect societal impacts, and environmental impacts. Within each category, the criteria are ranked from best to worst performance on HS2 Phase I (for viewing purposes). The performance assessment results are presented numerically in Table 7.

On the limited set of criteria representing the official project objectives that guided the government’s HS2 appraisal process, in this appraisal HS2 was likewise assessed as outperforming the alternatives. This included regional economic development & regeneration, a goal which became increasingly important in the argumentation promoting HS2 (see [4]). Of all the official goals of HS2, only transport integration & connectivity was assessed as performing poorly compared to the WCML upgrade option. This was explained in interviews as being due to the nature of HSR (limited number of stations) as well as to the lack of direct connection with existing transport hubs (e.g., the New Street station in Birmingham, and to some extent airports).

Looking beyond the narrow set of assessment criteria that represent the official project goals, the picture is quite different. Most notably, both the M1 alignment and especially the WCML upgrade outperformed HS2 Phase I on almost all environmental criteria. This assessment confirmed the concern raised by the House of Commons environmental committee report regarding HS2 Phase I’s poor performance on biodiversity & nature [47]. HS2 Phase I performance was also found to be poor on a number of societal criteria, such as accessibility, land use, landscape, and equity & distributional effects.

Impact assessment is central to appraisal outcomes; however, as the project performance results in Figure 4 show, impact assessment on its own does not point to clear winners or losers. It depends on the weights that are assigned to the various criteria.

4.2. Assignment to Stakeholder Groups

Whereas the project performance assessments are averaged across all respondents, the criteria weights are calculated for each stakeholder group separately. This is because the criteria weights are the component of the appraisal process that represents opinions, and the goal of this research is to present multiple stakeholder perspectives. Before the weights can be calculated, it is necessary to specify the subgroups whose opinions are to be presented.

As described in Section 3.3.4, the overall group of stakeholders targeted in this paper was transport planners and experts. In order to identify subgroups within that larger group, the questionnaire included questions on professional background and experience. The responses to these questions were then used to determine the subgroups.

Respondents were assigned to subgroups in two steps (see Figure 5): (1) applying a “sustainability expertise” filter and (2) categorizing by sector of employment. To qualify as a “sustainability expert,” the respondent had to meet two of the following three criteria:(i)Have formal education in  environmental studies (university degree or university-level coursework)(ii)Conduct  environmental analysis of HS2/transport infrastructure “to a great extent(iii)Conduct analysis of HS2/transport infrastructure  primarily at “society-level (wider economic impacts, social/environmental issues)” rather than “project-level (system design, user benefits, project costs, etc.)”

As it turned out, the sectors aligned closely with sustainability expertise (though this would not have to be the case), resulting in the following four groups of transport professionals:(1)Government transport professionals: all employed at various levels of government, local, regional, or national (it is noteworthy that none of the respondents employed in government/public sector met the criteria for sustainability expertise).(2)NGOs: all belonging to nongovernmental organizations, representing various local, regional, or national interests.(3)Sustainable transport researchers: all academic transport professionals with sustainability expertise.(4)Other transport professionals: all those not included in any of the above groups, including conventional transport planners [48] working in the private sector or in academia.

4.3. Criteria Weights for Each Stakeholder Group

In contrast to the performance assessments, which were averaged across all respondents, the criteria weights were calculated for each group separately. Weights taken from the slider scale (Figure 2) were recorded to one decimal place and normalized. Normalization allows for grouping by type of respondent and is done by averaging the responses of all respondents in each group. This process highlights the different perspectives held by different groups with regard to which criteria matter most. If a criterion was not selected (and therefore not weighted) by any of the respondents in that group, then it received a weight of zero.

Resulting criteria weights for each of the four subgroups are shown graphically as radial plots in Figure 6. Each plot has 28 axes (one for each criterion) measured in percentage points. The more weight assigned to a given criterion is, the further its data point is plotted from the origin. More important criteria (as prioritized by each group) are the peaks in the spider graph, and less important criteria are the troughs. These results are presented numerically in Table 8.

Not only do the graphs in Figure 6 show the results for individual criteria, but because they are grouped by impact category, they also provide an overview of the relative importance of each category. For example, government transport professionals assigned far more weight to direct project impacts than to environmental impacts, whereas sustainable transport researchers were (relatively) more balanced in their prioritization of the three categories.

Although the sample sizes for each subgroup are not large, the results can still be taken seriously. For groups that are fairly homogenous, research on the number of interviews required for reaching “saturation” (where answers converge and additional judgments do not add new information or influence the overall result) suggests a target of 12 and a cut-off of 6 responses to ensure validity [49]. This means that the results for government transport experts (8 respondents) and sustainable transport researchers (12 respondents), and probably also for “other” (consisting of conventional transport professionals in private sector and research), depending on the assumed level of homogeneity for that group, can be accepted as reasonably robust. The NGO perspective, with only 3 respondents, lacks sufficient data to be considered fully valid, but still constitutes anecdotal evidence.

4.4. Project Preferences for Each Stakeholder Group

To see how these perspectives translate into project preferences and decision-relevant outcomes, it is necessary to combine the project performance assessments from Figure 4 with the criteria weights in Figure 6. Figure 7 shows project preferences for each of the four expert groups, based on the group-average criteria prioritization and the all-respondent-average performance assessments.

Transport experts in government are found to favor a high-speed rail solution, but they are ambivalent regarding alignment. This outcome for transport experts in government is in slight contrast with decision-makers in government who clearly supported the original HS2 alignment, which other studies have explained by an initial political decision to favor the higher speed option [9]. In the present study, the proposed HS2 alignment and the alignment along the M1 motorway corridor, both at 38%, are equally preferred, whereas the WCML upgrade, at 24%, is considered significantly less preferable. The main reason for favoring an HSR option is the high prioritization given to regional economic development & regeneration, on which both HSR options score highly. Other transport professionals tend to prefer the M1 alignment over HS2, mostly because of a relatively lower importance given to journey time. In their case, the WCML upgrade option is also deemed a viable alternative, due mostly to the high prioritization given to project costs (despite uncertainties with actual costs of an upgrade). Sustainable transport researchers, on the other hand, see the WCML upgrade as clearly preferable to either HSR option, with a slight preference for the M1 alignment as a second choice (see also [39] for more on sustainability viewpoints). This is due to the low priority given to journey time and also to the high prioritizations given to accessibility, transport integration & connectivity, and carbon footprint. NGOs were found to strongly support the conclusions of sustainability advocates with regard to preferring a WCML upgrade, but for different reasons: they were more concerned about impacts on transport integration & connectivity, land use & urban development, and landscape & cultural heritage.

This paper’s proposed method not only displays relative project preferences for each subgroup, but also, and importantly, can explain why different groups favor different options. Large transport infrastructure projects are often contentious and contested. In the end, the debate over HS2 was driven largely by political goals that had little connection to expert analyses [4]. This adversarial, politically driven debate could likely have been avoided or mitigated had the parties been better able to understand each other’s underlying interests. Because these priorities are hidden or assumed in prevailing methods such as CBA, the debate moves away from discussing interests and towards arguing positions—an unproductive style of conflict resolution that exacerbates contention and hostility [50].

By contrast, MCA approaches make the criteria weights that drive project preferences explicit. If these priorities are made explicit for multiple groups (i.e., the MAMCA approach), then the debate can shift from positions (i.e., “best project”) to interests (i.e., which criteria matter most). Figure 8 presents the top 10 criteria, in order of importance, for each perspective. Together, these account for roughly 80% of each group’s total criteria weight. By contrast, the other 18 criteria (see Table 8 for details) account for less than 20% of each group’s perspective. Moreover, it is helpful to notice that many of the same criteria are of concern to multiple groups, albeit ranked in different order. The fact that many of the same criteria end up in multiple groups’ top 10 can serve as a means for finding common ground and thus help resolve conflicting views. Indeed, among the 18 criteria featured in these four top-10 lists, eight of them are shared by at least three groups, and one—carbon footprint—is shared by all four groups.

4.5. Sensitivity Analysis

For the criteria weights, it was suggested that input should be solicited from a minimum of 6 respondents per subgroup in order to adequately capture the group’s perspective. But what about the performance assessments? How many respondents should be required to assess each criterion in order to feel confident that the results are reliable? Future research is needed to answer this question definitively, but for starters, it is worth noting that standard current practice assumes that conducting assessments, unlike assigning weights, is an objective process and therefore does not need to address the concept of convergence of opinion. MCA methods do not require multiple experts to conduct the performance assessments and frequently rely on only one or a few experts total for the assessment process. In this research, 33 experts conducted 202 assessments across 28 criteria in a process that was far more comprehensive than is the norm. Furthermore, every single criterion was assessed by at least 2-3 people. Compared to standard practice, this should be more than sufficient. However, it is still worth asking whether assessments might be improved if more experts are involved, and if so, how many?

In earlier research, Cornet (2016) [51] applied a strict standard of requiring a minimum of 4 independent assessments per criterion. This necessitated dropping some criteria from subsequent calculations. Since 10 criteria received only 2-3 impact assessments, applying a minimum of 4 valid assessments per criterion meant basing the project preference calculations on only 18 criteria (8 direct project impacts; 6 indirect societal impacts; 4 environmental impacts). As it turns out, compared with leaving all criteria in the analysis, this makes essentially no difference to the outcomes for any of the four groups (37%, 39%, 24% instead of 38%, 38%, 24% for government; NGOs unchanged; 28%, 31%, 41% instead of 27%, 31%, 42% for sustainable transport researchers; 32%, 28%, 31% instead of 30%, 37%, 33% for other transport professionals). This is explained by the fact that criteria receiving fewer assessments tended to be those deemed less important for decision-making in this context. Whether criteria are formally dropped or whether they are prioritized with very small weights, the effect on project preferences is minimal.

5. Discussion: Reflections on the Appraisal Process

Although the MAMCA process is straightforward and well documented, this study shows how a number of details need to be carefully considered in applying it to large-scale transport projects such as HS2 Phase I. Specifically, the tool should balance the need to deliver presentable results with the goal of promoting learning and "negotiation". Although some of the procedural details of our appraisal process differ from the standard MAMCA process, the essential purpose—to solicit and make explicit the perspectives of multiple actors—remains the same. Indeed, some of these procedural details may be adapted to other MAMCA applications.

This study also provides some useful lessons on specific procedural aspects of transport appraisal processes. Our results support the findings of others (see Section 3.4.3) that multiplicative AHP is a robust method of capturing judgments. Validity for transport appraisal that addresses sustainability issues requires comprehensiveness in the list of relevant impacts [23]. While it is important that all impacts are included in the initial list, it is not necessary for all respondents to assess all impacts. A large enough number of respondents will eventually cover all aspects of the scheme, with performance assessments eventually converging. This reduces demands on individual respondents.

In terms of data collection methods, we found the structured interview format (individual semistructured interviews based on a structured online questionnaire) to be very helpful in ensuring the quality of respondents’ input. It allows the interviewer to clarify the steps along the way, to address concerns about the method, and to ensure understanding of the alternatives, criteria, and scales used. For example, understanding what respondents have in mind when selecting a particular criterion is necessary to ensure that weights and assessments are properly ascribed and can subsequently be aggregated across respondents. The interview format allows this to happen. The implication is that an assessment in the form of a "hands-off" online survey is much less likely to be answered properly, if at all. A further advantage of the interview technique, compared with a workshop setting, is that it helps obtain the (confidential) views of those respondents who are opposed to the project or the method, or who would normally stay more “quiet” in a bigger group. The interview format also utilizes respondents’ time more efficiently than a workshop setting, requiring only 1h~1h30 which can be scheduled at a time of their convenience.

The grouping of respondents into homogeneous subgroups can be challenging in practice. At one extreme, there are as many perspectives as there are respondents. Clear rules for categorizing respondents are essential to the transparency of this exercise.

6. Conclusion and Future Research

Motivated by a concern that standard transport appraisal methods do not adequately incorporate diverse perspectives on the impacts of large-scale transport project, this research aimed to develop and test new ways of presenting multiple stakeholder perspectives, explicitly and systematically. We evaluated the feasibility of applying a modified MAMCA to assess the implications of large transport infrastructure projects.

Our modifications included focusing on a particular subset of stakeholders (transport professionals), yet defining that subset broadly to incorporate diverse types of expertise, including energy, environment, and sustainability, as they relate to transport issues. The diversity of perspectives within this subset of stakeholders, even if not representative of all stakeholders, enables the method to be successfully demonstrated and could be adapted or expanded to include other stakeholders.

A further modification in our process is the broadening of the pool of transport professionals and experts engaged in assessing and comparing the impacts of project alternatives, which also implies a broadening of those who choose the criteria for assessment. While our process engages the same group of transport experts in the assessment of project impacts and in the weighting of criteria, it would be possible to decouple these steps and ask a broader group of stakeholders to weight the criteria than are asked to assess the project impacts.

Finally, by not continuing the MAMCA process to the point of recommending a particular project option, our process puts a stronger emphasis on demonstrating the influence of different stakeholder perspectives on assessment outcomes. Although Macharis et al. [6] do propose a mechanism for synthesizing perspectives to come up with project recommendation(s), they stress the fundamental importance of including the descriptions of multiple perspectives as part of the final output. This paper’s presentation of multiple perspectives as final results is therefore very much in the spirit of MAMCA.

With regard to specific procedural aspects of appraisal processes, the most important learning from this research is the usefulness of conducting semistructured interviews in conjunction with an online questionnaire for the assessment and weighting process within MCA.

The proposed comparative stakeholder approach will provide planners and decision-makers with a means of quantifying indirect impacts, thereby making them more visible and comparable. In the context of transport appraisal, gaining such visibility is critical to avoid giving default priority to those impacts that are more easily quantifiable.

More fundamentally, the approach developed here contributes to the shift towards more participatory, discursive, and civic types of assessment. It can help develop more systematic “active stakeholder management” procedures which make it possible to “assess the extent to which stakeholder preferences are conflicting or converging” [35]. A possible extension to explore could be introducing penalties in the ranking for projects that generate widely diverging views, whereas those with greater consensus would be rewarded—the consistency of support thus becoming a criterion within the MCA.

This paper demonstrates a new approach for incorporating sustainability in transport appraisal. From a practical perspective the approach allows for including a standard and comprehensive set of sustainable transport criteria that could be used for ex-ante assessment, monitoring, and ex-post evaluation [23, 52]. If the list of criteria is sufficiently comprehensive to include impacts of a wide range of transport projects, this allows for comparing projects with different types of goals. With this in mind, further research on whether distinguishing between normative preference and contextual relevance of impacts in MCA/MAMCA may be useful if absolute and global sustainability objectives were to be given weights independently of the goals of a specific scheme (see Section 3.4.1). To improve the communication of results, there is also potential for providing further sensitivity analysis to illustrate which criteria affect the differences in ranking between groups, which in turn could inform the decision-making processes.

There is evidence from the official HS2 appraisal material that stakeholders involved critiqued the government for not considering alternatives to HS2 [9]. This suggests there is great advantage in formalizing the use of an MCA-based approach already early in the appraisal process, at a stage when wider options are still being considered. Because MCA requires options to compare against and because the results are expressed in terms of relative desirability of projects, it requires the explicit consideration of more than one project, and these projects must be considered on “equal” terms. This approach effectively systematizes the inclusion and assessment of options in sustainable transport appraisal processes. Doing so would enhance the capacity to analyze conflicting views, transparency of process, and accountability, both to current and future generations.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors are grateful to the Strategic Research Council of Denmark (Innovationsfonden) that funded the SUSTAIN research project.