#### Abstract

A well-developed perspective in the study of urban systems is that cities are complex systems that manifest as networks of interdependent economic units. These units might be occupations, industries, labor skills, patent technologies, etc. Much research has focused on describing the nature of these networks, quantifying their links, and suggesting applications for policymakers. In this paper, we examine the US skill network, focusing on the relationship between network centrality and economic performance. Here, nodes are represented by individual labor skills, and edge weights are derived from the colocation pattern of skill pairs among 384 US metropolitan statistical areas. The centrality of skills, using three centrality measures, is then aggregated to the occupational and metropolitan level. We find that occupations with higher skill centrality are associated with greater annual salaries, and metropolitan areas with higher skill centrality have higher productivity rates. Overall, these results suggest that the application of traditional network metrics to this view of cities as complex networks can offer new insights into the dynamics of regional economies.

#### 1. Introduction

Cities are among the most discussed complex systems. Complex systems are composed of numerous independent parts [1] and exhibit “nontrivial emergent and self-organizing behaviors” [2], which are commonly characterized by fat-tailed distributions [3]. Thus, a characteristic of any complex system is that structure emerges from disaggregated, but interconnected, interacting parts. The myriad of independent agents working and interacting in a city results in a macrostructure that is unlikely to have been deduced if one was to simply examine component individuals, the hallmark of complex systems. This is especially true given that these agents are embedded in nested subsystems, such as infrastructure, governance, and ecosystems, each of which is a complex network of interacting parts [4]. The intractability of modeling the multilayered complexities of cities has given rise to the view of cities as complex systems [5–10] and the growing application of complexity economics [11]. Indeed, cities are commonly discussed in the complexity literature more broadly [3, 6, 12, 13]. Given the growing consensus that cities are complex systems, advancing our understanding of cities as complex systems is imperative for informing coherent regional and inter-regional economic policies.

Fortunately, literature examining cities, used here synonymously with regional economies, from a complexity perspective is an active and growing area of research. The rapidly expanding area of research has resulted in the use of a wide variety of modeling and analytical techniques. Fuentes [14] gives a complete description of methods used in complex systems science. Two commonly used techniques for empirically analyzing cities as complex systems are power-law scaling analysis and network analysis. The early scaling analysis found that petrol stations scaled sublinearly with city size, that is, there are fewer petrol stations per capita as cities grow larger [15]. Subsequent findings uncovered broader results that, while wealth creation and innovation scale superlinearly, infrastructure in general scales sublinearly [16, 17]. Scaling analysis of cities has proliferated and is central to the study of cities as complex systems, also referred to as urban science [6].

The second analytical technique commonly used to analyze regional economies as complex systems is network analysis. For example, international trade data has been analyzed as a bipartite network of countries and goods to assess the complexity of a country’s product space [18]. Technology spaces based on the co-occurrence of patents in the same city have also been analyzed with network analysis [19, 20]. Regional industrial data has been analyzed as networks of industries linked by the relatedness of production based on co-occurrence to examine regional diversification over time [21]. National input-output tables have been analyzed as networks of trade with standard network centrality measures applied to analyze the network and relationships regarding regional economic resilience [22]. Finally, the co-occurrence of skills in industries and occupations has been conceptualized as networks to examine the inter-relatedness of skills [23, 24].

The rapidly growing body of literature analyzing cities as complex systems provides numerous opportunities to research previously unknown areas as well as distill existing research into actionable items for policymakers. Novel directions in research include the reformulation of network metrics and more thorough examinations of economic networks. For example, prior work examining skill networks has not explored the impact of node centrality in networks that underly the analysis [23, 24].

This paper contributes to the literature on cities as complex systems by analyzing how the centrality of skills in a skills network impacts occupation and the cities that comprise these skills. Prior work has been shown that firms diversify into new activities that are “skill-related” to their current activities and that export diversification is contingent on capabilities [25, 26]. If occupations are viewed as a collection of required skills, similar to the activities of firms or capabilities of countries, then the relationships of skills may play a role in occupational choices, which would impact occupational wages and, in turn, regional output.

In this study, the relatedness of skills is quantified based on the colocation of skills at the regional level. The colocation of skills in a region is used as the basis for a skill network to derive the centrality of skills in the overall network. We anticipate that skills with many connections in the skill network, or skills that occupy unique positions, are likely to be influenced by their connections and positions. Occupations with skills that occupy unique roles in the skill network may allow occupations with highly central skills to command higher wages, resulting in greater output at the regional level. That is, a skill that is positively linked to many other skills may allow workers with the skill to command higher wages because the skill allows those with the skill to perform the highest paying tasks. Aggregating to the regional level, it is perhaps the case that regions with more central skills would record greater economic output per capita. Explicitly, we hypothesize that skill centrality will result in higher wages at the occupational level and higher output per capita at the regional level.

#### 2. Data and Methods

##### 2.1. Occupational Employment

Employment data are taken from the 2018 Occupational Employment and Wage Statistics (OEWS) from the US Bureau of Labor Statistics (BLS) [27]. The OEWS captures annual employment in nearly 800 occupations for each of nearly 400 metropolitan statistical areas.

##### 2.2. Skills Data

We use ONet data, maintained by the Occupational Information Network, as a proxy for skills. ONet data measures hundreds of elements or aspects of occupations defined in the US Standard Occupational Classification (SOC) system. Elements are grouped into various categories, such as skills, activities, knowledge, and abilities. Examples of elements include reading comprehension, negotiating, self-control, integrity, and knowledge of mathematics. Surveys are used to quantify the level and importance of each element for all occupations. Here, we use the level of each element for each SOC occupation, where ONet disaggregates SOC codes beyond what is reported in the OEWS, we take the average level. We use ONet version 24.2 [28], which includes 161 elements that have a ‘level’ value. We refer to these elements as “skills” throughout this paper.

For each occupation, each skill is assigned a value on the continuous interval [0,7], which measures the importance of each skill to the occupation’s routine performance of work. Thus, a skill with level = 4 is more critical to an occupation than a skill with level = 1. These values are updated regularly and obtained through a codified process of consultations with occupational analysts, surveys of workers, and input from occupational experts (see Supplementary Materials for the full list of skills analyzed).

##### 2.3. Urban Areas

Our geographic units of analysis are metropolitan statistical areas (MSAs). MSAs are meant to capture cohesive regional economies that may span several counties. An MSA comprises a core county with a population of 50,000 or more and the surrounding counties that commute to the core county. A discrepancy that arises when using OEWS data is that the BLS uses an alternate definition of regional economies known as New England City and Town Areas (NECTAs) for the six states of New England. While we use NECTAs to calculate the interdependence measure, we subsequently drop NECTAs when comparing the centrality measures against measures of regional productivity, which are not published for NECTAs.

##### 2.4. Performance Data

For comparing the centrality measure developed for occupations, we use the annual average salary of occupations from the national OEWS dataset. For analyzing regional skill centrality measures against productivity, we use gross domestic product at the MSA level and regional population provided by the Bureau of Economic Analysis (BEA) [29].

##### 2.5. Method

To calculate the interdependence of skills, we begin by mapping ONet skill levels, *l*, to the number of workers, *w,* in an occupation, *o*, in a given MSA, *m*. This captures the total skill level, *s*, each MSA has for any individual skill, *i*. Formally,

Next, we determine which MSAs specialize in which skills by adapting the commonly used location quotient. Specifically,

Then, we determine the interdependence of skills using conditional probabilities.

The interdependence of any two skills, *i* and *j,* is calculated by determining how frequently they co-occur as specialized skills. In equation 3, *m*, *m'*, and *m''* are randomly selected MSAs. Skills that co-occur in MSAs more often than would be anticipated at random have a value above one. Skills that co-occur in MSAs less frequently than would be anticipated at random have a value below 1. The interdependence values of all skill pair combinations comprise a co-occurrence matrix. The matrix is symmetric and undirected.

We then use the skill co-occurrence matrix to create a network. Nodes in the network are simply the skills in the co-occurrence matrix. The edges in the network are the undirected interdependence values.

After recasting the matrix as a network, we drop interdependence values less than zero. This is done on the grounds that negative values represent the repulsion of two skills. The purpose of this analysis is to focus on skills that are attracted to one another, thus implying the *benefits* of colocation.

For each node in this skill network, we calculate three centrality statistics: weighted-degree centrality (*d*), closeness centrality (*c*), and betweenness centrality (*b*). The network software Pajek is used to calculate centralities. Weighted degree centrality is defined as

Closeness centrality is defined aswhere is the number of vertices and is the length of the shortest path, geodesic, from node *i* to node *i*. Finally, betweenness centrality is defined as

The betweenness centrality of a node, , is the portion of the geodesics of all node pairs *j* and *k* that go through node *i*. Betweenness centrality measures the importance of a node in connecting various portions of the network. Weighted degree centrality is the number of nodes (skills) that any given node connects with, weighted by the value of the connection. In the case of the skill network, the weighted degree used here is simply the sum of the conditional probabilities that are greater than zero. The closeness centrality of a node is the shortest possible distance from a node to all other nodes, which is the number of nodes given that the shortest distance between any two nodes equals one, divided by the actual shortest distance between the two nodes. The closeness centrality of a node measures the distance the node is from all other nodes. Here, all weighted degree centrality and all closeness centrality are used.

The three measures used are based on the dual-lobed structure of the skill network previously found [23, 24]. The skill network comprises two distinct communities, which have been termed “sensory-physical” and “sociocognitive” [23]. It is anticipated that degree centrality will capture the importance of each skill within its community, while betweenness centrality will capture the importance of a skill in connecting the two communities. It is anticipated that closeness centrality will capture aspects of both the importance within a community and in connecting the two communities.

Next, we use each of the three network measures, retermed for conciseness, and weight occupation skills by the skill centrality measure, . We refer to the resulting measure as occupational skill centrality, *O*. Formally, occupational skill centrality is defined aswhere is the number of skills within each occupation. Occupational skill centrality is simply the average of the levels at which occupation is expected to perform the given skills, weighted by each skill’s centrality. Restating our hypothesis, we anticipate that occupations with skills more central in the network, weighted by the level at which the skill needs to be performed at for the occupation, will command a higher wage. Finally, we calculate the MSA skill centrality, .

The MSA skill centrality is the sum of MSA employment weighted by occupational skill centrality, divided by total MSA employment. That is, MSA skill centrality is the average occupational skill centrality weighted by occupational employment in the MSA. The regional skill centrality thus aims to measure the aggregate centrality of a region’s skills. As we hypothesized that skill centrality will provide wage benefits to occupations, it is likely that these benefits will aggregate up to the MSA level. Therefore, we hypothesize that MSAs with higher MSA skill centrality are likely to have a higher GDP per capita.

#### 3. Results and Discussion

The positive skill network is displayed in Figure 1. The graph is characterized by the dual lobe structure previously uncovered and termed “sensory-physical” and “sociocognitive” [23]. Skills in the sensory-physical lobe include “building and construction,” “finger dexterity,” “peripheral vision,” and “stamina.” Skills in the sociocognitive lobe include “active learning,” “complex problem solving,” “fluency of ideas,” and “social perceptiveness.” Of the 161 skills, 102 falls in the sociocognitive lobe and 59 in the sensory-physical lobe.

The skill network is created using the colocation pattern of skills in regions, with each node in the network representing a skill. Skills that are closer together colocate in metro areas more frequently than anticipated at random and have a higher interdependence value. Negative interdependence values are dropped. The resulting positive skills network displays a dual lobe structure. The graph displays two communities, blue and yellow, detected using the Louvain community detection algorithm, which optimizes modularity. Modularity is high when there are few connections between modules but many within modules. The two communities have previously been termed “sensory-physical” (blue) and “sociocognitive” (yellow).

##### 3.1. Descriptive Stats

The distributions of the three centrality measures examined for the skills interdependence network are reported in Figure 2. Weighted degree centrality and betweenness centrality are both highly skewed. Closeness centrality, in comparison, is closer to a uniform distribution, though it is subtly bimodal. Recalling that betweenness centrality is the share of shortest paths between all nodes that go through a particular node, it is not surprising that betweenness centrality is the most highly skewed centrality measure. There are only a few nodes that link the two lobes of the overall network. The three skills with the highest betweenness centrality, “installation,” “physics,” and “coordination,” are all located in-between the two lobes. In comparison, the two skills with the highest closeness centrality, “resolving conflicts and negotiating with others” and “time management,” are both located well within a lobe. While not explored here, it is likely that the bimodal distribution shown for closeness centrality in Figure 2 is the result of the dual lobe structure seen in the network graph.

Skill centrality is derived from the skill network (Figure 1), which relates skills based on how frequently they co-occur within MSAs as compared with what would be anticipated at random. The weighted degree centrality distribution and betweenness centrality distribution of skills in the network are highly skewed, with some skills having high centrality while most are not. In contrast, the closeness centrality distribution of skills in the network is bimodal, likely reflecting the dual lobed nature of the skill network.

The skill centralities reported in Figure 1 are then mapped to occupations based on the level of skill required for each occupation to create the occupational skill centrality. Although the underlying skill centrality measures for weighted degree and betweenness are highly skewed, all three occupational skill centrality measures are more normally distributed (Figure 3). While an individual skill (node) may have a very high centrality measure, it only accounts for one of the numerous skills that occupation may require. Thus, occupations comprise skills of varying degrees of skill centrality as well as the level at which a worker in a given occupation is expected to perform the skill.

An occupation’s skill centrality is the average of the occupation’s skills, weighted by the level at which that occupation is required to perform the skill. The occupational skill centrality distributions are closer to normal than the underlying skills, resulting from the aggregation of the occupation’s skills.

Finally, the occupational skill centralities are aggregated to MSAs using (6) to create the MSA skill centrality. The MSA skill centrality distributions are even more normally distributed than the occupational skill centralities (Figure 4). The more normally distributed MSA skill centrality is the result of the second round of aggregation. Despite this, there remains variation among the regions analyzed here for all three centrality measures.

An MSA’s skill centrality is the average of occupational skill centrality weighted by occupational employment in the MSA. The MSA skill centrality distributions are even closer to normally distributed than the underlying skills or occupation, resulting from the second round of aggregation.

##### 3.2. Occupation Wages

To test the importance of occupational skill centrality, we compared the standardized occupational skill centrality with the national annual mean wage for each occupation. The relationship between occupational skill centrality and the national annual mean salary is positive and significant for all three measures (Figure 5). Among three centrality measures, degree centrality has the greatest explanatory power (adj. *R*^{2} 0.49), while betweenness has the lowest explanatory power (adj. *R*^{2} 0.25).

These results suggest that occupations with skills that are connected with many other skills demand higher salaries than occupations with skills that are important to the total structure of the network. A plausible reason for this is that people with higher degree skills can use their skills in a variety of different settings and more easily find the highest paying use of the skill. In contrast, occupations with a high betweenness centrality may not recognize the unique position their skills hold in connecting the overall network.

Occupations with greater skill centrality are correlated with higher annual salaries nationally. Occupational skill centrality is standardized, and national annual mean salaries are shown in the log axis. Weighted degree centrality has the highest adj. *R*^{2}, while betweenness centrality has the lowest adj. *R*^{2}. This suggests that skills with many connections command a higher wage than skills that occupy connecting roles in the network.

##### 3.3. MSA Productivity

We find that regional centrality measures are positively and significantly correlated with regional productivity, measured as MSA GDP per capita (Figure 6). Reflecting findings for occupations, degree centrality accounts for the greatest share of variance of GDP per capita (adj. *R*^{2} = 0.23), while betweenness accounts for the least variance (adj. *R*^{2} = 0.14). In results not shown, all three coefficients remain significant when controlling for population. Puerto Rico and NECTAs are dropped for this analysis.

If an MSA’s labor market is composed of occupations having a high skill degree centrality, then the aggregate local labor market is thought to be capable of identifying the best uses of workers’ skills. As in the case of betweenness centrality, the employees in an MSA with high betweenness centrality may not recognize their unique position in the network in order to command higher wages.

MSAs with higher MSA skill centrality are correlated with higher output per capita. MSA skill centrality is standardized, and output per capita is the logarithm of GDP per capita. As with occupation skill centrality, the adj. *R*^{2} is highest for weighted degree centrality and lowest for betweenness centrality, suggesting that the number of links an MSA’s component skills had is more important than the location of those skills in the network.

#### 4. Conclusions, Future Directions, and Policy Suggestions

This paper examines occupations and regional economies in terms of the skills of which they are composed. Skill interdependence is defined by the co-occurrence of skill pairs at the MSA level. Skill pairs that co-occur more frequently than expected by chance are interpreted as having strong interdependencies. The skills and their interdependencies are then conceptualized as nodes and edges, comprising a network. As each skill is located in this overall network, the nodes have a unique centrality measure. These skills are embodied in worker occupations and in the cities in which workers like.

We find that the centrality of skills influences both the occupations that require those skills and the regions where those occupations exist. In particular, we find that greater skill centrality is associated with higher occupational salaries and with higher regional productivity.

The networks used in this study are created after dropping negative values. Negative interdependence values are dropped. Future work could explicitly explore the negative values and their impact on both the networks as well as occupations and regions.

As a tentative policy suggestion, policymakers may be able to consider the occupational skill centrality when deciding between training programs. All else equal, if two occupations were being considered for funding at a local training center or community college, our results suggest that the occupation with higher skill centrality will have a larger impact on wages and regional productivity.

#### Data Availability

All data used in this study come from publicly available datasets as described in the text.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Supplementary Materials

The online supplemental material contains a list of ONET elements, or occupation attributes, for the dataset used in this study. O∗NET provides information on several hundred elements.* (Supplementary Materials)*