#### Abstract

Measurements of multiparticle correlations in the collisions of small systems such as p+p, p/d/^{3}He+A show striking similarity to the observations in heavy-ion collisions. A number of observables measured in the high-multiplicity events of these systems resemble features that are attributed to collectivity driven by hydrodynamics. However, alternative explanations based on initial-state dynamics are able to describe many characteristic features of these measurements. In this brief review, we highlight some of the recent developments and outstanding issues in this direction.

#### 1. Introduction

Relativistic collisions of hadrons and nuclei at the modern colliders provide unique testing ground for QCD at high energies. Over a decade of experimental measurements for a wide range of energies and collision systems have been dedicated to studying the properties of the matter formed in such collisions. A number of measurements have provided strong indications that the QCD matter formed in the collisions of two nuclei (A+A) behaves like a strongly interacting fluid that exhibits collectivity [1–4]. Consistent measurements of strong radial and anisotropic flow, jet-quenching, and so forth have convinced us to establish such properties of the medium. Small system collisions have initially been thought of providing benchmarking measurements for the observations in heavy-ion collisions.

Very recently, several striking observations have been made in p+p, p+A, d+A, and ^{3}He+A collisions [5–13]. Observations in high-multiplicity events for such small collision systems seem to resemble features that are common to A+A collisions and very often attributed to decisive signatures of collectivity due to hydrodynamic evolution. Such observations include appearance of azimuthal correlations that extend in long range rapidity known as ridge, which is also quantified in terms of Fourier harmonic coefficient of , strong multiplicity dependence of mean transverse momentum, HBT radii, and many others.

An outstanding question remains whether such systematics have a collective origin that can be attributed to hydrodynamic evolution like in A+A collisions or a natural consequence due to initial-state dynamics that appear in the final-state observables or a combination of both. Several alternative approaches based on final-state [14–23] and initial-state [24–40] dynamics have provided competing arguments in this context. Interestingly, such a debate on the relative importance of initial-state [41] and final-state effects [42] also took place in the early 90s while addressing the experimental data from p+ collisions at the Tevatron.

In this review, we provide a brief overview on the experimental findings and outline a general theory perspective on the collective phenomena observed in small systems. Subsequently, we focus specifically on theoretical explanations based on the initial-state dynamics. We discuss different theoretical approaches to the calculations and contrast the results with experimental findings. We conclude with a brief summary of the present status and perspectives for future studies.

##### 1.1. Experimental Overview

The current debate of collectivity in small systems was triggered by the discovery of ridge-like correlations that extend over a long range in rapidity in high-multiplicity p+p collisions by the CMS collaboration [5]. Such long range structure of azimuthal correlations was previously seen in heavy-ion collisions at RHIC [43–45] and LHC [46] and generally attributed to the nearly boost invariance structure of the azimuthal correlations driven by hydrodynamic flow. At the same time, causality arguments suggest [47] that such correlations must develop at the very early stages of the collisions, indicating the strong influence of initial-state dynamics on such observable. While the observation of similar ridge-like structures made in high-multiplicity p+Pb collisions at the LHC was expected [6–8], a surprise was that the signal strength at the same multiplicity was stronger in p+Pb compared to p+p. The latest inclusion to these measurements is the highest energy p+p collisions so far at 13 TeV [12, 13]. Including the 13 TeV and 7 TeV data, both ATLAS and CMS collaborations have demonstrated that the strength of ridge correlations (characterized by the near-side yield) at a given multiplicity is independent of collision energy. However, a more pronounced energy dependence appears to be present in higher-order cumulants such as which unlike the case at lower energies appears to exhibit a clear sign change as a function of multiplicity in 13 TeV p+p collisions [48]. Meanwhile, RHIC, being a versatile machine, has collided ^{3}He+Au [11] along with d+Au [9, 10] and p+Au [49] to perform similar measurements of azimuthal correlations. Such measurements indicate the presence of significant and in p+Au, d+Au, and ^{3}He+Au, the systematics of which is very similar to what is commonly seen in A+A collisions. Analyses common to A+A have been repeated in both p+Pb and p+p at LHC by triggering on high-multiplicity events. Measurements of higher-order harmonics of azimuthal anisotropy , mass dependence of , higher-order cumulants of , and many others are now available (for a comprehensive review, we refer the reader to [50, 51]).

By now, a number of intriguing results observed in p+p and p/d/^{3}He+A have accumulated, including a strong multiplicity dependence of average transverse momentum that is attributed to radial flow as well as the observation of sizable Fourier harmonic coefficients up to and its higher-order moments of the azimuthal correlation generally attributed to anisotropic flow. Most importantly, several characteristics, such as the mass dependence of both and , have been found to be similar to what is seen in A+A collisions.

However, it is worth mentioning that some striking contrasts also exist. Unlike in A+A collisions, where the observation of jet-quenching has been one of the pillars of the discovery of a strongly interacting Quark-Gluon plasma (QGP) [52, 53], so far no evidence of (mini) jet-quenching has been found in small systems [54–57]. Even though the standard jet-quenching analysis in small systems is complicated due to trigger bias effects, the absence of such phenomena may provide important insights with regard to the theoretical interpretation of the observed phenomena.

##### 1.2. General Theoretical Perspectives

It is useful to first address the question about the origin of long range azimuthal correlations (shown in Figure 1) from a more general point of view and formulate our theoretical expectations based on previous observations in small and large systems. While causality arguments imply that any long range rapidity correlation must originate from the very early stages of the collision [47], this leaves open the question how the observed momentum space correlations are created dynamically during the space-time evolution. Specifically, one can, at least from a theoretical point of view, distinguish two different mechanisms whereby momentum space correlations of hadrons produced in the final-state reflect(i)intrinsic momentum space correlations of the partons produced in initial (semi-) hard scatterings(ii)position space correlations between initial-state partons, for example, the initial-state geometry, which are transformed into momentum space correlations due to final-state interactions.

While, in any realistic scenario, both kinds of correlations (i) and (ii) contribute to the long range azimuthal correlations, their relative strength depends on the magnitude of final-state effects. In low-multiplicity p+p collisions, for example, the dominant source of long range azimuthal correlations is due to the production of back-to-back (mini-) jets. Since in this case the density of produced partons is low, the typical (semi-) hard partons produced in the initial scattering escape the interaction region without final-state effects significantly affecting their back-to-back correlation. Considering on the other hand soft particle production amidst large parton densities in nucleus-nucleus collisions, it is well established that the azimuthal anisotropy of, for example, 1 GeV particles is dominated by the final-state response to the initial-state geometry. In this case, the mean-free path of a typical (semi-) hard parton is small compared to the system size, such that the initial-state momentum correlations of ~GeV partons are destroyed during the equilibration process. Therefore, the subsequent dynamics of the equilibrated QGP can be accurately described by relativistic hydrodynamics.

Even though it is sometimes possible to choose the kinematics such that one mechanism dominates over the other, there are various examples in between where both initial-state and final-state effects are important. One prominent example includes the behavior of jets in heavy-ion collisions. While highly energetic jets can escape the interaction region without equilibrating, they can loose a significant part of their energy through interactions with the softer medium. Even though the dominant correlation of the leading high- particles is still due to the initial back-to-back correlation, the path length dependence of the energy loss in the medium also leads to an additional correlation with the initial-state geometry. Such correlations are reflected, for example, by the high-momentum measuring correlations between soft and hard particles.

Clearly, the aforementioned examples illustrate that it is important to consider both initial-state momentum space correlations and the response to the initial-state geometry due to final-state effects in order to describe azimuthal correlations in small systems over a wide kinematic range. Our qualitative expectation is illustrated in Figure 2, where the azimuthal correlation strength due to initial-state and final-state effects is shown versus the event multiplicity, for example, in p+p collisions for a fixed transverse momentum range, for example, 1–3 GeV. Based on our discussion, we expect that in low-multiplicity or minbias events the azimuthal correlations between 1–3 GeV particles are predominantly due to back-to-back minijets (peaked at ). With increasing event multiplicity, the contribution from multiparton processes, such as the “Glasma graphs” (Section 2.3.3), becomes increasingly important resulting in azimuthal correlations that have a symmetric structure in relative azimuthal angle around . When increasing the multiplicities even further, final-state interactions in this transverse momentum region can no longer be neglected at some point and lead to a depletion of initial-state correlations. Even though minijets do not fully equilibrate yet, the system starts to show a response to the initial-state geometry, which in this low opacity region is presumably dominated by the path length dependence of the parton energy loss, also referred to as parton escape mechanism [58]. Ultimately, in the limit of very high multiplicities, minijets are fully quenched, resulting in the formation of a thermalized medium and the complete loss of initial-state momentum space correlations. In this high opacity regime, azimuthal correlations are dominated by the response to initial geometry described by a hydrodynamic expansion of a thermalized Quark-Gluon plasma.

One can attempt to further estimate the multiplicities corresponding to the transitions from the initial-state to the final-state dominated regime, exploiting recent theoretical progress in the understanding of the equilibration process [65]. Since the equilibration time at weak coupling corresponds to the time scale when a semihard parton ~ looses all its energy to form a soft thermal bath, one naturally expects the crossover from the initial-state to final-state dominated regime to occur when the associated equilibration time becomes comparable to the system size . Conversely, as long as typical semihard partons escape without encountering significant final-state interactions, whereas for semihard partons are fully quenched, equilibrium is reached early on and the dynamics is dominated by the subsequent hydrodynamic expansion. Based on the estimate of the equilibration time for at realistic coupling [66, 67] and the multiplicity with [68], we obtain thatcorresponding to a crossover at around , which in fact is much larger than the minbias multiplicities reached in p+p or p+Pb collisions (the typical values of in minbias p+p and p+Pb collisions at LHC are about ~6 and ~17 [69] and [70], resp.). We caution however that the estimate in (1) is inferred from leading order weak-coupling calculations and should only serve as a ballpark figure.

Beyond simple analytic estimates, probably a promising alternative approach is to directly attempt an extraction of the boundaries between the different regimes through detailed comparisons of theory and experiment. While a first-principle theoretical description is complicated throughout most of the multiplicity regimes shown in Figure 2, significant theoretical progress has been made in understanding the features of initial-state correlations in the regime where final-state effects can be neglected. In the following, we will review the theoretical computation of initial-state correlations in the color glass condensate (CGC) effective field theory of high-energy QCD and critically assess to what extent these calculations are compatible with the experimental observations.

#### 2. Multiparticle Production in the CGC Framework

##### 2.1. High-Multiplicity Events

Experimental observations suggest that long range ridge-like correlations in small colliding systems appear in high-multiplicity events. Before we turn to a more detailed discussion of possible mechanisms to produce such correlations, a first necessary step is to understand the origin of high-multiplicity events that populate the long tail of experimental multiplicity distributions. Considering the most elementary case of p+p collisions, high-multiplicity events are a consequence of three major sources of fluctuations:(1)Geometry of collisions(2)Intrinsic saturation scale of the proton(3)Distribution of color charge density inside the protons.While one naturally expects a strong impact parameter dependence of the multiplicity in p+p collisions [71], the importance of additional sources of initial-state fluctuations have only been realized recently. Significant progress in including all of the above into a consistent phenomenological description has been made within the IP-Glasma model which is based on the framework of CGC [72, 73].

Nonperturbative large effects, which are not captured in the conventional CGC framework, are expected to give rise to fluctuations of the intrinsic saturation scale of the proton [74–79] as illustrated in Figure 3(b). In [61], intrinsic fluctuations of the proton saturation scale were introduced in the IP-Glasma model [61] according to a distribution: with the variance adjusted previously to the inclusive charged particle multiplicity and rapidity distributions in p+p collisions over a wide range of energies 0.2–7 TeV [61]. An interesting consequence of such fluctuations is that, even in symmetric collision systems such as p+p, event-by-event fluctuations of the saturation scale of each proton lead to an asymmetry of the rapidity distribution of the produced particles on an event-by-event basis [80]. Consequently, high-multiplicity p+p collisions in particular are always asymmetric and in a sense expected to look more like p+A collisions.

**(a)**

**(b)**

**(c)**

The saturation scale and the geometric profile of the proton, which are the two most important ingredients in the IP-Glasma model, are obtained from the IP-Sat parameterization of the HERA DIS data [81, 82]. While in the original IP-Sat model the proton shape was assumed to be round, recent modification to IP-Sat has been done by assuming the gluon density to be distributed around three valence quarks inside the proton (see Figure 3(a)) [60, 83] and several other models have started to include similar kinds of fluctuations [84, 85]. Interestingly, such subnucleon scale fluctuations can be constrained by incoherent diffractive vector meson production and improved agreement with existing HERA data can be achieved [60]. It has also been pointed out that geometric fluctuations of the proton can provide a dynamical explanation for the “hollowness effect” observed in elastic p+p scattering at LHC energies [86] and it would further be interesting to explore to what extent the modeling of the proton geometry can be aided by first-principle lattice QCD calculations.

With the initial-state including different sources of fluctuations constrained by the HERA data, the -particle production probability can be computed within the color glass condensate framework. For a given configuration of initial color charge, the -particle distribution is a negative binomial distribution (NBD) with mean and width related to the saturation scale [87]. Due to fluctuation of impact parameter, a convolution of many such NBDs gives rise to the final probability distribution of multiplicity. However, it has been demonstrated that such distribution is narrower compared to the data. Only after including the intrinsic fluctuations of the proton saturation scale, one can describe the tails of the experimental multiplicity distributions (Figure 3(c)). Since multiplicity is dominated by low momentum ( 1 GeV) gluons, it is very challenging to implement a scheme of fragmentation for the production of soft hadrons. Therefore, the number of produced charged particles is generally taken to be proportional to the number of gluons. The results shown in Figure 3 indicate that the high-multiplicity events that populate the tail of distributions are generated due to rare high color charge density configurations of the wave functions of the colliding systems. In the next section, we argue that the same underlying dynamics that leads to the origin of high-multiplicity events also drives the systematics of multiparticle correlations in small collision systems.

##### 2.2. Qualitative Discussion of Initial-State Correlations

Multiparticle production in Quantum Chromo Dynamics (QCD) naturally leads to correlations between particles produced in high-energy collisions. A complete theoretical understanding of these effects though is extremely challenging. Nevertheless, significant progress has been achieved in recent years based on the CGC effective field theory (EFT) of high-energy QCD, which provides the basis for phenomenological applications at RHIC and LHC energies.

Let us focus our discussion on the origin and systematics of the two-particle correlations seen at LHC. By far, the most well established source of long range two-particle azimuthal correlations is due to the production of back-to-back dijets. Such processes (also referred to as “Mueller-Navelet” jets [88]) are depicted in the right panel of Figure 4 and can be computed within standard perturbative QCD. Dijet production is kinematically constrained to produce only away-side (peaked at ) collimations and dominates in low-multiplicity or minbias events. However, in high-multiplicity events, one is probing rare configurations of the proton where in addition to the production of dijets from a single hard scattering, multiparton processes become increasingly important. A first calculation of these effects in the CGC framework was based on evaluating the associated Feynman diagrams referred to as “Glasma graphs,” depicted in the left panel of Figure 4. Such graphs give rise to nonfactorizable two-particle correlations that have a symmetric structure in relative azimuthal angle around (see Figure 4). When decomposed in terms of the Fourier coefficients of the particle distributions, they give rise to nonzero even harmonics . Beyond the lowest order processes depicted in the left panel of Figure 4, further contributions to the azimuthal collimations come from the multiple scattering of partons leading to both even and odd . Such processes can be included in a classical Yang-Mills description and will be discussed in more detail in a following section.

Since interference effects between Glasma graphs and jet graphs vanish to lowest order in the kinematic regime , the resulting two-particle correlations function as a direct sum of both contributions:The relative strength of the dijet production represented by the “jet graph” and the “Glasma graphs” determines the features of the observed dihadron correlations as shown in Figure 4. In high-multiplicity events, the Glasma graphs are enhanced by a relative factor of compared to the “jet graphs”; one therefore naturally expects to see a pronounced near-side collimation at that extends over a wide range of rapidity referred to as the “near-side ridge.” The fact that near-side collimation extends far in rapidity is a consequence of the nearly boost invariant nature of the Glasma gluon fields. While qualitatively these features are indeed present in the experimental data, of course it requires detailed theory calculations to establish the quality of agreement. In the remainder of this section, we will outline the essential steps in the computation of initial-state correlations in the CGC framework. A summary of comparisons with experimental results is presented in Section 3.

In the CGC framework, colliding protons and nuclei are effectively described as static sources of color charge on the light-cone that generate color currents: The color charge densities in each colliding hadron or nucleus fluctuate from event to event and their statistical properties are constrained by independent measurements. Computation of multiparticle production in the CGC framework is based on the calculation of the classical Yang-Mills fields created from such color currents by solving the Yang-Mills equations: Different theoretical descriptions within the CGC framework employ different levels of approximation which are discussed in more detail in the following.

##### 2.3. Perturbative Computation

In the perturbative framework, one tries to obtain an analytical solution of the gauge fields by performing order-by-order expansion of (5) in powers of the color sources . In the dilute-dense framework which assumes lowest order in and all orders in or in the dilute-dilute framework, one can derive analytical expressions for n-gluon production in the -factorized form [89]. The essential ingredients to such factorization relations are the correlator of the dilute-sources in or the Wilson lines corresponding to the dense source in momentum space. Such correlators are represented as unintegrated gluon distributions (UGDs) and can be expressed in terms of , the forward scattering amplitude of a quark-antiquark dipole of transverse size on a proton/nuclear target, through the expression

The -dependence of is determined by the rapidity () evolution of the dipole scattering amplitude implemented in the Balitsky-Kovchegov (BK) renormalization equation [90, 91] which is a simplified form of the JIMWLK renormalization equations [90, 92–95]. The leading order expression of the BK equation is given by where and is the BFKL kernel. The implementation of the kernel often used for phenomenology includes a running coupling next-to-leading-log (NLL) correction to BK and is referred to as the rcBK equation [96] given byEquation (8) requires an initial condition for , one choice of which is the McLerran-Venugopalan (MV) model [97, 98] with a finite anomalous dimension given bywhere is a nonperturbative scale. This MV-like parameterization is constrained by global fits to the DIS data [99] although it must be noted that MV along with BK evolution does not include spatial geometric structure of the proton. The scattering amplitude is known to have a strong dependence on the impact parameter [81] which is incorporated in other parameterizations such as IP-Sat [82, 100] or b-CGC [81] models of DIS. More recently, significant progress consistently including the full NLL corrections to BK evolution into DIS fits [101–103] has been made. However, so far, this has not been included in phenomenological studies of p+p and p+A collisions.

With the unintegrated gluon distribution obtained from (6), one can estimate the production of -gluons using the factorization approach. In the following section, we describe the approach for single-inclusive, double-inclusive, and jet production essential for the phenomenology in p+p and p+A collisions.

###### 2.3.1. Single-Inclusive Gluon Distribution

The single-inclusive gluon production corresponds to the simplest process that describes the emission of a single gluon of momentum which in the perturbative framework can be written in the -factorized form aswhere and is the transverse overlap area. It must be noted that this -factorized form of single gluon production has a logarithmic infrared divergence which is generally regulated by putting a lower cut. It must be noted that (10) assumes that the dependence on transverse geometry or the impact parameter of collisions has already been integrated out and absorbed into ; a more general expression in such a context can be found in [89].

###### 2.3.2. Dijets

In the perturbative framework, estimation of the two-particle correlations in p+p and p+A collisions is based on a direct computation of dijet and the Glasma graphs in the -factorization approximation [27–29, 31, 40]. Such approximations are valid for momenta above the saturation scale and do not include multiple-scattering effects.

The dijet contribution in this framework is estimated [88, 104, 105] to bewhere is BFKL Green’s function that generates gluon emissions between the gluons that fragment into triggered hadrons. The form of is given by [31].Here, is the BFKL eigenvalue with being the logarithmic derivative of the Gamma function with the effective coupling factor and . For a pair of hadrons with a rapidity separation of , does the necessary resummation of the rapidity ordered multigluon emissions. The effect of such gluon emission leads to angular decorrelation that affects the observed dihadron correlation [88]. The diagrams corresponding to such BFKL emission and its impact on broadening the away-side structure of the azimuthal dihadron correlation are shown in Figure 5. In the limit, one can obtain the well known form of the dijet cross section expression in the Multi-Regge kinematics (MRK) [105, 106]: As we discuss in the following section, the quantitative estimation of the dijet cross section is essential for the description of the dihadron correlation observed in high-multiplicity events of p+p and p+Pb collisions.

###### 2.3.3. Glasma Graph

In the context of two-particle correlations, there are a total of eight topologies of the Glasma graph (full expression can be found in [31]); the contribution for one such diagram to the two-particle correlations can be written asBeyond the straightforward, perturbative result, the nonperturbative factor has been introduced to account for the contributions to multiparticle production below the scale . Constraints from the independent analysis of -particle multiplicity distributions [73, 107–109] suggest that be in the range and phenomenological studies typically employ the value of obtained from [108, 109].

The collimation in the Glasma graph framework comes from the fact that each of the four UGDs in (14) has a bell-shaped structure with a maximum corresponding to the saturation momentum. Such nature of the UGDs kinematically constrains two gluons to be produced in similar (or back-to-back) directions giving rise to the double ridge structure in two-particle correlations [27, 110].

##### 2.4. Dilute-Dense Models

Besides the perturbative diagrammatic approach taken in the Glasma graph calculation, several studies have been launched to investigate the origin of structure of the long range azimuthal correlations in the limit where a dilute projectile of individual partons scatter off the color fields of a dense projectile. Neglecting correlations of the incoming partons in the projectile, the double-inclusive distribution of the scattered partons takes the form [37, 39]where denotes the Wigner function of quarks/gluons inside the projectile and is the usual dipole-correlator of light-like Wilson lines in the fundamental/adjoint representation. While the transverse momenta and of the two incoming quarks are uncorrelated, it is evident from (15) that the correlations in the momentum transfers and give rise to azimuthal correlations of the scattered partons and . We note that, in contrast to the perturbative calculation, present calculations in the dilute-dense description usually do not take into account rapidity evolution between the produced particles and are therefore limited to the kinematic range where where evolution effects can be neglected. While recent progress has been made in formulating evolution equations for multiparticle production in dilute-dense systems [111], these have not been employed for phenomenological studies so far.

By evaluating the dipole operator in short-distance expansion where denotes the light-cone electric field, one can establish an intuitive picture of the origin of the long range near-side correlations [24]. When a projectile parton scatters off the color field of the nucleus, it receives a transverse momentum kick in the direction of the color-electric field of the target. Color fields fluctuate from event to event and are locally organized in domains of size ~1 as illustrated in Figure 6. When two (or more) quarks scatter off the same domain, they will receive a similar kick whenever they are in the same color state. Naturally, this leads to a correlation which is suppressed by (in the limit of large ) and by the number of domains , where denotes the transverse area probed by the projectile.

Several studies have computed the azimuthal harmonics of the two-particle correlation function in (15), either based directly on numerical evaluations including JIMWLK evolution [37, 39] or within (semi)analytic models [33, 112, 113]. Generally, the double-inclusive spectrum in (15) features sizable azimuthal correlations which are on the order of and most pronounced when both parton momenta and are on the order of the saturation scale. While the double-inclusive spectrum for the two incoming quarks features both even () and odd () harmonics, it turns out that the odd harmonics for gluons vanish identically due to an exact symmetry of the the correlation function under . Even though the dilute-dense framework described above certainly does not provide the most realistic initial-state model for midrapidity particle production in high-multiplicity events, it turns out that the absence of odd harmonics for gluons poses a more general problem within the CGC framework. However, as discussed in Section 3, recent simulations including the early-time classical Yang-Mills dynamics have been able to resolve this puzzle to some extent.

##### 2.5. Classical Yang-Mills

Beyond the approaches outlined above, there have also been new theoretical developments in the study of initial-state correlations in event-by-event simulations in classical Yang-Mills theory [38, 72, 73]. In this approach, (5) is solved numerically for the individual colliding hadrons or nuclei in Lorentz gauge , whereand then transformed into the light-cone gauge , where one finds [114–119]The infrared regulator in (17) is of order and crudely incorporates color confinement at the nucleon level by damping the Coulomb tail. The gauge field in the forward light-cone after the collision at time is given by the solution of the CYM equations in Fock–Schwinger gauge in terms of the gauge fields of the colliding nuclei [120, 121]: Such gluon fields produced after the collision are evolved in time according to Classical Yang-Mills equations up to time to estimate the gluon spectrum by imposing Coulomb gauge and extracting the equal time correlation function [122, 123]:where denotes the Bjorken metric and labels the two transverse polarizations. In Coulomb gauge, the mode functions take the formwhere denote the Hankel functions of the second-type and order (see [120] for details).

The calculations in the framework of classical Yang-Mills naturally include the Glasma graphs and extend the reach of perturbative calculations towards lower by consistently including multiple-scattering effects as well as coherent rescattering in the final-state. Since event-by-event simulations in the classical Yang-Mills theory also allow for an improved treatment of the impact parameter dependence, they can be used in the future to systematically study initial-state effects across different collision geometries, for example, in p+A, d+A, and ^{3}He+A collisions at RHIC [10, 11, 122].

##### 2.6. Hadronization

So far, we have outlined the computation of initial-state correlations at the parton level. The mechanism of hadronization converts the partonic correlations driven by initial-state dynamics into correlated production of final-state particles. Implementation of a realistic scheme of hadronization is therefore essential for the phenomenology of small systems collisions. A first-principle QCD based approach to such a problem is very challenging. One therefore resorts to several available approximation schemes for fragmentations of partons. The most commonly used approach is the standard parton-hadron independent hadronization scheme [123] in which, for example, the single-inclusive hadron distributions are obtained by convoluting the gluon distributions with fragmentation functions aswhere denotes the probability that a gluon fragments into a hadron carrying fraction of its momentum at a scale . The lower limit of the integral is determined from the kinematic requirement that the momentum fraction of the gluons . Commonly used forms for such as KKP or DSS fragmentation functions are obtained from fits to the inclusive hadron production data in e^{+}+e^{-} and p+p collisions [124, 125]. In case of double-inclusive production relevant for the study of two-particle correlations, one assumes an ansatz of the form [27]A limitation of this standard approach of hadronization is that the applicability of the fragmentations functions is questionable at low virtuality . Therefore, such scheme of hadronization can not be used for bulk particle production which is dominated by soft processes with typical virtuality scale 1 GeV. In this regime, an alternative approach such as the Local Parton-Hadron Duality (LHPD) [126] can be used to describe bulk particle production driven by initial sate dynamics [127].

The state-of-the-art scheme of hadronization to describe bulk particle production used in several event generators like PYTHIA [128, 129] is based on the Monte-Carlo implementation of the Lund string fragmentation function: where and denote the transverse mass and the light-cone momentum fraction of the fragmenting hadron. The default parameters and are constrained by global analysis of p+p data. The hadronization scheme in PYTHIA also includes decays of hadron resonances and final-state hadronic interactions. A quantitative study of the effect of hadronization on initial-state parton level correlations using different schemes of fragmentation has been done in [130]. The study indicates that the structure of the final azimuthal correlations is very much sensitive to the choice of fragmentation. For example, the resonance decay and hadronic interactions can distort the initial-state correlations and introduce artificial final-state correlations in the produced hadrons. Such mechanism will complicate the interpretation of experimental data. In such a context, it has been demonstrated that hadronization effects in PYTHIA combined with the scheme of color reconnection produce effects that can mimic collective flow like pattern leading to a strong mass ordering of [131] in p+p collisions. In addition, the hadronic transport models that implement the Lund string fragmentation of PYTHIA have also been shown to qualitatively reproduce the systematics of measured in p+Pb collisions [132]. Clearly, more phenomenology in this direction will further improve our understanding. Meanwhile, it is essential to implement a state-of-the-art framework of hadronization to test different features of multiparticle correlations in the CGC at the level of hadrons; work in this direction is in progress [133].

#### 3. Comparison with the Latest Experimental Observations

##### 3.1. p+p Collisions

We begin with a comparison of model calculations with the latest data in p+p collisions at LHC energies [12], where the quantity of experimental interest is the rapidity and momentum integrated correlation function defined aswhich so far has only been computed in the perturbative CGC approach. The quantity of experimental interest is the near-side yield defined as the ZYAM subtracted integrated associated yield per trigger given byHere, corresponds to the minimum of the dihadron correlation function (ZYAM) and is the number of trigger particles. A quantitative comparison of the near-side yield in p+p collisions shown in Figure 7 yields a good agreement between initial-state calculations [40] and experimental data at both 7 and 13 TeV. The striking feature of the data model comparison shown in Figure 7 is that the near-side yield is approximately energy independent. Such energy independent scaling is naturally explained within the framework of CGC [40]. Origin of such scaling can be understood as follows. In the CGC framework, a single scale determines both the single and double-inclusive production in p+p collisions. The dependence on the center of mass energy in both multiplicity and the near-side yield enters only through ; that is, where . Fixing the multiplicity therefore naturally fixes the value of giving energy independent scaling of the ridge-like correlations. Such scaling provides strong indication of multiparticle production driven by a single semihard scale as captured in the CGC.

**(a)**

**(b)**

##### 3.2. p+A Collisions

A detailed quantitative estimation of the dihadron correlations in p+Pb collisions at the LHC using the perturbative dijet and Glasma graph framework has been performed in [31]. One of the interesting results from [31] shown in Figure 8(a) indicates that the CGC framework naturally explains the increase in the near-side yield in p+Pb collisions as compared to p+p collisions for events with the same multiplicity measured by the CMS collaboration. Results shown in Figure 8(b) indicate that such framework also very well reproduces the peripheral subtracted correlation function measured by the ALICE collaboration in p+Pb collisions. Aforementioned due to the limitations of the perturbative framework, such estimations can be performed only for trigger and associated momentum 1 GeV.

**(a)**

**(b)**

Recent simulations in the framework of classical Yang-Mills that go beyond such limitations of the perturbative regime have already opened the path for improved phenomenology. Even though at present these calculations do not yet include dijet graphs (note that also the interference contribution between Glasma graphs and jet graphs no longer vanishes beyond leading order) and hadronization effects and thus do not allow for a direct comparison with experimental data, simulations in this regard have led to new insight into the correlations at the parton level concerning in particular the dynamics of the correlations during the very early stages. The results from the recent classical Yang-Mills simulations performed in p+Pb collisions [38] are shown in Figure 9. While gluons are produced with a significant momentum space anisotropy at , initially the two-particle correlation function is symmetric around and features only even harmonics in accordance with the perturbative result [38]. However, including the effects of the classical Yang-Mills evolution up to leads to the buildup of a sizable on the parton level, while the initial-state remains intact [38]. These results indicate that at least at the gluon level the nonperturbative dynamics of the Glasma fields gives rise to large values of both and at time scales after the collision that are comparable to what is seen in the data. Interestingly, sizable and in this framework also extend to large transverse momentum probably beyond the applicability of hydrodynamics in p+Pb collisions [134].

**(a)**

**(b)**

While a systematic comparison of the initial-state calculations to experimental data in p+Pb collisions yields a similar level of quantitative agreement of two-particle correlations for momenta [29, 31], there has been enormous progress on the experimental side to identify additional signatures of collective motion which could be indicative of the onset of significant final-state effects. Even though some observables, such as mass ordering properties observed in correlations between identified hadrons [135], are not necessarily sensitive to the origin of the correlations, more promising directions including, for example, correlations between more than two particles have also been explored [136, 137]. One of the most striking observations in this regard is the sign change of the four-particle cumulant observed around by ALICE [137]. However, most of the recent measurements have focused on low observables where the perturbative calculations [27–29, 31, 40] outlined in Section 2.3 do not necessarily apply, hence complicating the comparison between theory and experiment.

Nevertheless, there have been first attempts to extend the theoretical framework to understand whether certain features of the low data such as the sign change of can also be explained from initial-state effects. Based on the dilute-dense approximation in (16) first attempts have been made to study initial-state correlations of more than two particles. Generalizing the dilute-dense formalism to -particle scattering, it was pointed out in [35, 39, 138] that the four-particle cumulant is sensitive to non-Gaussian correlations of the color-electric fields inside the target. While the lowest order perturbative contribution to was found to be positive, it was argued [35] that non-Gaussian correlations of the domains of color-electric fields may explain the experimentally observed sign change of as a function of multiplicity. Even though a dynamical explanation for the existence of sizable non-Gaussian correlations is still lacking, this topic is an active subject of further investigations. However, one should also caution that in contrast to the two-particle cumulant , where short-range correlations are suppressed by introducing a large rapidity gap, the four-particle cumulant also receives contributions from short-range correlations and interference diagrams. While a complete theoretical calculation of higher-order -particle correlations is desirable, further progress is needed to tackle this problem.

Similarly, preliminary results from event-by-event simulations in classical Yang-Mills theory a la [38] also suggest a negative sign of when all particles have low momenta, while at high the four-particle cumulant is always positive [139]. While the first results are interesting, further theoretical progress is needed to decide unambiguously whether the sign change in can be explained in terms of initial-state and early-time effects. It would also be interesting to extend the experimental measurements of higher cumulants towards higher momenta to achieve convergence on this issue.

#### 4. Summary

Experimental observations of collective correlations in small systems challenge our current understanding of the space-time evolution of high-energy collisions. While at low multiplicities one expects dominance of initial-state effects, it is also expected that at sufficiently high multiplicities initial-state correlations are destroyed due to final-state interactions and the hydrodynamic response to the initial-state geometry provides the dominant source of correlations. However, it is theoretically challenging to predict the transition from initial-state to final-state dominated dynamics pointing to the importance of developing a unified theoretical framework where both effects are consistently taken into account.

In this brief review, we specifically outlined recent developments in the approach based on initial-state dynamics. While calculations based purely on initial-state correlations are able to quantitatively describe various features of the experimental data in p+p and p+Pb collisions up to the highest multiplicity windows, a simultaneous description of the low and high observables over a wide range of multiplicity remains challenging within any single theoretical framework. Of course, several outstanding issues remain on the theory side and the development of a unified framework that combines (1) dynamics of multiple-soft interactions with (2) a first-principle calculation of jet production and (3) a state-of-the-art fragmentation scheme is essential for a wide range of phenomenological applications. So far, experimental efforts have focused on soft observables such as s; however, complimentary information on the dynamics can be obtained by considering higher-momentum probes. While in heavy-ion collisions the observation of strong jet-quenching phenomena provides an important indication for the formation of a strongly interacting Quark-Gluon plasma, no such features have been reported so far in small systems. Even though such an analysis is experimentally challenging, it would be important to establish to what extent minijets and actual jets are modified in high-multiplicity events to further constrain the relative importance of initial-state and final-state effects in small systems.

#### Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

The authors thank Kevin Dusling, Ulrich Heinz, Tuomas Lappi, Wei Li, Derek Teaney, and Raju Venugopalan for important discussions. The authors thank Bjoern Schenke for careful reading of the manuscript and helpful comments and suggestions. The authors are supported under U.S. Department of Energy Contract no. DE-SC0012704. Sören Schlichting acknowledges support under U.S. Department of Energy Grant no. DE-FG02-97ER41014.