#### Abstract

In the past several years kinematic data sets from Milky Way satellite galaxies have greatly improved, furthering the evidence that these systems are the most dark matter dominated objects known. This paper discusses a maximum likelihood formalism that extracts important quantities from these kinematic data sets, including the amplitude of a rotational signal, proper motions, and the mass distributions. Using a simple model for galaxy rotation it is shown that the expected error on the amplitude of a rotational signal is 0.5 km with stars from either classical or ultra-faint satellites. As an example Sculptor is analyzed for the presence of a rotational signal; no significant detection of rotation is found, with a 90% c.l. upper limit of 2 km . A criterion for model selection is presented that determines the parameters required to describe the dark matter halo density profiles and the stellar velocity anisotropy. Applied to four data sets with a wide range of velocities, models with variable velocity anisotropy are preferred relative to those with constant velocity anisotropy, and that central dark matter profiles both less cuspy and more cuspy than Lambda-Cold Dark Matter-based fits are equally acceptable.

#### 1. Introduction

Since their initial discovery [1], dwarf spheroidals (dSphs) have offered a unique insight into the formation of galaxies and structure on the smallest scales. Initially characterized as unusual and ghostly stellar systems, photometric studies tended to find that these systems contained old stellar populations with no recent signature of star formation activity [2]. Though photometrically well studied since their discovery over seventy years ago, as late as nearly 30 years ago minimal was known on the internal kinematic properties of their stellar populations or on the kinematic properties of these objects in the Milky Way (MW) halo.

Aaronson [3] provided the first measurement of the line of sight velocities of stars in Milky Way dSphs. From the spectra of merely three carbon stars, Aaronson suggested a mass-to-light ratio for the Draco dSph nearly an order of magnitude greater than that of Galactic globular clusters. Follow-up studies of several dSphs, including Sextans, Fornax, Ursa Minor, Sculptor, increased the velocity samples by an order of magnitude, and in the process established these systems to be dark matter dominated [4–6]. It was further suggested that all of these systems share a similar dark matter halo mass of [1–5] [6]. Even at the time of these early measurements, it was understood that the mass distributions of these systems provide strong constraints on the properties of the particle nature of dark matter, including its mass and primordial phase-space density [7–9].

With the advent of high resolution, multiobject spectroscopy, the velocity samples from the brightest dSphs initially studied in [4–6] have now increased by up to three orders of magnitude [10–12]. These new data sets have revealed that the velocity dispersions of the systems are all 10 km , and in all cases the dispersions remain constant even out to the projected radius of the outermost velocity measurements [11]. Though the data sets have increased by more than tenfold, the more modern analysis of these systems still confirms the global conclusion established from the initial observations that dSphs are strongly dark matter dominated [10, 11, 13, 14].

Not only has the past several years seen an increase in the kinematic data sets for the brightest dSphs, the number of known Milky Way satellites has more than doubled due to the Sloan Digital Sky Survey (SDSS). As of the writing of this paper, the SDSS has discovered 14 new Galactic satellites [15–17]. The new SDSS systems have lower luminosities and surface brightnesses than the 11 classical Milky Way satellites that were known prior to SDSS. The half-light radius for several of these new objects is less than 100 pc; this radius is smaller than the typical half-light radius of the classical satellites but still somewhat larger than the typical globular cluster half-light radius of 1–10 pc.

Several kinematics studies on the ultra-faint population of SDSS satellites have been undertaken in the past several years [18–21]. Using spectra from eight of the SDSS satellites, Simon and Geha [20] concluded that these objects are strongly dark matter dominated. Several of the ultra-faint satellites have velocity dispersions as low as 5 km , making them the most-promising systems to study the phase space limits of the dark matter. It has additionally been observed that the ultra-faint satellites are the most metal-poor systems known, and that they form a continuation of the luminosity metallicity trend set by the brightest dSphs [21, 22].

With the above data sets now available, it is becoming increasingly necessary to develop better theoretical tools to interpret them. An important aspect of the theoretical modeling will necessarily require an interpretation of the kinematic data sets for the population of MW satellites; a detailed understanding of these kinematic data sets will be important not only for determining the mass distributions of each individual system, but for a global comparison to theories of Cold Dark Matter (CDM) [13, 23]. Understanding the mass distributions will also be important for interpretation of limits on particle dark matter masses and annihilation cross sections in high-energy gamma-ray experiments [24–26]. Further, understanding the kinematics of these systems may eventually reveal whether they have dark matter cusps or cores, which would in itself provide a stringent test of the CDM paradigm [27].

The primary aim of this paper is to discuss a maximum likelihood formalism that is used for extracting important physical quantities from dSph kinematic data sets. Section 2 begins by reviewing the properties of the kinematic data sets and defining the likelihood. Section 3 then uses the likelihood to extract rotational and proper motion signals. Section 4 discusses mass modeling and a new calculation for model selection. Section 5 presents the conclusions.

#### 2. Likelihood Function and Error Modeling

Information on the kinematic properties of dSphs is extracted from the line of sight velocities of their individual stars. This section introduces the likelihood used in the data analysis and projections for the errors attainable on several parameters using the likelihood.

##### 2.1. Likelihood Function

The probability for a velocity data set, , is assumed to be of the form

In (1) the dispersion of the distribution is given by the sum of the measurement uncertainty on a star, , and the intrinsic dispersion of the system at the projected radius of the star. The latter quantity is symbolized by and is determined by the model; Section 4 provides more details on this quantity and specifically how it relates to the mass of the systems. The systemic line of sight velocity in the direction of the th star is given by . Written in the above form, (1) may be read as the probability for the data set, given the parameters and . Appealing to Bayes’ theorem and defining the likelihood function as

the parameters and may be determined directly from the data by the maximization of (2). Equation (2) assumes uniform priors on the model parameters.

The form of (1) results from the convolution of Gaussian distribution which represents the measurement error on the velocity of a given star with a separate sampling distribution that is assumed to be Gaussian. It is the sampling distribution of velocities that is connected to physical quantities such as the velocity anisotropy of the stars, and the potential of the stellar and dark matter components. For a given model of the galaxy, the true line of sight velocity distribution function may indeed be non-Gaussian; certain limiting cases of the velocity distribution for analytic potentials have been considered in [28]. This paper shows that when attempting to reconstruct the line of sight velocity distribution for a given model, degeneracies exist between the stellar velocity anisotropy and the stellar and dark matter potentials. Though more information may be gained on model parameters if the true velocity distribution was known, and thus utilized in the parameter estimation, the Gaussian approximation provides the most conservative sampling distribution in reconstructing model parameters in variance estimation problems (for a specific discussion of this point, see the discussion in Chapter 8 of [29]). Further, the mass estimations presented here using the likelihood in (2) agree with mass estimates that use a Gaussian likelihood in the binned velocity dispersion [30]; in this latter case the velocity dispersion does not necessarily correspond to the variance of a Gaussian line of sight velocity distribution, making it self-consistent to determine parameters such as the velocity anisotropy.

The distribution function in (1) provides the simplest description of a data set. Including higher-order effects naturally introduces a larger set of model parameters. The first modification to (1) from higher order corrections comes from noting that the mean velocity, , varies as a function of the position of the star in the galaxy. This variation in the mean velocity results from the fact that, for lines of sight with larger angles from the line of sight directly to the center of the galaxy, the proper motion of the object contributes an increasingly larger component to the line of sight velocity. To describe how the line of sight velocity varies as a function of position, consider a cartesian coordinate system in which the -axis points in the direction of the observer from the center of the galaxy, the -axis points in the direction of decreasing right ascension, and the -axis points in the direction of increasing declination. The angle is measured counter-clockwise from the positive -axis, and is the angular separation from the center of the galaxy. The mean line of sight velocity is then

In the small angle approximation, , where , and is the distance from the observer to the center of the dSph. Then , so that (3) can be written as . In the limit that the vector pointing from the observer to the center of the galaxy is exactly parallel to the lines-of-sight to each star, .

Equation (3) show that the line of sight velocity of a system increases roughly linearly with the increase of the projected distance from the center of the dSph. This effect is purely geometric and may be used to recover the proper motion of a dSph with similar accuracy to the proper motions attained in ground and space-based measurements [12, 31]; an application to a specific data set of Sculptor is given. The extraction of dSph proper motions in this manner is analogous to the determination of the proper motions for the Large Magellanic Clould [32] and for M31 [33] from their stellar and satellite distributions, respectively.

There may also be rotational motion, in addition to the dominant contribution from random motions, present in the galaxy. Though rotation is intrinsic to the dynamics of the system and is not purely geometric as that described by (3), a simple parameterization is possible if the rotation amplitude is described by a term , where defines the projected axis of rotation. Adding all of the terms together gives the following expression for the line of sight velocity of a star:

With the addition of each of the terms in (4), our likelihood function now reads

and the vector set of 6 parameters () may be directly determined from the data. In the sections below these parameters is determined from an example data set; before jumping into this data analysis the following subsection provides a discussion of the theoretical predictions for the errors attainable on these quantities.

##### 2.2. Error Modeling

From the likelihood function defined in (1) and (5), the Fisher matrix formalism may be used to derive projected errors on the model parameters. For model parameters that are varied, the Fisher matrix is defined as an by matrix so that the entry for the th and th parameters is given by

Here is a vector defining the set of parameters. In the simple case studied in this section, the parameters are given by . According to the Rao-Cramer inequality, the minimum possible variance attainable on a parameter using maximum likelihood statistics is given by the inverse of the Fisher information matrix, . The average in (6) is taken over the data, and the derivatives are evaluated at the true model of parameter space. The inverse of the Fisher matrix thus provides an approximation for the true covariance of the parameters, and using provides a good approximation to the errors on parameters that are well-constrained by the data.

The Fisher matrix is constructed by differentiating the log of the likelihood function in (5). It will be understood that the total dispersion is evaluated at the projected radius of the th star. Averaging over the likelihood function, and using the above definition of , the final expression for the Fisher matrix is

The sum is over the number of observed stars in the galaxy. The analysis in this section considers the simplified case that does not in itself depend on any model parameters. A more detailed model would consider this quantity as a function of the parameters that describe the mass modeling of the system; this is discussed in more detail in Section 4.

In the second term in (7), the derivatives are with respect to the theory dispersion alone, whereas both of the contributions to the variance sum in the denominator. For the well-studied satellites, with intrinsic velocity dispersions of 10 km , the dispersion from the distribution function dominates the dispersion from the measurement uncertainty, while for many of the newly-discovered satellites, both contributions to the dispersions are similar. Equation (7) shows that, to determine the error on any of the parameters, one must determine the distribution of stars within the dSph that have measured velocities, and the error on the velocity of each star. This implies that the projected errors are independent of the mean velocity of the stars. Additionally, under the approximation that and no rotation, the first term in (7) vanishes, and the errors are independent of the parameters describing the mean motion of the system.

The projected errors obtained using (7) provide an excellent estimate of the measured errors on both and [12, 31]. Though there has been no conclusive detection of a parameter similar to in published kinematic data samples, it is interesting to determine the expected error on this quantity given expected future data samples. Figure 1 shows example error projections for , for two different model galaxies. The upper solid curve assumes structural parameters similar to that of Segue 1, with a Plummer radius of kpc and a stellar limiting radius of kpc [34]. The lower dashed curve assumes structural parameters similar to that of Draco, with a King core radius of kpc and a King limiting radius of kpc [35]. Each curve assumes that the measurement uncertainty on each star is 2 km . In both cases, the stars have been uniformly distributed at projected positions in the galaxies; this provides a good representation of the present observational configurations.

In addition to their interesting applications for understanding the rotation and proper motion of the dSphs, the calculations presented in this section are crucial for uncovering properties of underlying dark matter distributions. For example a strong gradient may reflect ongoing tidal disruption, which would clearly affect dark matter mass modeling, as is discussed in more detail in Section 4.

#### 3. Proper Motions and Rotation

This section discusses an application of the maximum likelihood formalism introduced in Section 2, with a specific focus on the methodology for extracting an intrinsic rotational signal and proper motions using an example data set. Extracting rotation from a data set is important for reasons discussed above, and, in addition to its phenomenological interests, extracting the proper motions of MW satellites may have important implications for understanding the origin of the accretion history of MW [36–38]. Specifically determining the latter would present a unique observational test of MW halo formation within the CDM paradigm.

Several dSphs have kinematics data sets large enough that statistically significant constraints may be placed on the parameters , , and . For illustrative purposes this section considers just one example, the Sculptor dSph. Sculptor is located at a distance of 80 kpc and has a measured King limiting radius for its stellar distribution of 1.6 kpc [39]. Given these parameters it is one of the more spatially extended dSphs. The mass content of Sculptor has been estimated in several recent papers [13, 14, 40], and it has been shown that Sculptor may contain some degree of rotational support [40]. Further, the previous determinations of the proper motion of Sculptor from its line of sight velocities may indicate a discrepancy between the proper motion as determined from this method and from ground- and space-based measurements [12]. This latter fact may in itself be indicative of the presence of an intrinsic rotational component, provided that the systematics on the ground- and space-based determinations of the Sculptor proper motions are well-understood [41].

To extract the rotation and proper motion signal, a simplified model is considered by assuming that the likelihood function is characterized by the six parameters introduced in Section 2. It is assumed that the intrinsic dispersion is uniform throughout the galaxy and does not depend on any of the parameters of the mass modeling introduced in Section 4. Introducing the set of parameters discussed in Section 4 does not affect the reconstruction of the parameters discussed in this section since the intrinsic dispersion is uncorrelated with the parameters of the function [31].

In order to determine the probability distributions of the parameters , a standard metropolis-hastings algorithm [29] is used to sample the likelihood function as written in (5). For all runs described here accepted points were obtained in each chain, with the first excluded to account for a conservative burn-in phase. For simplicity, a uniform proposal distribution is assumed for each of the parameters over a wide range chosen to encompass physically-acceptable values for each of these parameters. The line of sight velocity data used is taken from the Walker et al. [42] sample, and only those stars with c.l. probably for membership are used in the analysis.

Figure 2 shows the posterior probability distributions for the proper motion components corresponding to and , and , respectively. The remaining parameters are marginalized over. Though the distributions in Figure 2 were obtained by allowing the parameter to float freely, the distributions are found to be relatively unaffected if is instead fixed so that . This reflects the fact that the radial gradient in the velocity of the stars is distinct from the intrinsic rotational component, which has a sinusoidal behavior as a functional of the position angle. The results presented in Figure 2 are in agreement with the measurements of Walker et al. [12], though here a larger set of parameters is marginalized over in this analysis.

**(a)**

**(b)**

Figure 3 shows the corresponding probability distribution for the rotational parameter . Again the remaining five parameters are marginalized over. The result is that, given the rotational parameterization and the using entire distribution of 1352 Sculptor stars, there is no statistically significant detection of rotation. From Figure 3 the c.l. upper limit on the rotation is found to be 2 km . The result presented in Figure 3 is somewhat degenerate with the parameters describing ; for example if and were (unphysically) set to zero, the implied upper limit on reduces by about 50%.

Figure 3 represents the averaged value of throughout the entire galaxy. It may be possible that the rotation amplitude in the outer region differs from the rotation rate in the inner region; if this were the case then it is plausible that this effect is washed out in the averaging process. To provide a simple test for a possible differential rotation rate, an additional likelihood analysis was considered with just the outer sample of Sculptor stars. Here the outer stars are defined as only those with projected radius beyond 0.5 kpc. Even in this case, there is no statistically significant detection of , though in this case the 90% c.l. upper limit increases to 10 km

#### 4. Mass Distributions and Model Selection Criteria

This section discusses the extension of the maximum likelihood analysis developed in Section 2, with a goal of using the kinematic data to determine dark matter mass distributions. A calculation of the gravitating mass of a stellar system is one of the more fundamental tasks in astronomy, and simple scaling arguments provide some guidance to anticipate the results. It is worthwhile to first review these arguments as applied to the dSphs before undertaking a more detailed and model dependent treatment.

##### 4.1. Spherical Mass Modeling

###### 4.1.1. Initial Estimates

Under the assumption that a star cluster is spherically symmetric, the orbital distribution of the tracer particles is isotropic, that mass follows light, and the cluster is isolated from any external gravitational potential, the virial theorem provides a mass estimate of , where is the observed extent of the cluster and is the velocity dispersion of the stars. Although this is probably the simplest estimate one can make for the mass of a star cluster, it does provide a useful extremum bound. For example Merritt [43] has show that the virial theorem may be used to derive a lower bound on the mass of a star cluster, which is obtained from the assumption that all of the mass is concentrated as a point in the center. This minimum mass is given by , where is the harmonic mean stellar radius in the cluster.

Of course for dSphs it is not consistent to assume that these systems are isolated, since they are orbiting within the extended dark matter halo of the MW. For dSphs orbiting with the MW halo, the minimum mass estimate above is particularly useful, as it in turn provides a conservative estimate of the radius at which particles would be stripped due to the MW potential. As an example consider the case of Segue 1, which is an MW satellite with a stellar luminosity 340 at a Galactocentric distance of 28 kpc. From the de-projected light distribution, the harmonic mean stellar radius is 10 pc, and given the velocity dispersion of km [21], the implied minimum mass of Segue 1 is . Assuming that Segue 1 is a point mass orbiting in the potential of the MW, the radius at which particles would be presently getting stripped is the Jacobi radius, , where kpc. Assuming the minimum mass of , pc. It is important to note that this provides only an estimate of the instantaneous tidal radius; if Segue 1 came significantly closer to the MW in the past then this estimate would differ. The above estimate provides a lower bound on the radius at which particles would be getting stripped, under the assumption of a circular orbit. A similar argument for the tidal radius of Segue 1 was considered in Geha et al. [21] using the Illingworth approximation for the mass as (for an alternative interpretation for the origin of Segue 1, see [44].)

###### 4.1.2. Jeans Equation

At the next level of detail from the dynamical perspective, an estimate for the mass of the dSphs may be obtained by appealing to the sphericallysymmetric jeans equation, assuming that the gravitating mass of the system consists of both stars and dark matter. The analysis here closely follows the treatment given in the appendix of Strigari et al. [13] and refers to this paper for further details. A standard discussion of the spherical jeans equations comes from [45].

The spherical jeans equation is

Here is the deprojected stellar density profile, the circular velocity is , and the parameter characterizes the difference between the radial and tangential velocity dispersions of the stars. Integrating along the line of sight gives the velocity dispersion as a function of projected radius, ,

Here, is the projected surface density of the stellar distribution, and is the three-dimensional stellar distribution. In (9), depends on the parameterization of the mass distribution of the dark matter component. The stellar density profile is taken to be fixed; for example the measurements of the projected density profiles for many of the classical satellites come from [46], and more updated profiles from, for example, [35, 47–49], while measurements of the density profiles for the ultra-faint satellite come from [34]. It is important to note that fixing the stellar density profile may introduce a degeneracy in determining the projected velocity dispersion profile, particularly in the central regions [50]. However the effect on the integrated mass distributions as considered here is less severe, motivating the assumption of fixing , rather than marginalizing over it, in the analysis.

Given the above assumption for the velocity anisotropy of the stars and for the shape of the dark matter profile for the galaxy, the likelihood function can now schematically be written as

For compactness, the vector has been defined, and and are vectors that describe the stellar velocity anisotropy and the gravitational potential of the system, respectively. The line of sight velocity dispersion is dependent on and through the spherical jeans equation. The mass of the system, as well as quantities related to the mass distribution, are determined via , and thus by integrating out the model parameters one may determine the probability density for the mass of the system contained within a fixed physical radius.

###### 4.1.3. Error Projections on Mass Distribution

Before performing an example calculation using (10), it is interesting to get an idea as to how the errors on the mass distribution depend on the physical radius within which the mass is determined. To perform these estimates, we again appeal to the Fisher matrix formalism outlined above. However the analysis here is different from above in that now the likelihood depends on the vector set of parameters and in addition to .

The example considered here uses the velocity data sample from Fornax of Walker et al. [42], specifically the stars with c.l. for membership. This gives a total of 2409 Fornax members. The three-dimensional surface density profile for Fornax is assumed to take the form

with the parameters . A profile of this form with these parameters is consistent with the recent measurements of Fornax star counts [49], though generally the results presented are independent of the normalization of the surface density profile. The stellar mass-to-light ratio is assumed to be unity, consistent with the results presented in [49]. The dark matter density profile is assumed to be the einasto profile

and following CDM simulations, [51]. The velocity anisotropy is assumed to be of the following form

Thus in the Fisher matrix calculation the base set of parameters are now given by (the rotational and geometric parameters, are ignored here: this is justified given that the Fornax data is consistent with and that the parameters do not correlate with the parameters that determine the mass).

Given the base set of parameters in used to calculate , the error on a derived parameter, , is given by

The derived parameter specifically considered here is the log of the mass within a given fixed physical radius (see [52] for another example where the derived parameter considered is the log slope of the dark matter density profile). Where desired Gaussian priors may be taken by simply adding to the component of the Fisher matrix.

Figure 4 shows the error on the log of mass as a function of the physical radius within which the mass is measured. Here the fiducial baseline parameters for the velocity anisotropy have been taken as , implying slightly tangential orbits in the central region of the halo and isotropic orbits at outer radii. Different combinations of have been taken as indicated to represent the degeneracy between these two parameters when fitting the data. Each of these parameters sets, combined with an anisotropy model, produces a velocity dispersion profile that roughly fits the profile of Fornax. While the goal here is to not undertake a direct fit to the data and to explore the exact degeneracy space of these parameters, examining these three sets of fiducial parameters gives a feel for how the constraints on the mass depend on the fiducial parameter set. Priors on each of and are taken as , while priors on and are taken as . Each of these priors are motivated by the range of these parameters scanned in the algorithm described in following the subsection below. As is seen, for the stellar profile considered above and the fiducial set of parameters taken, the best-constrained mass is at a radius 0.6–1.0 kpc. This best constrained radius is found to be relatively weakly dependent on the sets of fiducial parameters, particularly near the best constrained mass, provided that they give a good fit to both the star count and velocity dispersion data of Fornax. This notion that the mass is a strongly constrained at the half-light radius is found to be a general property of dispersion supported systems, as discussed in [53, 54].

###### 4.1.4. Fornax Mass Distribution

The posterior probability density for the mass distribution of Fornax is now determined directly from the kinematic data, and compared to the projected error on the mass distribution as determined from Figure 4. As above, a metropolis hastings algorithm is used to determine the respective posteriors, and the same data for both the star counts and the line of sight velocity distribution have been used. In the parameter scan, uniform priors have been taken on each of the parameters over the following ranges as follows: .

Figure 5 shows two example probability distributions for the Fornax mass, within 0.6 kpc (*left*) and within the approximate Fornax stellar tidal radius of 3 kpc (*right*). The probability distributions are seen to be slightly non-Gaussian, particularly the M(3 ) distribution. Comparing the approximate width of each of these distributions with the errors projected in Figure 4 provides generally good agreement, in spite of the intrinsic assumption in the Fisher matrix formalism that the errors on the parameters are Gaussian. Specifically for the left panel, a Gaussian fit gives (0.6 kpc)/. These results confirm the general trend seen in Figure 4 that the error on the integrated mass within a fixed physical radius increases at larger radii towards outer regions of the halo.

**(a)**

**(b)**

Results of the calculations for the mass distributions of the entire population of dSphs are presented in [13, 23, 53, 54]. These results, as well as more recent determinations, show that the central mass distributions for the dSphs are very similar, despite an over four order of magnitude variation in their luminosities. The average density within a spherical radius of 0.3 kpc is 0.1 ; for the brightest satellites baryons can contribute to the potential in this central region, while for the least luminous satellites the potential is dominated by dark matter within this region. Within the context of spherical models, these constant central density results are robust to the specific parameterization of the mass distribution, primarily due to the fact that the integrated mass is directly constrained via the jeans equation and the approximately similar scale for the velocity dispersion profiles [13].

###### 4.1.5. Model Selection

The likelihood formalism introduced above does not give any information regarding the optimal parameterization of the dark matter mass profile. For example referring to the calculation above, is the einasto profile with just two free parameters an acceptable description of the data? Given the parameterization of the dynamics via the spherical jeans equation, we can answer this question and determine how many parameters are required to describe the mass profile given the maximum likelihood formalism. Moreover, we can determine how the parameterization of the density profile depends on the given data sets. For example Segue 1, with only 24 measured line of sight velocities, may require a smaller set of parameters than does Fornax, which has 2400 measured line of sight velocities.

To specifically answer the question of how to determine the appropriate set of parameters in maximum likelihood theory one may appeal to the bayes evidence. For the purposes here the evidence, , is defined as the integral of the likelihood in (10) over all of the model parameters. When comparing models, the ratio of their respective evidences gives an idea of how much more probable one model is over another. For example if , the difference between the two models is substantial; for , the difference between the two models is strong, and for the difference between the two models is decisive [55].

As an illustration, four different dSphs that span a wide range in their respective number of velocities are considered: Segue 1, Sextans, Sculptor, and Fornax. These dSphs have 24, 424, 1352, and 2409 stars, respectively; for the latter three galaxies we consider only those stars that have a probability of 90 membership from the Walker et al. [42] sample. For each dSph we determine how many model parameters are necessary to describe the data, and we consider several different models.

For the “Baseline” 3 parameter model, the following range of parameter space is integrated over for Fornax, Sculptor, and Sextans: , . The velocity anisotropy is assumed to be a constant, , with a range given by . For Segue 1 the ranges are the same except for the scale radius, which is taken to vary over the range ; this range is motivated by the likely upper limit to the dark matter tidal radius for Segue 1 [21]. All of the ranges above are chosen as plausible values to describe the halos of dSphs. A flat prior is chosen over these regions; as a further detail one may chose a prior that weights each of these parameters differently, for by example considering the scatter in the relation as seen in CDM simulations [56]. The bayes evidence for the Baseline model will be denoted as .

Three different models are compared to this Baseline 3 parameter model: (i) a model in which the parameter space for is enlarged in the range (corresponding to [3–7]), so that central slopes that are both more flat and more steep than the CDM value are allowed (ii) a model with the Baseline 3-parameter profile, but with a three-parameter velocity anisotropy profile which depends on the three-dimensional physical radius as in (13), and (iii) a model in which and the profile in (13) is assumed.

Model (i) is thus described by four parameters, while model (ii) is described by five free parameters, and model (iii) is described by six free parameters. In Table 1, we define model (i) as the “Exp” model, model (ii) is denoted as the “” model, and model (iii) is denoted as “ EXP.” These models provide useful illustrations of the calculation of the evidence as applied to dSphs; alternative models may of course be defined and even larger parameter spaces may be explored. The utility of the above models as defined allows us to explore to what extent CDM-like inner slopes are more favored, and to what extent an alternative parameterization of the velocity anisotropy provides a better fit to the data as compared to simply changing the value of the inner slope.

The results for the ratio of the bayes evidence for the various models, relative to , are shown in Table 1. For each of the galaxies, we see roughly the same pattern; as more parameters are added, the better the model fits the data. This result implies that models with larger sets of parameters are favored even after penalization for the larger volume of parameter space that is integrated over. Allowing for a larger volume of parameter space for the dark matter density profile affects the evidence more than simply varying the shape of the anisotropy profile. In total, the best fitting models are those that allow both the velocity anisotropy and the central slope to vary freely, that is, the EXP models.

The results in Table 1 indicate that for all four galaxies variable velocity anisotropies are slightly preferred relative to those with constant velocity anisotropy, and that central dark matter profiles both less cuspy and more cuspy than CDM-based fits are equally acceptable. Future data sets, both line of sight velocities and potential proper motion measurements for stars in dSphs [52, 57, 58], will be important in narrowing the acceptable ranges for both the velocity anisotropy and the central slope.

#### 5. Conclusion

This paper has discussed the analysis of kinematic data from Milky Way dwarf spheroidals, with a primary motivation of (1) understanding physical quantities that are well-constrained by the data and (2) understanding the systematics that underly the determination of the dark matter masses of these systems, given the simplest assumption that the dSphs are purely pressure supported systems. Of the possible systematics perhaps the most significant and observationally-accessible is the determination of a velocity gradient in the data sample, which may be indicative of tidal disruption from the potential well of the Milky Way. The results in the literature indicate that, based on the kinematic data alone, velocity gradients due to tidal disruption or rotation are not conclusively present in any of the dSphs. This paper has provided an example, using a simple parameterization, of how to search for rotation in the kinematic data sets using a maximum likelihood analysis. The kinematic sample of Sculptor was analyzed, and it was found that the maximum likelihood rotational amplitude is zero, with an upper limit of 2 km at 90% c.l. The magnitude of these errors is consistent with the projected magnitude of the errors from theoretical modeling.

When modeling the mass distribution of the dark matter halos of the dSphs, degeneracies between model parameters affect the determination of the total mass profiles, even in the context of the simplest spherical models. To shed light on these degeneracies, this paper has discussed a new criteria for model selection applied to the dSph kinematic data sets, taking a step towards determining how many parameters are needed to describe the mass distribution of spherical halos. For the four dSphs studied here, chosen because they have a wide range of available line of sight velocities, it is shown that, assuming CDM-motivated Einasto profiles for the dark matter halos, models with variable velocity anisotropy are slightly preferred relative to those with constant velocity anisotropy. Further, central slopes for the dark matter profile that are found in CDM simulations are not a unique description of the data sets; both more cuspy and less cuspy models are allowed for the central slope. This is primarily due to the degeneracy between the central dark matter slope with the central stellar profile and the velocity anisotropy distribution [52].

Future photometric and kinematic data sets promise to further pin down the mass distributions of the dSph dark matter halos. Upcoming data for the ultra-faint satellites will be particularly important and may be able to show whether any tidal effects are present in these galaxies. Further more, development of nonspherical distributions for both the light and dark matters should be considered given these data sets (for initial results along these lines see [59]). Controlling systematics in these data sets will prove to be important step towards further testing the currently favored CDM theory of structure formation.

#### Acknowledgments

The author is grateful to Andrey Kravtsov, Matt Walker, and Beth Willman for discussion and comments on this paper. I additionally acknowledge the anonymous referee for constructive comments and critique that improved the content and presentation. The support for this work was provided by NASA through Hubble Fellowship Grant HF-01225.01 awarded by the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., for NASA, under Contract NAS 5-26555.