Research Article | Open Access
An Approximate Bayesian Method Applied to Estimating the Trajectories of Four British Grey Seal (Halichoerus grypus) Populations from Pup Counts
For British grey seals, as with many pinniped species, population monitoring is implemented by aerial surveys of pups at breeding colonies. Scaling pup counts up to population estimates requires assumptions about population structure; this is straightforward when populations are growing exponentially but not when growth slows, since it is unclear whether density dependence affects pup survival or fecundity. We present an approximate Bayesian method for fitting pup trajectories, estimating adult population size and investigating alternative biological models. The method is equivalent to fitting a density-dependent Leslie matrix model, within a Bayesian framework, but with the forms of the density-dependent effects as outputs rather than assumptions. It requires fewer assumptions than the state space models currently used and produces similar estimates. We discuss the potential and limitations of the method and suggest that this approach provides a useful tool for at least the preliminary analysis of similar datasets.
Complete censuses are not practical for most animal populations. Instead, abundances usually have to be estimated by scaling up from partial counts. This process is complicated when the components of a population differ in their detectability. Pinnipeds such as grey seals (Halichoerus grypus), where young pups remain ashore and the rest of the population spends the majority of its time at sea, are an extreme example of this problem. In these cases, population estimation can effectively come down to scaling up from observations of pups. This process needs to be done efficiently, and also evaluate the population estimates’ precision. This paper presents a way of simplifying the computations and reducing the assumptions underlying such models. It produces similar results to the, more complicated, methods currently used to produce the estimates on which decisions about the conservation and management of British grey seal populations are based. These simplifications can release resources and data for the examination of environmental and other effects on the populations. In this particular study, they have removed the need for maximum fecundity and survival to be considered equal everywhere.
Grey seals are colonial breeders. Females mature at around six years of age and give birth to a single pup in the autumn. The pups are born on land and remain ashore for several weeks. This behaviour, along with their neonatal white coats, makes them relatively easy to observe. Counting the other components of these populations is much less straightforward, since while they do haul out on land, the animals spend most of their time at sea and submerged.
The species is abundant around Britain and also on the eastern seaboard of North America. There are smaller numbers of animals in the Baltic Sea and around the northern European coastline. They were heavily hunted and more recently have been seen as a serious competitor to commercial fisheries. In 1914, a pessimistic estimate that the British population of grey seals was down to 500 individuals led to the Grey Seal (Protection) Act. This gave some legal protection to the species . From there, exponential growth at around 6-7% brought the population to around 70,000 in the 1970s . Similar exponential growth has been recorded in the grey seal population breeding on Sable Island off Nova Scotia [3, 4].
While the population was growing exponentially, scaling up from pup production estimates to total population size was relatively straightforward, requiring only estimates of the proportion of females breeding and the sex ratio. However, around 1995, the previously steady growth started to slow in some regions (Figure 1). More recent estimates of population size depend critically on the assumptions made about where in the species’ lifecycle density-dependent effects occur. A set of Bayesian state space models has been used to model the population and advise government agencies involved in its management [5, 6]. State space modelling involves specifying two linked submodels: the first, the “process model,” represents the evolution of the true, but unknown, state of the system (i.e., the number of seals of different ages in each region in each year), while the second, the “observation model,” represents the relationship between observations (pup production estimates) and states (true numbers of pups) . The process submodel can be viewed as a type of stochastic, age-structured matrix model of the population dynamics of the species. For grey seals, various sub-models have been investigated containing different assumptions about the component of the population subject to density-dependent regulation and about the movement of animals between regions . Each model was fitted to the pup production data, with prior distributions specified for population numbers in the initial year and for all model parameters (fecundity, the survival of pups and adults, carrying capacity, animal movement, and observation error). More details of the approach are given in Newman et al. [8, 9]. Their analysis of this dataset assumed that environmental carrying capacity was the only parameter to vary between regions. Fitting those models is computationally intensive and requires both statistical expertise and customised software .
The state space models contain observation and process components but fit them together. Instead, we use the pup production data to estimate maximum growth rates for the start of the time series, and we apply this with an assumption of the location of density dependence in the species’ lifecycle to derive the adult population sizes. Our approach separates the estimation process into three parts: first smoothing the pup production estimates to give a distribution of estimated pup production trajectories, then associating each trajectory with a set of demographic parameters, and finally applying the relevant demographic parameters to each pup production trajectory to estimate the numbers of individuals in each of the other age classes. The method presented here can be seen as a form of approximate Bayesian computation [10–12].
Since 1984, pup production at the main Scottish grey seal colonies has been monitored by series of aerial surveys carried out throughout the breeding season. Each year, between 3 and 6 flights are made over each colony, using a fixed-wing aircraft with a vertically fitted large format camera . The numbers of animals in each photograph are counted and used to estimate the total numbers of pups at each colony. Some pups that die soon after birth may be missed or confused with stillbirths and excluded from the counts. This study, along with previous ones, ignores this detail and defines fecundity as the ratio of estimated pup numbers to adult females. Pup survival is then considered as the proportion of the counted pups that survive another year.
Equivalent counts are made directly by observers on the ground at the colonies in England and Shetland. A consistent methodology has been used to estimate the total numbers of pups in each colony and, where sufficient surveys have been completed, calculate the estimates’ precision . Previous analyses have summed the data within each of four regions: the North Sea (effectively defined as the eastern coastline of the UK from the Thames to Rattray Head, north of Aberdeen), Orkney, the Inner Hebrides, and the Outer Hebrides. We follow this and use the total pup production estimates from each area (Figure 1) as inputs to our models.
We use the same prior distributions for grey seal demographic parameters as the previously published state space models . They gave separate prior distributions for maximum (low population density) fecundity and adult and pup survival. The values were based on previous studies of the species and are shown in Table 1.
The population estimation was carried out in three stages: first smoothing the pup production estimates to give a distribution of estimated pup production trajectories, then associating each trajectory with a set of demographic parameters, and finally applying the relevant demographic parameters to each pup production trajectory to estimate the numbers of individuals in each of the other age classes.
3.1. Smoothing of the Pup Production Trajectory
We fitted generalised additive models (GAMs), with log link functions and a quasi-Poisson error structure , separately to each of the four regional pup production time series. These were simple empirical models that used a cubic spline to smooth the observed data. The mgcv library  within the R statistical environment  was used for this. Generalised cross-validation was used to set the smoothing parameter within the models, with gamma (a parameter that reduces the tendency of these model to overfit data) set to 1.4 .
Pup production in each region showed a period of approximately exponential growth (Figure 1) though with different annual growth rates. The growth of the populations in each region was, therefore, investigated separately. A parametric bootstrap was used to derive the distribution of maximum growth rate from the fitted models. 10,000 replicate pup production trajectories were simulated for each region, using the Bayesian covariance matrices for the parameters of the gam models  to allow for the dependencies between the parameters of their smooth terms, and each trajectory’s maximum annual growth rate was then calculated.
3.2. Demographic Parameter Estimation
Scaling the replicate pup production trajectories up into population trajectories requires them to be combined with suitable sets of demographic parameter values. Any combination of demographic parameter values will produce a particular stable age-structure and exponential growth rate at low population densities. Given that the animals are assumed to breed first at age 6 and have fecundity and mortality rates are constant among adults, and using the notation in Table 1, this growth rate, , satisfies (the derivation of this equation is given in the Supplementary Material which is available online at http://dx.doi.org/10.1155/2011/597424).
The maximum growth rate within each estimated pup production trajectory was taken as an estimate of the low-density growth rate for that replicate population. The demographic parameter values then need to be drawn from their joint conditional probability distribution given the appropriate exponential growth rates. Explicitly calculating this distribution is not straightforward, but it can be approximated numerically by drawing from an unconditional joint probability distribution for the demographic parameters and discarding those results whose maximum growth rate falls outside a small neighbourhood of the required value. This approach is slow, because small neighbourhoods will produce high parameter rejection rates , so we simplified it further. We drew 10,000 sets of candidate parameter values from Newman’s priors and calculated the rates of stable exponential growth that each would produce for low-density populations. Each of the replicate pup trajectories was then associated with the set of demographic parameter values that produced the exponential growth rate most similar to the observed maximum, and the sets of deterministic matrix models were populated using these values.
3.3. Population Models
Two sets of incomplete deterministic age-structured matrix models of the females within each population were then constructed, with one assuming that all the density dependence was in fecundity and the other putting it all into pup survival. These differed from those in Newman et al.  by not defining the functional form of the density dependence. If there were no density-dependent effects, a deterministic age-structured matrix model with known survival rates could be used to project forwards from a single pup count to give the number of one-year olds the next year, then two-year olds the following year, and so on. Similarly, fecundity could be used to estimate adult female numbers from pup numbers, then survival rates used to project back to younger animals in earlier years. If there was actually stochastic variation in the population growth, such an approach could produce inconsistencies when based on a series of pup counts. It was assumed that density dependence only affected one transition, either fecundity or pup survival, in the model and that stochastic effects, such as fluctuations in environmental conditions, only acted at this point in the lifecycle. This formulation effectively meant that there was a single pup count that could be used to estimate each age class in each year, and the density-dependent relationship was implicitly determined by the relative sizes of the age classes either side of that transition.
Five annual age classes were used in the model, along with a sixth that contained all the older animals. Only animals within the oldest category were considered to breed . For the variable fecundity model, the numbers of one-year-old females, , in each replicate were calculated by multiplying half the previous year’s pup production , (so assuming an equal sex ratio at birth) by the pup survival parameter, A similar process was used to fill in the subsequent 2-, 3-, 4-, and 5-year-old classes but using an “adult” survival parameter, The numbers of individuals in the older age groups during the early years of the study were estimated by using the stable age structure for an exponentially growing population, described above, to scale the estimated numbers of pups in the first year of the dataset. The numbers of adult, six-plus, females were then projected forwards throughout the dataset Each year’s effective fecundity was then calculated by dividing the estimates for pups by those for female adults.
Equivalent calculations were made for the model with density-dependent pup survival though these used fecundity, , to calculate numbers of adults from the pup production estimates They then worked back down in age Within this second model, the recent younger age classes were filled in using pup survival estimates generated from the data from the years with most similar estimated adult numbers. Further details and code for these calculations are contained in the Supplementary Material.
Two different methods were used to combine the results of the two models of each region. The first assumed that one of the two models was correct and, in the absence of information to choose between them, simply superimposed the two posterior distributions of population estimates to effectively give a single, model-averaged, overall distribution of population estimates. The second approach assumed that the truth lay somewhere between the two explicit models and placed an additional uniform prior on where the result lay between them. Two uniform random variables were, therefore, drawn to produce each of a set of estimates: one variable to identify a pair of bootstrap replicates, and the other to determine the weighting of the results of the two population models for that replicate. The result was a distribution that effectively smeared across the two directly modelled extremes. Total, rather than female only, population estimate distributions were then calculated by multiplying each replicate by a draw from a normal distribution with mean 1.73, the value used in previous analyses of this data  and standard deviation of 0.1, to allow for the uncertainty in the sex ratio within these populations.
Figure 1 shows the smoothed pup production trajectories for each of the regions. Populations’ trajectories clearly differ between regions. In the North Sea, pup production still appears to be growing almost exponentially. In all other regions the growth rates have at least slowed substantially (see Figure S2 in Supplementary Materials). This occurred during the mid 1990s in the Inner and Outer Hebrides and in the early 2000s in Orkney. In the Outer Hebrides, the highest pup production estimate occurred before 2007 in all but 64 out of the 10,000 replicate trajectories, implying that a significant decline has occurred in that region. For 95% of these replicates, the highest values occurred within the period 1995–2002. Everywhere except the North Sea, the density-dependent effects cause the pairs of matrix models to diverge (Figure 2). Estimates of the 2007 population are given for each region in Table 2 and have a slightly higher precision than those produced by the state space models . Adult survival is the only demographic parameter substantially altered by the model fitting (Table 1; Supplementary Material Figure S3). Plots of the effective patterns of density dependence (Supplementary Material: Figures S4, S5) show much more clearly defined patterns for the models containing variable fecundity than those with variable pup survival.
|1All the CIs include uncertainty in the population sex-ratio. |
2The CIs are estimated conservatively by summing those of the individual models.
The two models agree that there are probably slightly more than 20,000 grey seals that breed on the eastern coasts of England and Scotland (our North Sea region) and that this population is continuing to grow in a near exponential fashion. In all the other areas, the predictions diverge rapidly with the models containing density-dependent fecundity producing estimates for 2007 that are 2-3 times as large as the equivalent figures for density-dependent survival. The confidence intervals of these pairs of models do not overlap. Outside the North Sea, the precision of the population estimates would be greatly improved if it were possible to distinguish where in the grey seal life-cycle density dependence impacts most strongly. Because they do not specify the functional form of the density dependence, the models presented here can give little information on this. It is also difficult to extract this information from the more complex state space models of this system even though these do explicitly assume the form of the density dependence [5, 9], probably because the connection between the data (pup counts) and the required information (location of density dependence in the lifecycle) is through the, initially unknown, population size. Additional information, independent of that used here, is, therefore, required.
The approach presented here is Bayesian in the sense that it uses prior distributions on the density-independent demographic parameters (adult survival, and maximum pup survival, maximum female fecundity). However, it is approximate in its use of the priors on the demographic parameters and because there are no formal priors on the density-dependent components of the model or the population sizes, there are no complete likelihoods for the results. The result is a semiparametric approach to model fitting, rather than a fully parametric model. The pup production data are used to derive a maximum growth rate for the early part of the time series, and this is used, with an assumption of the location of density-dependence in the species’ lifecycle, to derive the adult population sizes together with approximate graphical representations of the density dependent effects. This strategy has some similarities to that adopted in approximate Bayesian computation (e.g., ), where explicit calculation of the likelihood is avoided by the use of easier-to-compute summary statistics. The process of sampling parameter values from prior distributions is stochastic, and Monte Carlo error can, therefore, influence the outcome. Repeating the model fitting produced population estimates and confidence intervals within 1% of the values reported here suggesting that this effect is small. It could be further reduced by increasing the number of replicates used. The calculations reported in this paper took around 10 minutes to run on a laptop with a 2.33 Ghz Intel Core 2 Duo processor and 2 Gb of RAM.
This methodology effectively pushes most of the uncertainty in the system into the error terms of the GAMs. These models, therefore, have lower precision than the colony-based pup production estimates and estimate each year’s expected, rather than actual, pup production. The uncertainty then passes through into the population estimates and could be expected to reduce their precision.
The age-structured population models are largely deterministic. They effectively assume that varying environmental conditions will mainly affect transitions where density dependence occurs. That is likely to be a simplification of the actual situation but not an unreasonable one given that the other processes are assumed to be relatively insensitive to the size of the population relative to its carrying capacity. Because these models do not contain a predefined form for their density-dependent components, they also avoid prescribing a distribution or pattern for the effects of environmental variation on the populations’ dynamics. Instead, such variability can be expected to simply reduce the precision of their results.
The similarity of the credibility interval widths presented here to those from the more detailed state space models suggests that the additional effects, such as demographic stochasticity and movement between areas, which are explicitly represented in those models, may have limited impact on the precision of their results in this case. This is not entirely unexpected, since demographic stochasticity should be small for such large populations, and the estimated amount of movement between colonies is also small (inspection of posterior movement parameter estimates reported in ). Another way of interpreting this is that the extra parameters, along with the requirement to satisfying constraints imposed by the functional forms of the density dependence and movement, within the state space models may absorb a sufficiently large proportion of the information within this small dataset to negate the benefits of their more accurate representation of the system. It is also possible that the use of different demographic parameter values in each region, made possible by the other simplifications in model structure, is the key to the performance of these scaled GAM models. Additionally, the process of matching each replicate’s maximum growth rate to that of a set of demographic parameter values, rather than simply drawing directly from the priors, may actually extract most of the information available to the more complete Bayesian analysis.
The uniform prior on the relative impact of density dependence on fecundity and pup survival is clear and unambiguous. It is much easier to calculate than a set of intermediate models, and this reflects the current state of ignorance as to the true balance between these factors. While it is straightforward to apply here, it might be harder to justify its combination with formal likelihood-based model selection techniques, such as Akaike’s information criterion , which penalise models for including additional parameters. Such formal model selection techniques are not directly applicable to the approach described here, because it does not estimate an explicit likelihood though model selection techniques are starting to be proposed for similar situations within the framework of approximate Bayesian computation .
Our approach could be seen as a retrograde step, since it does not attempt as complete a description of the system or utilisation of the data, as the state space models. It could also be criticised for its limited predictive and explanatory power. However, any projection of models requires extrapolation, and needs to be done cautiously. For these populations, the most obvious danger would be in the projection of density-dependent effects beyond the range of existing data, which requires a belief that their functional forms have been adequately described. It is also possible that if the state space models were modified in the light of these results, for example, by modifying them to allow adult survival to vary between areas, the precision of their estimates would improve. However, as the most appropriate analysis of datasets will always depend on their size and the availability of resources, this sort of less demanding methodology may also be appropriate for other small datasets.
Collection of aerial survey data is funded by NERC and involves many people at the Sea Mammal Research Unit and elsewhere. Additional data were collected by Natural England, the National Trust, Lincolnshire Wildlife Trust and SNH. The analysis and its interpretation have been discussed with many colleagues at St. Andrews and the members of the NERC Special Committee on Seals. The authors are grateful to them all.
The Supplementary Material (doi:10.1155/2011/597424) contains additional figures and text describing the fitting of the models and their output. This includes the posterior distributions of the parameters and the patterns of density dependence implicit in the results. The R code used to fit the models is also included.
- R. A. Lambert, “The grey seal in Britain: a twentieth century history of a nature conservation success,” Environment and History, vol. 8, no. 4, pp. 449–474, 2002.
- C. F. Summers, “Trends in the size of British Grey Seal populations,” Journal of Applied Ecology, vol. 15, no. 2, pp. 395–400, 1978.
- W. D. Bowen, J. McMillan, and R. Mohn, “Sustained exponential population growth of grey seals at Sable Island, Nova Scotia,” ICES Journal of Marine Science, vol. 60, no. 6, pp. 1265–1274, 2003.
- W. D. Bowen, J. I. McMillan, and W. Blanchard, “Reduced population growth of gray seals at sable island: evidence from pup production and age of primiparity,” Marine Mammal Science, vol. 23, no. 1, pp. 48–64, 2007.
- L. Thomas, S. T. Buckland, K. B. Newman, and J. Harwood, “A unified framework for modelling wildlife population dynamics,” Australian and New Zealand Journal of Statistics, vol. 47, no. 1, pp. 19–34, 2005.
- SCOS 2007, “Scientific advice on matters related to the management of seal populations: 2007,” Reports of the UK Special Committee on Seals, http://smub.st-andrews.ac.uk.
- S. T. Buckland, K. B. Newman, C. Fernández, L. Thomas, and J. Harwood, “Embedding population dynamics models in inference,” Statistical Science, vol. 22, no. 1, pp. 44–58, 2007.
- K. B. Newman, S. T. Buckland, S. T. Lindley, L. Thomas, and C. Fernández, “Hidden process models for animal population dynamics,” Ecological Applications, vol. 16, no. 1, pp. 74–86, 2006.
- K. B. Newman, C. Fernández, L. Thomas, and S. T. Buckland, “Monte carlo inference for state-space models of wild animal populations,” Biometrics, vol. 65, no. 2, pp. 572–583, 2009.
- T. Toni, D. Welch, N. Strelkowa, A. Ipsen, and M. P.H. Stumpf, “Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems,” Journal of the Royal Society Interface, vol. 6, no. 1, pp. 187–202, 2009.
- M. G. B. Blum, “Approximate bayesian computation: a nonparametric perspective,” Journal of the American Statistical Association, vol. 105, no. 491, pp. 1178–1187, 2010.
- S. Koyama, L. Castellanos Pérez-Bolde, C. Rohilla Shalizi, and R. E. Kass, “Approximate methods for state-space models,” Journal of the American Statistical Association, vol. 105, no. 489, pp. 170–180, 2010.
- L. Hiby, D. Thompson, and A. Ward, “Improving census by aerial photography—an inexpensive system based on nonspecialist equipment,” Wildlife Society Bulletin, vol. 15, no. 3, pp. 438–443, 1987.
- S. N. Wood, Generalized Additive Models: An Introduction With R, Chapman and Hall/CRC, 2006.
- S. N. Wood, “Fast stable direct fitting and smoothness selection for generalized additive models,” Journal of the Royal Statistical Society. Series B, vol. 70, no. 3, pp. 495–518, 2008.
- R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2006.
- J. Harwood and J. H. Prime, “Some factors affecting the size of British Grey seal populations,” Journal of Applied Ecology, vol. 15, no. 2, pp. 401–411, 1978.
- SCOS 2008, “Scientific advice on matters related to the management of seal populations: 2008,” Reports of the UK Special Committee on Seals, http://smub.st-andrews.ac.uk.
- T. Toni and M. P. Stumpf, “Simulation-based model selection for dynamical systems in systems and population biology,” Bioinformatics (Oxford, England), vol. 26, no. 1, pp. 104–110, 2010.
- K. P. Burnham and D. R. Anderson, Model Selection and Inference, Springer, New York, NY, USA, 2nd edition, 2001.
Copyright © 2011 Mike Lonergan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.