#### Abstract

Trajectories of drug use are usually studied empirically by following over time persons sampled from either the general population (most often youth and young adults) or from heavy or problematic users (e.g., arrestees or those in treatment). The former, population-based samples, describe early career development, but miss the years of use that generate the greatest social costs. The latter, selected populations, help to summarize the most problematic use, but cannot easily explain how people become problem users nor are they representative of the population as a whole. This paper shows how microsimulation can synthesize both sorts of data within a single analytical framework, while retaining heterogeneous influences that can impact drug use decisions over the life course. The RAND Marijuana Microsimulation Model is constructed for marijuana use, validated, and then used to demonstrate how such models can be used to evaluate alternative policy options aimed at reducing use over the life course.

#### 1. Introduction

Marijuana is the most widely used illicit substance in the United States, with use rates being the highest among youth and young adults. Average thirty-day use rates among high school seniors in the United States have been about 20% since 1994 and in 2009, and 5.4% of high school seniors reported daily use of marijuana during the past month [1]. Use rates among young adults are also high, with 59% of all individuals between the ages of 19 and 28 reporting having ever used marijuana, 17% of individuals reporting use in the past thirty days, and 5.4% reporting use of marijuana on a daily basis in the past 30 days [2].

Although a large number of studies have examined risk and vulnerability factors associated with the general use and onset of marijuana, only recently have scientists started examining factors associated with marijuana use careers, the duration and severity of marijuana use, and quit behavior. Current work suggests that age of onset, frequency of use at an early age, drug-using peers, and stressful life events are important factors influencing both duration and the probability of quitting [3–6]. Other work suggests that while these various factors are important, there are alternative trajectories of use that people seem to follow because of heterogeneous experiences [7–10]. The vast majority of these trajectory studies have focused only on the periods of late adolescence and young adulthood during which marijuana use peaks [11–13]. None of these analyses have examined the longer-term implications of marijuana use careers, such as the relationship between marijuana use trajectories and drug treatment or criminal engagement. This is due in large part to the lack of available longitudinal data that captures the full spectrum of drug use and its consequences over the entire life course. Thus, most of what we have learned about initiation, escalation, and general quit behavior is from general population surveys of nonheavy users [3–6], while information about those experiencing chronic use and relapse is from populations who have at some point sought treatment [14–16]. Without a way to connect the information into a singular framework, we are unable to consider and evaluate the relative effectiveness (or cost-effectiveness) of a diverse set of policy proposals.

Markov or population cohort models have been used in the past to combine different data set summaries in order to track entire drug use careers for a population [17–20]. Although Markov models can be useful for understanding questions related to total population consumption or average lifetime consumption, they provide limited insight into the variation in drug use within a population and how or whether individual-level factors or experiences, such as prior drug use or drug treatment history, influence transitions into and out of drug use careers [21]. For example, while one may know that relapse risk for illicit drug use increases with prior history, in a Markov model one may not be able to easily model the dependence on such history given the importance of multiple individual-level factors on the probability of relapse.

Microsimulation modeling offers a way of modeling individual-level drug use trajectories over the life course. More so than Markov modeling, simulation models are able to capture the uncertainty and heterogeneity in factors influencing drug use and its persistence over time by examining how a change in behavior today impacts the natural course of outcomes over an individual’s lifetime. Microsimulation models can explicitly capture both the uncertainty of events as well as the heterogeneity in predictors and outcomes by allowing transition probabilities across drug use states to be functions of multiple individual-level characteristics as well as a random stochastic component [22].

In this paper, we present the RAND Marijuana Microsimulation Model, which mimics marijuana use over the life course starting with a cohort of 12-year olds representative of the United States population of 12-year olds in 2004. We show that sufficient data sources exist to inform the construction of this model; demonstrate the ability of microsimulation model to capture uncertainty and heterogeneity in lifetime trajectories of marijuana use; present evidence for the external validity of the model we have constructed. Our model facilitates the development of a better understanding of use trajectories by synthesizing multiple data sources from different populations. The model assumes a chronic disease perspective in that it recognizes that marijuana use might increase or decrease over time, with periods of no use and relapse [23]. It also retains aspects of individual heterogeneity by allowing transition probabilities between marijuana use states and, say, treatment to be influenced by relevant individual-level characteristics, including prior use history, treatment history, and demographic characteristics.

The rest of the paper is organized as follows. In Section 2, we present the RAND Marijuana Microsimulation Model of the life course of marijuana use for persons in the United States that follows a cohort from age 12 to age 85 and simulates their individual drug use trajectories. We provide an overview of the data sources we use as model inputs in Section 3, highlighting what the data sources do and do not contain, and explain how that informs our model building. In Section 4, we examine external validity of the RAND Marijuana Microsimulation Model by comparing key results from the models to findings from existing public use data. We also examine some key epidemiological questions. Finally, in Section 5, we demonstrate the power of the model by examining alternative “what if” scenarios that are frequently considered by policy makers to assess the probable effect of each policy on marijuana consumption over the lifetime.

#### 2. Model Construction

The RAND Marijuana Microsimulation Model is used to simulate marijuana drug use histories for individuals who belong to a cohort of 12-year olds who are representative of 12-year olds living in the household population in the US in 2004. This cohort is then followed in the model throughout their lives. These drug use histories are determined by individual-level characteristics, such as gender and race/ethnicity that are fixed, as well as those that vary over time, such as age and level of marijuana use. The RAND Marijuana Microsimulation Model consists of states, transition probabilities, and a time cycle during which these transitions take place. *States* represent the physical location and drug use proclivity of the individual at a given point in time . As shown in Figure 1, there are four physical locations in our model: (1) in the community, not in treatment; (2) in the community, outpatient treatment; (3) in residential treatment; (4) dead. In addition to these physical locations, an individual’s state is determined by one of four marijuana use proclivities: being a nonuser, an occasional user, a regular user, and a heavy user.

Marijuana use proclivity represents the individual’s underlying desire to use the substance rather than their current level of use. When an individual is unconstrained and living in the community, his/her observed marijuana use behavior will perfectly reflect his/her current marijuana use proclivity (underlying desire). If, however, the individual is in a constrained environment, then his/her underlying marijuana use proclivity may exceed his/her actual observed drug use in time . For example, a heavy user entering treatment in period may be observed not using drugs in period but that does not mean his/her natural proclivity is to be a nonuser. In actuality, s/he is still a heavy user just being constrained externally not to use. Upon exit from treatment s/he may remain abstinent for some time afterward. However, by having a natural proclivity as a heavy user, s/he will have a higher probability of relapsing into use until, s/he has sustained abstinence for some specified length of time. Hence, one’s natural proclivity for drugs can and will change over the lifetime. Transitions to higher and lower proclivities will occur based on one’s marijuana using experience and length of time consuming marijuana at a given level.

*Transition probabilities* are used to stochastically move an individual from one state (physical location and proclivity) to another state between time and time . These transition probabilities will vary according to the state in which an individual resided in period and may also depend on individual characteristics (e.g., age, gender, and race/ethnicity) and past history of the individual (e.g., age of first marijuana use). For example, changing from heavy use proclivity in the community to a regular use proclivity in the community may be a function of the length of time since the individual last used marijuana and his or her age of initiation. Transitions are generated as random draws from stochastic distributions, the form and parameters of which are determined from regression analyses or cross-tabulations of relevant data from the target populations. Conditional upon a transition in marijuana use taking place at a given age, the individual’s life history will be altered from that point onward, and subsequent probabilities for state transitions will be conditional upon current marijuana use proclivity.

Individuals are allowed to transition from combined physical location and proclivity states from time to ; however, the data sources required to parameterize our model are available for the transitions among physical location states as well as among proclivity states, conditional on location. Because of these data considerations, we implement the state transitions in two steps: first, we model the probability of entering each physical location as a function of individual prior states, demographics, and personal history; second, conditional on having transitioned into a physical location at time , we implement the transition among marijuana use proclivity states. For instance, the probability of a heavy marijuana user who is in the community at time entering residential treatment and having a regular proclivity to use at time would involve calculating the following pair of probabilities:(i) Pr(enter residential treatment at time in community at time ) = (demographics, heavy marijuana use proclivity at time );(ii) Pr(regular marijuana use proclivity at time in residential treatment at time , heavy marijuana use proclivity at time ) = (demographics, marijuana use proclivity at times 1, …, ).

In our model, each *time cycle* (or time step) is represented by one quarter (3 months). One quarter has an important meaning in treatment settings, as longer lengths of stay (typically greater than 3 months) have been associated with improved posttreatment outcomes [27].

Figure 1 summarizes the model structure. The rows represent all 13 possible combined location-proclivity states at any given time . The state transitions that are possible when moving to time are highlighted. For example, nonusers in the community at time have nonzero probabilities of transitioning into occasional use at time , but the model restricts nonusers from entering treatment. In contrast, a regular user who is in the community at time can remain in the community as either a regular user or with a different proclivity by decreasing to occasional use or increasing to heavy use, or else this regular user will die or enter either outpatient or residential treatment. The “NAs” in Figure 1 indicate transitions that are not possible. Note that an individual’s proclivity to use drugs will not experience big changes simply with a change in physical location. For example, when a heavy user is living in the community and then transitions into residential treatment, his/her proclivity will remain at the heavy user level at least for one time period even if the observed marijuana use behavior in the next period drops to zero.

#### 3. Data Sources and Model Parameters

Table 1 lists the parameters of the RAND Marijuana Microsimulation Model, summarizing how their values were determined as well as their data sources. Several key model parameters vary with respect to individual characteristics, as elaborated upon below.

*Marijuana Use Proclivity*

Since data sources on marijuana use proclivity states were only available at the annual level, we modeled marijuana use transitions at the annual level and converted annual transitions to quarterly transitions in the microsimulation model by randomly selecting a plausible set of quarterly transitions associated with each annual transition, under the assumptions that marijuana use proclivity could only change at most one level per quarter.

The National Longitudinal Survey on Youth (NLSY97) is a nationally representative data set of approximately 9,000 youths who were 12 to 16-years old as of December 31, 1996, for which respondents are followed up annually. For each age, we fit an ordinal logistic regression model to predict individual-level marijuana use proclivity states in the current year: no use in past year; occasional use (≤2 times in past month); regular use (3–19 times in past month); heavy use (>20 times in past month). Predictors included in each model are marijuana use in the previous year, gender, race/ethnicity, educational level (less than high school, high school diploma, some college, and college degree), and age of first use of marijuana. We then included the resulting ordinal logistic regression prediction equation into the microsimulation model in order to simulate marijuana use proclivity transitions for 15–24-year olds.

Because very few observations are available for youth aged 12–14 in NLSY, we used data from the National Survey on Drug Use & Health (NSDUH) to estimate drug use proclivity transition probabilities for this age group. We modeled transitions for 12–14-year olds in two steps. First, we estimated the probability of initiating marijuana use at each age using synthetic cohorts from NSDUH from years 2002–2004. We used synthetic cohorts since individual-level data on whether marijuana initiation occurred during the past year is lacking for individual observations. The second step was to simulate drug use proclivity for those who have initiated marijuana ever. Given the very low numbers of respondents who reported ever having used marijuana in this age group in NSDUH, we combined estimates of the marijuana use proclivity levels from NSDUH 2002–2004 to stabilize our estimates of the cross-sectional probabilities of the marijuana use proclivities. We then applied these use probabilities equally to marijuana initiates regardless of individual level characteristics and past history to obtain drug use proclivity for ages 12–14.

The NLSY cohort data we use only cover ages 15–24, so we used a different strategy to obtain marijuana use proclivity transition probabilities for adults aged 25–85. We used data on marijuana initiation from the 1994–1998 National Household Survey on Drug Abuse (NHSDA) to fit an age-period-cohort (APC) model to predict marijuana initiation at each age, controlling for period and cohort effects [28–30]. (1994–1998 of NHSDA were used instead of more recent surveys because the recent survey reports age in unevenly divided precategorized bins, making it untenable for age-period-cohort statistical modeling.) We fit a Poisson regression model with a log link function to the age, period and cohort-specific marijuana initiation counts for each age (a), period (p), and cohort (c) group, , and included as an offset term to adjust for variation in the number of persons eligible to initiate in each APC group:
where , , and represent the effects of belonging to age, period, and cohort group APC. We included age in the model as a set of dummy variables covering 3-year age categories, starting with ages 12–14 and ending with ages 48–50 (given the rarity of initiates over age 50, we assumed zero marijuana initiation over age 50), such that ages 21–23 was the reference (holdout) category. We transformed the age-specific Poisson regression coefficient estimates of marijuana initiation to risk ratios (RRs) for age group* a* by computing . This allowed us to compute the probability of initiation among those aged 25 and older in age group as times the probability of initiation among 21–23-year olds, after controlling for period and cohort effects.

Then, we obtained from NSDUH 2004 the cross-sectional probabilities of marijuana use proclivity states for persons aged 25 and older who ever initiated. Since we lacked individual-level data on transitions across these states, we assume a correlation structure between individual-level marijuana use at year and year . (According to information from the MTF and NSDUH data, marijuana initiation is rare after the age of 24. The mean age of first use for marijuana according to the most recent 2007 NSDUH survey is 17.8-years (SAMHSA, 2008).) For persons aged 25–49, we assumed the correlation of annual marijuana use proclivity states from year to year would be similar to that found between ages 23 and 24 in the NLSY. We assumed that individual-level annual transitions would be more strongly correlated from year to year when individuals were 50-years of age and older, and thus we used the correlation observed from *quarterly* transitions among 24-year olds for annual transitions for those after 50-years of age. We then used iterative proportional fitting [31] to estimate marijuana use proclivity transition probabilities when moving from age to such that the cross-sectional probabilities found in NSDUH 2004 for ages and were preserved.

*Residential and Outpatient Drug Treatment*

We estimated the probabilities of transitioning from the community into treatment by first estimating the number of individuals each year to enter treatment for marijuana use. For this calculation, we make use of data from the Substance Abuse and Mental Health Administration’s (SAMHSA’s) Treatment Episode Data Set (TEDS), a comprehensive treatment admissions data set collected from the states on admissions to treatment offices that receive public funding (either due to block grants, government support, and/or publically insured patients). Unfortunately, the TEDS are episode-based admissions, and it is not possible to identify information on prior treatment episodes (aside from whether the patient had any) in these data. Thus, we augment information from these national data with information from a single state (Oregon) which tracks information on patients entering treatment over time.

We multiply the annual number of marijuana treatment admissions in the 2005 TEDS data by our estimate from the Oregon treatment data of the proportion of these admissions that are for outpatient (92.8%) versus residential (7.2%). We then divide these numbers by the population of regular and heavy users estimated from the 2005 NSDUH data to get the fraction of regular and heavy users estimated to go into both outpatient and residential treatment. Finally, we convert these to quarterly figures by dividing by four. We also use the Oregon data to estimate the number of quarterly admissions that reflect double-counting of individuals—32.2% for residential and 6.1% for outpatient—and reduce the number of residential and outpatient quarterly admissions, respectively, by these amounts in order to estimate the number of individuals going to treatment each quarter.

To estimate the average length of time in treatment, we used administrative data on publicly funded treatment admissions to residential and outpatient drug treatment programs for marijuana abuse and dependence from administrative data in Oregon, years 1996–2008. We fit Weibull regression models to these data to develop predictions of the lengths of stay for clients in residential and outpatient marijuana treatment settings that varied by age, gender, race/ethnicity, age of first marijuana use, number of previous episodes of treatment (if any), and cumulative length of time spent in outpatient and residential treatment. We similarly fit multinomial logistic regression models to derive probabilities of transitioning from the current treatment episode to another episode (whether that be outpatient or residential) or to moving into the community, given current physical location, with these probabilities also varying by the aforementioned characteristics. We applied these probabilities to individuals whose length of stay in treatment had been achieved in the current time cycle, as determined by the Weibull model-based predictions of length of stay.

Treatment effectiveness is defined in terms of one-year recovery rates (abstinent for at least a year), and presumed in the model to be 20%, based on findings from two large randomized clinical trials of adolescents in treatment for marijuana specifically. The Cannabis Youth Trial (CYT), the largest cannabis use disorder treatment trial to date, found the percent of youth in recovery (abstinent for at least a year) of 12%–16% [24]. A more recent randomized trial of adolescent outpatient treatment showed an overall recovery rate of about 20% [25]. This higher recovery rate is consistent with a study of treatment for cannabis-dependent adults, yielding an abstinence rate of 23% at 15 months [26].

*Death*

We used population-based life table data for the United States [32] to model mortality. Data were available on mortality for white men, white women, black men, and black women. The white mortality life table data were used to model mortality for other nonwhite persons. Since mortality data are only available separately from data on transitions to other physical locations, we first estimate the probabilities of death for each person who is alive in the simulation at time ; next, we estimate transition probabilities for entering the other physical location states for those persons who are still alive at time .

*Uncertainty in Microsimulation Parameters*

It is important to account for variation in our simulation output when validating the model and obtaining estimates based on the model. For example, to compare simulated marijuana use trajectories to those found in the literature (i.e., the “validation data”), one needs to know how “close” these two estimates are to one another. If we were to run the model just once, we could simulate a marijuana use trajectory that does not match the validation data. However, this could be a false negative in that a second run of the simulation model could yield a different outcome and match the validation data. The variation exists because of the uncertainty that we build into the model (stochastic events) as well as the uncertainty of the parameters predicting the likelihood of these events. Thus, to account for uncertainty in our simulation model output, we implement Monte Carlo simulation, which involves repeatedly (500 times) and randomly drawing parameter estimates used as inputs to the model from their estimated sampling distributions, running the model once for each set of sampled parameter estimates, and then summarizing uncertainty in microsimulation-based estimates with approximate 95% uncertainty intervals. This will allow us to compare our microsimulation results to external data for validation purposes and to report uncertainty in model estimates that is a function of the uncertainty in model inputs.

#### 4. External Validation of Microsimulation Model

No long-term longitudinal data following a nationally representative cohort of marijuana users exists. That is precisely why modeling techniques are useful. However, it also makes external validation challenging. Nevertheless, there are several ways we can partially externally validate our model. The first is to compare short-term outcomes predicted from the model (i.e., age-specific prevalence rates) with those from national data sources not used to parameterize of the model. For example, marijuana use proclivity transition probabilities for those aged 15–24 are based on regression models fit to data from the NLSY97 survey, while NSDUH was used to estimate marijuana use proclivities for other ages. Using prediction models based on these data that are updated each year as the sample ages, our model can generate age-specific prevalence rates. These prevalence rates can then be compared to those from nationally representative data sets not used in the model parameterization, such as the Monitoring the Future (for high school seniors) and National Household Survey on Drug Use and Health (NSDUH) (for those aged 15–24). One challenge with making comparisons to such nationally representative data sets is that they actually report slightly different use rates. For example, both NSDUH and Monitoring the Future (MTF) collect data on marijuana use, but NSDUH marijuana use figures are consistently lower than those from the MTF, which could be due to underreporting in the household setting or due to differences in sampling frames, for example, MTF does not survey school dropouts, a group that NSDUH has shown to have different rates of illicit drug use than those who remain in school [33]. We, therefore, compare our model results to both of these data sources for external validation.

Figure 2 shows how annual prevalence rates from both our microsimulation model compared to prevalence estimates from the 2002 NSDUH for the age groups 12–30. The 2002 NSDUH is used because it coincides with the simulated population reaching age 18 based on NLSY97 data (in 1997 the population was 11-12 and by 2002 they would be 17-18). This is a key age as evidence from the NSDUH (2007) suggests that 17.4 is the average age of initiation among those who use marijuana. Thus, we know we should capture roughly half of all initiation by age 18. Annual prevalence rates from ages 12–14 are indeed fairly consistent for both, as is to be expected given that both models use data from a very similar data source, the 2004 NSDUH, for constructing transitions at ages 12–14. What is relevant, however, is how closely the microsimulation prevalence rates mirror the NSDUH 2002 age-specific prevalence rates at the other ages.

The microsimulation model lets us also consider whether prevalence rates among key demographic groups are consistent with those observed in other national data. Figure 3 compares the microsimulation annual prevalence results at age 18 by race and gender to data on high school seniors from the 2002 Monitoring the Future Survey (MTF), with the MTF figures reflecting self-reported past year marijuana use. Here it can be seen that the microsimulation does well at capturing differences in use by race and gender. Estimates from the microsimulation model are indeed consistent with those across all demographic groups. Though there is no widely accepted level for establishing the equivalence of prevalence rates across demographic groups, it is helpful to note that the differences between the model results and MTF are smaller than differences that are considered by substance abuse treatment providers to be clinically meaningful [34].Thus, the microsimulation model appears to be doing well, as indicated by its ability to match external data not used in the model, capturing both population prevalence rates as well as important variation among key demographic groups within the population.

#### 5. Results

Table 2 presents the average person years of use overall and by race/ethnicity and gender for our simulated population. Overall, Whites use marijuana for the longest period of time relative to Blacks and Hispanics (8.8-years versus 8.2 and 8.2-years, resp.), with the average use career for males being 9.1 years and for females 8.5-years. What can also be seen in Table 2 is that there is stability in the person-years of marijuana use across race/ethnicity and gender categories within a level of use. In general, females at any level of use consume for shorter periods than men and Hispanics and Blacks consume for shorter periods than Whites. The ethnic/gender variation is most pronounced among heavy users, with Whites having on average 0.8 additional person-year of marijuana use than Blacks or Hispanics, and males having 0.6 more person-year of use than females.

Table 3 presents the percent of the population cohort of 12-year olds who ever become regular or heavy users overall and by race/ethnicity and gender categories. Again we see that racial and ethnic minorities are less likely to become regular and/or heavy users than Whites. For example, 26.9% of Whites are at some point in their life heavy marijuana users, versus 20.0% for Blacks and 19.0% for Hispanics. A larger percentage of males versus females also become heavy marijuana users (27.6% versus 21.3%).

Figure 4(a) shows that the probability of being a heavy user of marijuana at each age conditional on heavy use in the previous year steadily increases from near zero at age 14 to about 0.75 by age 25, then leveling at about 0.6 until age 34. After age 34, the probability of being a heavy user begins to decline again, but then becomes highly variable by age 40, as indicated by the increasing width of the 95% uncertainty intervals, shown as horizontal segments. The substantial uncertainty in the estimates as age increases beyond age 40 reflects the uncertainty in the input data given the decreasing numbers of persons in the target population who are heavy marijuana users, emphasizing the great uncertainty in addressing this question for persons over 40 given the input data. Similarly, Figure 4(b) shows greater continuity of marijuana use among younger people, with those aged 13–35 having probabilities around 0.65 among regular/occasional marijuana users of continuing to use at this proclivity level one year later. After the of age 35, the average probabilities of continuing to use at this proclivity level are close to 0.5, but again become extremely variable for individuals over the age of 50.

**(a)**

**(b)**

The top row of Table 4 provides some additional findings from the baseline model, which we will use a as a basis of comparisons for some later policy simulations. The first column reports the average number of use years of marijuana for the entire cohort (including never users) over the life course (5.3-years). As a large number of users never initiate marijuana use, these nonusers bias down the average number of use years a bit. The average duration of use (in years) among those that initiate marijuana is 8.8-years (seen previously in Table 2 as well). The next two columns report the prevalence of any marijuana use and heavy marijuana use at three key ages (18, 19, and 20) for the cohort, showing that at these ages approximately one-third of the cohort reports using marijuana and one in 14 reports doing so heavily. Interestingly, the rate of heavy use appears to rise between age 18 and 20 (from 6.1% to 8.1%) even though prevalence of any use does not.

In the last column of Table 4, we show the number of treatment episodes per 10,000 in the study population cohort (overall and then among heavy users). The results in the model suggest that the total number of episodes per 10,000 in the population is 524. However, it appears that very few people in our cohort have repeat treatment episodes, as the model only predicts 419 individuals receiving treatment per 10,000 in the study population cohort or 1.25 episodes of treatment for the population of users (results not shown). This suggests that about 20% of those in treatment have more than one treatment episode for marijuana specifically. If we narrow the analysis to just those individuals who use marijuana heavily at some point during their lifetime, we see that this group experiences a much larger number of episodes of treatment (1640 episodes per 10,000 heavy users) over the life course as one might expect. Interestingly, however, we do not see that heavy users of marijuana are any more likely to experience multiple episodes of treatment for marijuana, as the average number of treatment episodes for heavy users is 1.27 (versus 1.25 for all users).

#### 6. Policy Simulations

While microsimulation models can be valuable for building use trajectories that would not otherwise be fully observed if one had to rely on existing data sets, as has been done thus far, the real strength of these models is their ability to explore the implications of alternative “what if” scenarios. The “what if” scenarios may be hypothetical or closely approximate real policy changes that have been proposed, and the results of these scenarios can be explored both from a short run (i.e., immediate effect) and long run (effects over the life course) perspectives. “What if” scenarios are useful for understanding how sensitive important outcomes might be to policy changes. The population outcomes associated with these “what if” scenarios or policy changes can then be compared to the world as it actually is, to see what the plausible impact of these changes are, even given the uncertainty seen in the real world.

We make use of this feature by considering three alternative “what if” scenarios that have frequently been debated in policy circles. Each of these is compared to our baseline scenario discussed thus far, where the baseline effectiveness of marijuana treatment is 20% and prevention is not explicitly considered.

The first scenario takes a much more favorable view of marijuana treatment, assuming an overall effectiveness (one-year recovery) rate of 35%. This is based on findings reported in two papers suggesting that that enhancing treatment, through voucher incentives and behavioral skills training, resulted in a 35% abstinence rate at the end of treatment [35, 36]. The CYT also evaluated five short-term outpatient psychosocial interventions and reported rates of no drug use in the past 30 days and other criteria of up to 34%. If the CYT treatment arms could be enhanced, possibly through booster sessions to maintain effects over a longer period of time, then a general recovery rate of 35% may not be unrealistic. We, therefore, use this alternative assumption of a 35% effectiveness rate as our first “what if” scenario.

A second “what if” scenario maintains the baseline treatment effectiveness estimate of 20% but instead increases the availability of treatment by 10%. The availability of treatment is an important policy lever, since many studies have shown drug treatment to be cost-effective in reduce drug-related crime, such as in the case of cocaine [37]. There is also public support for increased treatment availability, as illustrated by the consistent funding by the Federal Government for State and Community Grants to expand treatment and develop drug courts.

Our final scenario considers an alternative to funding additional treatment and instead considers using prevention. Treatment effectiveness remains as it did in the baseline model (20%), but now we presume that a universal, school-based primary drug prevention program is implemented and delivered to students aged 12–14. A synthesis of the literature on such model (evidence-based) programs found that they reduced the prevalence of marijuana use at one-year followup by 10.9% [38]. Despite marijuana prevalence rates returning to the baseline levels by age 18 following implementation of these programs, the delays in initiation have shown universal school-based drug prevention to be cost-effective over the long-term [38].

The impact of each of the three scenarios on a few baseline metrics from the RAND Marijuana Microsimulation Model is reported in the second through fourth row of Table 4. The average duration of marijuana use for all persons in the population cohort as well as just those who ever initiate marijuana were essentially constant across all three alternative policy scenarios, with a very small decrease for the school-based drug prevention scenario. The prevalence of marijuana use among 18–20-year olds in the population cohort was also rather similar, with the prevention scenario resulting in very small reductions in prevalence at all three age groups, which would be expected if there were any sort of long-term gains of marijuana prevention. The larger difference, however, is in the prevalence of heavy marijuana use across these three ages. The prevention scenario suggests lower rates of heavy use for all three age groups. As treatment is presumed to be the result of heavy use, it is not surprising that the two treatment scenarios do not generate large reductions in the prevalence of heavy use for these young users, as many of them have just started using at this higher level (and hence have not yet been exposed to treatment). The most noticeable difference across the policy scenarios pertains to the number of treatment episodes. The number of treatment episodes per 10,000 in the study population cohort is 473 when school-based drug prevention is implemented, whereas changing the effectiveness of drug treatment to 35% from 20% results in a smaller reduction in the numbers of treatment episodes in the study population cohort (to only 503). This is not terribly surprising that increasing treatment effectiveness does not result in a bigger reduction in the number of treatment episodes given only 20% of those who go to treatment experience more than one treatment episode in our base model. If we were evaluating a different substance with a much higher relapse rate (e.g., cocaine or heroin), increasing treatment effectiveness would be expected to have a much larger impact and possibly greater than that for prevention. Finally, expanding access to treatment does little in our model to reduce prevalence rates or rates of heavy use at the ages of 18–20, perhaps because only a small proportion of the treatment episodes occur at these young ages in our model. However, expanding treatment does in fact expand the number of treatment episodes, as one would expect. What is most surprising about this is that we do not see this policy having any effect on the average number of years marijuana is used in the cohort. If treatment was more effective or more people had access to it, one would expect that the years of marijuana use might decline. This might not be the case, however, as it is not just heavy users who use marijuana for long duration. It appears in our model that a relatively large share of the cohort who uses marijuana does so at a relatively low level over a long period of time. Hence, expanding treatment does nothing for reducing the duration of marijuana use for the entire cohort.

#### 7. Discussion

Microsimulation modeling of lifetime drug use and outcomes is a useful complement to traditional methods for estimating drug use for a population cohort in that it captures some interesting, and potentially relevant, heterogeneity in drug use across individuals and over time. In this paper, we describe a model we built to examine marijuana use over the life course for a cohort of individuals with the same basic demographics as those observed for 12-year olds in the 2004 NSDUH. We demonstrate that a model relying on predicted transitions similar to those observed in the NLSY97 data can generate use rates youth and young adult use rates that are consistent with those observed in other data sources and also capture important heterogeneity in terms of age, gender, and race. The ability of the model to replicate information from external data sources suggests that the model captures the important dynamics relevant for describing marijuana use over the life course. As such, it provides a useful tool for examining a variety of policy scenarios that cannot necessarily be fully evaluated using real data, as it takes time for the full impacts of a policy on trajectories to be realized and our current data systems do not provide linkages between youth cohorts and cohorts of heavy users.

While a microsimulation model provides a useful method of predicting heterogeneous patterns of marijuana use over time, there are numerous limitations of such simulation models, including the fact that they are based on parameters that are calculated from existing knowledge and existing data that may change in the future. One limitation is that the validity of results depend on the correctness of the model specification. As the lack of data makes it impossible to know with certainty whether the model is correctly specified, the ability to validate outcomes from the model helps us better understand if it is reasonably parameterized. Once the structure of the integrative simulation model is built, alternative plausible parameters and model constructions could easily be explored to determine their effects on the assumptions and calculations of the model and its final outcome. Another limitation is that our model is parameterized using data from the United States, and thus results may not be generalizable to other countries. However, input parameters for our model could be derived using data from other countries. A more significant limitation is that the model currently only considers a limited set of outcomes and consequences. The simplification was necessary to facilitate the initial construction of the microsimulation model. Future work will consider additional areas not currently considered within this model, such as criminal justice involvement, health events, and labor market events. If the patterns suggested by the microsimulation model are indeed correct, it can provide an important insight into the optimal timing of prevention or treatment interventions aimed at reducing overall or heavy use.

With those caveats regarding the model’s limitations in mind, policy simulations using this model suggest that efforts to improve prevention may be more effective at influencing marijuana use (as measured by young adult prevalence rates, young adult heavy user prevalence rates, and years of use at a regular or heavy level) than increasing treatment access or effectiveness. However, this is due in large part to the relatively low rates in which marijuana users get treatment (less than 5%) and the relatively low rates of readmission for this drug. If we were instead considering a model where relapse and readmission to treatment was significantly higher (say cocaine or for heroin), it is entirely possible that treatment could be more effective than prevention in terms of reducing the total years of use in any given proclivity level as well as the likelihood of becoming a heavy user. So, we caution the reader not to generalize findings from this study, which are specific to marijuana and a function of the presumed underlying scenarios.

We also point out that while the model does very well mimicking real-world observations for a relatively young population (up to age 40), the predictions and trajectories from this model become relatively less stable and more uncertain after the age of 40. This is due to the underlying uncertainty of the data used to inform parameter estimates used in the microsimulation model. Nonetheless, current epidemiological models of marijuana use show that most marijuana use drops precipitously after the age of 30 [11], suggesting that the model may be sufficient for capturing the more relevant trends of interest to policy makers.

#### Acknowledgments

This paper describes work that was done with support from the National Institute on Drug Abuse (NIDA) (Grant no. 1R01 DA019993). This research was conducted with restricted access to Bureau of Labor Statistics (BLS) data. The views expressed here do not necessarily reflect the views of NIDA, RAND, or BLS. The authors gratefully acknowledge research assistance from Russell Lundberg, Erin de la Cruz, and Tanya Bentley and significant guidance from members of their expert panel, including Michael Dennis, Susan Ettner, Michael French, Teh-wei Hu, Martin Iguchi, Emmett Keeler, Andrew Morral, Peter Reuter, and Jeffrey Wasserman.