Journal of Biomedicine and Biotechnology

Volume 2008 (2008), Article ID 694297, 5 pages

http://dx.doi.org/10.1155/2008/694297

## Statistical Analysis of Twin Populations using Dissimilarity Measurements in Hippocampus Shape Space

^{1}Center for Imaging Science, Johns Hopkins University, Baltimore, MD 21218, USA^{2}Department of Psychiatry, Washington University, St. Louis, MO 63110, USA

Received 19 July 2007; Accepted 19 November 2007

Academic Editor: Daniel Howard

Copyright © 2008 Youngser Park et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

By analyzing interpoint comparisons, we obtain significant results describing the relationship in “hippocampus shape space” of clinically depressed, high-risk, and control populations. In particular, our analysis demonstrates that the high-risk population is closer in shape space to the control population than to the clinically depressed population.

#### 1. Introduction

Major depressive disorder (MDD) is a mental disorder affecting about 16% of the US adult population, and is a major cause for concern not only in the United States but the world over. It is a disorder characterized by depressed mood, diminished interest or pleasure, significant weight loss, feelings of guilt or low self-worth, insomnia or hypersomnia, fatigue, poor concentration, or recurrent thoughts of death. The symptoms are widespread, and tend to be quite stable. In 2000, the World Health Organization (WHO) estimated depression to be the leading cause of disability as measured by years lived with disability (YLD) and the fourth leading contributor to the global burden of disease. See [8].

Over the years, a significant amount of research has been dedicated to finding physiological causes of MDD. One such study involved the catecholamine hypothesis [1] that suggested that MDD is caused by decreased levels of the neurotransmitters norepinephrine and serotonin. This finding led to most modern day medication for MDD, which works by preventing the reuptake of these neurotransmitters. Neuroimaging research has also shown that enlarged ventricles, sulci, reduced volume of the frontal lobe and basal ganglia are also associated with depressive episodes [1].

The studies aforementioned involved studying the brain
once MDD had already set in. The physiological changes are associated with the
symptoms themselves. What about physiological *predictors* for MDD? Such predictors would
facilitate the diagnosis of the disorder well before the onset of the symptoms,
perhaps allowing measures to prevent the symptoms from ever appearing.

A vast amount of research is being conducted in order to find biological predispositions to MDD. There is evidence correlating shape differences of the hippocampus to depression [6] and schizophrenia [10]. In this manuscript we analyze interpoint comparisons [2] to investigate the relationship in “hippocampus shape space” of three populations among twins. The subjects are categorized into three categories: the affected subjects (clinically depressed, or MDD), the nonaffected cotwin of the MDD subjects (high-risk, or HR), and the nonaffected twin pair (Control, or CTRL). The dataset includes both monozygotic (MZ) and dizygotic (DZ) twin subjects.

According to established literature, the concordance rate for monozygotic (MZ) HR subjects is 40%, and for dizygotic (DZ) HR subjects 11% [15]. This demonstrates that the subjects labeled HR (due to the fact that their twin is MDD) are in fact high risk–-they develop MDD at a higher rate than the general population.

#### 2. Data

Our data set includes subjects (57 twin pairs): 29 CTRL-CTRL pairs, 22 HR-MDD pairs, and 6 MDD-MDD pairs. The subjects are young female twins recruited through an epidemiological sample based on Missouri birth records. To ensure that hippocampus shape space is the only independent variable, other factors had to be controlled; all of the subjects were right handed and were screened for factors that may cause structural changes of the brain such as loss of consciousness greater than 5 minutes, chronic medical or neurological illnesses, or pregnancy.

To obtain images of the hippocampus, very high
resolution magnetic resonance imaging (MRI) scans were required. The Siemens
Vision/Sonata 1.5T scanner was used to acquire three MPRAGE scans [19] (160 slices at FoV, 1.0?mm^{3} isotropic
voxels). Using Analyze [12], the images were registered and averaged, converted
to 8-bits while optimizing the intensity range, and interpolated to 0.5?mm
isotropic voxels. The image protocol implemented above allows for optimal
comparative analysis.

For each of left and right hemispheres separately, 22 three-dimensional landmarks were identified for each hippocampus and were used to generate and align hippocampal subcubes to a standardized orientation. It is these landmark data that we employ herein.

#### 3. Shape

Using the landmark data, for each pair of subjects and for each of left and right hippocampus, we produce an interpoint shape comparison, as described below.

For two subjects and (for the left hemisphere, say), let and be the corresponding landmarks, where .

##### 3.1. Landmark Matching

Finding the shape comparison involves a landmark matching (LM) transformation. The transformation is nonparametric, and this flexibility implies that overfitting must be guarded against via regularization. LM finds a diffeomorphism that minimizes an error criterion which includes both landmark mismatch and transformation complexity. That is, where is a geodesic distance in a group of diffeomorphisms [4] and is a regularization parameter which controls the relative contribution of transformation complexity versus landmark mismatch to the optimization objective. The algorithm solves the nonlinear Euler equation by a Newton method combined with a shooting procedure [18].

We use , the energy of the minimizing diffeomorphism, as the shape comparison between two subjects and (for the left hemisphere, say).

##### 3.2. Interpoint Comparison Matrices

Applying LM to the left or right hippocampus data for each pair of subjects yields an interpoint comparison matrix . However, is , hollow (zeros on the diagonal) and is asymmetric. That is, we obtain matrices and .

The nature of the hippocampus shape space is such that under ideal conditions, it should yield a symmetric distance matrix. The asymmetry of the matrix does not reflect the true nature of the hippocampus shape space, and is in fact a result of the limitations in the LM matching method. Hence, before further investigation, must be symmetrized to , using an appropriate symmetrization technique [5]. In this work we symmetrize via .

Figure 1 depicts the structure of the interpoint comparison matrices for the 114 subjects. Figure 2 depicts the actual interpoint comparison matrix (after symmetrization) for the 114 subjects.

#### 4. Statistical Analysis

Our task is to begin describing the relationship of
the three populations (MDD, HR, CTRL) amongst one another in the hippocampus
shape space elicited by the LM interpoint comparisons. First, we present a
multidimensional scaling (MDS) [13] scatter plot; unfortunately, we see in
Figure 3 that no significant relationship can be discerned from this plot.
Employing linear discriminant analysis (LDA) after MDS for all possible MDS
target dimensionalities–-analysis via LDA MDS LM *a la* Miller et al. [9]–-yields no
classification capabilities statistically significantly superior to chance.
Nevertheless, we will see in Figures 4 and 5 a suggestion that perhaps progress
can be made on our task, given a sufficiently clever methodology.

Figure 4 depicts kernel probability density estimates
[7] for the LM-Left comparisons to show that the entries of the interpoint
comparisons matrix that correspond
to comparisons between HR and CTRL (the solid line in Figure 4) are, overall, *smaller* than the entries which correspond
to comparisons between HR and MDD. That is, Figure 4 suggests a *stochastic ordering* relationship [3]: . Such a result is precisely what we seek. Again,
dependencies amongst the entries of make it
difficult to assess the statistical significance of the result depicted in
Figure 4.

Each row of the interpoint comparisons matrix , corresponding to a single HR subject, gives rise to
two samples: and . That is, we have the vector of comparisons from that
HR subject to every CTRL subject, and we have the vector of comparisons from
that HR subject to every MDD subject. (We do not include in these vectors the
twin of the particular HR subject under consideration; ignoring twinnedness in
the analysis proves beneficial that we eliminate bias in similarity status
between a subject and her twin that is not due to condition (MDD,HR,CTRL).) For
this individual HR subject's two sample data, a Wilcoxon-Mann-Whitney test [17]
of the null hypothesis that the distribution of comparisons is the same as
the distribution of comparisons , against the alternative of *stochastic ordering*, yields a *P*-value.
Figure 5 provides a quantile-quantile plot of these -values for . Under the null hypothesis, these -values would
be expected to be distributed approximately uniform(0,1). The plot demonstrates
a clear deviation from a uniform distribution, again suggesting a *stochastic ordering* relationship–-. Again, dependencies amongst the entries of make it
difficult to assess the statistical significance of the result depicted in
Figure 5.

The quantile-quantile plot independently reiterates the suggestion of a stochastic ordering relationship that was first seen using the kernel probability density estimates. Thus, while Figures 4 and 5 give an inkling of the type of information that can be gleaned regarding the relationship in hippocampus shape space of the three populations (MDD, HR, CTRL) amongst one another, it remains henceforth to accurately assess the Figures' suggestion.

#### 5. Classification

To further uncover the characteristics of hippocampus
shape space, we consider the task of *classifying* each HR subject as either MDD
or CTRL.

As before, we consider the two samples, and , associated with each individual HR subject. We classify the HR subject as belonging to MDD or CTRL based on the Wilcoxon-Mann-Whitney test statistic -value, as described in [16]; (see also [3, page 183]).

Once we have classified each of the HR subjects in this way, we assess the relative similarity of HR to CTRL versus MDD based on the classifier's performance–-based on the collection of HR subjects' classifier-assigned class labels, taken as a whole.

This procedure can be employed with LM interpoint comparisons obtained on Left, Right, or both Left and Right hippocampuses, and with any of the three populations (HR, CTRL, MDD) as the population of interest–-the role of HR in the description above.

#### 6. Results

Classifying the 22 HR subjects as either MDD or CTRL
using results in 19
classified as CTRL versus 3 classified as MDD. The probability of obtaining a
result this extreme or more extreme (the -value) under
the least favorable null hypothesis *HR are equally likely to be classified as MDD
as CTRL* is against each
one-sided alternative. LM-Right yields 16 classified as CTRL versus 6
classified as MDD–-classification performance not statistically
significantly distinguishable from chance. Combining left and right, the shape
comparisons LM (LM-Left and Right) yields 20 classified as CTRL versus 2
classified as MDD–- for each
one-sided alternative and strong statistical evidence that HR is more like CTRL
than MDD in hippocampus shape space.

An analogous analysis–-classifying the 33 MDD
subjects as either HR or CTRL using LM-Left–-shows that MDD is more like
CTRL than it is like HR ( for each
one-sided alternative), and that the left carries more information than does
the right–-the *P*-values are smaller indicating that the signal is
stronger.

The results obtained from classifying the 59 CTRL
subjects as either HR or MDD are more nuanced: in this case, using LM-Left
indicates that CTRL is more like HR than it is like MDD (*P* < .0005) while using
LM-Right indicates that CTRL is more like MDD than it is like HR (*P* < .0005). This hemispherical ambiguity provides further
insight into hippocampus shape space.

Finally, we note that in the last two columns of Table 1 we consider classifying the 22 HR subjects (via leave-one-out crossvalidation) as HR or MDD and as HR or CTRL. These results are consistent with our other findings–-HR is more difficult to distinguish from CTRL than from MDD, and the information extracted via LM-Left is more powerful for this task than is LM-Right.

#### 7. Conclusions/Discussion

Our analysis indicates that HR is more like CTRL than it is like MDD, MDD is more like CTRL than it is like HR, and CTRL is not obviously more like one or the other. Also, we discern that the left hippocampus carries more information than does the right.

If hippocampus shape space were one-dimensional–-if
the population shapes could be accurately represented in –-then the
joint relationship described by these three results could be depicted as in
Figure 6, with the CTRL population *between* the HR and MDD populations in
terms of shape. However, it must be noted that this depiction (Figure 6) offers
only a simplified view of the true infinite dimensional nature of the shape
space configuration, as suggested by the fact that the 2-dimensional MDS
embedding depicted in Figure 3 presents little or no class separation.

##### 7.1. On Populations

Our stated task is in terms of *populations*–-to begin describing the
relationship in hippocampus shape space of the three populations (MDD, HR,
CTRL) amongst one another. However, our results are *conditional*–-using LM-Left we classify, for example, the 22 HR subjects representing the HR population as belonging to
either the MDD or the CTRL class, conditionally on “training” data from MDD
and CTRL. This, in fact, is the standard approach in probabilistic
pattern recognition; see, for example, [11]. The difference between a focus on
populations versus conditionals is indicative of a difference between “policy
science” and “laboratory science” [14]. A justification for the conditional
approach in “laboratory science” is given in [11] where it is claimed that the
unconditional approach “ would be
unnatural, because in a given application, one has to live with the [training]
data at hand.” In “policy science”, however, knowledge about the populations
themselves may be the focus.

By performing our analysis thrice, for each of the
three populations in turn conditionally on the “training” data from the other
two, we obtain three conditionals. Letting denote the
class-conditional sample sizes for each of the three classes, we see that the
joint distribution for our sample is -dimensional
(where is the presumed
“shape space” dimensionality of each observation). Each conditional
considered is -dimensional,
with one population remaining. The overall joint distribution of interest–-the three populations in “shape space”–-is of course not simply the
product of our three conditionals. However, *some* population inferences regarding
stochastic ordering can be performed via the (multiple) conditionals, and in
particular the conditional approach justifies the simplistic view of our three
populations in “shape-space” given by Figure 6.

#### Acknowledgment

Sincere appreciation to Michael Bowers (JHU), Timothy Brown (JHU), Anthony Kolasny (JHU), Tomoyuki Nishino (WashU), and the others for their valuable assistance.

#### References

- “Depression. National Institute of Mental Health,” http://www.nimh.nih.gov/publicat/depression.cfm.
- L. B. Alloy, J. H. Riskind, and M. J. Manos,
*Abnormal Psychology: Current Perspectives*, McGraw-Hill, New York, NY, USA, 9th edition, 2005. - J. Posener, L. Wang, J. Price et al., “High dimensional mapping of the hippocampus in depression,”
*American Journal of Psychiatry*, vol. 160, no. 1, pp. 83–89, 2003. View at Publisher · View at Google Scholar - J. G. Csernansky, S. Joshi, L. Wang et al., “Hippocampal morphometry in schizophrenia by high dimensional brain mapping,”
*Proceedings of the National Academy of Sciences of the United States of America*, vol. 95, no. 19, pp. 11406–11411, 1998. View at Publisher · View at Google Scholar - J.-F. Maa, D. K. Pearl, and R. Bartoszyński, “Reducing multidimensional two-sample data to one-dimensional interpoint comparisons,”
*Annals of Statistics*, vol. 24, no. 3, pp. 1069–1074, 1996. View at Publisher · View at Google Scholar - “Genetics and major psychiatric disorders:a program for genetic counselors. National Coalition for Health,” http://www.nchpeg.org/cdrom/empiric.html.
- J. P. Mugler III and J. R. Brookeman, “Three-dimensional magnetization-prepared rapid gradient-echo imaging (3D MP RAGE),”
*Magnetic Resonance in Medicine*, vol. 15, no. 1, pp. 152–157, 1990. View at Publisher · View at Google Scholar - Analyze Software, “Mayo Clinic,” http://www.mayo.edu/bir /Software/Analyze/.
- M. I. Miller, A. Trouvé, and L. Younes, “Geodesic shooting for computational anatomy,”
*Journal of Mathematical Imaging and Vision*, vol. 24, no. 2, pp. 209–228, 2006. View at Publisher · View at Google Scholar - S. Allassonnire, A. Trouve, and L. Younes, “Geodesic shooting and diffeomorphic matching via
textures meshes,” in
*Proceedings of the 5th International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR '05)*, pp. 365–381, Augustine, Fla, USA, November 2005. View at Publisher · View at Google Scholar - T. Saito and H. Yadohisa,
*Data Analysis of Asymmetric Structures*, Marcel Dekker, New York, NY, USA, 2005. - T. Cox and M. Cox,
*Multidimensional Scaling*, Chapman & Hall, New York, NY, USA, 2nd edition, 2001. - M. Miller, C. Priebe, and Y. Park, “Collaborative computational anatomy: the perfect storm
for mri morphometry study of the human brain via diffeomophic metric mapping, multidimensional
scaling and linear discriminant analysis,” to appear in
*Proceedings of the National Academy of Science*. - B. Silverman,
*Density Estimation for Statistics and Data Analysis*, Chapman & Hall, New York, NY, USA, 1986. - P. J. Bickel and K. A. Doksum,
*Mathematical Statistics: Basic Ideas and Selected Topics*, vol. 1, Prentice Hall, Upper Saddle River, NJ, USA, 2nd edition, 2005. - J. Rice,
*Mathematical Statistics and Data Analysis*, Addison-Wesley, Reading, Mass, USA, 2nd edition, 1995. - C. E. Priebe, “Olfactory classification via interpoint distance analysis,”
*IEEE Transactions on Pattern Analysis and Machine Intelligence*, vol. 23, no. 4, pp. 404–413, 2001. View at Publisher · View at Google Scholar - L. Devroye, L. Gyorfi, and G. Lugosi, “A Probabilistic Theory of Pattern Recognition,” 1996, Number 31 in Applications of mathematic. View at Google Scholar
- B. Caffo, 2006, Personal communicatio.