About this Journal Submit a Manuscript Table of Contents
Journal of Biomedicine and Biotechnology
Volume 2008 (2008), Article ID 694297, 5 pages
Research Article

Statistical Analysis of Twin Populations using Dissimilarity Measurements in Hippocampus Shape Space

1Center for Imaging Science, Johns Hopkins University, Baltimore, MD 21218, USA
2Department of Psychiatry, Washington University, St. Louis, MO 63110, USA

Received 19 July 2007; Accepted 19 November 2007

Academic Editor: Daniel Howard

Copyright © 2008 Youngser Park et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


By analyzing interpoint comparisons, we obtain significant results describing the relationship in “hippocampus shape space” of clinically depressed, high-risk, and control populations. In particular, our analysis demonstrates that the high-risk population is closer in shape space to the control population than to the clinically depressed population.

1. Introduction

Major depressive disorder (MDD) is a mental disorder affecting about 16% of the US adult population, and is a major cause for concern not only in the United States but the world over. It is a disorder characterized by depressed mood, diminished interest or pleasure, significant weight loss, feelings of guilt or low self-worth, insomnia or hypersomnia, fatigue, poor concentration, or recurrent thoughts of death. The symptoms are widespread, and tend to be quite stable. In 2000, the World Health Organization (WHO) estimated depression to be the leading cause of disability as measured by years lived with disability (YLD) and the fourth leading contributor to the global burden of disease. See [8].

Over the years, a significant amount of research has been dedicated to finding physiological causes of MDD. One such study involved the catecholamine hypothesis [1] that suggested that MDD is caused by decreased levels of the neurotransmitters norepinephrine and serotonin. This finding led to most modern day medication for MDD, which works by preventing the reuptake of these neurotransmitters. Neuroimaging research has also shown that enlarged ventricles, sulci, reduced volume of the frontal lobe and basal ganglia are also associated with depressive episodes [1].

The studies aforementioned involved studying the brain once MDD had already set in. The physiological changes are associated with the symptoms themselves. What about physiological predictors for MDD? Such predictors would facilitate the diagnosis of the disorder well before the onset of the symptoms, perhaps allowing measures to prevent the symptoms from ever appearing.

A vast amount of research is being conducted in order to find biological predispositions to MDD. There is evidence correlating shape differences of the hippocampus to depression [6] and schizophrenia [10]. In this manuscript we analyze interpoint comparisons [2] to investigate the relationship in “hippocampus shape space” of three populations among twins. The subjects are categorized into three categories: the affected subjects (clinically depressed, or MDD), the nonaffected cotwin of the MDD subjects (high-risk, or HR), and the nonaffected twin pair (Control, or CTRL). The dataset includes both monozygotic (MZ) and dizygotic (DZ) twin subjects.

According to established literature, the concordance rate for monozygotic (MZ) HR subjects is 40%, and for dizygotic (DZ) HR subjects 11% [15]. This demonstrates that the subjects labeled HR (due to the fact that their twin is MDD) are in fact high risk–-they develop MDD at a higher rate than the general population.

2. Data

Our data set includes subjects (57 twin pairs): 29 CTRL-CTRL pairs, 22 HR-MDD pairs, and 6 MDD-MDD pairs. The subjects are young female twins recruited through an epidemiological sample based on Missouri birth records. To ensure that hippocampus shape space is the only independent variable, other factors had to be controlled; all of the subjects were right handed and were screened for factors that may cause structural changes of the brain such as loss of consciousness greater than 5 minutes, chronic medical or neurological illnesses, or pregnancy.

To obtain images of the hippocampus, very high resolution magnetic resonance imaging (MRI) scans were required. The Siemens Vision/Sonata 1.5T scanner was used to acquire three MPRAGE scans [19] (160 slices at FoV, 1.0 mm3 isotropic voxels). Using Analyze [12], the images were registered and averaged, converted to 8-bits while optimizing the intensity range, and interpolated to 0.5 mm isotropic voxels. The image protocol implemented above allows for optimal comparative analysis.

For each of left and right hemispheres separately, 22 three-dimensional landmarks were identified for each hippocampus and were used to generate and align hippocampal subcubes to a standardized orientation. It is these landmark data that we employ herein.

3. Shape

Using the landmark data, for each pair of subjects and for each of left and right hippocampus, we produce an interpoint shape comparison, as described below.

For two subjects and (for the left hemisphere, say), let and be the corresponding landmarks, where .

3.1. Landmark Matching

Finding the shape comparison involves a landmark matching (LM) transformation. The transformation is nonparametric, and this flexibility implies that overfitting must be guarded against via regularization. LM finds a diffeomorphism that minimizes an error criterion which includes both landmark mismatch and transformation complexity. That is, where is a geodesic distance in a group of diffeomorphisms [4] and is a regularization parameter which controls the relative contribution of transformation complexity versus landmark mismatch to the optimization objective. The algorithm solves the nonlinear Euler equation by a Newton method combined with a shooting procedure [18].

We use , the energy of the minimizing diffeomorphism, as the shape comparison between two subjects and (for the left hemisphere, say).

3.2. Interpoint Comparison Matrices

Applying LM to the left or right hippocampus data for each pair of subjects yields an interpoint comparison matrix . However, is , hollow (zeros on the diagonal) and is asymmetric. That is, we obtain matrices and .

The nature of the hippocampus shape space is such that under ideal conditions, it should yield a symmetric distance matrix. The asymmetry of the matrix does not reflect the true nature of the hippocampus shape space, and is in fact a result of the limitations in the LM matching method. Hence, before further investigation, must be symmetrized to , using an appropriate symmetrization technique [5]. In this work we symmetrize via .

Figure 1 depicts the structure of the interpoint comparison matrices for the 114 subjects. Figure 2 depicts the actual interpoint comparison matrix (after symmetrization) for the 114 subjects.

Figure 1: Structure of the interpoint comparison matrices for the 114 subjects.
Figure 2: The interpoint comparison matrix , after symmetrization, for the 114 subjects. The comparison values are color-coded, with red representing zero (e.g., the diagonal entries) to green representing large values.

4. Statistical Analysis

Our task is to begin describing the relationship of the three populations (MDD, HR, CTRL) amongst one another in the hippocampus shape space elicited by the LM interpoint comparisons. First, we present a multidimensional scaling (MDS) [13] scatter plot; unfortunately, we see in Figure 3 that no significant relationship can be discerned from this plot. Employing linear discriminant analysis (LDA) after MDS for all possible MDS target dimensionalities–-analysis via LDA MDS LM a la Miller et al. [9]–-yields no classification capabilities statistically significantly superior to chance. Nevertheless, we will see in Figures 4 and 5 a suggestion that perhaps progress can be made on our task, given a sufficiently clever methodology.

Figure 3: A multidimensional scaling scatter plot of mapped into . Little can be discerned from this plot regarding the relationship in hippocampus shape space of the three populations (MDD, HR, CTRL)–-no class-conditional differentiation is apparent.
Figure 4: This figure shows kernel probability density estimates for . The solid line depicts and the dashed line depicts .
Figure 5: This figure shows the quantile-quantile plot for . Depicted are the individual -values for a Wilcoxon-Mann-Whitney test of each HR subject, in turn, based on the two samples and .

Figure 4 depicts kernel probability density estimates [7] for the LM-Left comparisons to show that the entries of the interpoint comparisons matrix that correspond to comparisons between HR and CTRL (the solid line in Figure 4) are, overall, smaller than the entries which correspond to comparisons between HR and MDD. That is, Figure 4 suggests a stochastic ordering relationship [3]: . Such a result is precisely what we seek. Again, dependencies amongst the entries of make it difficult to assess the statistical significance of the result depicted in Figure 4.

Each row of the interpoint comparisons matrix , corresponding to a single HR subject, gives rise to two samples: and . That is, we have the vector of comparisons from that HR subject to every CTRL subject, and we have the vector of comparisons from that HR subject to every MDD subject. (We do not include in these vectors the twin of the particular HR subject under consideration; ignoring twinnedness in the analysis proves beneficial that we eliminate bias in similarity status between a subject and her twin that is not due to condition (MDD,HR,CTRL).) For this individual HR subject's two sample data, a Wilcoxon-Mann-Whitney test [17] of the null hypothesis that the distribution of comparisons is the same as the distribution of comparisons , against the alternative of stochastic ordering, yields a P-value. Figure 5 provides a quantile-quantile plot of these -values for . Under the null hypothesis, these -values would be expected to be distributed approximately uniform(0,1). The plot demonstrates a clear deviation from a uniform distribution, again suggesting a stochastic ordering relationship–-. Again, dependencies amongst the entries of make it difficult to assess the statistical significance of the result depicted in Figure 5.

The quantile-quantile plot independently reiterates the suggestion of a stochastic ordering relationship that was first seen using the kernel probability density estimates. Thus, while Figures 4 and 5 give an inkling of the type of information that can be gleaned regarding the relationship in hippocampus shape space of the three populations (MDD, HR, CTRL) amongst one another, it remains henceforth to accurately assess the Figures' suggestion.

5. Classification

To further uncover the characteristics of hippocampus shape space, we consider the task of classifying each HR subject as either MDD or CTRL.

As before, we consider the two samples, and , associated with each individual HR subject. We classify the HR subject as belonging to MDD or CTRL based on the Wilcoxon-Mann-Whitney test statistic -value, as described in [16]; (see also [3, page 183]).

Once we have classified each of the HR subjects in this way, we assess the relative similarity of HR to CTRL versus MDD based on the classifier's performance–-based on the collection of HR subjects' classifier-assigned class labels, taken as a whole.

This procedure can be employed with LM interpoint comparisons obtained on Left, Right, or both Left and Right hippocampuses, and with any of the three populations (HR, CTRL, MDD) as the population of interest–-the role of HR in the description above.

6. Results

Classifying the 22 HR subjects as either MDD or CTRL using results in 19 classified as CTRL versus 3 classified as MDD. The probability of obtaining a result this extreme or more extreme (the -value) under the least favorable null hypothesis HR are equally likely to be classified as MDD as CTRL is against each one-sided alternative. LM-Right yields 16 classified as CTRL versus 6 classified as MDD–-classification performance not statistically significantly distinguishable from chance. Combining left and right, the shape comparisons LM (LM-Left and Right) yields 20 classified as CTRL versus 2 classified as MDD–- for each one-sided alternative and strong statistical evidence that HR is more like CTRL than MDD in hippocampus shape space.

An analogous analysis–-classifying the 33 MDD subjects as either HR or CTRL using LM-Left–-shows that MDD is more like CTRL than it is like HR ( for each one-sided alternative), and that the left carries more information than does the right–-the P-values are smaller indicating that the signal is stronger.

The results obtained from classifying the 59 CTRL subjects as either HR or MDD are more nuanced: in this case, using LM-Left indicates that CTRL is more like HR than it is like MDD (P < .0005) while using LM-Right indicates that CTRL is more like MDD than it is like HR (P < .0005). This hemispherical ambiguity provides further insight into hippocampus shape space.

Finally, we note that in the last two columns of Table 1 we consider classifying the 22 HR subjects (via leave-one-out crossvalidation) as HR or MDD and as HR or CTRL. These results are consistent with our other findings–-HR is more difficult to distinguish from CTRL than from MDD, and the information extracted via LM-Left is more powerful for this task than is LM-Right.

Table 1: Output of classifier based on the Wilcoxon-Mann-Whitney test statistic. For example, the first numerical column, “H : CvM,” gives the number of HR classified as CTRL versus MDD and the second numerical column, “H : MvC,” gives the number of HR classified as MDD versus CTRL. Thus, we find that combining left and right, the shape comparisons LM (LM-Left and Right) yields 20 HR classified as CTRL versus 2 HR classified as MDD–-strong statistical evidence that HR is more like CTRL than MDD in hippocampus shape space. (This analysis is based on 22 HR subjects, 33 MDD subjects, and 59 CTRL subjects. Thus, e.g., the two HR numbers, H : CvM and H : MvC, should sum to 22. Discrepancies are due to situations in which the classifier makes “no decision” as described in [16]; (see also [3, page 183]).

7. Conclusions/Discussion

Our analysis indicates that HR is more like CTRL than it is like MDD, MDD is more like CTRL than it is like HR, and CTRL is not obviously more like one or the other. Also, we discern that the left hippocampus carries more information than does the right.

If hippocampus shape space were one-dimensional–-if the population shapes could be accurately represented in –-then the joint relationship described by these three results could be depicted as in Figure 6, with the CTRL population between the HR and MDD populations in terms of shape. However, it must be noted that this depiction (Figure 6) offers only a simplified view of the true infinite dimensional nature of the shape space configuration, as suggested by the fact that the 2-dimensional MDS embedding depicted in Figure 3 presents little or no class separation.

Figure 6: Artist's rendition of what hippocampus shape space might look like were it one-dimensional–-if the population shapes could be accurately represented in . Our results suggest that the joint relationship of the three populations, in terms of shape, puts the CTRL population between the HR and MDD populations. The relationship depicted here holds for both LM-Left and LM-Right, although our results suggest that for the left hippocampus CTRL is shifted closer to HR while for the right hippocampus the CTRL is shifted closer to MDD.
7.1. On Populations

Our stated task is in terms of populations–-to begin describing the relationship in hippocampus shape space of the three populations (MDD, HR, CTRL) amongst one another. However, our results are conditional–-using LM-Left we classify, for example, the 22 HR subjects representing the HR population as belonging to either the MDD or the CTRL class, conditionally on “training” data from MDD and CTRL. This, in fact, is the standard approach in probabilistic pattern recognition; see, for example, [11]. The difference between a focus on populations versus conditionals is indicative of a difference between “policy science” and “laboratory science” [14]. A justification for the conditional approach in “laboratory science” is given in [11] where it is claimed that the unconditional approach “ would be unnatural, because in a given application, one has to live with the [training] data at hand.” In “policy science”, however, knowledge about the populations themselves may be the focus.

By performing our analysis thrice, for each of the three populations in turn conditionally on the “training” data from the other two, we obtain three conditionals. Letting denote the class-conditional sample sizes for each of the three classes, we see that the joint distribution for our sample is -dimensional (where is the presumed “shape space” dimensionality of each observation). Each conditional considered is -dimensional, with one population remaining. The overall joint distribution of interest–-the three populations in “shape space”–-is of course not simply the product of our three conditionals. However, some population inferences regarding stochastic ordering can be performed via the (multiple) conditionals, and in particular the conditional approach justifies the simplistic view of our three populations in “shape-space” given by Figure 6.


Sincere appreciation to Michael Bowers (JHU), Timothy Brown (JHU), Anthony Kolasny (JHU), Tomoyuki Nishino (WashU), and the others for their valuable assistance.


  1. “Depression. National Institute of Mental Health,” http://www.nimh.nih.gov/publicat/depression.cfm.
  2. L. B. Alloy, J. H. Riskind, and M. J. Manos, Abnormal Psychology: Current Perspectives, McGraw-Hill, New York, NY, USA, 9th edition, 2005.
  3. J. Posener, L. Wang, and J. Price, et al., “High dimensional mapping of the hippocampus in depression,” American Journal of Psychiatry, vol. 160, no. 1, pp. 83–89, 2003. View at Publisher · View at Google Scholar
  4. J. G. Csernansky, S. Joshi, L. Wang, J. W. Haller, M. Gado, J. P. Miller, U. Grenander, and M. I. Miller, “Hippocampal morphometry in schizophrenia by high dimensional brain mapping,” Proceedings of the National Academy of Sciences of the United States of America, vol. 95, no. 19, pp. 11406–11411, 1998. View at Publisher · View at Google Scholar
  5. J.-F. Maa, D. K. Pearl, and R. Bartoszyński, “Reducing multidimensional two-sample data to one-dimensional interpoint comparisons,” Annals of Statistics, vol. 24, no. 3, pp. 1069–1074, 1996. View at Publisher · View at Google Scholar
  6. “Genetics and major psychiatric disorders:a program for genetic counselors. National Coalition for Health,” http://www.nchpeg.org/cdrom/empiric.html.
  7. J. P. Mugler III and J. R. Brookeman, “Three-dimensional magnetization-prepared rapid gradient-echo imaging (3D MP RAGE),” Magnetic Resonance in Medicine, vol. 15, no. 1, pp. 152–157, 1990. View at Publisher · View at Google Scholar
  8. Analyze Software, “Mayo Clinic,” http://www.mayo.edu/bir /Software/Analyze/.
  9. M. I. Miller, A. Trouvé, and L. Younes, “Geodesic shooting for computational anatomy,” Journal of Mathematical Imaging and Vision, vol. 24, no. 2, pp. 209–228, 2006. View at Publisher · View at Google Scholar · View at MathSciNet
  10. S. Allassonnire, A. Trouve, and L. Younes, “Geodesic shooting and diffeomorphic matching via textures meshes,” in Proceedings of the 5th International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR '05), pp. 365–381, Augustine, Fla, USA, November 2005. View at Publisher · View at Google Scholar
  11. T. Saito and H. Yadohisa, Data Analysis of Asymmetric Structures, Marcel Dekker, New York, NY, USA, 2005.
  12. T. Cox and M. Cox, Multidimensional Scaling, Chapman & Hall, New York, NY, USA, 2nd edition, 2001.
  13. M. Miller, C. Priebe, and Y. Park, “Collaborative computational anatomy: the perfect storm for mri morphometry study of the human brain via diffeomophic metric mapping, multidimensional scaling and linear discriminant analysis,” to appear in Proceedings of the National Academy of Science.
  14. B. Silverman, Density Estimation for Statistics and Data Analysis, Chapman & Hall, New York, NY, USA, 1986.
  15. P. J. Bickel and K. A. Doksum, Mathematical Statistics: Basic Ideas and Selected Topics, vol. 1, Prentice Hall, Upper Saddle River, NJ, USA, 2nd edition, 2005.
  16. J. Rice, Mathematical Statistics and Data Analysis, Addison-Wesley, Reading, Mass, USA, 2nd edition, 1995.
  17. C. E. Priebe, “Olfactory classification via interpoint distance analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 4, pp. 404–413, 2001. View at Publisher · View at Google Scholar
  18. L. Devroye, L. Gyorfi, and G. Lugosi, “A Probabilistic Theory of Pattern Recognition,” 1996, Number 31 in Applications of mathematics.
  19. B. Caffo, 2006, Personal communication.