#### Abstract

We introduce a method of analysing longitudinal data in variables and a population of observations. Longitudinal data of each observation is exactly coded to an orbit in a two-dimensional state space . At each time, information of each observation is coded to a point , where is the physical condition of the observation and is an ordering of variables. Orbit of each observation in is described by a map that dynamically rearranges order of variables at each time step, eventually placing the most stable, least frequently changing variable to the left and the most frequently changing variable to the right. By this operation, we are able to extract dynamics from data and visualise the orbit of each observation. In addition, clustering of data in the stable variables is revealed. All possible paths that any observation can take in are given by a subshift of finite type (SFT). We discuss mathematical properties of the transition matrix associated to this SFT. Dynamics of the population is a nonautonomous multivalued map equivalent to a nonstationary SFT. We illustrate the method using a longitudinal data of a population of households from Agincourt, South Africa.

#### 1. Introduction

Analysis of multivariable longitudinal data involves either statistical or nonstatistical methods. Statistical methods include multivariate Markov chain model [1], regression model [2], and mixed models [3], while some nonstatistical methods involve extraction of dynamical system using state space reconstruction technique [4] or visual methods such as motion charts [5, 6] and parallel coordinate plots [7, 8]. A motion chart shows additional dimensions of the data at different time points, where the size and color of the bubble (among others) are used as variables. PCP represents variables as parallel axes, where a sequence of line segments intersects each axis at a point corresponding to the observation’s value at the associated variable. Both methods aim to identify correlation among variables and identification of clusters and patterns among observations in the data. We present a novel nonstatistical method of analysis that is useful particularly for collecting information of change. State space reconstruction method mainly requires use of delay coordinates from data to come up with models for prediction. Our method does not rely on delay coordinates, and our aim is not prediction. Contrasting to motion charts, there is, in principle, no limitation to the number of variables studied in our method. Contrasting to PCP, for large and large number of observations, orbits over our two-dimensional state space, or over time, are easily visualized.

Here we consider dynamics of multivariable longitudinal data of a population of observations by applying a swap operation on the data of each observation. We suppose that values of variables are discretised to bins so that each variable takes value from . For a fixed order of variables (e.g., ) denote by
the space of all -ary sequences of length and element by such that is the value of . The -ary multivariable time series in variables of an observation is defined by
where each . Our method can be used for -ary valued data, but for illustration, we use . We denote our* binary* multivariable longitudinal data of a population of observations by the set of binary multivariable time series of observations
In the general analysis of longitudinal data we see no recognition that data is taken with a purpose. Here we suppose that the value of a binary variable is either favourable or unfavourable to a given purpose. Suppose we would like to investigate the effect of change of households variables, namely, biological mother (BM), household head (HH), and adult death (AD), to a child’s educational progress. Consider binary questions
and suppose that answers are coded either favourable = 1 or unfavourable = 0 to our purpose. It is a reasonable hypothesis that the answer ‘‘yes’’ to is favourable to child education, while ‘‘yes’’ to and is unfavourable. Table 1 shows coded data of 4 subjects (households) over 7 time steps. Using parallel coordinate plots (PCP), Figure 1(a) shows the answer of each subject at . To illustrate the answer for all time using PCP, the evolution of line segments, with time, will form surfaces that obscure each other. Figure 1(b) shows the Bratelli diagram [9, 10] of transition of answers from time to , to variable order BM, HH, and AD. Transitions per time interval can be associated with a state transition matrix, where states are analysed on the space
This can be used to generate probability matrices for Markov models [11, 12].

**(a)**

**(b)**

Observe that, for question regarding (HH), household has constant favourable answer, while has constant unfavourable answer. On the other hand, has constant favourable answer to (AD). Our aim is to extract clusters associated with stable variables. Underlying our method is our belief that the set of physical variables in which an observation spends the most time in is important (e.g., HH = 1 for , HH = 0 for , and AD = 1 for ) and that among the physical variables themselves the variables that are most often experienced (most probable) by the observations are important. We elegantly expose both most probable variable and value of variable, by a simple process of dynamically reordering variables.

There is no a priori indication of any absolute dynamics in data and here it is deterministically imposed. Because longitudinal data is fundamentally defined by change (if nothing changes, cross-sectional data is sufficient), frequency of answer change of variables then becomes a property of interest. A deterministic operation is applied to the multivariable data of each observation at each time step, dynamically reordering position of variables (and their corresponding values) by their stability; that is, the most stable is eventually positioned to the left, and the most frequently changing one is positioned to the right. All possible orders of variables are considered. From this, we introduce the significance state (order of variables) and fitness state (associated values with variables) of an observation. It is in a chosen ordering that the notion of fitness takes an objective and consistent meaning. The idea of fitness and significance is new in literature. The -dimensional longitudinal data of an observation is represented as a 2-dimensional orbit in fitness-significance axes composed of points. Orbits in sufficiently encode the longitudinal data of each observation. Analysis of orbits at the individual and population levels in this space can then follow.

This paper is organized as follows. In Section 2, we present the method of constructing orbits from multivariable longitudinal data. A detailed theory of the reordering operation applied to data is presented. In Section 3, a deterministic equation of motion that generates all observed orbits, as well as other possible orbits, is presented. In Section 4 we discuss transitions that occur in data. This is captured by a nonstationary SFT. We also discuss dynamics at the population level. An illustration of the method is presented in Section 5. We give concluding remarks in Section 6.

#### 2. Background and Preliminaries: Method of Orbits

We suppose that longitudinal data is gathered by first specifying a purpose and then choosing questions that are of interest to purpose. The questionnaire may be designed to a purpose posed in the form ‘‘to investigate the effect of the variables on .” Here we will only consider binary-valued questions with responses hypothesized as either favourable or unfavourable to the purpose. A favourable answer is coded 1 and 0 otherwise. Our longitudinal data is the response to a set of questions (associated with variables) surveyed from a population composed of observations over periods.

Denote the* questionnaire* by
the set containing questions. Let be an index set of elements, and let time . Let , and let . For each observation , denote a reordering of questions in at time by
the concatenation of answers to by
and the concatenation of question indices of by

*Example 1. *Coded data of observation for and is shown in column 2 of Table 2. Suppose we (arbitrarily) reorder the questions at to . Then and . Similarly, if , we have and . As we are merely rewriting entries from the original data, all information is preserved.

##### 2.1. Fitness and Significance States

Consider questions in (4) and assign index to (). Suppose we give more weight (significance) to and do this by positioning 0 in the left-end of question order, say, , , , denoted by 012 (or ). As in numbers or decimals, our weighting places the most significant number at the left-end position. All possible (concatenated) answers to 012 are given by
Since 000 has all unfavourable answers we say that it is the least* fit* answer, while 111 is the* fittest*. Note that there are states with the same number of favourable values, for example, 001, 100, and 010. By a suitable weighting of questions, we show below that the lexicographic ordering of answers in (10) is an appropriate ordering of fitness.

Table 3 illustrates concatenated coded answers to question order 012 of three observations from a population. To , observation has constant answer 0 while has constant answer 1.

Suppose we arrange answers in (10) lexicographically along an -axis. Then for question order 012, a one-dimensional dynamics on the -axis composed of the eight states arises. Answers of and to seldom change (i.e., they are both constant in ) so the two households and stay in the regions and of the -axis, respectively. Recall that is the question associated with the significant (left) position of the question order so fitness is biased towards the left position. We can then write because the significant variable is unfavourable in and favourable in . This holds true even if has the same, or more, favourable values as . This argument can be extended to any two elements with the same first entries.

In general, not all observations may be stable in the same variable; for example, in Table 3 is constant in , not in . Moreover, stability of an observation may change in time; it may be stable in variable over one time interval and then stable in variable over another time interval. We will not study orbits in a fixed question order alone. We construct a -axis with states corresponding to question orders. The order of questions per observation becomes a new variable.

*Definition 2. *Given and , the* fitness state *and* significance state* of observation at time are the sequences
respectively. The set of fitness states of length is called the* fitness space of ** variables* defined by
and the set of significance states of length is called the* significance space of ** variables* defined by
Elements of both and are arranged according to the lexicographic ordering () of sequences of length .

*Definition 3. *The space
is called the* change space for ** variables.*

Given , we have the cardinalities , , and . A way of labeling state is via the map For convenience, we label states in from left to right, and from top to bottom. If and , we will refer to state as state and write .

*Remark 4. *In general, for multivariate data in variables, with* all* variables -ary valued, the space is composed of states.

##### 2.2. The Method of Orbits

We define the dynamics of observations taken from a survey of questions. Let be the set of nonnegative integers, let be the power set of , and let be such that

*Definition 5. *Let and let .(a)The map is defined by
(b)The map is defined by
Let . Then and both change values under . If , then is first applied to ; that is,

*Definition 6. *Let . Consider questionnaire with . For each observation , let be the frequency of change in answer value of over the observation period. Suppose
Inequality (20) is called the* observation frequency relation* and question order is the* initial significance state of observation *. If and at the population level, then choose question order . If , then choose question order as in the questionnaire (6). The* initial fitness state* is such that is the value of in . The* initial state of * is the ordered pair .

By choosing the initial significance given in Definition 6, we start the orbit in its most-likely significance state. This facilitates convergence to clusters (where they exist) and is useful for short data sets. Other strategies (e.g., using order of (6) or random choice) will nonetheless converge to question order according to (20).

*Remark 7. *Longitudinal data is only of interest where change occurs; else cross-sectional surveys are adequate. We are interested in longitudinal data that give nontrivial information of change about the population; that is, . Otherwise, question may be deleted as any such property becomes an identifier of subpopulations of possible interest for analysis in its own right.

*Definition 8. *For each observation , define the* change set at time* by
Let be the initial state of . Denote by the state of at time . The* change map* is such that
The set is given by the longitudinal data for each and is a useful ordered listing of questions that change answer values from time to . For each , the nonautonomous map defines an evolutionary process that displaces the most frequently (resp., slowly) changing answers and corresponding questions to the right (resp., left).

*Definition 9. *Given initial state of , define the state of at time by
The* forward orbit of ** under * is defined by

##### 2.3. Algorithm for Building the Orbit of an Observation in

We give a simple algorithm to determine the states that comprise the orbit of an observation from longitudinal data.

*Step 1. *Determine initial question order and initial state of and plot in .

*Step 2. *Identify from the data of the question that changes answer at time , say . Swap both (in ) and corresponding answer (in ) to the right of and , respectively, and change to , where if and 0 otherwise. This new question order and answer order give the next state of . Suppose both and change answers at . If , then sequentially swap to the right and (resp., and ) of the question order (resp., answer order), starting with (resp., ). Change to and to . Plot the point in and directed edge from to .

*Step 3. *Repeat Step 2, updating state and for time .

Visualization and analysis of orbits of observations in allow capturing information of change in longitudinal data. We illustrate in Figure 2 an orbit in . The useful distance on is given by the discrete metric; that is, if and zero otherwise. The visualized distance between points in has no interpretation so we may represent by a regularly spaced point.

*Remark 10. *The set of all orderings of variables is captured in . For fixed question order and , means that fitness state is fitter than . Each level in the significance axis is question order under frequency ordering. The significance axis informs us which variables are weighted most strongly at each time, where significance is ordered from left to right. Clearly, reordering of variables is one among many families of operations; for example, swapping can be done by swapping changing variable to the left end. This operation however does not reveal clusters.

##### 2.4. Transitions in

We now analyze possible state transitions which an observation can take in .

*Definition 11. *Let and be in , and let .(a)Suppose is such that . Then there is a* transition* from to under , defined by .(b)Suppose . If and , then there is a* horizontal transition* from to . If and , then there is a* vertical transition* from to .(c)Suppose . If, in addition, is such that , then the transition from to is* reversible*, and one writes .

We have* self-transitions* if (no change), the empty set. In general, horizontal transitions denote change in the right-most variable, while vertical transitions denote change in the last two variables.

Let . Given an observation , if and , then there is such that . We use the symbol “” to denote that there may be other observations in or at times and , respectively. State transitions of observation in are visualised as a sequence of directed edges.

*Example 12. *Consider data of observation in column 2 of Table 4. The asterisks denote changing answers in the next time step. The frequencies of change are , , and , so the initial question order of is . And initial fitness state is (Definition 6). At , we apply to initial state . Questions and in change answers in the next time step so (not ). Applying to changes and to and , respectively. Next, is first applied to both and by moving each to the right (they are already rightmost), followed by moving and to the right end. Hence, we have . For and , verify that and , respectively. At each time the bold numbers in the significance column are the question indices in that change answers at . The orbit of in is shown in Figure 2. The vertical transition from state 11 to 3 denotes two changing answers, while a transition from state 9 to 32 denotes three changing answers.

*Example 13. *Figures 3(a) and 3(b) illustrate the associated orbits in of the four households given in Table 1. Observe that orbit of stays strictly on the left half of , particularly in subset where , while and stay in the right half subset, where and , respectively.

**(a)**

**(b)**

*Definition 14. *Let and let .* The return map* is such that
That is, first inserts and to the position, where and are located, followed by changing to the new value . If and then is first applied to and .

Trivially, for any , , and ,

Theorem 15. *Let . Define the image set and preimage set of over by
**
respectively. For each , *(a)*there exists a unique such that . Moreover, ;*(b)*there exists a unique such that . Moreover, .*

*Proof. *(a) Let be the fitness state of . The image of under both maps and is unique so is unique. Note that and the set of images of under , , is the set of distinct binary numbers of length , which can be associated with distinct states in . This is a bijection between elements of and the binary numbers of length , so .

(b) For , choose , where is the return map. The proofs of the uniqueness of and the cardinality of follow a similar argument as in (a).

Theorem 16. *Let and . If , then for .*

*Proof. *Fix and , where . Let and be sequences of length so that
Let . Observe that, for any integer , and are fixed under . Now
For , . Hence, .

Corollary 17. *For any , , and , the transition from to is reversible.*

The next theorem states some nonallowable transitions in .

Theorem 18. *Let
**
Define
*(a)*Let . The only transitions in , aside from self-transitions, are horizontal transitions. Let .*(b)*There is no transition from to .*(c)*There is no transition between and .*

*Proof. *(a) It is easy to show that any transition between distinct points with the same significance state is under . In particular, horizontal transitions are between pairs and , where and .

(b) Let . The sets and are the only elements of such that is moved to the leftmost position of the question order of , in which case question order must be , different from .

(c) As in (b), and are the only elements of such that remains in its position under . The rest of the proof follows a similar argument as in (b).

##### 2.5. Local Dynamics in

Consider the subset where answers are constant. Since states and states, then is composed of subsets of the form . For constant answers (i.e., ), denote by the subset of , where question has constant answer . Then any is given by , , . Define by Using (33), we can express as We now define the set (as in ) associated with transitions between points in , where . Recall from Theorem 15 that transitions in are under elements of the set . For , there are constant answer values so transitions in are under elements of the set Similarly, for constant answer values (i.e., ), transitions in are under the set In general, the dynamics in is described by transitions under the set For , we see from (37) that , as expected. Moreover, the transitions in are given by the transitions in , together with the additional transitions under the set difference .

Since and for and that there is a one-to-one correspondence between transitions in and (from (37)), we can write .

*Example 19. *Figures 4(a) and 4(b) illustrate all possible transitions in and , respectively. Consider horizontal transitions alternating between states 1 and 2. In Figure 4(a), this denotes alternating 0-1 answer to question 0. On the other hand, transitions alternating between states 1 and 2 in Figure 4(b) denote alternating 0-1 answer to question with constant favourable answer to question . All transitions between states in the dashed boxes (containing ) are under the map , where “" refers to the index on the right end of question order . In each , one answer value is constant and is positioned at the left end of the fitness state. The digraph associated with transitions in is exactly the same as the digraph associated with transitions in each . Transitions in these subsets are under . The set is composed of states 1 and 2, of states 3 and 4, and so on. Transitions under are shown as well.

**(a)**

**(b)**

*Example 20. *Figure 5 illustrates orbits in that cluster into two. A red edge signifies a movement from right to left, a green edge a movement from left to right, and a blue edge a transition to the same state (self-transition). Instead of analysing in , we can remove and consider only the three questions , , and and analysing orbits in characterised by and orbits characterised by . Note that the frequency change of here is .

#### 3. General Equation of Motion of Dynamics in

This section concerns properties of all possible paths in .

*Definition 21 (see [13]). *Let and be arbitrary sets. A multivalued map from to , denoted by , is such that is assigned a set for all .

Consider our state space . Denote by the map in (22), where is constant. Define and the multivalued map by so is a 1 to map. Under , the orbit is such that . The multivalued map can be interpreted as a digraph whose vertices are the points in and edge if .

Equivalently, can be defined as a square matrix of size that encodes all possible paths in .

*Definition 22. *Let and let . For , let and . Let be the multivalued map in (39). The* theoretical transition matrix for ** under ** questions* is denoted by , where
captures all theoretically admissible transitions between states in . Each entry indicates that there is a transition from state to in one step. Hence, gives all possible paths in that any observation can take.

*Example 23. *From Figure 4, we have for and , as given below
Because the digraph in is the same as the digraphs in (Example 19), the matrix encoding transitions in is the same as the matrix encoding transitions in . The submatrices of entries all equal to 1 (in bold numbers) in correspond to transitions in . The zero submatrices in (in bold) denote the nonallowable transitions between states with the same significance (Theorem 18(a)).

##### 3.1. Properties of

We give some elementary results for .

Theorem 24. *(a) The digraph is -regular; that is, the in- and out-degree of all its vertices is .**(b) The largest eigenvalue of is , with associated eigenvector .**(c) Consider trace .**(d) Consider .*

*Proof. *(a) This is a consequence of Theorem 15.

(b) This is a consequence of (a) and the Perron-Frobenius theorem [14].

(c) Let . From the definition of self-transition, for any . Hence, for .

(d) Let and be points in such that
*Claim*. , where and are given in (27).

The case where is given in Example 23. We prove our claim for . By definition of self-transition, and . It is clear that and . Let
Observe that
Since and have the same image for all , , as claimed.

Our claim implies that rows of associated with and are the same. Because of these repeating rows, det. The proof can be readily extended to the general case and is treated in exactly the same manner.

##### 3.2. Subshift of Finite Type

Let , . If observation is in state after time steps, we will write to indicate that more than one observation may be in . We associate with the orbit a sequence of symbols , where if . Denote the corresponding* symbol space of one-sided sequences of * by
The* equation of motion in * is given by the shift map defined by

*Definition 25 (see [15]). *The pair is called the* subshift of finite type (SFT) determined by *.

The SFT determined by captures the exact detail of all possible itineraries of observations in . Many dynamical properties of an SFT depend on the structure of its associated transition matrix or digraph.

*Definition 26 (see [16]). *(i) A transition matrix is* irreducible* if for each pair , there exists so that . Otherwise, is* reducible*. (ii) The digraph associated with is* strongly connected* if it is possible to get from any vertex to any other one by traversing a sequence of edges as directed by . (iii) If there exists such that for all , then is* primitive*. (iv) Let be any set. A continuous map is* (topologically) transitive* if, for all nonempty open sets , there exists such that . (v) If there is 0 such that for all , then is* (topologically) mixing*.

Consider the SFT given by . The following results [17, 18] concern dynamical properties of a transitive SFT and algebraic properties of its transition matrix. (i) The shift map is transitive (resp., mixing) if and only if is irreducible (resp., primitive). (ii) is irreducible if and only if the digraph is strongly connected. In that case we say that is irreducible. (iii) A nonnegative, irreducible matrix with a positive element on the main diagonal is primitive.

Theorem 27. * is irreducible.*

*Proof. *We prove by induction on . From Example 23, for all so is irreducible for . Assume that is irreducible for . Then the digraph is strongly connected, and is irreducible. We prove that is strongly connected to .

Recall that transitions in are under the set . From (37), this set can be expressed as , where is the set associated with transitions in the irreducible set , and . With the additional transitions under , we show that (i)the set (given in (33)) is irreducible;(ii)there is a path to and from any pair , where and .We prove (i) by showing that there is such that, for all , there is a transition from to . By Theorem 16, for any . Let
Observe that all ’s have the same fitness states but distinct significance states. In particular, each is contained in a distinct subset . This vertical transition allows all to visit a distinct in at most steps. Hence, is irreducible. To show (ii), take
Under , there is a reversible transition .

From (i), (ii), and the irreducibility of , there is a path between any pair . Hence, is strongly connected and is irreducible, as desired.

Corollary 28. * is primitive.*

*Definition 29 (see [15]). *The topological entropy of the shift map is defined by
where is the set of allowable sequences of length .

A continuous map on a compact metric space is chaotic if [17]. The topological entropy for SFTs is given by the following theorem [15].

Theorem 30. *Let be a transition matrix and let be the associated SFT. Then
**
where is the maximum eigenvalue of .*

Since is the associated subshift of the multivalued map in (39), one has

#### 4. Dynamics from Data

Given multivariable longitudinal binary data of dimension , every observed orbit is an orbit of an SFT determined by . Dynamics of real-world longitudinal data however is not often defined by an SFT. Data usually selects certain paths given by and may sometimes stay in a particular subspace of .

##### 4.1. Nonstationary SFT

Let , , and . For each , denote by the matrix that records observed transitions that occur in the longitudinal data from time to , where We note that is defined over an interval of time and that may vary with time. Some allowable transitions between states given by might not occur in the observed data so . For the case where is constant we write and we can define the SFT given by the pair .

If is not constant, then we have a sequence of matrices
Given , define
We call the* nonstationary symbolic space restricted by the sequence of matrices *. The shift takes place as usual in and is denoted by . We refer the reader to [19–21] for a discussion on nonstationary SFT.

*Definition 31. *The pair is the nonstationary SFT (NSFT) determined by the sequence of matrices .

Visualization of dynamics of longitudinal data defined by an NSFT is illustrated by a sequence of directed graphs called a Bratteli diagram [9, 10]. Equivalently we may plot orbits of observations in over time. Figure 6(a) illustrates the orbits of a population in a subset of over time.

**(a)**

**(b)**

##### 4.2. Population Dynamics

We discuss the longitudinal data of a population of observations. Aside from all possible paths that an observation can take in , we are also interested in the number of observations on paths. Because it is possible for more than one observation to occupy a state in at a given time, we can consider the number of observations that follow the same transition in .

In general, given a transition matrix (e.g., those in (40) or (52)), standard analysis is to accumulate number of transitions between states, and from this construct the associated stochastic transition matrix. Construction of associated transition and stochastic matrices on states of can then follow as usual. In what follows, let denote a finite index set with and .

*Definition 32. *Let be the number of observations in state at time . The density matrix at time is defined by
where is the number of observations in state at time that go to state at time . The (net)* flux* of state at time is defined by
Let , . The flux in (56) can also be expressed as
From (56) and (57), we have
Let be a row vector whose th entry is . We will refer to as the* observed capacity vector at* time . Given the initial observed capacity vector , there are two methods that we can use to determine .

*(i) Nonhomogenous Case*. From data, we are encouraged to construct a probability matrix based directly on the density matrices . For each , we construct a time-dependent probability matrix , where
The capacity vector is given by the product
We show that the capacity in (56) agrees with the th entry of (60). It is trivial if for all . Otherwise we have

*(ii) Homogenous Case*. Let be the density matrix at time and let
be the accumulated density matrix of the data over the observation period. Define the mean density matrix by
Suppose is irreducible. Define the mean probability matrix from by , where