Chinese Journal of Mathematics

Chinese Journal of Mathematics / 2014 / Article

Research Article | Open Access

Volume 2014 |Article ID 756917 |

Manfred Trümper, "The Collatz Problem in the Light of an Infinite Free Semigroup", Chinese Journal of Mathematics, vol. 2014, Article ID 756917, 21 pages, 2014.

The Collatz Problem in the Light of an Infinite Free Semigroup

Academic Editor: E. Bannai
Received24 Aug 2013
Accepted10 Nov 2013
Published30 Apr 2014


The Collatz (or ) problem is examined in terms of a free semigroup on which suitable diophantine and rational functions are defined. The elements of the semigroup, called T-words, comprise the information about the Collatz operations which relate an odd start number to an odd end number, the group operation being the concatenation of T-words. This view puts the concept of encoding vectors, first introduced in 1976 by Terras, in the proper mathematical context. A method is described which allows to determine a one-parameter family of start numbers compatible with any given T-word. The result brings to light an intimate relationship between the Collatz problem and the problem. Also, criteria for the rise or fall of a Collatz sequence are derived and the important notion of anomalous T-words is established. Furthermore, the concept of T-words is used to elucidate the question what kind of cycles—trivial, nontrivial, rational—can be found in the Collatz problem and also in the problem. Furthermore, the notion of the length of a Collatz sequence is discussed and applied to average sequences. Finally, a number of conjectures are proposed.

1. Introduction

The () problem, first posed by Lothar Collatz in 1937, concerns the behavior of natural numbers under the recurrence function for odd and for even . The Collatz conjecture says that, for any natural number, the iteration leads to the trivial cycle 1, 4, 2, 1. Historical accounts of the Collatz problem have been given, for example, by Lagarias [1], Wirsching [2], and others.

The application of the Collatz rules to a given start number will always result in a sequence of uniquely determined integers. It will be referred to as the Collatz sequence attached to . Only start numbers which are odd will be considered. This simplifies the arguments while not restricting in any way the generality of the results.

We also recall the simple fact that, for all integers inside a Collatz sequence, it holds that , i.e. cannot be divided by . This is so because is in the residue class and divisions by cannot result in a number which has divisor . Integers which have a divisor can occur only either as start numbers or as even ones preceding a start number.

Since the Collatz operations have unique inverses, we can also consider a sequence going backwards. However, Collatz backward sequences are not unique because there are branch points. The branch points are the integers satisfying at the same time and . To such , one can either apply the inverse operation or one can multiply it by an even power of (the reason being that implies ). Thus, in order to uniquely fix the backward sequence, one has to specify how to proceed at branch points.

Consider, for example, the start number with the sequence , , , , , , , , , , , , . Branch points are (from right to left) , , , . The backward sequence could branch from down to or up to and from up to or and so on.

This situation calls for some encoding of a Collatz sequence. Terras [3] wrote the symbols and for the operations and , respectively, and listed them in the order in which they appear (from left to right), such as for the example given above. He called this object encoding vector. A “component” of this vector is also the parity of the integer which triggers the respective operation. Lagarias [1] modified the terminology and called the same object parity vector. However, though one could go along with the terms “encoding” or “parity,” the object in question is not a vector but rather the element of a free semigroup.

The symbols used in the present paper will be (for “up”) to designate the operation and (for “down”) to designate the operation . The sequence of operation symbols will be referred to as T-word. This is an acceptable terminology since the elements of a free group are usually called “words.” The prefix T is chosen in commemoration of Riho Terras who seems to have been first to have coded a Collatz sequence. For the start number , mentioned in the preceding paragraph, the T-word would be written as or, shorter, .

Remark on Notation. Since any operation results in an even integer, it is always followed by an operation . Therefore, we will frequently use the abbreviation . The letter “” is taken from the term spike which is what it resembles in graphics. Use of this notation makes a T-word shorter. For example, the T-word resulting from the start number would be written as or . In the following, both notations will be used. The reason is that key formulas and relations will be simpler and easier to memorize in the “” notation.

In fact, the T-words (with the concatenation of words as composition rule) form a semigroup .

The typical element of is a T-word of the general form where is the number of -operations in . The exponent is, for example, the number of -operations following the third -operation.

The total number of -operations in a T-word is then The corresponding number sequence, called a realization of the T-word, begins with an odd start number   and terminates with an odd end number  . That sequence is composed of even integers and of odd integers. To exclude triviality, we require that start numbers are odd. To allow seamless concatenation, we then must require also that end numbers are odd.

Note that each -operation yields an even integer and is thus followed by at least one -operation. Therefore we have for and thus .

The generators of the semigroup are the T-words with and , i.e. , , .

Under the group operation, the concatenation of any two T-words and , a new T-word is uniquely defined. The operation is not commutative, but it is associative; i.e. The unit element is the empty string of characters, which will be denoted by  . We have .

From the above, it follows that both the order of the semigroup and the number of generators of are .

It is easy to enumerate infinite sets of subsemigroups of . Consider all T-words ending in the same generator , say. For any , they form a countably infinite sub-semigroup. Of particular interest later in this paper will be the case , i.e. the set of all T-words ending in . They will be referred to as reduced  T-words.

More generally, all T-words of the form with fixed form a sub-semigroup. Likewise, for all T-words of the form with fixed .

In unique fashion, the semigroup can be extended to a full group. This is done by adding to the inverses of all elements to the semigroup. The inverse of each element is unique since the inverses of the Collatz operations are uniquely defined. However, when extending the semigroup to the full group, some of the elements can no longer be represented by integers but only by rational numbers. This item will be considered in Section 4.

The present paper does not solve the Collatz problem but puts it into a new mathematical context which has already brought new insights. As Wirsching [4] noted, “the strategy for finding interesting things about an “intractable” problem is threefold: translate the conjecture into as many different contexts as you can, formulate weaker statements implied by the conjecture in question and try to prove some of them, and wait for flashes of genius giving new and interesting insights.”

2. Realizations of a T-Word. The Rescale Procedure

To find the T-word which is attached to a given start number is an easy task. To establish a relation between the start and end numbers of a given T-word is a matter of straightforward calculation which was first done by Böhm and Sontacchi [5].

However, to find all start numbers to which a given T-word is attached is not quite obvious. This problem has been solved by Trümper [6]. The method and the results are described in the following. It will be seen that its solution will permit new and deep insights into the Collatz problem. Moreover, the method lends itself easily to numerical applications.

A sequence of integers which follow the operations spelled out by a T-word will be called a realization of . The task of this section will be to describe a systematic procedure allowing to determine all realizations for any given T-word.

Though, in general, it is impossible to give closed form expressions for the realizations of a T-word, it can be done for some simple T-words, with methods adapted to the particularity of the case. Two examples will be given below.

Example 1. To find the set of start numbers and end numbers for the simplest type of T-word, i.e. a generator . The task is simple, but it is of basic importance since the generators are the building blocks of all T-words.
We begin by writing the defining relation for a generator. It is . To every odd start number this relation uniquely assigns a natural number and an odd end number . Since the Collatz operations have unique inverses, the equation can also be written as . But now we cannot say that a given end number determines and in a unique way. The reason is that has a divisor only if . To which residue class belongs depends on both and . If , then . If , then . Note that cannot happen.
Thus, if we want to write the start number as a diophantine function of , we must distinguish the two cases where is even or odd.
Case  . It requires and, since must be odd, it takes the form with . Then we get
Case  . It requires and, since must be odd, it takes the form with . Then we get The start numbers have now got an index which labels the different realizations of the generator.

Remark. For , we have , the T-word for the trivial cycle. Indeed, with , we get, from (4) and (5), and .

These results on the numerical realizations of the generators of are summarized as follows.

Lemma 1. A generator of has realizations with start numbers and end numbers , , which are determined by with and with . The quantities and are given by
Case  . , ,
Case  . , .

Note that and are natural numbers satisfying The integers and are, respectively, referred to as the start period and the end period of the generator .

We will refer to as the start phase and to as the end phase of the generator .

That the expressions (4), (5), (6), and (7) for the realizations of T-word generators can be obtained in closed form is due to the structural simplicity of generators. But even for this simplest of all T-words, the distinction between the parity of the parameter comes into play, giving rise to two different expressions. In the next simplest case, one of two concatenated generators, , we would have to deal with four cases, namely, those where is (even, even), (even, odd), (odd, even), or (odd, odd).

Example 2 (the T-word ). It represents the steepest possible increase of a Collatz sequence. With , this T-word has the minimal number of -operations. By one operation , the start number is changed to which will be rewritten as . Now the second operation will result in which gives . Therefore, after operations , the result will be . Up to here, could be any real number. The condition that is an odd integer means , . With this expression, one gets . In order that this end number is an odd integer for all natural numbers , the re-scale substitution needs to be done. It yields , the end number of the Collatz sequences belonging to the T-word.

From here, the repeated application of the inverse operation yields the start number the end number being with and . In this example, it is the structural simplicity of the T-word which permits a realization in closed form.

To treat the general case, one needs a systematic procedure which is applicable to any T-word. It is best demonstrated for a relatively simple T-word, for example, .

The start number , required to be an odd integer, is written as . The operation will take it to the even integer which, by the operation , will give . This expression should be odd since, according to the T-word, the next operation will be . It will be made odd for all values of by rescaling . The result will be the integer .

Now the operation yields the even integer and gives . This expression should be even since the next operation to be performed is . It will be made even for all values of by rescaling . The result will be the integer . The operation now yields . Since this integer must be odd, the rescaling will be applied to give . We now got the end number for the T-word .

The start number will be obtained by applying iteratively to all rescale substitutions which were done above. The result is .

Checking the result with the T-word gives the sequence , , , , , .

The role of the rescale substitutions is summed up by saying that they are done to assure the correct parity of each integer which is obtained by the Collatz operation prescribed by the T-word.

The re-scale procedure provides the means to numerically calculate the start and end numbers for any given T-word. The author of this paper has developed a computer program able to deal with T-words having a length of up to about letters.

For instance, the unusually long Collatz series with start number and end number has the T-word (we are abbreviating ) It is made up of generators with a total of -operations. The number of steps is . The start period is and the end period is .

The re-scale procedure yields the realizations with . For , this gives indeed and .

Using the computer program mentioned above, the calculation of these numbers takes just a fraction of a second.

Since this sequence has attracted a certain amount of public attention—it figures in Hofstadter’s cult book “Gödel, Escher, Bach” [7]—we will carry it as an example throughout this paper and note it’s relevant properties and characteristics as they are conceptually developed.

We will now describe an alternate method to calculate the start and end numbers for all realizations of a given T-word. It is based on a given T-word with already known start and end numbers which will be appended in either of the following ways. Either, replace the last (i.e., rightmost) generator by one with an additional in it. Or, append the T-word by the generator . All possible extensions of a T-word can be derived by combining these two operations.

Theorem 1. Let be a T-word with   up operations and   down operations whose realizations have start and end numbers described by (8) and (9). Let be the T-word which is obtained from by replacing the last (rightmost) generator by where .
Let the realizations of have start and end numbers Then the start periods and end periods of and are related by The start phase and end phase of are as follows.

Case  ( is odd). Consider

Case  ( is even). Consider

Proof. The operation “” cannot be performed on since this quantity is an odd integer. Therefore one has to go back to the integer from which had been obtained by the rescaling substitution . That this is always the final re-scale step is shown in Table 1. This table describes the logic of the re-scale procedure which has the aim to assure that appropriate expressions of the form are odd for any natural number .
Now, by the re-scale substitution , the expression will become . This expression evidently is even since is even and and both terms in the round bracket are odd. Performing the operation “” now yields . The final re-scale substitution to be performed depends on the parity of the last term. Again, two cases have to be considered.
is odd. Rescaling yields . Applying the same rescaling substitutions to , one finds . From these relations for and , one gets (17).
is even. Rescaling by yields the expression . Here it should be noted that, for sufficiently large , the last term, , may come out to be ≥. Therefore that term is rewritten as . Thus, in the present case, it holds where we have made use of the relation . Now, similar reasoning as in the previous case yields . From here, one gets (18).

Case ActionResult

1EvenEvenDivide by any case
3OddEven Case  4
4OddOdd Case  2

Note that the 3 equations (16), (17), and (18) can be used to verify Lemma 1.

The second extension of a T-word to be investigated is the concatenation of a generator which is a true group operation. In contrast, the extension considered in Theorem 1, i.e. appending an operation “”, does not qualify as a group operation since “” is not an element of the semigroup.

Theorem 2. Let be a T-word with up operations and down operations with start and end numbers described by (8) and (9). Then the possible start and end numbers of , expressed by those of , are given by with andif   is odd: if is even:

Proof. All that is to be done is to subject the end number of to the group operation . Thus will be taken to . Here, the first term is even or odd depending on the parity of but can always be made even by a re-scale substitution. The second term may also be even or odd and needs to be made odd. Thus two cases have to be distinguished.
is odd. Rescaling gives and and thus yields (20) and (21).
is even. Rescaling gives and and thus yields (20) and (22).

Theorems 1 and 2 show that a T-word (1) has an infinite number of realizations whose start and end numbers are given by with By iteration, (16) through (22) allow to calculate the start and end numbers for the realizations of any given T-word under the Collatz rules. The iteration may be started with the simplest of all T-words, i.e. the generator whose realizations have start numbers and end numbers . For example, the formulas of Lemma 1 may be verified by use of Theorem 1.

Knowing the start numbers of a given T-word gives an unprecedented control over the integers in the corresponding Collatz sequence.

As an example, the T-word will now be considered. Here we have and .

The start number of the th realization, as obtained by the re-scale procedure, is given by where The end number is with

Figure 1 shows the Collatz sequences for , with all numbers (even and odd) pictured. The sequences end at step number 288 and from thereon the usual erratic behavior of the values appears.

Figure 2 depicts the entire Collatz series. But, here, only the odd numbers are shown. This is done to avoid unnecessary clutter in the picture.

Figure 3 shows, always for the same T-word, the odd values for the entire and sequences. The start number for the sequence is higher than the start number for the sequence.

In the next section, it will be shown that the phases and are nothing else than the start and end number of the first realization of the same T-word under the rule. Thus it will become clear to which extent the problem and the problem are interlaced and how some special features of the latter influence the former one.

The basic equations (23) and (24) give the start and end numbers of the realizations of a T-word described by (1). The start number must yield the Collatz sequence described by the T-word. If the first generator of the T-word, , is applied, the start number will be taken to the next odd integer by This relation shows that, along the T-word, the phase will evolve according to the rule. It will follow the prescription of the T-word but with an operation “” which now means the multiplication of an odd integer by the factor and subsequent subtraction of .

Thus the Collatz sequence belonging to a realization of a T-word is mimicked by the sequence whose start number is the start phase of the Collatz sequence.

From now on, integers which are obtained by the rule will be marked by an upper index “—” which is converted to a bar over the symbol. Equations (23) may now be rewritten in the more symmetric form And, in particular for and with and , we have The last two equations tell us that for any T-word, it holds The reason is that would imply , contradicting our assumption that is odd. Similarly, would imply , contradicting the fact that cannot divide .

Now, by reasoning perfectly analogous to the one of Theorems 1 and 2, one may determine the realizations of a T-word under the rule. This will result in the following.

Theorem 3. Under the rule, a T-word having down operations and up operations uniquely determines the odd integer such that, for any , is the start number of the th realization of where the integer is the start number of the first realization of .
Likewise, the end numbers of this realization are where is the end number of the first realization of .
Realizations of Simple T-Words under the Rule

Example 3. Realizations of a generator under the () rule.
Case where is even. From (4) and (34) one gets which simplifies to Equations (5) and (35) yield
Case where is odd. From (6), one gets which simplifies to while (7) and (35) yield

Example 4. Realizations of the T-word under the rule.
From (11) and (34), one gets while (12) and (35) imply with and .
Note that, for , one gets , i.e. the trivial cycle.

Now, as shown by (30) to (35), there is perfect symmetry, on one hand, between the integers , which belong to the implementation of a T-word and, on the other hand, the integers , which belong to the implementation of the same T-word. If one eliminates from these equations and uses some elementary algebra, one finds the relations According to (40), the integer is independent of and so is the integer , as shown by (41). And all the integers occurring in these equations can be expressed in terms of . Now, the evaluation of gives the expression which can be rewritten as where with More explicitly, in terms of the parameters to and , the function is written as

The three equations (40), (41), and (42) can now be rewritten asNote the difference in the sign of between the two formulas. Also, since the two equations hold for all values of , this index may just be dropped.

will be called the link function of the T-word because for any T-word (i)and for any realization of under whatever rule, or , it links the start number to the end number, (ii)it links the integers derived from the rule to their analogues derived from the rule.

Equation (47) shows that the link function is a sum of terms each of which is the product of a power of and a power of . Terms of this kind have already attracted the interest of medieval mathematicians like Philippe de Vitry (cf. Carlebach [8]) who, with musical harmonies in mind, called them harmonic numbers.

Equation (44) provides criteria for the existence of a cycle. The condition for a cycle, namely that the start number equals the end number , immediately yields This condition shows, that for a cycle to exist under the () rule, the integer must be positive and must divide .

On the other hand, for a cycle to exist under the () rule, has to be satisfied. Since and are positive, must be negative and the positive quantity must divide .

The question of cycles will be discussed in greater detail in Section 5.

Equations which are equivalent to (44) and to the cycle conditions (49) and (51) have first been published by Böhm and Sontacchi [5] in 1978. These authors used negative integers under the rule to describe what here is treated as positive integers under the rule.

The traditional view considers the problem as the application of the Collatz rule to negative numbers.

Both treatments are equivalent, as far as the calculation of the corresponding number sequences is concerned.

The traditional view seems to have the advantage that there is only the rule which needs to be considered. However, it diverts attention from the fact that the problem and the problem—though related—are quite different from each other.

The equation given by Böhm and Sontacchi applies indeed to both cases, i.e. positive and negative integers as start numbers. But it does not bring to light the important role played by the link function . In fact, Böhm and Sontacchi did not pay attention to the properties of because the importance of this function was not visible in the context of the traditional view. To my knowledge, the properties of the link function have never been carefully investigated, nor the intriguing differences between the functions and .

In Section 1, the diophantine functions and have been introduced. They are defined on the semigroup and they express summary information about the T-word. They are now joined by the link function which expresses detailed information about the T-word. As these functions are playing a key role in the discussion of the and the problem, their properties will be investigated more closely.

Range of Values of , ,   : Number of generators in a T-word, .  : Number of down operations in a T-word, .  : Range is , is odd, and is not divided by .  : Link function, that yields a natural number which is odd and is not divided by . For the lower and upper value of the link function for given and , see (39) below.

Here we want to give the value of the link function for two T-words treated as examples in Sections 1 and 2 (remember that ).

Example of Section 1 (), T-word : One has , , , and .

Example of Section 2 (), T-word


,  ,



Composition Laws. For a T-word , obtained from two T-words and by concatenation, it holds While (53) and (54) are obvious, (55) is easily deduced from (50). Equation (56) is proved by writing the defining equation (47) and then dividing the sequence of terms into the two parts belonging to and .

Attention is drawn to the truly remarkable similarity of the composition laws for the two key functions relating to T-word.

Worth mentioning are these further properties of the functions defined on the semigroup. Consider The last equation is obtained directly from (47).

For another remark relating to the concatenation of T-words, see the end of this section.

Further Remarks on and . All of the quantities considered here, , , , and , are unique diophantine functions of a T-word .

Both functions and are of similar structure, both being finite sums of powers of the integers 2 and 3. Both are nonzero, and neither nor have 2 or 3 as a divisor. However, while can be either positive or negative, is always positive.

Further results about the function will be presented in Section 7.

Equation (56) is of basic importance to the T-word approach. Together with (58), it may be considered as a definition of . Indeed, for a given T-word (1), repeated application of (56) and (58) will result in (47).

Similarly, (55) together with the relation may be taken as a definition of .

Clearly, those quantities which only express summary information about the T-word, such as , , and , are maps of the semigroup into the set of natural numbers. For example, the T-words having the same number of generators all have the same function , and all T-words which are made up of the permutation of the generators of one given T-word have the same values of their functions , , and .

Multiple Link Function Values. As to the link function, (56) and (58) (or the definition (47)) show that does not depend on the number of trailing “” in the T-word.

Moreover, in some rare cases, the same link function value may belong to T-words which differ in the number of their generators, i.e. in their value of .

Table 2 shows the lowest multiple link function values for the three lowest types of multiplicity, double, triple, or quadruple.


19 5223

259 92503

4459 1348111

It appears that the pair (, ), i.e. the value of the link function and the number of generators, uniquely determines the T-word (compare Section 9, Conjecture 5).

Upper and Lower Value of the Link Function for Given and . Consider the set of T-words with fixed and . The number of T-words in this set is given by We want to determine the lower and the upper bound of link function values for the T-words in .

The lowest value of will be obtained for a T-word in which the “free” -operations appear at the end (where they do not count). It has the form Using (11), (12), (44), (55), and (58) one finds The set of T-words with has already been considered (with regard to its realizations) in example 3 of Section 2.

The highest value of will be obtained for a T-word in which all “free” -operations are included in the first (leftmost) generator. This T-word has the form Note that it is obtained from by reversing the order in which the two main constituent T-words are written.

With (11), (12), (44), (55), and (58), one finds Thus, for the T-words , the values of the link function are found in the interval The basic results obtained that far are pictured in Figure 4. For a given T-word, it illustrates the symmetry in the realizations under the and the rule and shows the 2-fold role played by the link function .

Can be equal to or can be equal to ?

In view of (30) and (31) which can be rewritten as and , one may ask if the special case or exists. The answer is negative as seen from (31), (49), and (51).

For , they imply which contradicts the assumption that is an odd number.

Similarly, in the case , the above mentioned equations imply which contradicts the fact that cannot divide the end number of any Collatz sequence.

Concatenation of T-Words. Here we will briefly resume the issue of concatenating two T-words. How does one find the -realizations of the T-word in terms of those for and ?

Consider two T-words and which are concatenated to form the T-word . The T-words , , and come with their respective start numbers , , , and with end numbers , , and  .

The most direct (and perhaps simplest) way to determine and would be to apply the re-scale procedure to . This would be the most convenient way to get a numerical result. However, for the purpose of general investigations, it would be necessary to have general formulae.

Clearly, the Collatz sequences belonging to the two T-words can only be joined together if the end number of equals the start number of . This can be achieved by choosing suitable realizations for and for .

We denote the start numbers of , , and by , , and , and similarly we denote the end numbers by , , and .

The condition for joining the two T-words and is then Using (23) and (24), (65) can be rewritten as Rearranging terms, we get the concatenation condition This is one inhomogeneous linear diophantine equation for the two unknowns and . Since the coefficients and are coprime, it has infinitely many solutions.

A particular solution , can be obtained by one of the standard methods used for solving this type of diophantine equations. As is well known, it cannot be given in closed form.

The general solution, given by the sum of a particular solution and the solution to the homogeneous equation, is The scale of has to be fixed in such a way that and take their smallest allowed values for . Once and are known, the start and end number of can be written as a function of those of and , as shown below.

In fact, from and , one gets The last two equations show how, under the rule, the start number and the end number of the T-word are obtained from those of the T-words and .

For a T-word under the rule, analogeous results are obtained without further calculation. This simplicity is owed to the symmetry of the key equation (30) under an exchange of the start numbers and and (31) under an exchange of the end numbers and , if both equations are taken for .

Under the rule, the concatenation condition will take the form Here, and mean the representation in which and enter the T-word .

In analogy to (68), the general solution to (70) is written as Again, the scale of has to be fixed in such a way that and take their smallest allowed values for .

From (69), we obtain, by placing a bar above the concerned quantities, The 4 representation parameters , , , and have to be integers greater than zero. Moreover, they are not independent of each other. In fact, one shows that for , they are related by the conditions We finally show another way to prove the important composition formula (56).

Write relation (48a) for the T-word and for the two T-words and , keep in mind that , , and .

This yields the three equations Here we are using a simplified notation which hopefully renders some formulas less cumbersome.

Using the last two equations to eliminate the terms and in the first equation, one finds (56).

Hopefully the relations of the preceding subsection on concatenation will be useful in a future treatment of the Collatz problem; see Section 9, Conjecture 1.

Before continuing, the essentials of the preceding sections will be summarized. These essentials include both the concepts and the generic notation. The notion of T-words was introduced to spell out what happens to a sequence of integers either under the Collatz () rule or under the () rule. The sequence has start and end numbers which are both odd. The T-words form a semigroup with countably infinite elements and countably infinite generators. On are defined: The three functions , , containing summary information, and the link function which contains almost complete information about the T-word.

Under the rule, a T-word determines, for , a set of odd start numbers and odd end numbers . Under the rule, a T-word determines a set of odd start numbers and odd end numbers . Start numbers are linked to end numbers by . Also, is linked to and is linked to by .

From the basic equation (56) for the link function, a number of useful and by no means obvious relations can be deduced. The results underline the importance of the functions and .

First we mention the useful commutator relation where and so forth.

Next we want to know the link function of the -fold concatenation of a T-word.

Lemma 2. Let be a T-word and let denote the -fold concatenation of . Then where .

Proof. With , (56) yields This proves the lemma for . Supposing now that it holds true for , then application of (56) to shows that it holds for .

An immediate consequence of (76) and thus a corollary to Lemma 2 is as follows.

Lemma 3. For any , the positive rational functions, defined by satisfy

If   () is an integer,    belongs to a cycle. Therefore,    will be called the cycle function.

For , the cycle function will follow the T-word under the rule. For , follows the T-word under the rule. If    is rational, both numerator and denominator are odd integers and    will be the rational start number of a rational cycle.

The cycle function also plays an important role in criteria for the direction (up or down) a sequence takes; see Section 6.

More about cycles in Section 5.

There is another way of looking at Lemmas 2 and 3.

Lemma 4. Let be a T-word and let denote the T-word obtained from by moving the first generator to the end of . Then it holds or, equivalently,

Proof. The described cyclic one-step permutation changes to , where (with , , ) and (with , , ). Evaluating and by (56) yields and . Elimination of from these two equations gives (81).

Remark. Despite the denominator in (81) the quantity is an integer. In fact, that the right hand side of (81) contains a factor is seen when is inserted from (47).
Lemma 4 means that for , a one-step cyclic permutation of the generators of causes the cycle function (or ) to change according to the Collatz rule, and for , it changes according to the rule.
In order to apply the Collatz rules to rational numbers with odd denominator, one has to adopt the following terminology. The fraction is called even (odd) if the numerator is even (odd). Since the denominator (or ) is odd, the (or ) rule can be applied to the rational number (or ).
Rational cycles under the rule have been investigated by Lagarias [9].
Summing up the preceding remarks we note the following. (i)Every T-word with gives rise to a rational cycle whose (rational) start number is (=). This rational cycle is generated by the rule. (ii)Every T-word with , i.e. , gives rise to a rational cycle whose (rational) start number is (=). This rational cycle is generated by the rule.
Inverse T-Words. In Introduction it has already been mentioned that the semigroup can be made a full group by adding to it the inverse of all elements. Note that the inverse of any T-word is uniquely determined. We denote the inverse of a T-word by . It is defined by . For instance, the inverse of a semigroup generator , corresponding to the operation , is corresponding to the operation .
The relevant functions of are all obtained from and the composition laws (55), (56), and (57). One gets Now, applying (56) to , one shows that and, similarly,

Remark 4. As an example for the usefulness of these results, an alternate proof of Lemma 4 is given. In the notation of Lemma 4, write which is equivalent to . Evaluating both sides with (56) yields and thus . This gives (81).
Another interesting result is obtained from (85) and (86).

Lemma 5. For any T-word and , it holds

Proof. By straightforward calculation using (85) and (86), one finds

5. Cycles

First a remark on the terminology to be used: the cycle conditions have already been given by (49) to (52). Though slightly different for the two cases and , they suggest a simple classification which is applicable to both cases.

A trivial cycle (TC) is characterized by and or , a simple cycle (SC) by and or , and an amazing cycle (AC) by and or .

Usually, the types SC and AC are called nontrivial cycles. However, a distinction between SC and AC is necessary since the situation is simple as long as the denominator (or