Journal of Probability and Statistics

Journal of Probability and Statistics / 2019 / Article

Research Article | Open Access

Volume 2019 |Article ID 6814378 | 11 pages |

On the Probabilistic Proof of the Convergence of the Collatz Conjecture

Academic Editor: Alessandro De Gregorio
Received20 Jan 2019
Accepted27 May 2019
Published01 Aug 2019


A new approach towards probabilistic proof of the convergence of the Collatz conjecture is described via identifying a sequential correlation of even natural numbers by divisions by that follows a recurrent pattern of the form , where represents divisions by 2 more than once. The sequence presents a probability of 50:50 of division by 2 more than once as opposed to division by 2 once over the even natural numbers. The sequence also gives the same 50:50 probability of consecutive Collatz even elements when counted for division by 2 more than once as opposed to division by 2 once and a ratio of 3:1. Considering Collatz function producing random numbers and over sufficient number of iterations, this probability distribution produces numbers in descending order that lead to the convergence of the Collatz function to 1, assuming that the only cycle of the function is 1-4-2-1.

1. Introduction

The Collatz conjecture concerns natural numbers treated as () of positive even integers. It is defined by the functionIt simply asks you to keep dividing any positive even integer repeatedly by 2 until it becomes an odd integer, then convert it to even integer by tripling it and adding 1 to it, and then repeat the process. The conjecture has been widely studied [1, 2]. It predicts that the recurring process will always form a sequence that descends on the natural numbers to cycle around the trivial cycle 1-4-2-1. The conjecture involves the natural numbers and it simply asks, under any complete process of the conjecture, why it is always the case that, over statistically sufficient number of iterations, the decrease made by the divisions by 2 exceeds the increase made by the conversions from oddness to evenness. It has been noticed here that, from a start odd positive integer, one iteration either increases the number when the result-even number is divided by 2 only once to obtain an odd number or decreases the start number when the result-even number is divided by 2 more than once. Therefore, we seek here to quantify the decrease and the increase probabilistically of the start number after every iteration and generalize that over a sufficient number of iterations to check convergence of the function. It is claimed here that the function decreases the start number until it reaches a cycle, because statistically the sequence of all of the consecutive even integers of the elements of the Collatz function over the natural numbers (validated by deducting 1 from every even positive integer and then checking divisibility by 3) has a recurrent pattern of of division by 2 more than once compared to division by 2 once of probability 50:50 and ratio of about 3:1, where is division by 2 more than once.

Collatz conjecture function seems to produce random numbers and generate a random walk process locally but globally converges to 1. Therefore, to prove the convergence of the conjecture probabilistically it is sufficient to show that globally the recurrence of divisions of Collatz even elements by 2 more than once to reach an odd number has the same probability as that of their recurrent divisions by 2 once, denoted here as recurrent frequency (RF), and averages by the ratio of about 3:1. Summing over the respective divisions will always lead by a margin that offsets the increase of the recurrent sum made by the recursive conversion process of the odd Collatz number to even number by tripling it and adding 1 to it. This is easily noticeable if we recognize that if the positive even integers were sequenced by increase by 2, e.g., , division by 2 over the positive even integers follows a sequential order that is described as follows: if any of the sequence’s even elements produces an odd number when divided by 2 once, the following element in the sequence must produce an odd number by division by 2 more than once. This hidden regularity produces a 50-50 probabilistic RF of division by 2 over the positive even integers and turns what seems a random distribution of division by 2 to a global process that makes the events of division by 2 recurrence over the whole positive even integers progress according to the sequence , where is the number of divisions of the even number by 2 more than once to produce an odd number. Here we prove that Collatz-even numbers also follow the same 50:50 probability distribution that leads to descent convergence of the sequence made by the function to a cycle. The proposed proof of the Collatz conjecture here is complete if its process only cycles about 1, 4, and 2, since the decrease of the sequence of the global Collatz process is assembled from perfect correlated probabilistic events defined by the sequence ., over the function’s even elements. This probabilistic correlation is not heuristically derived as opposed to the well-known heuristic argument of the function found in many references [35] which states that the function averages division by 2 once of the time and division by 2 twice of the time and division by 2 three times of the time, etc., which produces a decrease of of the preceding number each iteration on average. In this paper it is claimed that the function produces an increase of the odd start number of 50%, of the time as opposed to a decrease of the odd start number of 62% the other of the time, averaged over a sample of sufficiently large number of Collatz even integers if we assume that the mixing properties of the function’s even integers are truly picked at random in the process.

2. Division by 2 Sequence of Positive Even Integers

For comparison and to easily identify the RF sequence of division by 2 for Collatz function elements, we first generate the RF sequence of positive even integers.

Lemma 1. Let be any positive even integer that can be divided by 2 only once to yield an odd positive integer; then the next even integer must be divided by 2 by more than once to yield an odd positive integer.

Proof. If by initial definition, then . Adding the LHS expressions yields , an odd number. This necessitates that and the term is divisible by 2 more than once.

Lemma 2. Let be any positive even integer that can be divided by 2 only once to yield an odd positive integer; then the second to next even integer must be divided by 2 only once to yield an odd positive integer.

Proof. If by initial definition, then . Adding the LHS expressions yields , an even number. This necessitates and the term is divisible by 2 only once to obtain an odd number.

From Lemmas 1 and 2, we generate a table of positive even integers and their corresponding frequencies of division by 2 until reaching an odd parity. Starting with the first row as the even integers made by the term , with elements as the frequencies of division by 2, and spanning the natural numbers we can construct a “RF table” over all positive integers that identifies Collatz elements with the back-bone as the line of integers that collapse to 1 by repeated divisions by 2 made by the even numbers , as Collatz function requires. This row makes a symmetrical line that contains all even numbers made by Collatz function that collapse to the trivial cycle 1-4-2-1, e.g., 4, 8, 16, 32. We then construct columns in ascending order by increase by 2 to produce all even positive integers with each column ending by an even number that is two less than the next integer on the collapsing symmetrical line. We observe that the symmetrical line in the table has symmetrical sequential frequencies for all of the columns to infinity along the rows and makes rows with equal frequencies because of the ordered repeated frequencies for each column, which allows us to estimate relative RFs, a key probability distribution that allows us to conclude that Collatz conjecture converges probabilistically to a cycle. The table is constructed in this order mainly to be able to count frequencies of divisions by 2 and approximate the relative RFs of even positive integers to yield an odd number. It follows that consecutive Collatz function’s even elements (in italic) also follow the same pattern as those of the sequence of the table of . We also construct the table with the variable spanning all positive integers of on the symmetrical line, not just even Collatz function elements that contribute to the collapse process of Collatz function, to produce a line of all powers of twos.

Starting with any natural number, Collatz function produces numbers in seemingly random way locally but globally the numbers decrease and the process proceeds toward the collapsing symmetrical line and to the left on the table and it eventually hits a number on the symmetrical line and then collapses to 1 and cycles around 1-4-2-1 in a deterministic process.

The symmetrical distribution of frequencies of divisions by 2 of even natural numbers as in the table exhibits a classical probability distribution about the collapsing symmetrical line over the natural numbers. Only those numbers on the symmetrical line that satisfy Collatz function can branch out and contribute to the collapse process to 1 (those numbers with s an even integer) and the branches that are connected via the function with odd start numbers making new subbranches on the Collatz tree (see Figure 1) that if a branch is reached, the process will collapse to its start odd number; i.e., on the trunk of the tree, the number 28 (256) contributes to the collapse process because you can deduct 1 from it and divide by 3 to get a whole number, but the number 29 (512) does not, and the number 341 leads to 210 (1024) on the symmetrical line that collapses to 1 while the odd number 357913941 ends with 230 (1073741824) on the symmetrical line as well. Those numbers on the symmetrical line that can be traced backward by the function act as points for branching out to trace the Collatz tree where the symmetrical line is the tree’s trunk.

3. Perfect Symmetry of the Table of Positive Even Integers with Pivotal Frequencies

Looking for hidden symmetry in the background of even integers in terms of RFs is of prime importance to assign symmetry to the RFs of the function and determine their probability ratios. That is because Collatz function’s elements occur sequentially every third element in the table of positive integers as represented by Table 2. Quick observation of the table of positive even integers reveals that each column is exactly symmetrical about a pivotal new frequency that is next number on the number line to the pivotal frequency in the preceding column and equals of its own; i.e., the column 25 (32) has double the frequencies of the preceding column of 24 about the pivotal RF of 4, with pivotal RF greater than the preceding one of 3 of the preceding column by 1 and equal to , where is 5. Those pivotal RFs increase to infinity by the increase of the numbers of the column about the symmetrical line of .

4. Perfect Symmetry of the Table of Even Elements

The perfect symmetrical RFs of division by 2 of the even terms of the function allow us to determine their probability ratios. We construct Table 3 the same way we construct Table 1 by building columns in ascending order of consecutive elements about the collapsing symmetrical line that is made of elements. Similar to the columns of the even integers in Table 1, the columns of even elements of of Table 3 are symmetrically distributed since each row in the table carries the same frequency for all of the columns and each column begins with the same RF sequence as the preceding column. This symmetry is important since it spans positive integers to infinity and it allows taking the average of the RFs of the elements of the function that divide by 2 more than once as a measure of the behavior of the function’s complete process that leads to a descending trajectory. The ratio of about 3:1 of divisions by 2 more than once of the RFs compared to divisions by 2 once was computed from the table with counts varying by length and location, between 100 and 3000 of consecutive elements and up to 236 element. While the RF ratio of about 3:1 seems to be consistent to infinity as shown by Table 3, this work needs yet elaborative approach to identify a proper distribution that yields this ratio to prove the consistency to infinity of the RF symmetry of Table 3 to reach full proof of the convergence of Collatz conjecture. Disregarding the exact shape of the proper distribution, here we use symmetry of the distribution of the even elements about the symmetrical line to estimate the RF ratio of 3:1.























































































5. Probability Distribution of Even Natural Numbers in Terms of Division by 2

The ordered distribution of even natural numbers in terms of division by 2 about the symmetrical line represents a classical probability distribution.

Lemma 3. The probability of division by 2 more than once and division by 2 once for a randomly chosen positive even number is .

Proof. It follows from the ordered distribution of division by 2 once followed by division by 2 more than once by Lemma 1 and Lemma 2. The probability is easy to check in Table 1.

6. Probability Distribution of Collatz Function’s Even Elements

Table 3 represents 50:50 probability of the Collatz function’s even elements in terms of their RFs of division by 2 more than once as opposed to division by 2 once.

Lemma 4. Collatz function even elements are sequenced every three consecutive numbers on the sequence of the even nonnegative integers with probability of division by 2 more than once as opposed to division by 2 once, to obtain an odd number for a randomly chosen Collatz element, being .

Proof. Let be any Collatz even element that can be divided by 3 after subtracting 1; then the next even integer is not a Collatz even element since it does not follow that restriction, neither is the next one, but the one that follows is, since a Collatz element is restricted bywhich is divisible by 3 if the variable is multiples of 6 only and leads to the fourth consecutive integer after the variable on the table of nonnegative even integers.
Further, let be any Collatz even element that is only divisible by 2 once; the following set of logical equations describes RFs of all of Collatz even elements by obtaining their parity.
First, if by initial definition, then . Adding the LHS expressions yields , an odd number. This necessitates that and the term is even number and therefore it is divisible by 2 more than once to obtain an odd number.
Second, if by initial definition, then . Adding the LHS expressions yields , an even number. This necessitates that and that the term is divisible by 2 only once to obtain an odd number.

This is shown by quick inspection of the sequence of the even natural numbers by subtracting 1 followed by division by 3 (italic face in Table 1).

Note. Lemma 4 can be generalized to any generalized Collatz function in the form of , where is an odd number and their corresponding sequence on the even nonnegative integers sequence can be derived accordingly.

7. Probability Ratios of RFs of Collatz Even Elements

Since RFs are ordered perfectly among all Collatz even elements as Table 3 indicates, any sufficiently large sample is a true representation to compute RF ratio of Collatz function’s elements.

Lemma 5. The sum of divisions by 2 more than once is on average 2.97 times (about 3 times) the sum of divisions by 2 once over the Collatz even elements over the first 1500 counts.

Proof. Inspection of Table 3 verifies the prediction.

Theory 1. Collatz function process must produce a descending order of numbers over adequate number of iterations with 3:1 RF ratio of division by 2 of even elements.

Proof. Considering the apparent random distribution of even integers produced by Collatz function processes, the probability of division by 2 more than once to division by 2 once of Collatz elements is . Therefore we can average Collatz processes as two distinctive operations of a start number of an odd Collatz element that ends up in an odd number. The first operation is to increment it by tripling it and adding 1 and then dividing by 2 once, which increases the start number while the other operation is to increment the start number but divide it by 2 three times, which decreases the start number by a larger magnitude than the increase, in line with Collatz conjecture. To see this, let be the start odd-number. Then applying the function and dividing by 2 once give an end number,For large enough numbers, this equation gives an increase of about of the start number. If we divide the function by 23 instead, the process givesThis equation gives a decrease of about of the start number, which is larger than the increase, and leads to successive average decrease of the start number.

The critical ratio that produces reduction of the start number as increase is about 2.57:1.

Collatz function then produces steps that end with even numbers that zigzag up and down but in a descending manner until it eventually reaches an odd number whose ascending step is on the symmetrical line and collapses to the ultimate odd number of 1 assuming the cycle 1-4-2-1 is the only cycle in the process.

For low counts of even elements, Table 4 shows selective ratios of the RFs of division by 2 more than once compared to division by 2 once. The ratios reveal that, for low counts, as low as 100 elements that may span a Collatz process, the process exhibits decreasing trajectories. Also, the ratio of the first 129 RFs of the column under 224 on the symmetrical line in Table 3 is found to be 2.92, which yields a decreasing trajectory since all of the ratios are above the critical ratio.

Sample spaceRatio









For very low counts of a complete process, the process may quickly end its life and collapse to 1 by hitting the symmetrical line with increasing trajectory up to the symmetrical line; examples are start odd numbers of 3 and 5.

Example 6 (start odd number is 3). For a 50:50 probability of division by 2 more than once as opposed to division by 2 once with ratio 3:1 and a repeating pattern, the iteration process of applying the function gives 10. Division by 2 once gives 5. If we divide by 2 three times the iteration gives 1.25. This is actually an increase of the start number. Fortunately, the process hits the symmetrical line and collapses to 1 upon increase of the start number only upon repeating the process with the number 5. Starting with the number 7 and up on the number line, in general, the collective decrease of the start number is larger than the collective increase, except for those start numbers that hit the symmetrical line before the decrease occurs, e.g., start numbers that make the main branches on the Collatz tree (see Figure 1) such as 5, 21, 85, 5461, etc., in line with theory 1.

Example 7 (start odd number is 9999). Incrementing it by Collatz function and dividing by 2 once yield 14999. This is an increase of the number of 50%. Repeating the process but dividing by 2 three times, you get 3749.75. That is a decrease of the start number of about 62%. Obviously, the percentage decrease is larger than the percentage increase of the start number in line with the Collatz conjecture.

8. Comparison with Generalized Collatz Functions

Many generalized Collatz functions are discussed in the literature [48]. Generalized Collatz functions such as and do have probability distributions of their even elements in terms of division by 2; i.e., inspection of Table 1 reveals that the function has even integers every four consecutive integers over the even integers sequence with a 50:50 ratio of division by 2 more than once as opposed to once. The same goes with the function with spacing of seven consecutive integers. Therefore, to check for the divergence of those functions, we must compute the relative frequencies of divisions by 2 that contribute to the rise of the start number as opposed to those that contribute to its descending. It is noticed here that, unlike the function , division by 2 contributes differently to the increase or decrease of the start number for other generalized functions; i.e., besides the fact that you multiply the start number by 5 instead of 3, division by 2 once as well as twice with the function contributes to the rise of the start number and its process and then produces an equation with coefficient of much large compared to the function leading to the divergence of the function.

Question. Why does not the function eventually reach the symmetrical line on its ascending trajectory and it collapses to 1?

Answer. None of the even numbers on the symmetrical line belongs to the function’s elements since deducting 1 from all of its even integers does not produce odd integers that are evaluated to 0 ().

9. The Only Trivial Cycle of the Collatz Function Is 1-4-2-1

It is easy to prove that the cycle 1-4-2-1 is the only trivial cycle for Collatz function.

Lemma 8. Let be any positive odd integer. Then the only trivial cycle of the Collatz function is 1-4-2-1.

Proof. The equationdescribes a trivial cycle where is an integer that equals the number of divisions by 2. Solving for yieldsFor positive integer solution, must be 2, must be 1, and must be 4. That is because if , the expression yields as fraction and if , the expression yields negative value. This leaves the only solution to the equation with a start number 1.

10. Nontrivial Nested Cycles

Collatz conjecture forbids looping anywhere on the Collatz tree except at the bottom of the trunk as clear in Figure 1. Starting at any point on the tree, Collatz function allows the process only to head in one direction from one point to another on a subbranch to another subbranch leading to a main branch and then to the trunk and finally collapsing to the loop 1-4-2-1. A global nested trajectory of Collatz conjecture is represented by the sequence that defines the trajectory of a start odd number ,The sequence of the function must become periodic with an end number that equals the start number for any and .

Collatz conjecture suggests the nonexistence of a nested cycle with the start number that equals the end number. This may not be generalized locally for any relatively small degree of nesting of the function which, according to the conjecture, prohibits the return to the same start number different than 1. Since we assume that the function heads to stochastic behavior very fast, we may assume with small degree of certainty that the function does not trace back to the same start number and hit a cycle somewhere on the sequence of integers with high probability by the same reasoning of stochastic distribution of large number of elements in a sample, to hit the same number twice with a very low probability.

In comparison with the function that has no elements on the trunk of the Collatz tree and therefore any of its cycles must end up with an odd number other than 1 (see Figure 1), the function has elements on the trunk and it collapses to 1 if it happens that the function’s trajectory reaches the trunk to cycle around the trivial cycle 1-4-2-1. Two known cycles of are 17-27-43-17 and 13-83-33-13. The higher degree of zigzagging up and down of the function (compared with the function starts with the column with a start odd number 13 of Collatz tree of the specified function, then alternates between other columns, here 81 and 33 for the second cycle, and returns to the column of start number 13 (see Figure 1). Any number of the four numbers can be the starting and ending number as well. Obviously, the three columns involved must contain elements of the function that are spaced wide enough for the trajectory to return to the same column (tree branch) it starts from. The generalized function statistically has a larger chance to hit the start branch after its launch than because the degree of zigzagging about its launch branch is higher because it has a wider spacing (multiplying by 5) and also that division by 2 once as well as twice contributes to the increase of the function as opposed to only division by 2 once that contributes to the increase of the start number for the function . Overall, probabilistically, it does not seem there exists what prohibits a nontrivial cycle for the function .

11. Conclusion

The convergence process of the function was proven over its elements up to 236 by identifying a sequence of the function’s positive even integers that produces a probability of 50:50 of the division of the integers by 2 more than once as opposed to their division by 2 once with a ratio of about 3:1. For any positive odd integer, the collective divisions by 2 more than once that produced a total decrease of the start number in the function’s trajectory were found to exceed the total increase of the start number produced by division by 2 once. The process indicates a systematic global decrease until one event matches an even number on the symmetrical line and collapses to 1 and loops the cycle 1-4-2-1, presuming that the function yields no other cycles.

Data Availability

The data used to support the findings of this study are mostly available in the text. Any further data might be requested from the corresponding author.

Conflicts of Interest

The author declares that he has no conflicts of interest.


The author is grateful for the help and encouragement he received from Prince Mohammad Bin Fahd University.


  1. J. C. Lagarias, “The 3x + 1 problem: an annotated bibliography (1963–1999),” View at: Google Scholar
  2. J. C. Lagarias, “The 3x + 1 problem: an annotated bibliography, II (2000-2009), 2012,” View at: Google Scholar | MathSciNet
  3. T. Terence, “The Collatz conjecture, Littlewood-Offord theory, and powers of 2 and 3, 2011”. View at: Google Scholar
  4. J. C. Lagarias, “The 3x + 1 problem and its generalizations,” The American Mathematical Monthly, vol. 92, no. 1, pp. 3–23, 1985. View at: Publisher Site | Google Scholar
  5. R. E. Crandall, “On the "3x+1'' problem,” Mathematics of Computation, vol. 32, no. 144, pp. 1281–1292, 1978. View at: Publisher Site | Google Scholar | MathSciNet
  6. M. Garcia and F. Tal, “A note on the generalized 3n+1 problem,” Acta Arithmetica, vol. 90, no. 3, pp. 245–250, 1999. View at: Publisher Site | Google Scholar
  7. F. Mignosi, “On a generalization of the 3x + 1 problem,” Journal of Number Theory, vol. 55, no. 1, pp. 28–45, 1995. View at: Publisher Site | Google Scholar
  8. K. R. Matthews, “Generalized 3x+1 mappings: markov chains and ergodic theory,” in The Ultimate Challenge: The 3x + 1 Problem, J. C. Lagarias, Ed., pp. 79–103, AMS, 2010. View at: Google Scholar

Copyright © 2019 Kamal Barghout. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

1744 Views | 535 Downloads | 0 Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.