ISRN Biomathematics

Volume 2013, Article ID 538631, 8 pages

http://dx.doi.org/10.1155/2013/538631

## Dinucleotide Circular Codes

^{1}Equipe de Bioinformatique Théorique, BFO, LSIIT, UMR 7005, Université de Strasbourg, Pôle API, 300 Boulevard Sébastien Brant, 67400 Illkirch, France^{2}Istituto di Analisi dei Sistemi ed Informatica “Antonio Ruberti”, Consiglio Nazionale delle Ricerche and Dipartimento di Matematica, “Ulisse Dini” Università di Firenze, Viale Morgagni 67/A, 50134 Firenze, Italy^{3}Université de Marne-la-Vallée, 5 boulevard Descartes, 77454 Marne-la-Vallée Cedex 2, France

Received 12 October 2012; Accepted 12 December 2012

Academic Editors: J. Chow, M. Jose, M. R. Roussel, and J. H. Wu

Copyright © 2013 Christian J. Michel and Giuseppe Pirillo. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We begin here a combinatorial study of dinucleotide circular codes. A word written on a circle is called circular. A set of dinucleotides is a circular code if all circular words constructed with this set have a unique decomposition. Propositions based on a letter necklace allow to determine the 24 maximum dinucleotide circular codes (of 6 elements). A partition property is also identified with eight self-complementary maximum dinucleotide circular codes and two classes of eight maximum dinucleotide circular codes in bijective correspondence by the complementarity map.

#### 1. Introduction

We continue our study of the combinatorial properties of circular codes in genes, that is, on the nucleotide alphabet . A dinucleotide is a word of two letters (diletter) on . A trinucleotide is a word of three letters (triletter) on . The two sets of 16 dinucleotides and 64 trinucleotides are codes in the sense of language theory but not circular codes [1, 2]. In order to have an intuitive meaning of these notions, codes are written on a straight line, while circular codes are written on a circle, but, in both cases, unique decipherability is required.

Trinucleotide comma-free codes, a very particular case of trinucleotide circular codes, have been studied for a long time, see for example, [3–5]. After the discovery of a trinucleotide circular code in genes with strong mathematical properties [6], circular codes are mathematical objects studied in combinatorics, theoretical computer science, and theoretical biology. This theory underwent a rapid development, see for example, [7–27].

Trinucleotides are the fundamental words for genes, that is, the DNA sequences coding the amino acids constituting the protein sequences. However, dinucleotides are also words with important biological functions in genomes. Dinucleotides are involved in some genome sites, for example, the splice sites of introns in eukaryotic genomes are based on the dinucleotides and [28, 29]. Dinucleotides are also involved in some genome regions, for example, the dinucleotide in animal and plant genomes allows a positive or negative control over gene expression [30], and the dinucleotides [31, 32], [33], and [34] in eukaryotic genomes occur as concatenated words , (called tandem repeats in biology).

We begin here a new combinatorial study concerning the dinucleotide circular codes. Their number, their list, and a partition according to the complementarity map are determined with propositions based on a letter necklace.

#### 2. Preliminaries

The following definitions and propositions are classical for any finite set of words on any finite alphabet [1]. We recall them for dinucleotides, that is, words of length 2 on a 4-letter alphabet. Let denote the genetic alphabet, lexicographically ordered by . The set of nonempty words (resp., words) on is denoted by (resp.). The set of the 16 words of length 2 (dinucleotides or diletters) over is denoted by . The set of the 64 words of length 3 (trinucleotides or triletters) over is denoted by .

*Definition 1. *A set of words in is a dinucleotide code if, for each , , the condition implies and for .

Dinucleotide codes are read on a straight line.

*Definition 2. * A dinucleotide code in is circular if, for each , , , , the conditions and imply , (empty word), and for .

Dinucleotide circular codes are read on a circle.

*Remark 3. *The set is a code but not a circular code.

*Definition 4. *Two dinucleotides and are conjugate if there exist two letters and , , such that and .

Proposition 5 (see [1]). *A dinucleotide circular code cannot contain a word of the form ** with **. *

The periodic dinucleotides , , , and cannot be in a dinucleotide circular code.

Proposition 6 (see [1]). *A dinucleotide circular code cannot contain conjugate dinucleotides. *

*Example 7. *The dinucleotides and cannot be in the same circular code.

The set operations complementarity , permutation , and mirror image ^{̃} defined later are involutions.

*Definition 8. *The nucleotide complementarity map is defined by *, **, *, and .

*Definition 9. *The dinucleotide complementarity map is defined by for all .

*Example 10. *.

*Definition 11. *The complementary dinucleotide set of a dinucleotide set is the set obtained by applying the dinucleotide complementarity map to all the dinucleotides of *. *

*Remark 12. *.

*Definition 13. *A dinucleotide circular code is self-complementary if, for each *, *.

*Definition 14. *The (left) dinucleotide circular permutation map permutes circularly each dinucleotide *, **. *

*Definition 15. *The permuted dinucleotide set of a dinucleotide set is the set obtained by applying the circular permutation map to all the dinucleotides of *. *

*Remark 16. *.

*Definition 17. *The mirror image of a dinucleotide is , *. *

*Definition 18. *The mirror image of a dinucleotide set is the set of the mirror images of all the dinucleotides of .

*Remark 19. *.

*Remark 20. *For a dinucleotide and for a dinucleotide set *,* we have and *. *

Proposition 21 (see [27]). *A dinucleotide code is circular if and only if the dinucleotide code is circular.*

Proposition 22. *A dinucleotide code ** is circular if and only if the permuted dinucleotide code ** is circular. *

*Proof. *By Proposition 21 and Remark 20.

*Remark 23. *Proposition 22 is not true with trinucleotides [6].

#### 3. Results

In this paper, we identify the subsets of which are circular codes. Based on a letter necklace, we prove a necessary and sufficient condition for a set of dinucleotides to be a circular code.

*Definition 24. **Let * be letters in . One says that the ordered sequence is a *-*necklace for a subset if each dinucleotide belongs to *. *

Proposition 25. * Let be a subset of . The following conditions are equivalent:*(1)

*is circular code,*(2)

*has no 5-necklace.*

*Proof. *(1) (2). Let be a circular code. We have to prove that has no -necklace. Suppose, by way of contradiction, that is a -necklace for . As contains four letters, for some , , we have that . Remark that the maximum value of is . (i) If , then has a periodic dinucleotide . (Contradiction with Proposition 5.) (ii) If , then has two conjugated dinucleotides and . Contradiction with Proposition 6. (iii) If , then either or . (iiia) If , the -necklace is . So, . Consider the sequence . Put , , , , , . Note that belong to . Now, the following relations hold: and . (Contradiction with the assumption that is a circular code.) (iiib) The case of is analogous. (iv) If , then , and the 5-necklace is . So, . Consider the sequence . Put , , , . Note that belong to . Now, the following relations hold: and . Contradiction with the assumption that is a circular code.

(2) (1) Let be without -necklace and suppose, by way of contradiction, that is not a circular code. As is a uniform code, there exist , , , such that and . Moreover, as all the elements of have a length of , there exist , such that ,, , and , ,, , . Now if , then is a -necklace, if , then is a -necklace, if , then is a -necklace. In any case, there is a -necklace. Contradiction.

If a dinucleotide set is a circular code, then there exists no word of with two different decompositions of their products written on a circle.

*Example 26. *Consider the set containing only the dinucleotides and *. *Let be any sequence with or *. *As does not contain *, **, * and *, *the sequence cannot have a double decomposition on a circle. The set has no 5-necklace as, if is (resp., ) then must be (resp., ), but (resp., ) is never a suffix in *. *There are sets with six dinucleotides which are circular codes. For example, any sequence with the set of dinucleotides , , *, *, *, * has no double decomposition on a circle.

More generally, write the sequence of on a circle. If , , , , , belong to a set , then cannot be a circular code because the sequence can be read in two ways: (with as the first letter) and (with as the first letter). There is a double reading of the sequence (corresponding to a double reading of the sequence ). In this case, is a -necklace for .

*Example 27. *If *, **, **, **, **, * are dinucleotides of *, *we have the following relations: *, * and *. *So, is not a dinucleotide circular code (also a consequence of the fact that contains two conjugate dinucleotides and )*. *

Proposition 28. *A dinucleotide circular code has at most 6 elements. *

*Proof. *There are 16 dinucleotides. Four dinucleotides are periodic: , , , and . The remaining 12 dinucleotides are partitioned in six conjugation classes: , , , , , and . By Proposition 6, a dinucleotide circular code has at most one dinucleotide in each of these conjugation classes. So, a dinucleotide circular code has at most 6 elements.

Proposition 29. *Let ** be a permutation of **. If**then ** is a dinucleotide circular code. *

*Proof. *Suppose, by way of contradiction, that is not a dinucleotide circular code, and let be a -necklace of . Note that, with the exception of , the other letters composing the necklace must be a suffix of a dinucleotide of .*Claim *1. *For *, . *Proof of Claim *1*.* By inspection, is never a suffix of a dinucleotide of . *Claim *2*. For *, .*Proof of Claim *2*.* By inspection, is a suffix only of . For , if , then which is impossible by Claim 1.*Claim *3*. For each *, . *Proof of Claim *3*.* By inspection, is a suffix only of and . Suppose, by way of contradiction, that . Then, or . If , we are in contradiction with Claim 1 and if , we are in contradiction with Claim 2. Suppose, by way of contradiction, that . Then, or . If , we are in contradiction with Claim 1 and if , we are in contradiction with Claim 2.*Claim *4*.* .*Proof of Claim *4*.* By inspection, is a suffix only of , , and . Suppose, by way of contradiction, that . Then, , , or . In the first case, we are in contradiction with Claim 1; in the second case, we are in contradiction with Claim 2; and in the third case, we are in contradiction with Claim 3.

By Claims , and , we have , and so, has no -necklace. Consequently, is a dinucleotide circular code.

*Definition 30. *A maximum dinucleotide circular code is a dinucleotide circular code having 6 elements.

*Remark 31. *In Proposition 29, we have considered an arbitrary permutation of , and we have proved that a maximum dinucleotide circular code corresponds to it. As the number of possible permutations is 24, the number of maximum dinucleotide circular codes is at least 24, and we will prove hereafter that it is exactly 24.

In the maximum dinucleotide circular code (Proposition 29), the letter has three occurrences in prefix of dinucleotides of (shortly in prefix of ), the letter has two occurrences in prefix of , and has one occurrence in prefix of . The letter never occurs in prefix of . This is a general fact, in the sense that in each maximum dinucleotide circular code there is a letter, say , with three occurrence in prefix of , and a letter, say , with two occurrences in prefix of , and a letter, say , with one occurrence in prefix of , while the remaining letter, say , never occurs in prefix of .

We will prove formally this general fact. In the sequel, a set of nonnegative numbers having a sum equal to is called a -partition of . By “set” we rather mean a “multiset” as some numbers can be equal,. Define (resp., , , ) as the number of occurrences of (resp., , , ) in prefix of a maximum dinucleotide circular code .

Lemma 32. *If is a maximum dinucleotide circular code, then is a -partition of . *

*Proof . *By Proposition 28.

Lemma 33. *In any dinucleotide circular code, one has . *

*Proof. *The alphabet contains four letters, and a dinucleotide circular code cannot contain periodic dinucleotides.

*Example 34. *For *,* the 4-partition of 6 is *. *

The following lemma will prove that the unique possible 4-partition for a maximum dinucleotide circular code is .

Lemma 35. *For each maximum dinucleotide circular code , there exists a permutation of such that has three occurrences in prefix of , has two occurrences in prefix of , has one occurrence in prefix of , and has no occurrence in prefix of . *

*Proof. *Putting the values in nonincreasing order, by Lemma 33, we have to consider only the following cases: , , , , and . *Case **.* Let be the letter with three occurrences in prefix of . Let be the three other letters of . We have *.* Without loss of generality, suppose that has three occurrences in prefix of . Necessarily one of the two dinucleotides and must be in . But, in the first case, we are in contradiction with Proposition 6, and in the second case, we are in contradiction with Proposition 5.

So, the case is impossible. *Case *. Let be the letter with three occurrences in prefix of . Let be the three other letters of . Then, we have and ; otherwise, we are in contradiction with Propositions 5 and 6.

Now, suppose that in , the same letter, say without loss of generality, has two occurrences in suffix of ,. The letter cannot be a prefix of . Indeed, , are in contradiction with Proposition 6, and is in contradiction with Proposition 5.

So, in , the letters must have only one occurrence in suffix of . Without loss of generality, we have . But, is a -necklace for . By Proposition 25, we are in contradiction.

So, the case is impossible.*Case *. Let be one of the three letters with two occurrences in prefix of . Let the three other letters of . Without loss of generality, we have . With the two other letters having two occurrences in prefix of , we have three possibilities , , *. **Case *. By Propositions 6 and 5, , but . As are conjugate, we are in contradiction with Proposition 6.*Case *. By Propositions 6 and 5, , but . As (otherwise, we are in contradiction with Proposition 6), the two dinucleotides with in prefix of must be and and, consequently, . But, is a -necklace for . By Proposition 25, we are in contradiction.*Case *. By Propositions 6 and 5, , but . As (otherwise, we are in contradiction with Proposition 6), the two dinucleotides with in prefix of must be and and, consequently, . But, is a -necklace for . By Proposition 25, we are in contradiction.

So, the case is impossible.*Case *Let be one of the two letters with two occurrences in prefix of . Let be the three other letters of . Without loss of generality, we have . Consider the following cases:(i) has two occurrences in prefix of , and and have one occurrence in prefix of . By Propositions 6 and 5, , and is the unique possible dinucleotide of with in prefix of . By Propositions 6 and 5, . If then is a -necklace of , and by Proposition 25, we are in contradiction. So, cannot be a prefix of . Contradiction.(ii) has two occurrences in prefix of , and and have one occurrence in prefix of . By Propositions 6 and 5, , and is the unique possible dinucleotide of with in prefix of . By Propositions 6 and 5, and (otherwise, is a -necklace for , and by Proposition 25, we are in contradiction). So, cannot be a prefix in . Contradiction.(iii) has two occurrences in prefix of , and and have one occurrence in prefix of . By Propositions 6 and 5, we have three possible cases , , and .(iiia). So, . By Propositions 6 and 5, , but . By Propositions 6 and 5, . In the first case , and by Proposition 6, we are in contradiction. In the second case, . But, is a -necklace for . By Proposition 25, we are in contradiction.(iiib). So, . By Propositions 6 and 5, , but . By Propositions 6 and 5, is the unique possible dinucleotide of with prefix . So, . But, is a -necklace for . By Proposition 25, we are in contradiction.(iiic). So, *.* By Propositions 6 and 5, , but . By Propositions 6 and 5, , but . As are conjugate, we are in contradiction with Proposition 6. So, the case is also impossible.

Only the 4-partition is realized by . It corresponds to the permutation of . In other words, the permutation, whose existence is proved, is , , , and .

Proposition 36. *There are 24 maximum dinucleotide circular codes. *

*Proof. *By Proposition 29, each permutation of is associated with a maximum dinucleotide circular code . As there are permutations, the number of maximum dinucleotide circular codes is at least .

Now, let be a maximum dinucleotide circular code. By Lemma 35, its -partition must be . Let (resp., , , ) be the letter of having (resp., , , ) occurrences in prefix of . As has three occurrences in prefix of , we must have . As has two occurrences in prefix of , and as is already in , we must also have . Finally, as has only one occurrence in prefix of , and as and are already in , we must have . Consequently, , and is one of the maximum circular codes already considered. Thus, the number of maximum dinucleotide circular codes is exactly .

A computer calculus confirms that there are exactly maximum dinucleotide circular codes (Table 1).

There are eight self-complementary maximum dinucleotide circular codes: , , , , , , , and (Table 1). The 16 remaining ones are partitioned in two classes of eight maximum dinucleotide circular codes in bijective correspondence by the complementarity map (Table 1).

Proposition 37. *If is a maximum dinucleotide circular code, then is also a maximum dinucleotide circular code. *

*Proof . *By inspection (Table 1).

Proposition 38. *If is a maximum dinucleotide circular code, then
*

*Proof. *By inspection (Table 1).

This proposition is not true with maximum trinucleotide circular codes, see for example, [6].

#### 4. Conclusion

This new combinatorial study of circular codes in genes has proved that there are exactly maximum dinucleotide circular codes on the -letter genetic alphabet . They are listed in Table 1. Propositions 22, 37, and 38 lead to interesting properties with dinucleotide circular codes in DNA. Indeed, they ensure that several maximum dinucleotide circular codes can exist in the two strands of the DNA double helix simultaneously. Indeed, a maximum dinucleotide circular code in a given strand of DNA implies that its complementary set in the complementary strand of DNA is also a maximum dinucleotide circular code (Proposition 37) and according to two possibilities: or with (Table 1). Furthermore, its permuted set in , obtained by a frameshift of one letter of in , is also a maximum dinucleotide circular code (Proposition 22). Finally, its complementary permuted set in is also a maximum dinucleotide circular code (Proposition 38).

Chemical modification of nucleotides is ubiquitous in RNA and DNA. So far, a total of 107 modified nucleotides, for which chemical structures have been assigned, have been reported in RNA (see the RNA Modification Database at http://rna-mdb.cas.albany.edu/RNAmods/ [35]). The largest number, that is, 81, with the greatest structural diversity, is found in tRNA, with 30 in rRNA, 12 in mRNA, and 13 in other RNA species, most notably snRNA. The four nucleotides can be chemically modified, for example, methyladenosine, dimethyladenosine, trimethyladenosine, methylcytidine, dimethylcytidine, thiocytidine, methylguanosine, dimethylguanosine, trimethylguanosine, methyluridine, dimethyluridine, thiouridine, pseudouridine, dihydrouridine, but also inosine, lysidine, wybutosine, wyosine, queuosine, and archaeosine. In DNA, the cytosine in the dinucleotide, involved in gene regulation, can have two chemical forms (methylcytosine, hydroxymethylcytosine). This chemical change allows to store additional information, thus expanding the alphabet by two letters. Thus, the generalization of dinucleotide circular code propositions over larger alphabets is very interesting and should be investigated.

Dinucleotide circular codes may be involved in retrieval of the modulo frame in genomes, for example, in the dinucleotide repeats.

Dinucleotide circular codes may also have a biological function in the coding process of amino acids. In the standard genetic code, eight amino acids *Ala* (), *Arg* (), *Gly* (), *Leu* (), *Pro* (), *Thr* (), *Ser* (), and *Val* () are coded by sets of trinucleotides involving dinucleotides. Indeed, for each of these eight amino acids, there exists a dinucleotide such that all the trinucleotides of the form (where is any letter of ) code the same amino acid (Table 2).

Now, *Gly* () and *Pro* () cannot be coded by a dinucleotide circular code as their dinucleotides are periodic . Moreover, *Ala* () and *Arg* () cannot be coded simultaneously by a dinucleotide circular code as their dinucleotides are conjugate and similarly for *Leu* () and *Ser* () with the conjugate dinucleotides . On the other hand, as any subset of a maximum dinucleotide circular code is also a dinucleotide circular code, the following properties exist.(i) The four amino acids *Arg *(*R*), *Leu *(*L*), *Thr *(*T*), and *Val *(*V*) can be coded by the dinucleotide circular code which is a proper subset of the maximum dinucleotide circular code (Table 1).(ii) The four amino acids *Ala *(*A*), *Leu *(*L*), *Thr *(*T*), and *Val *(*V*) can be coded by the dinucleotide circular code which is a proper subset of the maximum dinucleotide circular code (Table 1).(iii) The four amino acids *Ala *(*A*), *Ser *(*S*), *Thr *(*T*), and *Val *(*V*) can be coded by the dinucleotide circular code which is a proper subset of the maximum dinucleotide circular code (Table 1).

These results contribute to the research field analysing the mathematical properties of genetic codes.

#### Acknowledgments

The authors thank the reviewers and Jacques Justin for their advice. The second author thanks the Dipartimento di Matematica “U. Dini” for giving him a friendly hospitality.

#### References

- J. Berstel and D. Perrin,
*Theory of Codes*, Academic Press, London, UK, 1985. - J. L. Lassez, “Circular codes and synchronization,”
*International Journal of Computer and Information Sciences*, vol. 5, no. 2, pp. 201–208, 1976. View at Publisher · View at Google Scholar · View at Scopus - F. H. C. Crick, J. S. Griffith, and L. E. Orgel, “Codes without commas,”
*Proceedings of the National Academy of Sciences*, vol. 43, pp. 416–421, 1957. View at Publisher · View at Google Scholar - S. W. Golomb, B. Gordon, and L. R. Welch, “Comma-free codes,”
*Canadian Journal of Mathematics*, vol. 10, pp. 202–209, 1958. View at Publisher · View at Google Scholar - S. W. Golomb, L. R. Welch, and M. Delbrück, “Construction and properties of comma-free codes,”
*Biologiske Meddelelser, Kongelige Danske Videnskabernes Selskab*, vol. 23, no. 9, 1958. View at Google Scholar - D. G. Arquès and C. J. Michel, “A complementary circular code in the protein coding genes,”
*Journal of Theoretical Biology*, vol. 182, no. 1, pp. 45–58, 1996. View at Publisher · View at Google Scholar · View at Scopus - A. J. Koch and J. Lehmann, “About a symmetry of the genetic code,”
*Journal of Theoretical Biology*, vol. 189, no. 2, pp. 171–174, 1997. View at Publisher · View at Google Scholar · View at Scopus - M. P. Béal and J. Senellart, “On the bound of the synchronization delay of a local automaton,”
*Theoretical Computer Science*, vol. 205, no. 1-2, pp. 297–306, 1998. View at Google Scholar · View at Scopus - F. Bassino, “Generating functions of circular codes,”
*Advances in Applied Mathematics*, vol. 22, no. 1, pp. 1–24, 1999. View at Publisher · View at Google Scholar · View at Scopus - R. Jolivet and F. Rothen, “Peculiar symmetry of DNA sequences and evidence suggesting its evolutionary origin in a primeval genetic code,” in
*Proceedings of the 1st European Workshop in Exo-/Astro-Biology*, P. Ehrenfreund, O. Angerer, and B. Battrick, Eds., ESA SP-496, pp. 173–176, Noordwijk, The Netherlands. View at Scopus - G. Frey and C. J. Michel, “Circular codes in archaeal genomes,”
*Journal of Theoretical Biology*, vol. 223, no. 4, pp. 413–431, 2003. View at Publisher · View at Google Scholar · View at Scopus - C. Nikolaou and Y. Almirantis, “Mutually symmetric and complementary triplets: differences in their use distinguish systematically between coding and non-coding genomic sequences,”
*Journal of Theoretical Biology*, vol. 223, no. 4, pp. 477–487, 2003. View at Publisher · View at Google Scholar · View at Scopus - G. Pirillo, “A characterization for a set of trinucleotides to be a circular code,” in
*Determinism, Holism, and Complexity*, C. Pellegrini, P. Cerrai, P. Freguglia, V. Benci, and G. Israel, Eds., Kluwer Academic Publisher, New York, NY, USA, 2003. View at Google Scholar - G. Pirillo and M. A. Pirillo, “Growth function of self-complementary circular codes,”
*Biology Forum*, vol. 98, no. 1, pp. 97–110, 2005. View at Google Scholar · View at Scopus - G. Frey and C. J. Michel, “Identification of circular codes in bacterial genomes and their use in a factorization method for retrieving the reading frames of genes,”
*Computational Biology and Chemistry*, vol. 30, no. 2, pp. 87–101, 2006. View at Publisher · View at Google Scholar · View at Scopus - J. L. Lassez, R. A. Rossi, and A. E. Bernal, “Crick's hypothesis revisited: the existence of a universal coding frame,” in
*Proceedings of the 21st International Conference on Advanced Information Networking and Applications Workshops (AINAW '07)*, pp. 745–751, Niagara Falls, Canada, May 2007. View at Publisher · View at Google Scholar - C. J. Michel, G. Pirillo, and M. A. Pirillo, “Varieties of comma-free codes,”
*Computers and Mathematics with Applications*, vol. 55, no. 5, pp. 989–996, 2008. View at Publisher · View at Google Scholar · View at Scopus - C. J. Michel, G. Pirillo, and M. A. Pirillo, “A relation between trinucleotide comma-free codes and trinucleotide circular codes,”
*Theoretical Computer Science*, vol. 401, no. 1–3, pp. 17–26, 2008. View at Publisher · View at Google Scholar · View at Scopus - G. Pirillo, “A hierarchy for circular codes,”
*RAIRO-Theoretical Informatics and Applications*, vol. 42, no. 4, pp. 717–728, 2008. View at Publisher · View at Google Scholar · View at Scopus - G. Pirillo, “Some remarks on prefix and suffix codes,”
*Pure Mathematics and Applications*, vol. 19, pp. 53–60, 2008. View at Google Scholar - C. J. Michel and G. Pirillo, “Identification of all trinucleotide circular codes,”
*Computational Biology and Chemistry*, vol. 34, no. 2, pp. 122–125, 2010. View at Publisher · View at Google Scholar · View at Scopus - G. Pirillo, “Non sharing border codes,”
*The Advances in Applied Mathematics and Mechanics*, vol. 3, pp. 215–223, 2010. View at Google Scholar - C. J. Michel and G. Pirillo, “Strong trinucleotide circular codes,”
*International Journal of Combinatorics*, vol. 2011, Article ID 659567, 14 pages, 2011. View at Publisher · View at Google Scholar - L. Bussoli, C. J. Michel, and G. Pirillo, “On some forbidden configurations for self-complementary trinucleotide circular codes,”
*Journal for Algebra Number Theory Academia*, vol. 2, pp. 223–232, 2011. View at Google Scholar - D. L. Gonzalez, S. Giannerini, and R. Rosa, “Circular codes revisited: a statistical approach,”
*Journal of Theoretical Biology*, vol. 275, no. 1, pp. 21–28, 2011. View at Publisher · View at Google Scholar - L. Bussoli, C. J. Michel, and G. Pirillo, “On conjugation partitions of sets of trinucleotides,”
*Applied mathematics*, vol. 3, pp. 107–112, 2012. View at Google Scholar - C. J. Michel, G. Pirillo, and M. A. Pirillo, “A classification of 20-trinucleotide circular codes,”
*Information and Computation*, vol. 212, pp. 55–63, 2012. View at Publisher · View at Google Scholar - M. Burset, I. A. Seledtsov, and V. V. Solovyev, “Analysis of canonical and non-canonical splice sites in mammalian genomes,”
*Nucleic Acids Research*, vol. 28, no. 21, pp. 4364–4375, 2000. View at Google Scholar · View at Scopus - S. M. Mount, “A catalogue of splice junction sequences,”
*Nucleic Acids Research*, vol. 10, no. 2, pp. 459–472, 1982. View at Google Scholar · View at Scopus - A. Bird, “The dinucleotide CG as a genomic signalling module,”
*Journal of Molecular Biology*, vol. 409, no. 1, pp. 47–53, 2011. View at Publisher · View at Google Scholar - F. Gebhardt, K. S. Zänker, and B. Brandt, “Modulation of epidermal growth factor receptor gene transcription by a polymorphic dinucleotide repeat in intron 1,”
*Journal of Biological Chemistry*, vol. 274, no. 19, pp. 13176–13180, 1999. View at Publisher · View at Google Scholar · View at Scopus - H. Buerger, J. Packeisen, A. Boecker et al., “Allelic length of a CA dinucleotide repeat in the egfr gene correlates with the frequency of amplifications of this sequence—first results of an inter-ethnic breast cancer study,”
*Journal of Pathology*, vol. 203, no. 1, pp. 545–550, 2004. View at Publisher · View at Google Scholar · View at Scopus - A. L. Schmidt and V. Mitter, “Microsatellite mutation directed by an external stimulus,”
*Mutation Research*, vol. 568, no. 2, pp. 233–243, 2004. View at Publisher · View at Google Scholar · View at Scopus - H. Cuppens, W. Lin, M. Jaspers et al., “Polyvariant mutant cystic fibrosis transmembrane conductance regulator genes: the polymorphic (TG)m locus explains the partial penetrance of the T5 polymorphism as a disease mutation,”
*Journal of Clinical Investigation*, vol. 101, no. 2, pp. 487–496, 1998. View at Google Scholar · View at Scopus - J. Rozenski, P. F. Crain, and J. A. McCloskey, “The RNA modification database: 1999 update,”
*Nucleic Acids Research*, vol. 27, no. 1, pp. 196–197, 1999. View at Publisher · View at Google Scholar · View at Scopus