Advances in Fuzzy Systems

Volume 2018, Article ID 4028417, 10 pages

https://doi.org/10.1155/2018/4028417

## Search of Fuzzy Periods in the Works of Poetry of Different Authors

^{1}National Research Nuclear University “MEPhI”, Kashirskoe Highway, 31, 115409, Moscow, Russia^{2}Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Leninsky Ave. 33, bld. 2, 119071, Moscow, Russia

Correspondence should be addressed to Eugene Korotkov; moc.liamg@voktorokeneg

Received 4 April 2018; Revised 2 July 2018; Accepted 16 July 2018; Published 16 August 2018

Academic Editor: Ferdinando DiMartino

Copyright © 2018 Artur Nor and Eugene Korotkov. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We applied a new method for the identification of fuzzy periods and the insertion and deletion of characters were taken into consideration while studying the works of poetry. The technique employs genetic algorithm, dynamic programming, and the Monte Carlo method. In the present work, the technique was applied to poems written by the famous Russian and foreign classics. A total of 95 poems were studied; and fuzzy periods possessing high statistical significance were identified with more than half of the poems under study. The existence of correlation between the stressed vowel letters in a poem with the position of the fuzzy periods was shown. The present study shows that a work of poetry contains both semantic component and fuzzy periods of letters; hence a poem could have psychological impact on the audience.

#### 1. Introduction

Works of poetry could be considered as a superposition of the semantic content and of the acoustic wave determined by a certain sound alternation periodicity. In relation to this, a poet is capable of combining the semantic content with a certain acoustic wave in a work of poetry. A certain periodicity of sounds alternation in a work of poetry is understood as an acoustic wave. If the meaning of a poetic text is easily understood by each person, the acoustic wave embedded in a work of poetry will be perceived rather intuitively, as some musicality, often fascinating the listeners and exposing them to a certain psychological impact [1]. In order to understand the mechanism of the acoustic wave impact on listeners, it would be very interesting to attempt quantitatively identifying and studying the acoustic wave embedded in a work of poetry, in the form of a certain periodicity of the poetic text [2, 3]. To solve this problem, it seems important to develop and apply new mathematical methods that could quantitatively demonstrate the existence of an acoustic wave in a work of poetry in the form of fuzzy periods and provide the quantitative characteristics of the periodicity found. This task seems to be important, since the quantitative determination of acoustic waves would ensure the classification of existing acoustic waves in the works of poetry. Thus, we could correlate a certain type of acoustic wave and its impact on a listener. After introducing such an important concept as fuzzy periods [4, 5], we could illustrate it with an example. Under the fuzzy periods, we shall obtain the mean of such periods, where the similarity between individual periods is insignificant or is missing at all; and the periodicity becomes statistically significant only on a certain set of periods (more than 2) [6]. Fuzzy periods could be demonstrated with an example. Let us consider a sequence in the following form:

The given sequence is characterized by a perfect periodicity consisting of 5 letters. In this study, each period is highlighted in parentheses, for clarity. There is absolute similarity between the separate periods and it is easily identified using the techniques described previously. Considering a case in the position of each period, a definite and limited set of alphabet letters could be found; for example, such set of letters for each period position is shown as follows: q,i,u,s,t; u,c,i,a,s,r; o,p,f,g,l,k,w; a,b,n,m,v; p,f,g,h,t,j,r.

Now, let us create a character sequence taking from each set a letter with the use of random technique and corresponding to the period position; then, the sequence can be obtained in the following form:

(iroap)(tufng)(sslmt)(uawaj)(qcgbf)(siknh)

The resulting character sequence lacks absolute periodicity. However, it should be noted that given the sufficient length of this sequence, it could be seen that, in the position of each period, only certain alphabet letters are located. Such a sequence is characterized by fuzzy periods, which could not be identified by pairwise comparison of any two periods but could be detected using a certain set of periods (more than 2).

Nowadays, several mathematical techniques are employed for the detection of fuzzy periods in character and numerical sequences. These include the wavelet transform [7] and the Fourier transform [8]. Previously, the information decomposition (ID) technique was developed [4]. The difference between the ID technique and the Fourier transform lies in the fact that the ID technique could be used for character sequence analysis without recoding it into a numerical series. Such a method of analysis makes it possible to obtain results that are unattainable with the Fourier transform. This allowed the fuzzy periods in DNA sequences [5], amino acid sequences [6], and of several works of poetry to be revealed [4]. However, the ID technique, like other methods previously discussed, does not allow the finding of a statistically significant fuzzy period with insertions and deletions of characters, which in case of literary works could be registered in connection with pronunciation peculiarities. For example, certain sounds may not be pronounced at all or may be pronounced with a certain accent. Consequently, most of the fuzzy periods contained in the sequence could not be determined using the previously developed methods.

As of today, there are mathematical approaches based on dynamic programming that allow the accurate identification of fuzzy periods of time series or character sequence in the presence of characters insertion or deletion [9, 10]. All these techniques are used to construct the multiple alignment of periods; and they are based either on performing the pairwise alignment of periods, followed by the subsequent creation of a guide tree, or on the search for embryos or common words in periods. Thereafter, the initial multiple alignment of periods is provided; and the optimization thereof is carried out in one way or another, including the use of hidden Markov models, iterative procedures, and some other techniques [10–12]. However, all the developed approaches do not ensure construction of the multiple alignment, if the statistically significant pair alignment is missing in the analyzed sequences. It does not allow the creation of a statistically significant guide tree for the progressive alignment; or the sequences are that different that they do not provide searching for the statistically significant embryos or common words. It turns out that nowadays, it is impossible to construct a multiple alignment for significantly different sequences (periods). In this case, it could be argued that all the developed approaches are “blind” and will not identify a statistically significant multiple alignment in the significantly different sequences (periods). Such an alignment could be found, if it would be possible to construct a multiple alignment through the direct application of dynamic programming for all the analyzed sequences. But this is the so-called NP-complete problem [13, 14]; and such an approach requires gigantic computer resources that are not available at present; and it is difficult to think about its creation in the nearest future.

Previously, a new technique was developed for identifying the fuzzy periods in character sequences, which took into consideration the insertions and deletions of characters [15, 16]. This technique is based on the new solution of the NP-complete problem regarding the sequences (periods) multiple alignment. This method employs genetic algorithm, techniques aimed at optimizing weight matrices, dynamic programming, and the Monte Carlo method. It enables identification of the fuzzy periods of a character sequence with insertions and deletions in previously unknown positions. It is important to note that this analysis requires only the symbolic sequence itself (the text of the poetic work) and other information about the poetic work, including the placement of stresses and features of pronunciation, are not required. In the given work, this approach was applied while searching for fuzzy periods in the poems of famous Russian and English-speaking poets. We showed that, in more than half of the works of poetry, it is possible to find fuzzy periods. This study shows that a work of poetry contains both semantic component and fuzzy periods, which could be responsible for the psychological impact of a poem on the audience. Fuzzy periods can be a reflection of the sound “wave” which exists in a poetic work.

#### 2. Fuzzy Periods Search Technique Algorithm Used with Consideration of Characters’ Insertions and Deletions

At the beginning of the work, the poetics is transformed in such a way that all the spaces are deleted, uppercase letters are changed to lowercase, and punctuation marks are changed to spaces (Figure 1, Paragraph 1). Thus, the character sequence is created on the basis of the transformed work of poetry for further evaluation. In Figure 1, Paragraph 2, the set of random matrices having the* k×n* dimension is generated, where is the period length and is the size of the original alphabet sequence. In Figure 1, Paragraph 3, modification and optimization of random matrices are performed, which is required for constructing the sequence alignment.