Mathematical Problems in Engineering

Volume 2017, Article ID 3096917, 11 pages

https://doi.org/10.1155/2017/3096917

## K-Line Patterns’ Predictive Power Analysis Using the Methods of Similarity Match and Clustering

^{1}College of Electronics and Information Engineering, Tongji University, Shanghai 200092, China^{2}Rabun Gap-Nacoochee School, Rabun Gap, GA 30568, USA^{3}Shanghai Baosight Software Co., Ltd., Shanghai 200092, China

Correspondence should be addressed to Lv Tao; moc.361@oatvlrepus

Received 23 December 2016; Revised 31 March 2017; Accepted 5 April 2017; Published 22 May 2017

Academic Editor: Anna M. Gil-Lafuente

Copyright © 2017 Lv Tao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Stock price prediction based on K-line patterns is the essence of candlestick technical analysis. However, there are some disputes on whether the K-line patterns have predictive power in academia. To help resolve the debate, this paper uses the data mining methods of pattern recognition, pattern clustering, and pattern knowledge mining to research the predictive power of K-line patterns. The similarity match model and nearest neighbor-clustering algorithm are proposed for solving the problem of similarity match and clustering of K-line series, respectively. The experiment includes testing the predictive power of the Three Inside Up pattern and Three Inside Down pattern with the testing dataset of the K-line series data of Shanghai 180 index component stocks over the latest 10 years. Experimental results show that the predictive power of a pattern varies a great deal for different shapes and each of the existing K-line patterns requires further classification based on the shape feature for improving the prediction performance.

#### 1. Introduction

A time series is a series of observations listed in time order. It is the most commonly encountered data type, touching almost every aspect of human life [1], for example, the meteorological time series, the time series of stock prices (stock time series for short) which are composed of stock price observations, and the time series of personal health that are consisted of the observation of blood pressure, temperature, white corpuscle, and so forth.

Researches show that the time series have two import features. (a) The historical information will affect the future trend [2]. That is, the historical values of observations will exert an influence on the future values in the time series. The influence can be described by time series’ period, nonstationarity, varying volatility, and so on. (b) History repeats itself [3]. That is to say, some special time subseries will repeat in the entire time series. Because of the two features, all kinds of time series forecasting have become a present hot research, one of which is the prediction of stock time series, stock prediction for short. As a typical time series, not only have stock time series the features of time series, but also the trend of stock prices is directly related to the people’s vital interests. Therefore, stock prediction has aroused the interest of a wide variety of researchers.

There are many technical analysis methods about stock prediction, the best known of which is candlestick technical analysis that is also called K-line technology analysis in Asia. In the stock market, in order to learn and study the fluctuation of stock prices in a more intuitive way, people invent a candlestick chart (also called K-line) to represent stock time series graphically. Taking a daily K-line, for example, a K-line represents the fluctuation of stock prices in one day, it not only shows the close price, open price, high price, and low price for the day but also reflects the difference and size between any two prices (all K-lines given in the paper refer to daily K-line, unless otherwise indicated). If the K-line of a stock lists in time order, then a series used to reflect the fluctuation of the stock price for some time can be formed, which can be called K-line series. As each K-line consists of four prices, the essence of K-line series is stock series with four observations.

In K-line series, if a K-line subseries contains some knowledge used to predict stock, then this subseries is called a K-line pattern series, a K-line pattern for short. For instance, when a subseries appears, the stock price will often rise or descend. Then, this subseries is a typical pattern series. Stock prediction based on K-line patterns is the essence of K-line technology analysis. How to mine the K-line patterns and how to make use of these patterns for predicting are main research contents of K-line technology analysis.

By the artificial methods of observing the K-line series of stock market (or Japanese rice market), people (the leading character is the founder of K-line, Munehisa Honma, who was a Japanese rice trader in the 18th century) have found many K-line patterns. The literatures [4, 5] introduce the existing patterns and their features in detail, such as Three Inside Up (TIU), Three Inside Down (TID), and Doji. Some papers [6–10] conclude from the experiment that the existing K-line patterns have a good forecasting capability for forecasting stock trends. Some other papers [11–15] have studied the stock prediction based on these patterns and have achieved some research results. However, there are also a number of papers [5, 16–18] challenging these patterns’ predictive power. They argue that K-line technology analysis violates the efficient market hypothesis, so it is not feasible for stock investment based on K-line patterns. They also did some experiments, which show that the existing K-line patterns have no predictive power.

Based on the above analysis, it is obvious that there are some disputes on whether the K-line patterns have predictive power in academia. However, there are few papers analyzing the reason why there are two different positions regarding the patterns’ predictive power. Paper [19] also pays attention to the debate, while it does not analyze the K-line patterns themselves but attempts to obtain an answer to the following question: are the trend reversals accompanied more often by some types of candlesticks than by others? Finally, paper [19] has found that there exist types of candlesticks that frequently tend to appear close to the trend-reversal regions and others that cannot be found in such regions. Although the paper’s research shows that the K-line patterns exist, it does not give the answer that why there is a debate on the K-line patterns’ predictive power.

Through reviewing the relevant literatures, this paper considers that the main reason is that the existing K-line patterns are lack of rigorous mathematical definition. For example, the shadow length and body size are not defined clearly in the definition of K-line patterns, which means that a K-line pattern has many different shapes. Because the predictive power of a pattern may vary a lot for different shapes. If we ignore the shape difference and research the predictive power of a pattern by taking all patterns with various shapes as a whole instead of classifying the pattern further based on its shape feature, then the study result of K-line patterns’ predictive power may produce deviations. For instance, a TIU pattern has three shapes: shape A, shape B, and shape C, as shown in Figure 1, where shape A is the generic form of TIU pattern, and shape B and C are infrequent form of which. Suppose that shape A has predictive power, and shape B and C do not have predictive power. When studying the predictive power of TIU pattern, if we ignore the shape difference between the three patterns and research them as a whole, then we will come to the wrong conclusion that TIU pattern has no predictive power. However, if the three patterns are classified further based on shape features and researched separately, then we can get the correct conclusion that TIU pattern has predictive power only at shape A.