Scientific Programming

Volume 2018, Article ID 8721246, 12 pages

https://doi.org/10.1155/2018/8721246

## Automatic High-Frequency Trading: An Application to Emerging Chilean Stock Market

^{1}Pontificia Universidad Católica de Valparaíso Chile, Avenida Brasil 2241, Valparaíso 2362807, Chile^{2}Pontificia Universidad Católica de Valparaíso Chile, Avenida Brasil 2830, Valparaíso 2340031, Chile^{3}Universidad Técnica Federico Santa María Chile, Avenida España 1680, Valparaíso 2390123, Chile^{4}Universidad Diego Portales Chile, Av. Ejército 441, Santiago 8370109, Chile

Correspondence should be addressed to Hanns de la Fuente-Mella; lc.vcup@etneufaled.snnah

Received 8 March 2018; Revised 11 August 2018; Accepted 5 September 2018; Published 30 September 2018

Academic Editor: José E. Labra

Copyright © 2018 Broderick Crawford et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

This research seeks to design, implement, and test a fully automatic high-frequency trading system that operates on the Chilean stock market, so that it is able to generate positive net returns over time. A system that implements high-frequency trading (HFT) is presented through advanced computer tools as an NP-Complete type problem in which it is necessary to optimize the profitability of stock purchase and sale operations. The research performs individual tests of the algorithms implemented, reviewing the theoretical net return (profitability) that can be applied on the last day, month, and semester of real market data. Finally, the research determines which of the variants of the implemented system performs best, using the net returns as a basis for comparison. The use of particle swarm optimization as an optimization algorithm is shown to be an effective solution since it is able to optimize a set of disparate variables but is bounded to a specific domain, resulting in substantial improvement in the final solution.

#### 1. Introduction

This research seeks to design, implement, and test a fully automatic trading system that operates on the national Chilean stock market, so that it is capable of generating positive net returns over time. In particular, it is desired to create a system that implements high-frequency trading (HFT). Thus, the research corresponds to the application of advanced computer tools to a problem of type NP-Complete, where the aim is to optimize the profitability of operations of purchase and sale of shares. In this way, the objective is to create an implementation of an automatic trading system that is capable of generating positive returns for a set of real data of the national stock market, under a completely automatic modality, where there is no intervention of a human operator in the decision-making and execution of operations. Section 2 describes automatic and semiautomatic stock-trading systems and algorithmic high-frequency trading context. Section 3 provides a review of current trading algorithm techniques that can work under the automatic mode of HFT, indicating which computer techniques can be applied. Section 4 presents the design of an automatic trading system, in HFT mode, indicating the restrictions on the data and financial instruments included in the study. Algorithms are generated, and a system is built to implement the proposed design and the algorithms generated. Individual tests of the implemented algorithms are carried out, reviewing the theoretical net return (profitability) that can be generated applied on the last day, month, and semester of real market data. Finally, in Section 5, it is determined which of the variants of the implemented system behaves better, using the net returns as a basis for comparison and applying other criteria as deemed necessary.

#### 2. Background

Stock trading is an activity that has been conducted for hundreds of years and is currently performed on stock exchanges around the world. In these exchanges, a huge variety of financial assets and debt instruments are traded daily. Stock trading is a complex decision-making problem that involves multiple variables and does not always have an optimal solution, since the conditions vary over time and are affected by internal and external factors.

In recent years, implementation of automatic and semiautomatic stock-trading systems that can analyze market conditions and make the necessary decisions to conduct required business transactions have begun. Since 2008, such systems have reported significant and constant gains on foreign stock exchanges such as the New York Stock Exchange (NYSE) [1]. These systems can be traced back to Morgan Stanley’s “Black Box” system, which was created in 1985 [2] for the generation of buy/sell signals based on statistical arbitrage.

The benefit of using automated systems for the stock trading task lies in the fact that people tend to pay too much attention to signals (price) and ignore the transition probabilities that generate the signals. An automated system, in contrast, can calculate the probabilities of price transition and act accordingly [3], avoiding problems of late reaction or overreaction to changes.

At present, the entry of automatic traders into the Chilean market has been relatively low, and there is no information available on such entry since those who employ automatic trading mechanisms are unwilling to divulge the details of their systems for fear of competition, as occurred in the beginning with the “Black Box” system. The domestic market has been able to operate with automatic low- and high-frequency traders since 2010, when the Santiago Stock Exchange launched the Telepregon HT system, which allows the trading of equities at a theoretical maximum rate of 3000 transactions per second [4, 5].

Trading is the exchange of ownership of a good, product, or service from a person or entity under conditions in which something is obtained in return from the buyer. Thus, trading can be understood as the practice conducted by stockbrokers or their clients whereby financial instruments are exchanged in securities markets. This trading is based on the principle of supply and demand of the traded instruments, which causes the prices of the instruments to vary and generates a profit (or loss) that is determined by the difference between the original purchase price and the final sale price.

High-frequency trading (HFT) is understood as a way of operating in stock markets to which a number of special conditions [1] apply:(i)There is a rapid exchange of capital(ii)A large number of transactions are performed(iii)Generally, a low gain per transaction is obtained(iv)Financial instrument positions are neither accumulated from one trading day to another nor avoided(v)Trading is conducted through a computer system

The definition of HFT itself does not indicate whether the system performing it is automatic, semiautomatic or user-operated.

In contrast, automatic trading varies from systems that support the entry of buy/sell orders to the market, such as systems capable of entering orders automatically without the need for a human operator but that can maintain the positions of financial instruments from one day to the next.

There is no single formula for defining an HFT or an automatic trading system [1, 6, 7]. For example, it is stated that an algorithmic trading (AT) system corresponds to “elements in decision-making and financial investments being executed by an algorithm through computers and electronic communication networks” [7]. Investment strategies can be predefined or adaptive. These investment strategies can be supported by knowledge of economics, statistics, artificial intelligence, metaheuristics, etc.

Similarly, it is proposed a sequential process for developing an HFT system that is based on four steps: (i) data analysis; (ii) trading model; (iii) decision-making; and (iv) execution of business [7].

Thus, there is no single formula for producing an HFT system. However, it is worth noting that to achieve an effective HFT system, it is necessary to take into account a series of processes common to any system, namely, analysis, identification, collation, routing, and execution [8].

In any of the automated systems described above, the components that present the greatest complexity are the analysis of real-time opportunities and the search for market inefficiencies. The available literature mentions methods of the following types: (i) Rule-based methods such as statistical arbitration [2]. These methods apply a series of rules that are based on the recent behavior of a financial instrument and act based on the result of applying those rules. (ii) Methods based on statistical and mathematical models, such as volume-weighted average price, time-weighted average price, and moving averages [1, 7]. For these cases, a mathematical or statistical model is used that requires a series of parameters that control its behavior. The selection of the configuration parameters is performed by a manual operator who is in charge of trading on the market. (iii) Methods combining statistical and mathematical models with optimization techniques based on metaheuristics [9]. These methods use metaheuristics to automatically fine-tune the parameters of known algorithms to obtain optimum values for current market conditions. (iv) Methods based on machine learning, data mining, and processing of complex events [1, 9, 10]. Because rapid assimilation of the large amount of information flowing to and from any stock market by a human operator is becoming an increasingly difficult task; it is desirable to develop systems that are able to detect hidden patterns in price variations and the relationships between financial instruments or other economic indicators and that can also incorporate a component of interpretation of “feeling” or “sensing” the market through natural language news processing (e.g., SuperX Plus of Deutsche Bank [11]).

#### 3. Trading Algorithms

##### 3.1. Statistical Methods Used in AT and HFT

Some of the most popular trading algorithms based on statistical or mathematical methods [7, 12] are as follows:

Volume-weighted average price (VWAP) is defined as the ratio of the volume of transactions rated against the volume of the instrument over the trading horizon. It is common to evaluate the performance of traders by their ability to execute buy/sell orders at prices that are better than the VWAP price on the trading horizon. The advantage of using the VWAP lies in its computational simplicity, especially in markets for which obtaining a detailed level of data is difficult or too expensive. The VWAP for an instrument on a day is calculated as follows:where is the volume of the instrument traded at time , and is the market price of instrument at time . can be used to minimize the costs of transactions and market impacts. It can also be used as a benchmark to verify the effectiveness of other algorithms and trading strategies.

A modification to the model called DVWAP (dynamic ) is proposed by [13]. This modification allows intraday transactions (transactions realized during the same day of execution) to be incorporated. This allows the model to be applied to a more realistic scenario of the market in which the news that arrives affects the price of the instruments.

Time-weighted average price () is the average price of a financial instrument over a specific period of time during which the order is executed at the price or better. It is used to execute orders at a specific time to keep the price close to what the market reflects at that time. The of an instrument in a period is calculated as follows:where is the market price of the instrument at time . Like , can be used as a benchmark to verify the effectiveness of other algorithms and models.

Other types of algorithms include variants of the linear econometric models presented by [1]. These models attempt to predict the behavior of random variables as a combination of other random variables, both contemporaneous and retrospective, with well-defined distributions. Such linear models can be expressed aswhere is the time series of a random variable on which a forecast is to be made; and are significant factors for predicting the value of ; , , and are the factors to be determined; and is the remaining error.

Moving averages (MA), a model for predicting future movements in the price of a financial instrument, focuses on how future data will react to changes in past data. To generate the MA model () with delays, we usewhere is the intercept, is the coefficient belonging to delay , and al is the unexpected component of the return at delay . There are several ways to estimate MA; they include the following:

Simple MA () is the weightless average of the previous prices . This can be of previous days or another measure of time. It can also be calculated based on the of the previous period, simplifying its calculation at the computational level.

Cumulative MA (CMA) is a moving average in which all prices are considered until the current instant. Its formula is similar to that of but begins from the first recorded market price for an instrument. It has no known application in trading strategies.

Weighted MA () is an average that uses multiplication factors to give different weights at different prices within the same MA window (convolution of data points with a fixed weight function). In trading, decreasing weight is assigned from to 1 at each price in the evaluation window, as follows:

Like MA, provides a smoothing function of the prediction curve. In some cases, it is used together with MA; it can also be used when the prices of previous days do not greatly affect the value of the current price of an instrument.

Exponential MA (EMA), also referred to as EWMA (exponential weighted MA), is a version similar to WMA in which the weight is an exponential rather than a linear function. The weight assigned to each market price decreases exponentially and never reaches zero. Thus, for a series , the EMA is calculated recursively aswhere is the coefficient of decreasing weight (a constant value between 0 and 1). A high coefficient value causes the old prices to decrease more quickly. Alternatively, can be expressed in terms of periods of time:

##### 3.2. Metaheuristic Models

Several known trading models and algorithms have been described in the literature. Such algorithms are generally applied manually by a human operator to determine when to buy, sell, or maintain the current position. Robert Pardo states that for a given combination of strategies, it is possible to apply optimization to determine a set of parameters that generates greater gains [9].

Such a postulate does not come without associated problems. The major known problem is that such optimizations can cause overperformance of the algorithm with respect to the data used. In the best-case scenario, the resulting algorithm will not generate the expected gains, and in the worst case, the algorithm will produce constant losses. One way to understand the concept of overperformance is to think of a statistical model that describes random error or noise instead of describing relationships between variables.

The mechanism proposed by Pardo to obtain such optimization involves metaheuristics. In this respect, both AT and HFT can be understood as complex optimization problems. This class of problems is referred to as class NP (nondeterministic polynomial time). The NP class is the class of problems in which a solution can be verified by a polynomial time algorithm but in which given the difficulty of the problems, there is no algorithm that can generate solutions in polynomial time. This implies that the application of conventional algorithms to this class of problems results in execution times that increase exponentially as the size of the problem increases.

Until 1971, there had been no demonstration of a problem of this kind. That year, Stephen Cook demonstrated the first NP-Complete practical problem [14]. In 1972, Richard Karp expanded on Cook’s idea, demonstrating a series of 21 NP-Complete class problems [15].

Because AT and HFT are both problems of trading financial instruments in markets with varying conditions over time, they can both be categorized as NP-class problems [16]. Because both are based on the maximization of net returns, according to Chang and Johnson [17], they can be classified as NP-Complete, even in versions that perform offline market simulations.

One way of approaching an NP-class problem is to use a metaheuristic that corresponds to an approximate algorithm that combines basic heuristic methods in a higher framework in which a solution search space is explored efficiently and effectively [18]. Thus, through an objective function that guides the search process, an efficient exploration of possible solutions is made in search of one or more near-optimal solutions.

A great variety of metaheuristic algorithms are available. Some of these algorithms have more affinity for certain types of problems than others, such as problems with binary, discrete, or continuous variables. Some algorithms can be applied to only one variable type, or adjustments must be made such as applying conversion functions. In particular, an approach to one of the existing algorithms called particle swarm optimization (PSO) will be presented.

###### 3.2.1. Particle Swarm Optimization

The PSO algorithm was introduced by Kennedy and Eberhart in 1995 [19] in an attempt to describe the social behavior of flocks of birds or schools of fish and to model their communication mechanisms as a basis for solving optimization problems. Instead of relying on the “survival of the fittest,” PSO is based on the cooperation of individuals.

PSO is defined as a metaheuristic algorithm that optimizes a problem by iteratively improving a population of candidate solutions called particles by moving them through the solution space using a formula based on each particle’s position and velocity. The movement of each particle is influenced by its best-known local solution and is also guided to the best-known global solution. With this, the swarm is expected to move collectively toward the best solution in the search space.

In the basic version of PSO, the velocity and position of the particles are calculated as follows:where is the position of the -th particle at iteration , is the velocity of the -th particle at iteration , is the inertia factor (a value between 0 and 1), is the local acceleration factor (cognitive component of the individual), is the global acceleration factor (social component of the swarm), and are random numbers with uniform distributions between 0 and 1, is the best previous position of the -th particle, and is the best previous position of the neighborhood of the -th particle.

Various formulations exist for the selection of the parameters , , and . One of the options is to use the values suggested in [20].

Other ways of determining the parameters include functions that modify the parameters during the execution of the algorithm. An example of this is given by Fikret in [21]; the example is based on the fact that the value of the parameter of inertia, , influences the diversification (exploration of the search space) and intensification (exploitation of the search space). High values of the parameter of inertia favor diversification, whereas low values favor the intensification of local solutions. In this way, an exponential function of the inertia parameter is defined bywhere is the initial inertia, is the final inertia, represents the current iteration, is the maximum number of iterations to be performed, and is a gradient constant. Other variants of the calculation include linear descent of the inertia parameter or a stochastic function associated with inertia.

For the specific problem of HFT and AT, the PSO algorithm is applied to the optimization of the parameters of a trading strategy based on the MA of two or more bands. In this case, the temporal parameters of the MA involved in the strategy against an objective function are optimized, including the following:(i)Obtain the highest net return (earnings)(ii)Obtain the most benefit per transaction(iii)Obtain the highest percentage of winning transactions or assure that the strategy has a higher specific financial ratio

In this way, the objective function that is applied to the PSO algorithm measures and classifies the quality of the trading strategy that is applied in the AT or HFT system. Thus, the success of applying PSO to an HFT and AT problem depends primarily on how the objective function is proposed for the same trading model.

#### 4. Methodology

The main objective of the research is to create a system that can conduct trading autonomously. Thus, a preliminary design of a system that can be applied during a full trading day for a given stock market is defined. As an initial step, this requires defining and delimiting the target market since there are multiple stock exchanges in the world, each offering a range of different markets and possessing specific regulations and restrictions.

The data used corresponds to the transactions carried out during 2 years for one of the equity instruments listed in the old IPSA index (now replaced by S&P/CLX IPSA). This corresponds to a highly liquid stock instrument in the national market. In particular, 2 sets of data were used. One is public: the register of daily operations, which is reported to the CMF (Commission of Financial Markets) and published daily in the institutional site of the Santiago Stock Exchange. The other is a product of data that is marketed: market replay, which contains the data of offers entered into the system at the level of nanoseconds, anonymized by “Corredor de Bolsa” (does not expose sensitive information of the orders as the client to which it belongs, % amount visible, internal operator, etc.).

Once the target market, data selected, and the instruments involved have been defined, a system can be designed that is capable of operating on the defined market and adapting the regulations and restrictions that govern it. The same definition of the target market and the instruments will serve to determine what external data will be required and how these data should be collected and treated by the system.

Strategies, especially classic trading strategies based on MA, should be validated in conjunction with parameter optimization using PSO. Regardless of the strategy adopted, the system that is designed must support any type of strategy, so it must be a generic and easily extensible system.

##### 4.1. Selection of Markets and Financial Instruments

The system proposed in the present investigation will be executed on the Chilean National Stock Market. This corresponds to the entire market of equity instruments in national currency (National Shares). For this market, the system is required have the following characteristics:(i)It has a defined operating schedule. The National Stock Market operates from 09:30 to 17:00 in the summer and from 09:30 to 16:00 in the winter. During this time, it is possible to negotiate (enter offers and modify or cancel them). There is a time slot between 09:00 and 09:23 (plus an interval of random time between 0 and 5 minutes) called the PreOpen session during which it is possible to enter or cancel offers before they are executed (with other offers).(ii)In the case of a brokerage firm, the costs are known (stock exchange rights), and the existing regulation is exhaustive, mainly in the guarantees that each stock broker must maintain to continue operating. In the case of a particular investor, the costs vary according to each stock brokerage, but they are also known (fixed costs and variable commissions).(iii)It is possible to identify the stocks that are the most liquid when reviewing the composition of the IPSA (Selective Stock Price Index).(iv)It provides both electronic trading mechanisms with support for high-frequency and electronic communication mechanisms for the entry of orders. For the latter, mechanisms are provided for institutional negotiators (brokers and financial institutions) through DMA (Direct Market Access) mechanisms on the FIX 4.4 protocol, such as retail market mechanisms (for noninstitutional users) provided by brokers, using routing command clients on FIX 4.4 with command polling.

Thus, shares of the national equity market with high presence and/or belonging to the IPSA indicators that are not suspended will be used. The purpose of this is not to discard actions that do not meet this criterion for analysis and storage purposes but simply to avoid using them during normal execution of the AT/HFT system unless they change their condition to high presence. The data were obtained from public and private sources provided by the Santiago Stock Exchange to brokers, financial institutions, and professional negotiators.

##### 4.2. AT/HFT System Design

The system is based on five annex modules and a central module for model execution. The central module is responsible for maintaining one or more trading models through a daily review of market behavior.

The basic form of operation of the execution model module consists of a parallel copy of the trading model chosen by each valid instrument in the target market. Each copy accesses the annexed modules independently to request information and to access communication interfaces, etc., but the annexed modules handle a single copy (singleton).

The system allows parallel executions. There is one thread per instrument with the possibility of trading; each thread in the chosen model is adjusted to the needs and characteristics of the instrument.

The thread requests its configuration parameters (which the human operator can change between executions) at the start of its cycle. It then requests updated market information and uses this information to load the model. The Storage process evaluates whether it is necessary to update its information; if the information is out of date, it looks for new information both in the market and in other sources of data. In either case, the process returns updated or recent market information to the model executor. The model executor evaluates the model and verifies whether there is a favorable condition for the purchase. If such a condition exists, the thread requests a risk assessment from the Risk module. If the Risk module determines that the market condition and the risk parameters are correct, the Risk module authorizes the transaction. Then, part of the capital available to make the purchase is reserved, and this part of the capital is requested by the module that handles capital and custody. With the available capital, the parameters of the order are calculated; the Communications module then sends the purchase order to the market. When the order has been entered into the market, the available capital is updated. The process is repeated cyclically throughout the trading hours. At the end of each cycle, it is possible to apply a complete revision of the model to adapt it to the new market conditions. The process for sales is similar, but it manipulates the custody of the instruments rather than the available capital.

##### 4.3. Adaptive Model with PSO

An adaptive AT/HFT model is proposed based on some known MA strategies. For this case, we present a classic model of two MA, one long and one short, in conjunction with two bands of risk management by stop-loss and stop-win. In this model, there are four parameters to optimize, as shown in Table 1.