Data Science and AI-based Optimization in Scientific ProgrammingView this Special Issue
Automatic High-Frequency Trading: An Application to Emerging Chilean Stock Market
This research seeks to design, implement, and test a fully automatic high-frequency trading system that operates on the Chilean stock market, so that it is able to generate positive net returns over time. A system that implements high-frequency trading (HFT) is presented through advanced computer tools as an NP-Complete type problem in which it is necessary to optimize the profitability of stock purchase and sale operations. The research performs individual tests of the algorithms implemented, reviewing the theoretical net return (profitability) that can be applied on the last day, month, and semester of real market data. Finally, the research determines which of the variants of the implemented system performs best, using the net returns as a basis for comparison. The use of particle swarm optimization as an optimization algorithm is shown to be an effective solution since it is able to optimize a set of disparate variables but is bounded to a specific domain, resulting in substantial improvement in the final solution.
This research seeks to design, implement, and test a fully automatic trading system that operates on the national Chilean stock market, so that it is capable of generating positive net returns over time. In particular, it is desired to create a system that implements high-frequency trading (HFT). Thus, the research corresponds to the application of advanced computer tools to a problem of type NP-Complete, where the aim is to optimize the profitability of operations of purchase and sale of shares. In this way, the objective is to create an implementation of an automatic trading system that is capable of generating positive returns for a set of real data of the national stock market, under a completely automatic modality, where there is no intervention of a human operator in the decision-making and execution of operations. Section 2 describes automatic and semiautomatic stock-trading systems and algorithmic high-frequency trading context. Section 3 provides a review of current trading algorithm techniques that can work under the automatic mode of HFT, indicating which computer techniques can be applied. Section 4 presents the design of an automatic trading system, in HFT mode, indicating the restrictions on the data and financial instruments included in the study. Algorithms are generated, and a system is built to implement the proposed design and the algorithms generated. Individual tests of the implemented algorithms are carried out, reviewing the theoretical net return (profitability) that can be generated applied on the last day, month, and semester of real market data. Finally, in Section 5, it is determined which of the variants of the implemented system behaves better, using the net returns as a basis for comparison and applying other criteria as deemed necessary.
Stock trading is an activity that has been conducted for hundreds of years and is currently performed on stock exchanges around the world. In these exchanges, a huge variety of financial assets and debt instruments are traded daily. Stock trading is a complex decision-making problem that involves multiple variables and does not always have an optimal solution, since the conditions vary over time and are affected by internal and external factors.
In recent years, implementation of automatic and semiautomatic stock-trading systems that can analyze market conditions and make the necessary decisions to conduct required business transactions have begun. Since 2008, such systems have reported significant and constant gains on foreign stock exchanges such as the New York Stock Exchange (NYSE) . These systems can be traced back to Morgan Stanley’s “Black Box” system, which was created in 1985  for the generation of buy/sell signals based on statistical arbitrage.
The benefit of using automated systems for the stock trading task lies in the fact that people tend to pay too much attention to signals (price) and ignore the transition probabilities that generate the signals. An automated system, in contrast, can calculate the probabilities of price transition and act accordingly , avoiding problems of late reaction or overreaction to changes.
At present, the entry of automatic traders into the Chilean market has been relatively low, and there is no information available on such entry since those who employ automatic trading mechanisms are unwilling to divulge the details of their systems for fear of competition, as occurred in the beginning with the “Black Box” system. The domestic market has been able to operate with automatic low- and high-frequency traders since 2010, when the Santiago Stock Exchange launched the Telepregon HT system, which allows the trading of equities at a theoretical maximum rate of 3000 transactions per second [4, 5].
Trading is the exchange of ownership of a good, product, or service from a person or entity under conditions in which something is obtained in return from the buyer. Thus, trading can be understood as the practice conducted by stockbrokers or their clients whereby financial instruments are exchanged in securities markets. This trading is based on the principle of supply and demand of the traded instruments, which causes the prices of the instruments to vary and generates a profit (or loss) that is determined by the difference between the original purchase price and the final sale price.
High-frequency trading (HFT) is understood as a way of operating in stock markets to which a number of special conditions  apply:(i)There is a rapid exchange of capital(ii)A large number of transactions are performed(iii)Generally, a low gain per transaction is obtained(iv)Financial instrument positions are neither accumulated from one trading day to another nor avoided(v)Trading is conducted through a computer system
The definition of HFT itself does not indicate whether the system performing it is automatic, semiautomatic or user-operated.
In contrast, automatic trading varies from systems that support the entry of buy/sell orders to the market, such as systems capable of entering orders automatically without the need for a human operator but that can maintain the positions of financial instruments from one day to the next.
There is no single formula for defining an HFT or an automatic trading system [1, 6, 7]. For example, it is stated that an algorithmic trading (AT) system corresponds to “elements in decision-making and financial investments being executed by an algorithm through computers and electronic communication networks” . Investment strategies can be predefined or adaptive. These investment strategies can be supported by knowledge of economics, statistics, artificial intelligence, metaheuristics, etc.
Similarly, it is proposed a sequential process for developing an HFT system that is based on four steps: (i) data analysis; (ii) trading model; (iii) decision-making; and (iv) execution of business .
Thus, there is no single formula for producing an HFT system. However, it is worth noting that to achieve an effective HFT system, it is necessary to take into account a series of processes common to any system, namely, analysis, identification, collation, routing, and execution .
In any of the automated systems described above, the components that present the greatest complexity are the analysis of real-time opportunities and the search for market inefficiencies. The available literature mentions methods of the following types: (i) Rule-based methods such as statistical arbitration . These methods apply a series of rules that are based on the recent behavior of a financial instrument and act based on the result of applying those rules. (ii) Methods based on statistical and mathematical models, such as volume-weighted average price, time-weighted average price, and moving averages [1, 7]. For these cases, a mathematical or statistical model is used that requires a series of parameters that control its behavior. The selection of the configuration parameters is performed by a manual operator who is in charge of trading on the market. (iii) Methods combining statistical and mathematical models with optimization techniques based on metaheuristics . These methods use metaheuristics to automatically fine-tune the parameters of known algorithms to obtain optimum values for current market conditions. (iv) Methods based on machine learning, data mining, and processing of complex events [1, 9, 10]. Because rapid assimilation of the large amount of information flowing to and from any stock market by a human operator is becoming an increasingly difficult task; it is desirable to develop systems that are able to detect hidden patterns in price variations and the relationships between financial instruments or other economic indicators and that can also incorporate a component of interpretation of “feeling” or “sensing” the market through natural language news processing (e.g., SuperX Plus of Deutsche Bank ).
3. Trading Algorithms
3.1. Statistical Methods Used in AT and HFT
Volume-weighted average price (VWAP) is defined as the ratio of the volume of transactions rated against the volume of the instrument over the trading horizon. It is common to evaluate the performance of traders by their ability to execute buy/sell orders at prices that are better than the VWAP price on the trading horizon. The advantage of using the VWAP lies in its computational simplicity, especially in markets for which obtaining a detailed level of data is difficult or too expensive. The VWAP for an instrument on a day is calculated as follows:where is the volume of the instrument traded at time , and is the market price of instrument at time . can be used to minimize the costs of transactions and market impacts. It can also be used as a benchmark to verify the effectiveness of other algorithms and trading strategies.
A modification to the model called DVWAP (dynamic ) is proposed by . This modification allows intraday transactions (transactions realized during the same day of execution) to be incorporated. This allows the model to be applied to a more realistic scenario of the market in which the news that arrives affects the price of the instruments.
Time-weighted average price () is the average price of a financial instrument over a specific period of time during which the order is executed at the price or better. It is used to execute orders at a specific time to keep the price close to what the market reflects at that time. The of an instrument in a period is calculated as follows:where is the market price of the instrument at time . Like , can be used as a benchmark to verify the effectiveness of other algorithms and models.
Other types of algorithms include variants of the linear econometric models presented by . These models attempt to predict the behavior of random variables as a combination of other random variables, both contemporaneous and retrospective, with well-defined distributions. Such linear models can be expressed aswhere is the time series of a random variable on which a forecast is to be made; and are significant factors for predicting the value of ; , , and are the factors to be determined; and is the remaining error.
Moving averages (MA), a model for predicting future movements in the price of a financial instrument, focuses on how future data will react to changes in past data. To generate the MA model () with delays, we usewhere is the intercept, is the coefficient belonging to delay , and al is the unexpected component of the return at delay . There are several ways to estimate MA; they include the following:
Simple MA () is the weightless average of the previous prices . This can be of previous days or another measure of time. It can also be calculated based on the of the previous period, simplifying its calculation at the computational level.
Cumulative MA (CMA) is a moving average in which all prices are considered until the current instant. Its formula is similar to that of but begins from the first recorded market price for an instrument. It has no known application in trading strategies.
Weighted MA () is an average that uses multiplication factors to give different weights at different prices within the same MA window (convolution of data points with a fixed weight function). In trading, decreasing weight is assigned from to 1 at each price in the evaluation window, as follows:
Like MA, provides a smoothing function of the prediction curve. In some cases, it is used together with MA; it can also be used when the prices of previous days do not greatly affect the value of the current price of an instrument.
Exponential MA (EMA), also referred to as EWMA (exponential weighted MA), is a version similar to WMA in which the weight is an exponential rather than a linear function. The weight assigned to each market price decreases exponentially and never reaches zero. Thus, for a series , the EMA is calculated recursively aswhere is the coefficient of decreasing weight (a constant value between 0 and 1). A high coefficient value causes the old prices to decrease more quickly. Alternatively, can be expressed in terms of periods of time:
3.2. Metaheuristic Models
Several known trading models and algorithms have been described in the literature. Such algorithms are generally applied manually by a human operator to determine when to buy, sell, or maintain the current position. Robert Pardo states that for a given combination of strategies, it is possible to apply optimization to determine a set of parameters that generates greater gains .
Such a postulate does not come without associated problems. The major known problem is that such optimizations can cause overperformance of the algorithm with respect to the data used. In the best-case scenario, the resulting algorithm will not generate the expected gains, and in the worst case, the algorithm will produce constant losses. One way to understand the concept of overperformance is to think of a statistical model that describes random error or noise instead of describing relationships between variables.
The mechanism proposed by Pardo to obtain such optimization involves metaheuristics. In this respect, both AT and HFT can be understood as complex optimization problems. This class of problems is referred to as class NP (nondeterministic polynomial time). The NP class is the class of problems in which a solution can be verified by a polynomial time algorithm but in which given the difficulty of the problems, there is no algorithm that can generate solutions in polynomial time. This implies that the application of conventional algorithms to this class of problems results in execution times that increase exponentially as the size of the problem increases.
Until 1971, there had been no demonstration of a problem of this kind. That year, Stephen Cook demonstrated the first NP-Complete practical problem . In 1972, Richard Karp expanded on Cook’s idea, demonstrating a series of 21 NP-Complete class problems .
Because AT and HFT are both problems of trading financial instruments in markets with varying conditions over time, they can both be categorized as NP-class problems . Because both are based on the maximization of net returns, according to Chang and Johnson , they can be classified as NP-Complete, even in versions that perform offline market simulations.
One way of approaching an NP-class problem is to use a metaheuristic that corresponds to an approximate algorithm that combines basic heuristic methods in a higher framework in which a solution search space is explored efficiently and effectively . Thus, through an objective function that guides the search process, an efficient exploration of possible solutions is made in search of one or more near-optimal solutions.
A great variety of metaheuristic algorithms are available. Some of these algorithms have more affinity for certain types of problems than others, such as problems with binary, discrete, or continuous variables. Some algorithms can be applied to only one variable type, or adjustments must be made such as applying conversion functions. In particular, an approach to one of the existing algorithms called particle swarm optimization (PSO) will be presented.
3.2.1. Particle Swarm Optimization
The PSO algorithm was introduced by Kennedy and Eberhart in 1995  in an attempt to describe the social behavior of flocks of birds or schools of fish and to model their communication mechanisms as a basis for solving optimization problems. Instead of relying on the “survival of the fittest,” PSO is based on the cooperation of individuals.
PSO is defined as a metaheuristic algorithm that optimizes a problem by iteratively improving a population of candidate solutions called particles by moving them through the solution space using a formula based on each particle’s position and velocity. The movement of each particle is influenced by its best-known local solution and is also guided to the best-known global solution. With this, the swarm is expected to move collectively toward the best solution in the search space.
In the basic version of PSO, the velocity and position of the particles are calculated as follows:where is the position of the -th particle at iteration , is the velocity of the -th particle at iteration , is the inertia factor (a value between 0 and 1), is the local acceleration factor (cognitive component of the individual), is the global acceleration factor (social component of the swarm), and are random numbers with uniform distributions between 0 and 1, is the best previous position of the -th particle, and is the best previous position of the neighborhood of the -th particle.
Various formulations exist for the selection of the parameters , , and . One of the options is to use the values suggested in .
Other ways of determining the parameters include functions that modify the parameters during the execution of the algorithm. An example of this is given by Fikret in ; the example is based on the fact that the value of the parameter of inertia, , influences the diversification (exploration of the search space) and intensification (exploitation of the search space). High values of the parameter of inertia favor diversification, whereas low values favor the intensification of local solutions. In this way, an exponential function of the inertia parameter is defined bywhere is the initial inertia, is the final inertia, represents the current iteration, is the maximum number of iterations to be performed, and is a gradient constant. Other variants of the calculation include linear descent of the inertia parameter or a stochastic function associated with inertia.
For the specific problem of HFT and AT, the PSO algorithm is applied to the optimization of the parameters of a trading strategy based on the MA of two or more bands. In this case, the temporal parameters of the MA involved in the strategy against an objective function are optimized, including the following:(i)Obtain the highest net return (earnings)(ii)Obtain the most benefit per transaction(iii)Obtain the highest percentage of winning transactions or assure that the strategy has a higher specific financial ratio
In this way, the objective function that is applied to the PSO algorithm measures and classifies the quality of the trading strategy that is applied in the AT or HFT system. Thus, the success of applying PSO to an HFT and AT problem depends primarily on how the objective function is proposed for the same trading model.
The main objective of the research is to create a system that can conduct trading autonomously. Thus, a preliminary design of a system that can be applied during a full trading day for a given stock market is defined. As an initial step, this requires defining and delimiting the target market since there are multiple stock exchanges in the world, each offering a range of different markets and possessing specific regulations and restrictions.
The data used corresponds to the transactions carried out during 2 years for one of the equity instruments listed in the old IPSA index (now replaced by S&P/CLX IPSA). This corresponds to a highly liquid stock instrument in the national market. In particular, 2 sets of data were used. One is public: the register of daily operations, which is reported to the CMF (Commission of Financial Markets) and published daily in the institutional site of the Santiago Stock Exchange. The other is a product of data that is marketed: market replay, which contains the data of offers entered into the system at the level of nanoseconds, anonymized by “Corredor de Bolsa” (does not expose sensitive information of the orders as the client to which it belongs, % amount visible, internal operator, etc.).
Once the target market, data selected, and the instruments involved have been defined, a system can be designed that is capable of operating on the defined market and adapting the regulations and restrictions that govern it. The same definition of the target market and the instruments will serve to determine what external data will be required and how these data should be collected and treated by the system.
Strategies, especially classic trading strategies based on MA, should be validated in conjunction with parameter optimization using PSO. Regardless of the strategy adopted, the system that is designed must support any type of strategy, so it must be a generic and easily extensible system.
4.1. Selection of Markets and Financial Instruments
The system proposed in the present investigation will be executed on the Chilean National Stock Market. This corresponds to the entire market of equity instruments in national currency (National Shares). For this market, the system is required have the following characteristics:(i)It has a defined operating schedule. The National Stock Market operates from 09:30 to 17:00 in the summer and from 09:30 to 16:00 in the winter. During this time, it is possible to negotiate (enter offers and modify or cancel them). There is a time slot between 09:00 and 09:23 (plus an interval of random time between 0 and 5 minutes) called the PreOpen session during which it is possible to enter or cancel offers before they are executed (with other offers).(ii)In the case of a brokerage firm, the costs are known (stock exchange rights), and the existing regulation is exhaustive, mainly in the guarantees that each stock broker must maintain to continue operating. In the case of a particular investor, the costs vary according to each stock brokerage, but they are also known (fixed costs and variable commissions).(iii)It is possible to identify the stocks that are the most liquid when reviewing the composition of the IPSA (Selective Stock Price Index).(iv)It provides both electronic trading mechanisms with support for high-frequency and electronic communication mechanisms for the entry of orders. For the latter, mechanisms are provided for institutional negotiators (brokers and financial institutions) through DMA (Direct Market Access) mechanisms on the FIX 4.4 protocol, such as retail market mechanisms (for noninstitutional users) provided by brokers, using routing command clients on FIX 4.4 with command polling.
Thus, shares of the national equity market with high presence and/or belonging to the IPSA indicators that are not suspended will be used. The purpose of this is not to discard actions that do not meet this criterion for analysis and storage purposes but simply to avoid using them during normal execution of the AT/HFT system unless they change their condition to high presence. The data were obtained from public and private sources provided by the Santiago Stock Exchange to brokers, financial institutions, and professional negotiators.
4.2. AT/HFT System Design
The system is based on five annex modules and a central module for model execution. The central module is responsible for maintaining one or more trading models through a daily review of market behavior.
The basic form of operation of the execution model module consists of a parallel copy of the trading model chosen by each valid instrument in the target market. Each copy accesses the annexed modules independently to request information and to access communication interfaces, etc., but the annexed modules handle a single copy (singleton).
The system allows parallel executions. There is one thread per instrument with the possibility of trading; each thread in the chosen model is adjusted to the needs and characteristics of the instrument.
The thread requests its configuration parameters (which the human operator can change between executions) at the start of its cycle. It then requests updated market information and uses this information to load the model. The Storage process evaluates whether it is necessary to update its information; if the information is out of date, it looks for new information both in the market and in other sources of data. In either case, the process returns updated or recent market information to the model executor. The model executor evaluates the model and verifies whether there is a favorable condition for the purchase. If such a condition exists, the thread requests a risk assessment from the Risk module. If the Risk module determines that the market condition and the risk parameters are correct, the Risk module authorizes the transaction. Then, part of the capital available to make the purchase is reserved, and this part of the capital is requested by the module that handles capital and custody. With the available capital, the parameters of the order are calculated; the Communications module then sends the purchase order to the market. When the order has been entered into the market, the available capital is updated. The process is repeated cyclically throughout the trading hours. At the end of each cycle, it is possible to apply a complete revision of the model to adapt it to the new market conditions. The process for sales is similar, but it manipulates the custody of the instruments rather than the available capital.
4.3. Adaptive Model with PSO
An adaptive AT/HFT model is proposed based on some known MA strategies. For this case, we present a classic model of two MA, one long and one short, in conjunction with two bands of risk management by stop-loss and stop-win. In this model, there are four parameters to optimize, as shown in Table 1.
Table 1 shows the variables involved in the model. These variables are subject to the following restrictions:
The principle of a 2-MA strategy is to identify when there is a crossover, that is, when the short MA curve intersects the long MA curve.
The curves can cross from below when the short MA curve intersects the long MA curve from a lower to a higher value or from above when the short MA curve intersects the long MA curve from a higher to a lower value. When a crossover of the first type (increasing) occurs, a favorable condition for the purchase occurs, since the price tends to be high. When the crossover is of the second type (decreasing), a condition is generated that discards purchases and forces custody to be liquidated through sales.
The control bands are applied to the set of two MA to generate stop-loss and stop-win mechanisms integrated in the model. These bands fulfill the objective of optimizing the model since the upper band prioritizes capital over gains (thus making the capital available to the other concurrent execution threads of the trading model), and the lower band reduces the losses.
The most important feature of the PSO model is the objective function that is used. The objective function will be performed in the first instance based on optimizing the net return of the system. In this way, the objective function will bewhere is the quantity sold in the -th period within the simulation horizon, is the sale price of the -th period for the only instrument traded in the simulation, is the quantity purchased in the -th period within the simulation horizon, is the purchase price of the -th period for the only instrument traded in the simulation, are the variable costs of the -th period required for transacting, and are the fixed costs of the -th period required for transacting.
This objective function is the calculation of the net returns for a time span of equal and consecutive periods. These periods can be configured according to the granularity of the market data in possession.
In the case in which the two simulations obtain the same value of the objective function, the system passes to the next exclusion criterion, in which the benefit per operation is maximized. This can be interpreted as maximizing the profit obtained between a purchase and its subsequent sale. The application of this criterion is in many cases difficult to calculate since the simulation must replace orders that participated in real order executions, which does not always make the quantities tally. In this case, it may be difficult to recreate the series of sales operations that correspond to a previous purchase. Thus, when it is a tie, it is better to move to the third criterion of discrimination, which corresponds to computing and comparing Sharpe’s ratio or another valid financial ratio.
The Sharpe ratio is defined as follows [1, 9]:whereFinally, Sharpe’s ratio for AT/HFT processes iswhere represents the net returns of the -th period, is the risk-free benchmark rate (a constant representing the opportunity cost), and is the number of periods.
Having calculated the Sharpe ratio, the objective function appears as
With this objective function, the adaptive model can be generated by applying a PSO algorithm that exploits the best combination of the variables defined for the problem. This makes it possible to have a rapid and effective model that is adapted to the changing market state. In the case of a tie in the profitability of the solutions, the ratio chosen for the case can be applied.
4.4. Implementation of the System
A modular system that allows extensibility (PSO automatic trading method) was built. The PSO implementation modules and the automatic trading engine have been separated. Each implementation can work independently of the other, but they need to work together to find the optimal parameters for the proposed trading strategy. The PSO module consists of the central implementation of metaheuristics but does not include the elements of a particular problem (Figure 1).
In this way, the module consists of an optimizer that requires three interfaces for its operation (Figure 1). The 3 interfaces are responsible for the following tasks:
SwarmConfigurator: Implementations of this interface must deliver a swarm composed of particles that extend to the abstract Particle class. Thus, the swarm configurator must create the initial particle configuration for a particular problem.
ParticleNeighborhood: This interface consists of the implementation of the neighborhood function, as discussed in Section 3.2.1. It has the function of determining which is the best particle within the neighbors of a given particle so that the velocity and position calculations of the particles of the swarm can be executed.
StopCriteriaEvaluator: The optimizer requires that the stop mechanism of the algorithm be indicated. Implementations of this interface must determine at the end of each optimization cycle whether execution can continue. As a basis for determining this, it is given a series of relevant data such as the number of iterations performed and the complete state of the swarm.
To adapt it to the particular problem that is to be optimized, the optimizer requires that the process be extended to the abstract implementation of the particle. The SwarmConfigurator class is responsible for instantiating the required implementation and for the implementation of the annexed interfaces. These interfaces consist of the following:
Particle, the abstract class of a particle, contains the particle’s position and velocity function as interfaces. It stores the best local solution found for velocity calculation purposes. The subclasses that extend it must implement a method that generates the value of the objective function together with the implementation of a method that can be compared against another particle by the value of its objective function to determine which has a better value. For practical purposes, the criterion is used that one particle is better than another if it has a higher value for the objective function.
Position is the interface representing the position of a particle that corresponds to one of the solutions to the problem. The classes that implement it must be able to calculate the distance to another position to create different neighborhood topologies. They must also accept an implementation of the Velocity interface and apply it to their current values, generating a new position.
Velocity is the interface that represents the velocity function of a particle. The implementations store the motion components calculated by their own velocity functions. These motion components are then applied to a Position implementation by a particle.
In the first instance, at least one implementation of the interfaces and abstract classes presented was performed to solve the automatic trading problem.
Figure 2 shows the implementations of the neighborhood interfaces and the stop criterion. The implementations of the swarm configurator, particle, velocity function, and position representation occur within the automatic trader; this is discussed in the next section. The details of each implementation are as follows:
GBestParticleNeighborhood is a global neighborhood function in which the best particle among the entire swarm set is sought. It is the simplest and fastest since it requires only finding the particle that maximizes the objective function for an iteration.
LBestParticleNeighborhood is a local neighborhood function in which the two particles closest to a given particle are searched and the best of the three particles is chosen. This is slower because it requires performing distance calculations between all the particles to find the particles that are closest to each other.
BasicStopCriteriaEvaluator is a detention criterion that is based on the number of iterations performed. When the number of designated iterations has been reached, the PSO algorithm stops. When it has stopped, the best approximate solution to the overall optimum is obtained.
4.5. Implementation of the Automatic Trader
For the implementation of the automatic trading engine, there is a central module that performs the necessary coordination to process the information related to a financial instrument through annexed modules that are specialized to perform specific tasks. The central module corresponds to an abstract class of automatic trading logic that can be generalized to any type of stock market (equities, fixed income, etc.); it uses a series of interfaces to access specialized modules in the target market. The central module operates at regular intervals (ticks), and it evaluates its internal trading algorithm during each run of the interval. If a buy/sell signal is generated, it proceeds to use the corresponding modules to perform the operation. The tick value determines whether the system behaves as an HFT system or as an AT system.
For its correct functioning, the central class of the system must be extended to generate the necessary functionality for a specific type of trader. At least two types of traders are required: one market simulation trader and one trader that communicates with the real market.
For the initial version of the AT system, the implementations of the interfaces for the simulation engine required by the PSO algorithm are created. This implementation simulates an extended period of the market through data uploaded to an MS SQL server database. In this way, the maximum profitability that generates a set of parameters of the proposed trading model for the selected period can be calculated.
The complete set of the generic PSO system and the AT engine are combined by a simple boot system to make it possible to perform the laboratory tests with the data collected in Market Maker. The boot system is configured based on a text file, and a parameter that indicates which mnemonic is to be entered into the optimizer.
Based on the laboratory tests performed, a number of improvements were made in the implementation of the system, generating an optimized version for performance. The purpose of this is to ensure that the optimization process of the solution using PSO converges rapidly enough to be executed multiple times during a day of trading.
The improvements applied address the following problems:
MA Calculation: The initial version of the AT system invokes the routine calculation of MA for each instant of system operation independently for each particle. The point calculation is changed by an incremental calculation based on the values of the previous and the new time period and on the totality of instances required by the execution of a particle.
MA Reading: Continuing from the previous problem, if for a given instant an MA with the same length had already been calculated, it was nevertheless recalculated. This introduces an overload to the Storage process, which must recalculate the same value. To solve this, an in-memory cache system that allows specific values to be calculated only once but to be queried efficiently multiple times is used.
Market Execution Reading: similar to the previous problem, this responds to how another of the AT system modules is implemented. In particular, the problem is found in the market simulation routine present in OfflineCommunicationThread. Because this routine is based on the historical information of order executions, the relevant information must be loaded from a storage system (database). In the first implementation, each particle again loads the same data from the database for each iteration of PSO. This problem is solved using a shared cache of order executions that is used by all the particles in all their iterations.
4.6. System Testing
4.6.1. Initial Testing Version
The first experiment with the initial version is used to determine whether the system performs properly and is capable of generating positive returns. The specific values used in the experiment are as follows: Period: 4 months (January–April 2012) Instrument: LAN Tick: 5 minutes
The period January–April 2012 is chosen because in that period, LAN has both increases and decreases in the share price. If only a period with increases is chosen, the risk management mechanism offered by the stop-loss band cannot be tested. However, if only a period with lows is chosen, the system will not perform the positioning (initial purchase), so it will remain inactive until a period of increase appears. The results of experiment 1 are as follows: Duration: 3.44 hours (12,388,202 ms) Net returns: >0 MA Short: 30 MA Long: 58 StopLoss: 4,0543 StopWin: 11,3177
The experiment indicates that the process consumes a large amount of time due to the number of iterations performed and the size of the swarm. The positive aspect of these results is that there are gains at the end of the process, showing that the chosen parameters can be used to configure a trader that operates within a period reasonably close to the period of optimization.
Reviewing the values of the Stop-Loss and Stop-Win bands reveals a problem. The values exceed the maxima defined in the model, since the standard PSO velocity formula is applied. This indicates that an adjustment to the implementation of the formula must be made before proceeding with the final experiments. The purpose of the second experiment is to review behavior and execution time for a shorter period. The input values are as follows: Period: 2 months (January-February 2012) Instrument: LAN Tick: 5 minutes
The experiment aborts in the middle of the process because negative values are generated for the positions of the particles. This emphasizes the fact that the standard implementation of the velocity function cannot be applied to the AT model, so it must be adapted.
4.6.2. Optimized Version Testing
As mentioned, an optimized version of the AT system was generated. In this version, corrections to errors detected in the initial version of the system were implemented. The above experiment is repeated to determine the level of improvement introduced into the system. Thus, the specific values used in experiment 3 are as follows: Period: 4 months (January–April 2012) Instrument: LAN Tick: 5 minutes The results of experiment 3 are as follows: Duration: 1 minute 26 seconds (85,950 ms) Net returns: $294,186 (Profitability 69.96%) MA Short: 45 MA Long: 52 StopLoss: 0 StopWin: 0.0040370020874758985
The experiment is successful, demonstrating that the AT system works properly within a reasonable time and that it is also able to find the parameters that allow positive profitability to be obtained for the analyzed period. Given this condition of increase, the PSO algorithm determined that under the conditions of the determined MA, it is more convenient to perform a large number of buy/sell operations, reflected by a Stop-Win band very close to zero. In addition, the algorithm determined that it is more advisable to use a zero risk to reduce losses. (See Table 2)
Finally, the experiment is executed 20 times to determine the best and worst times, together with the best net theoretical return. The best execution time obtained is 75,598 ms, and the worst is 102,842 ms; both values are well below the times obtained using the initial version of the system. The average execution time is 84,259.4 ms. The profitability of the best particles at the end of each PSO run fluctuates by an average of 438,870.6 CLP, with one particular iteration that uses a combination of parameters achieving a return of 138% on the first investment.
The experiment is repeated by varying the tick size. This allows the behavior of the AT system to be viewed in a manner that better approximates HFT. The specific values used in experiment 4 are as follows: Period: 4 months (January–April 2012) Instrument: LAN Tick: 1 minute
In reviewing the results of the experiment shown in Table 3, the increase in execution time (at an average of 377,079 ms, equivalent to 6 minutes and 17 seconds) and the reduction of theoretical net return are notable. These changes are mainly caused by an increase in the Stop-Win band, which is the parameter that allows gains to be generated during a period of price increase. For many of the best results, we also calculate a Stop-Loss band greater than zero, indicating that the AT system will accept some level of risk to generate profits. This behavior may seem unfavorable in a period of sustained price growth, but it may be advantageous when there is price variation over very short periods.
5. Conclusions and Future Work
In the present research, we studied trading technologies that make it possible to operate under an HFT and/or an AT modality. We chose the statistical technique of MA for its simplicity, its ability to predict price trends based on the history of an instrument, and its applicability in optimization of techniques.
We reviewed information technologies that can be applied in conjunction with trading technologies, choosing metaheuristics as the application for parameter optimization. Metaheuristics was chosen because a problem of profitability optimization in an equity market is an NP-class problem for which the application of search methods based on metaheuristics presents many advantages.
In addition, a design is presented for building an AT system based on the combination of trading and information technologies chosen. The design is sufficiently flexible to allow the system to be extended to other trading technologies and/or search solutions.
In the investigation, an initial version of the AT system is constructed under the proposed design. This allowed the first laboratory tests to be performed. These tests detected problems both with respect to the implementation of the AT system and with respect to special conditions that the PSO algorithm was not prepared to support. In particular, this version served to determine that in continuous but restricted domains the computation of PSO velocities must be bounded or modified in some way.
Finally, a second AT system is built based on the initial version but correcting the errors detected in the implementation of the AT model and applying the necessary limitations to the PSO algorithm. On the basis of the tests performed, it can be concluded that the defined AT system is capable of generating positive returns.
In this way, the chosen system corresponds to the improved version. Although the improved version is far from optimal, it provides a theoretical and practical basis for future research in a field in which the greatest amount of research comes from the private sector and not from the academic sector.
Regarding the application of PSO as an optimization algorithm, it is an effective solution for this problem type since it is able to optimize a set of disparate but bounded variables to a specific domain, thereby achieving a substantial improvement of the final solution. Regarding the application of PSO in optimizing the profitability of an AT system, it can be concluded that the velocity function must be altered or restricted depending on the trading model used.
How to obtain the optimal term of information prior to a given moment considered useful is a subject that remains to be studied in possible future work. One possible improvement would be to determine how changing the MA from simple to exponential would affect the optimal term. This would favor the recent trend of an instrument, ensuring that fluctuations that are too distant in time do not have undue importance in the model. Another future work would be the application of more complex AT system to the self-adjusting AT system, so that they include decision mechanisms with better risk management or that operate on smaller profit margins.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Broderick Crawford was supported by Grant CONICYT/FONDECYT/REGULAR 1171243, and Ricardo Soto was supported by Grant CONICYT/FONDECYT/REGULAR 1160455.
I. Aldridge, High-Frequency Trading, A Practical Guide to Algorithmic Strategies and Trading Systems, John Wiley & Sons, Inc, Wiley Trading Series, 2010.
A. Pole, Statistical Arbitrage, Algorithmic Trading Insights and Techniques, John Wiley & Sons, Inc, Wiley trading series, 2007.
Economia and Negocios, “Bolsa responde a las críticas por fallas de su nuevo sistema de transacciones,” 2010, http://www.economiaynegocios.cl/noticias/noticias.asp?id=73694.View at: Google Scholar
Bolsa de Comercio de Santiago, “Síntesis mensual Mayo 2010,” 2010, http://www.bolsadesantiago.com/Sintesis%20y%20Estadisticas/S%C3%ADntesis%20Mensual%20Mayo%202010.pdf.View at: Google Scholar
K. Kendall, Electronic and Algorithmic Trading Technology, Elsevier, Academic Press, Cambridge, MA, USA, 2007.
R. Pardo, The Evaluation and Optimization of Trading Strategies, John Wiley & Sons, Inc, Wiley Trading Series, Hoboken, NJ, USA, 2008.
J. Bigus, Data Minig with Neural Networks, McGraw-Hill, New York City, NY, USA, 1996.
Advanced Trading and I. Schmerken, “Deutsche bank aims new stealth algo at buy-side,” 2011, http://www.advancedtrading.com/algorithms/229401154.View at: Google Scholar
AMD WhitePaper, Market Risk and Algorithmic Trading, AMD & Toomre Capital Markets LLC, Phoenix, AZ, USA, 2007.
S. Cook, “The complexity of theorem proving procedures,” in Proceedings of the Third Annual ACM Symposium on Theory of Computing, pp. 151–158, Shaker Heights, Ohio, USA, May 1971.View at: Google Scholar
R. Karp, “Reducibility among combinatorial problems,” in Complexity of Computer Computations, R. E. Miller and J. W. Thatcher, Eds., pp. 85–103, Plenum, New York, NY, USA, 1972.View at: Google Scholar
K. Chang and A. Johnson, “Online and offline selling in limit order markets,” in Proceedings of the 4th International Workshop on Internet and Network Economics, Shanghai, China, December 2008.View at: Google Scholar
J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proceedings of IEEE International Conference on Neural Networks, pp. 1942–1948, Perth, Australia, November-December 1995.View at: Google Scholar
M. Fikret, Particle Swarm Optimization and Other Metaheuristic Methods in Hybrid Flow Shop Scheduling Problem, Singapore Polytechnic School of Electrical and Electronic Engineering, Singapore, 2009.