Advances in Artificial Neural Systems

Volume 2015 (2015), Article ID 157983, 8 pages

http://dx.doi.org/10.1155/2015/157983

## Generalisation over Details: The Unsuitability of Supervised Backpropagation Networks for Tetris

School of Engineering and ICT, University of Tasmania, Private Bag 87, Sandy Bay, TAS 7001, Australia

Received 19 January 2015; Accepted 1 April 2015

Academic Editor: Matt Aitkenhead

Copyright © 2015 Ian J. Lewis and Sebastian L. Beswick. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We demonstrate the unsuitability of Artificial Neural Networks (ANNs) to the game of Tetris and show that their great strength, namely, their ability of generalization, is the ultimate cause. This work describes a variety of attempts at applying the Supervised Learning approach to Tetris and demonstrates that these approaches (resoundedly) fail to reach the level of performance of hand-crafted Tetris solving algorithms. We examine the reasons behind this failure and also demonstrate some interesting auxiliary results. We show that training a separate network for each Tetris piece tends to outperform the training of a single network for all pieces; training with randomly generated rows tends to increase the performance of the networks; networks trained on smaller board widths and then extended to play on bigger boards failed to show any evidence of learning, and we demonstrate that ANNs trained via Supervised Learning are ultimately ill-suited to Tetris.

#### 1. Introduction

Tetris, created by Pajitnov for the* Elektronika-60* machine in 1984 [1], is one of the most continuously popular video games of all time [2]. While many versions of the game have incorporated hand-crafted AI opponents, research has also been performed into applying biologically inspired methods such as Artificial Neural Networks (ANNs) to the problem.

ANNs, as first developed by McCulloch and Pitts [3], are a network of neurons [4], called nodes, that each fires when the sum of their inputs exceeds a certain threshold value. The simplest type of ANN is the single-layer feedforward network (perceptron network), in which every input is directly connected to every output.

It was realised that perceptron networks cannot be used to solve complex problems [3, 4]; however, this can be overcome by adding extra layers of neurons to the network, and connecting every neuron in each layer to every neuron in the next layer to create multilayer feed-forward ANNs.

Backpropagation, first developed by Bryson and Ho [5], is the most common learning method in multilayer feedforward ANNs [4]. Backpropagation networks differ from perceptron networks in that, after the output is assessed, the net error is calculated and individual neurons are rewarded or punished depending on how much they contributed to the error. This procedure is performed after every epoch, and is a training technique known as Supervised Learning (SL). ANNs have been used to successfully solve problems that are trivial to humans but typically difficult to approach algorithmically, such as classification [6].

Every Tetris game must terminate at some point; it is statistically impossible to continue playing for an infinite amount of time, as a nontessellatable alternating sequence of S and Z pieces is inevitable (however, as the piece sequence must in be reality generated pseudorandomly, it is extremely unlikely that this sequence will be generated in practice) [7]. Farias and Roy [8] state that it would be possible to use dynamic programming to find an optimal strategy for Tetris but that it would be computationally infeasible due to the large state space of the game.

Breukelaar et al. [9] proved that even if the entire piece sequence is known in advance, Tetris is NP-complete. Importantly, they suggest that it is computationally unfeasible to compute the entire set of state spaces algorithmically and this justifies the use of machine learning approaches to approximate solutions to the optimal policy. To date, such research has been primarily focused on the Reinforcement Learning (RL) approach or on restricted versions of the game (see Section 2.2).

In this paper, we attempt to train an ANN using Supervised Learning to play Tetris successfully. A number of strategies are considered, including training separate networks for each piece, adding random rows to the training set, and training the networks on a board of reduced width.

#### 2. Background

The gameplay of Tetris consists of a random sequence of pieces that gradually fall down a gridded board at a rate of one square per game cycle. The pieces are all seven possible/Tetrominoes/: polyominoes that contain exactly four squares [10]. In standard Tetris, the set of Tetrominoes are created from exactly four squares and follow the standard naming scheme as shown in Figure 1.