Robustness and generality of Alpha Zero

## Hypothesis

Authors of recently published research Alpha Zero stated that this technique could be easily generalised to other problems without significant human effort and it approached better than other state-of-the-art alternatives *[1]*. They tested the system in 3 board games, specifically, **Go**, **Chess** and **Shogi**. We would like to check it in the different game(s) with closer game-space and game-tree complexities.

## Questions

- How well can Alpha Zero algorithm perform in other games than Go, Chess and Shogi?
- Can Alpha Zero perform better than other alternatives?
- Can AI agent discover some remarkable knowledge during its self-play training process using Alpha Zero?
- After comparing the performance of multiple algorithms, can we make a better one (in the future)?

## Methods

- Deep Reinforcement Learning + Monte Carlo Tree Search (Alpha Zero)
- Alpha-beta Pruning
- Adaptive Dynamic Programming
- Hand-coded rules

This list can be modified in the later stages of the research.

## Candidate Games

We would like to find an answer to our research questions using at least one of this board games:

#### Gomoku

> **\*Gomoku***, also called *Gobang* or *Five in a Row*, is an [abstract strategy](https://en.wikipedia.org/wiki/Abstract_strategy) [board game](https://en.wikipedia.org/wiki/Board_game). It is traditionally played with [Go](https://en.wikipedia.org/wiki/Go_(game)) pieces (black and white stones) on a Go board, using 15×15 of the 19×19 grid intersections. Because pieces are not moved or removed from the board, Gomoku may also be played as a [paper and pencil game](https://en.wikipedia.org/wiki/Paper_and_pencil_game). The game is known in several countries under different names.

-- [*Wikipedia*](https://en.wikipedia.org/wiki/Gomoku)

It's **space complexity** in 15x15 board is about 3225 while **game tree complexity** 1070 *[3]*. 

#### Abalone

> Abalone is an [award-winning](https://en.wikipedia.org/wiki/List_of_Mensa_Select_recipients#1990_New_York,_NY) two-player [abstract strategy](https://en.wikipedia.org/wiki/Abstract_strategy_game) [board game](https://en.wikipedia.org/wiki/Board_game) designed by Michel Lalet and Laurent Lévi in 1987. Players are represented by opposing black and white marbles on a hexagonal board with the objective of pushing six of the opponent's marbles off the edge of the board.

-- [*Wikipedia*](https://en.wikipedia.org/wiki/Abalone_(board_game))

It's **space complexity** is about 6.5 * 1023 while average **branching factor** 60 and **game tree complexity** 5 * 10154 *[4]*. 



### Related Work:

1. Silver D. et al. (2017). *Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm*. Retrieved from [https://arxiv.org/abs/1712.01815](https://arxiv.org/abs/1712.01815).
2. Jose Camacho Collados <josecamachocollados@gmail.com> 11 December 2017, *Is AlphaZero really a scientific breakthrough in AI?* viewed 21 March, [https://medium.com/@josecamachocollados/is-alphazero-really-a-scientific-breakthrough-in-ai-bf66ae1c84f2](https://medium.com/@josecamachocollados/is-alphazero-really-a-scientific-breakthrough-in-ai-bf66ae1c84f2)
3. Loos A. (2012). *Machine Learning for k-in-a-row Type Games Using Random Forest and Genetic Algorithm*. Retrieved from [https://comserv.cs.ut.ee/home/files/thesis_final_mod.pdf?study=ATILoputoo&reference=5D52AF13A55F51ADB1F03E3C1EEAF628BA1BC580](https://comserv.cs.ut.ee/home/files/thesis_final_mod.pdf?study=ATILoputoo&reference=5D52AF13A55F51ADB1F03E3C1EEAF628BA1BC580)
4. Lemmens N. (2005). *Constructing an Abalone Game-Playing Agent*. Retrieved from [https://project.dke.maastrichtuniversity.nl/games/files/bsc/Lemmens_BSc-paper.pdf](https://project.dke.maastrichtuniversity.nl/games/files/bsc/Lemmens_BSc-paper.pdf)
5. Zhao, D., Zhang, Z., & Dai, Y. (2012). *Self-teaching adaptive dynamic programming for Gomoku*. Neurocomputing, 78(1), 23-29.
6. Shao, K., Zhao, D., Tang, Z., & Zhu, Y. (2016, November). *Move prediction in Gomoku using deep learning.* In Chinese Association of Automation (YAC), Youth Academic Annual Conference of (pp. 292-297). IEEE.
7. Tan, Q., & Hu, X. CS221 *Project Final Report Gomoku Game Agent*.
8. Oswin,A & Franz,A & Tino,W. *Algorithmic Fun - Abalone*. Retrieved from [http://www.ist.tugraz.at/aichholzer/research/rp/abalone/tele1-02_aich-abalone.pdf](http://www.ist.tugraz.at/aichholzer/research/rp/abalone/tele1-02_aich-abalone.pdf).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Robustness and generality of Alpha Zero #4

Hypothesis

Questions

Methods

Candidate Games

Gomoku

Abalone

Related Work:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Robustness and generality of Alpha Zero #4

Description

Hypothesis

Questions

Methods

Candidate Games

Gomoku

Abalone

Related Work:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions