Research activities...

Computer Go (1991-2007): I was strongly involved in Go programming. Go is very complex and was an obstacle for AI for a long time [Bouzy & Cazenave 2001]. I developed Indigo, and I have some publications related to this work including pre Monte-Carlo experiments such as [Bouzy 2005a] or [Bouzy & Helmstetter 2003]. After the success of Monte-Carlo Tree Search (MCTS) - Big MC seminar - in which I contributed with Guillaume Chaslot [Chaslot & al 2008] and others, all go playing programs were the same, and the field was consequently less attractive than before. Once the key is uncovered, everybody use it, and the landscape becomes uniform... However, I still follow the improvements made in this challenging domain... In october 2015, AlphaGo, a software developed by DeepMind at Google, won a series of 5 games against Fan Hui a professional player (2 dan pro) on 19x19 without handicap: 5-0! On january 28th 2016, the work is described in Nature. AlphaGo combines MCTS with Deep Learning used for performing intelligent simulations, and evaluating the positions before the end, all the stuff being distributed on a set of 1000 computers.

Multi-agent learning (2008-2011): leaving computer go, I was interested in multi-agent learning. With Marc Métivier, I made some work using the pursuit evasion game as testbed [Bouzy & Métivier 2008]. In addition, I was interested in assessing the state-of-the-art decision makers - belonging to the field of multi-agent learning or not - and I used matrix games to do so. Matrix games are simple but they capture the essence of multi-agent learning. [Bouzy & Métivier 2010]. Hedging algorithms can improve the level obtained by experts algorithms. [Bouzy & al. 2011a].

Computer Amazons (2005-2010): With Julien Kloetzer, I have been working on Amazons programs [Kloetzer & al 2009].

Voronoi game (2010-2011): After the success of MCTS, I looked for a testbed in which there is an infinite set of actions. The Voronoi game is played on a square of the plan and an action is represented by a point on the plan. Domain-dependent knowledge inserted within the MCTS program is crucial. ( slides and [Bouzy & al. 2011b] ).

Cooperative path-finding (CPF) and Monte-Carlo "Fork" Search (MCFS) (2012-2013): CPF is a crucial sub-problem in video games. Due to the very high branching factor, standard MCTS does not work on problems with many agents. I designed a variant of MCTS, called MCFS, that solves such CPF problems. The principle is to build a tree by "forking" new sequences along the best sequences found so far, and not generating the whole set of legal moves, which is a too complex problem. To choose the next "fork", UCB can be used, not horizontally between siblings like in MCTS, but vertically between the nodes of a same sequence. ( slides and [Bouzy 2013] ). Future work: apply MCFS to any other single-agent planning problem with a large branching factor.

Hex (2013): Hex rules are simpler than Go rules. Hex is a good exercise to make MCTS and its enhancements work. I developed Hexquis, a Hex playing program. Future work: publish this work... or it will perish!

Weak Schur numbers (2014-2015): The weak Schur problem with K columns consists in placing consecutive numbers from 1 up to N into K columns, such that no number in a given column is the sum of two other distinct numbers of the same column. WS(K) is the maximal number respecting this condition. I developed an abstract procedure and a Monte-Carlo program that address this problem. ( slides and [Bouzy 2015b] ).

Pancake problem (2015): A chef prepares a stack of pancakes that come out all different sizes on a plate. The goal of the server is to order them with decreasing sizes, the largest pancake touching the plate, and the smallest pancake being at the top. The server can insert a spatula below a pancake and flip the substack situated above the spatula. He may repeat this action as many times as necessary. In the particular version, the goal of the server is to sort a particular stack with a minimum number of flips. In the global version, the question is to determine the maximum number of flips f(N) - the diameter - to sort any stack of N pancakes. ( slides and [Bouzy 2015a] ).

Burnt Pancake problem (2015-2016): Analogeous to the pancake problem but the pancakes have a burnt side which must be oriented downward at the end. ( slides and [Bouzy 2016a] and [Bouzy 2016c] ).

Cooperative path-finding without Hole (CPFwH) (2015-2016): CPF without Hole is challenging in that the joint actions are made up with cycles of elementary actions. Here, we conduct experiments using IDA* for optimally solving CPFwH on small rectangular boards. ( slides and [Bouzy 2016b] ).

Hanabi (2016-2017): Hanabi is a cooperative card game. The particularity is that a player sees the cards of his partners but not his own cards. The goal of the team is to achieve a perfect score of 25 by putting cards in increasing order into five fireworks, i.e. stacks of cards. At his turn, a player may either play a card, disgard a card or inform another player about his hand. In this game, the hat convention enables the players to play near-optimally. We made some experiments about several kinds of artificial players, including hat players and depth-one players using the hat convention ( slides and [Bouzy 2017] ). To some extent, the hat convention kills computer Hanabi. Then, our current challenge is building shallow neural networks learning to play by reinforcement ( slides ).

Miscellaneous (2017-): I have made some attempts in the following specific problems: Breakthrough and Chinese Checkers (related to Competitive Path-Finding), Dobble, Stable Marriages, Morpion Solitaire (all related to Monte-Carlo planning). Overall, I am interested in AI methods such as Tree Search, Reinforcement Learning, Planning, Monte-Carlo simulations, and including skills such as perceiving, acting, learning, planning, competing, cooperating, guessing, and I like to test them on games and problems as complex as possible.

Last modification 31 October 2017.