|
|
Line 1: |
Line 1: |
| In [[computer science]], '''Monte-Carlo tree search''' ('''MCTS''') is a [[heuristic (computer science)|heuristic]] [[search algorithm]] of making decisions in some [[decision process]]es, most notably employed in game playing. The leading example of its use are contemporary [[computer go]] programs.<ref>{{cite web|url=http://www.mcts.ai/?q=mcts|title=MCTS.ai: Everything Monte Carlo Tree Search|accessdate=2012-02-19}}</ref> MCTS is also used in programs that play other [[board game]]s (for example [[Hex (board game)|Hex]],<ref>{{cite journal|author=Broderick Arneson, Ryan Hayward, Philip Henderson|title=MoHex Wins Hex Tournament|journal=ICGA Journal|volume=32|issue=2|pages=114–116|date=June 2009|url=http://webdocs.cs.ualberta.ca/~hayward/papers/rptPamplona.pdf}}</ref> [[Havannah]],<ref>{{cite book|author=Timo Ewalds|title=Playing and Solving Havannah|publisher=Master's thesis, University of Alberta|year=2011|url=http://havannah.ewalds.ca/thesis.pdf}}</ref> [[Game of the Amazons]],<ref>{{cite book|author=Richard J. Lorentz|chapter=Amazons Discover Monte-Carlo|pages=13–24|others=H. Jaap van den Herik, Xinhe Xu, Zongmin Ma, Mark H. M. Winands (eds.)|title=Computers and Games, 6th International Conference, CG 2008, Beijing, China, September 29 – October 1, 2008. Proceedings|publisher=Springer|year=2008|isbn=978-3-540-87607-6}}</ref> and [[Arimaa]]<ref>{{cite book|author=Tomáš Kozelek|title=Methods of MCTS and the game Arimaa|publisher=Master's thesis, Charles University in Prague|year=2009|url=http://arimaa.com/arimaa/papers/TomasKozelekThesis/mt.pdf}}</ref>), real-time video games (for instance [[Ms. Pac-Man]]<ref>{{cite journal|author=Xiaocong Gan, Yun Bao, Zhangang Han|title=Real-Time Search Method in Nondeterministic Game – Ms. Pac-Man|pages=209–222|journal=ICGA Journal|volume=34|issue=4|date=December 2011}}</ref>), and nondeterministic games (such as [[skat (card game)|skat]],<ref>{{cite book|author=Michael Buro, Jeffrey Richard Long, Timothy Furtak, Nathan R. Sturtevant|chapter=Improving State Evaluation, Inference, and Search in Trick-Based Card Games|pages=1407–1413|others=Craig Boutilier (ed.)|title=IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, California, USA, July 11–17, 2009|year=2009|url=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.150.3077|doi=10.1.1.150.3077}}</ref> [[poker]],<ref>{{cite journal|author=Jonathan Rubin, Ian Watson|title=Computer poker: A review|journal=Artificial Intelligence|volume=175|issue=5–6|date=April 2011|doi=10.1016/j.artint.2010.12.005|url=http://www.cs.auckland.ac.nz/~jrub001/files/CPReviewPreprintAIJ.pdf}}</ref> [[Magic: The Gathering]],<ref>{{cite book|author=C.D. Ward, P.I. Cowling|chapter=Monte Carlo Search Applied to Card Selection in Magic: The
| | == Malta Womens Air Max 90 == |
| Gathering|title=CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games|publisher=IEEE Press|year=2009|url=http://www.ieee-cig.org/cig-2009/Proceedings/proceedings/papers/cig2009_002e.pdf}}</ref> or [[Settlers of Catan]]<ref>{{cite book|author=István Szita, Guillaume Chaslot, Pieter Spronck|chapter=Monte-Carlo Tree Search in Settlers of Catan|pages=21–32|others=H. Jaap van den Herik, Pieter Spronck (eds.)|title=Advances in Computer Games, 12th International Conference, ACG 2009, Pamplona, Spain, May 11–13, 2009. Revised Papers|publisher=Springer|year=2010|isbn=978-3-642-12992-6|url=http://www.ualberta.ca/~szita/papers/SzitaChaslotSpronck09Monte-Carlo.pdf}}</ref>).
| |
|
| |
|
| == Principle of operation ==
| | Energy Globe promotes projects that demonstrate the commitment of people and organisations to the sustainable future of our planet by awarding them their prestigious award. In the presentation of the award to Nature Trust (Malta), Energy Globe stated: environmental awareness is the key to a sustainable future. <br><br>Ny A special audience will get to see Helen Mirren onstage next month in London within the play "The Audience" some retired show business veterans inside a New Jersey nursing home.The live cinema event company BY Experience said Friday it'll show a live performance of Peter Morgan's have fun with residents of the The Lillian Booth Actors Home in New Jersey.BY Experience has donated the hightech digital projector the equipment for the broadcast and is permanently installing it in one of the home's community space. Residents will be able to catch all subsequent broadcasts by National Theatre Live.The play will be broadcast live across the world from London's Gielgud Theatre using multiple cameras on June 13. <br><br>Around the rear of the board there is a TF card slot to extend the storage capabilities as well as an expansion interface to allow user's development extension. It is provided with Linux2.6.32, Android2.2 and WinCE6.0 BSP.. These eventually progressed into some nice 8 size cities, i didn't let them get bigger than that. I payed closer attention to beakers this time. <br><br>The Medicare levy will increase by 0.5 of a percentage point to 2 per cent, to raise $3.3 billion a year. This will raise $20.4 billion between 2014/15 and 2018/19. Nicole also sent me an ebook that had places to submit your posts as well. I will be in contact with you later on this evening about being on the Lets Talk Kids Radio. <br><br>It's most typical in countries [http://www.indopacificmarine.com.au/mail/copyright.asp?page=106-Womens-Air-Max-90 Womens Air Max 90] with high standards of just living, mainly because diet plays many [http://www.marriagecelebrantjillemerton.com.au/scripts/search.asp?page=167-Louis-Vuitton-Factory-Outlet-Online Louis Vuitton Factory Outlet Online] in this condition. It affects about 1% of the population.. Dos Santos, with a net worth estimated at $500m, is an entrepreneur, investor and daughter of Angola's President Jose Eduardo dos Santos. Her first business venture was an eatery in Luanda called Miami Beach Club. <br><br>Dr. Kostur believes that accessibility, understanding and comfort are the cornerstones for an exceptional pediatric practice. [http://www.thornleighsports.org.au/wp-content/plugins/akismet/config.php?f=41-Tiffany-Jewelry Tiffany Jewelry] Watching them as they play with their puppets will supply also supply glimpse at how they look on world. Do you really say, don which that often?. <br><br>However, emotion expressions aren't always clear cut or might be influenced by a specific [http://www.sandpipermotel.com.au/sandpiper/head.asp?p=87-Mbt-Sandals-Women Mbt Sandals Women] situational context or learned knowledge. Within our work on the temporal and neural correlates of emotion expressions we address numerous questions by means of ERPs, fMRI, and lesion studies.<ul> |
| MCTS concentrates on analysing the most promising moves, basing the expansion of the [[game tree]] on random sampling of the search space.
| | |
| MCTS in games is based on many ''playouts''. In each playout, the games are played-out to the very end by selecting moves at random. The final game result of each playout is then used to weight the nodes in the game tree so that better nodes are more likely to be chosen in future playouts.
| | <li>[http://kehila.ein-iron.org.il/activity/p/431883/ http://kehila.ein-iron.org.il/activity/p/431883/]</li> |
| | | |
| The most primitive way of using playouts is playing the same number of them after each legal move of the current player and choosing the move, after which the most playouts ended up in the player's victory.<ref name="Bruegmann">{{cite book|last=Brügmann|first=Bernd|title=Monte Carlo Go|url=http://www.ideanest.com/vegos/MonteCarloGo.pdf|publisher=Technical report, Department of Physics, Syracuse University|year=1993}}</ref> The efficiency of this method – called ''Pure Monte Carlo Game Search'' – often increases when, as time goes on, more playouts are assigned to moves, after which previous playouts have led to this player's victory more often. Full MCTS consists in employing this principle recursively on many depths of the game tree. Each round of MCTS consists of four steps:<ref name="chaslot2008">{{cite journal|author=G.M.J.B. Chaslot, M.H.M. Winands, J.W.H.M. Uiterwijk, H.J. van den Herik, B. Bouzy|title=Progressive Strategies for Monte-Carlo Tree Search|journal=New Mathematics and Natural Computation|volume=4|issue=3|pages=343–359|year=2008|url=https://dke.maastrichtuniversity.nl/m.winands/documents/pMCTS.pdf|doi=10.1.1.106.3015}}</ref>
| | <li>[http://ldsbee.com/index.php?page=item&id=4050960 http://ldsbee.com/index.php?page=item&id=4050960]</li> |
| * ''Selection'': starting from root <math>R</math>, select successive child nodes down to a leaf node <math>L</math>. The section below says more about a way of choosing child nodes that lets the game tree expand towards most promising moves, which is the essence of MCTS.
| | |
| * ''Expansion'': unless <math>L</math> ends the game, create none, one or more child nodes of it and choose from them node <math>C</math>. If none child was created, start simulation from {{math|L}}.
| | <li>[http://metransparent.nfrance.com/~k1001/spip.php?article8359&lang=ar&id_forum=8701/ http://metransparent.nfrance.com/~k1001/spip.php?article8359&lang=ar&id_forum=8701/]</li> |
| * ''Simulation'': play a random playout from node <math>C</math>.
| | |
| * ''Backpropagation'': using the result of the playout, update information in the nodes on the path from <math>C</math> to <math>R</math>.
| | <li>[http://globaldialogue2013.go.tz/gdforum/activity http://globaldialogue2013.go.tz/gdforum/activity]</li> |
| Sample steps of one round are shown in the figure below. Each tree node stores the number of won/played playouts.
| | |
| [[File:MCTS (English).svg|frame|center|Steps of Monte-Carlo tree search]]
| | </ul> |
| Such rounds are repeated as long as the time allotted to a move is not up. Then, one of moves from the root of the tree is chosen but it is the move with the most simulations made rather than the move with the highest average win rate.
| |
| | |
| == Pure Monte Carlo game search ==
| |
| This basic procedure can be applied in all games which have only positions with finitely many moves, and which allow only for games of finite length. In a position all feasible moves are determined, for each one ''k'' random games are played to the very end, and the cumulative scores are recorded. The move with the best score is chosen. Ties are broken by fair coin flips. Pure Monte Carlo Game Search results in strong play in several games with random elements, for instance in [[EinStein würfelt nicht!]]. It converges to perfect play (as ''k'' tends to infinity) in board filling games with random turn order, for instance in [[Hex (board game)|Hex]] with random turn order.
| |
| | |
| == Exploration and exploitation ==
| |
| The main difficulty in selecting child nodes is maintaining some balance between the ''exploitation'' of deep variants after moves with high average win rate and the ''exploration'' of moves with few simulations. The first formula for balancing exploitation and exploration in games, called UCT (''Upper Confidence Bound'' 1 ''applied to trees''), was introduced by L. Kocsis and Cs. Szepesvári<ref name="Kocsis-Szepesvari">{{cite book|last=Kocsis|first=Levente|last2=Szepesvári|first2=Csaba|chapter=Bandit based Monte-Carlo Planning|others=Johannes Fürnkranz, Tobias Scheffer, Myra Spiliopoulou (eds.)|title=Machine Learning: ECML 2006, 17th European Conference on Machine Learning, Berlin, Germany, September 18–22, 2006, Proceedings. Lecture Notes in Computer Science 4212|publisher=Springer|isbn=3-540-45375-X|url=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.1296|pages=282–293|year=2006|doi=10.1.1.102.1296}}</ref> basing on the UCB1 formula derived by Auer, Cesa-Bianchi, and Fischer.<ref>{{cite journal|last=Auer|first=Peter|last2=Cesa-Bianchi|first2=Nicolò|last3=Fischer|first3=Paul|title=Finite-time Analysis of the Multiarmed Bandit Problem|journal=Machine Learning|volume=47|pages=235–256|year=2002|url=http://moodle.technion.ac.il/pluginfile.php/192340/mod_resource/content/0/UCB.pdf}}</ref> Kocsis and Szepesvári recommend to choose in each node of the game tree the move, for which the expression <math>\frac{w_i}{n_i} + c\sqrt{\frac{\ln t}{n_i}}</math> has the highest value. In this formula:
| |
| * <math>w_i</math> stands for the number of wins after the <math>i</math>th move;
| |
| * <math>n_i</math> stands for the number of simulations after the <math>i</math>th move;
| |
| * <math>c</math> is the exploration parameter – theoretically equal to <math>\sqrt{2}</math>; in practice usually chosen empirically;
| |
| * <math>t</math> stands for the total number of simulations in a given tree node, equal to the sum of all <math>n_i</math>.
| |
| The first component of the formula above corresponds to exploitation; it is high for moves with high average win ratio. The second component corresponds to exploration; it is high for moves with few simulations.
| |
| | |
| Most contemporary implementations of MCTS are based on some variant of UCT.
| |
| | |
| == Advantages and disadvantages ==
| |
| Although it has been proven that the evaluation of moves in MCTS converges to the [[minimax]] evaluation,<ref>{{cite book|last=Bouzy|first=Bruno|chapter=Old-fashioned Computer Go vs Monte-Carlo Go|title=IEEE Symposium on Computational Intelligence and Games, April 1–5, 2007, Hilton Hawaiian Village, Honolulu, Hawaii|url=http://ewh.ieee.org/cmte/cis/mtsc/ieeecis/tutorial2007/Bruno_Bouzy_2007.pdf}}</ref> the basic version of MCTS can converge to it after enormous time. Besides this disadvantage (partially cancelled by the improvements described below), MCTS has some advantages compared to [[alpha-beta pruning]] and similar algorithms.
| |
| | |
| Unlike them, MCTS works without an explicit [[evaluation function]]. It is enough to implement game mechanics, i.e. the generating of allowed moves in a given position and the game-end conditions. Thanks to this, MCTS can be applied in games without a developed theory or even in [[general game playing]].
| |
| | |
| The game tree in MCTS grows asymmetrically: the method concentrates on searching its more promising parts. Thanks to this, it achieves better results than classical algorithms in games with a high [[branching factor]].
| |
| | |
| Moreover, MCTS can be interrupted at any time, yielding the move it considers the most promising.
| |
| | |
| == Improvements ==
| |
| Various modifications of the basic MCTS method have been proposed, with the aim of shortening the time to find good moves. They can be divided into improvements based on expert knowledge and into domain-independent improvements.
| |
| | |
| MCTS can use either ''light'' or ''heavy'' playouts. Light playouts consist of random moves. In heavy playouts various heuristics influence the choice of moves. The heuristics can be based on the results of previous playouts (e.g. the Last Good Reply heuristic<ref>{{cite journal|last=Drake|first=Peter|title=The Last-Good-Reply Policy for Monte-Carlo Go|journal=ICGA Journal|volume=32|issue=4|pages=221–227|date=December 2009}}</ref>) or on expert knowledge of a given game. For instance, in many go-playing programs, certain stone patterns on a part of the board influence the probability of moving into that part.<ref name="Gelly-et-al">{{cite book|author=Sylvain Gelly, Yizao Wang, Rémi Munos, Olivier Teytaud|title=Modification of UCT with Patterns in Monte-Carlo Go|date=November 2006|publisher=Technical report, INRIA|url=http://hal.inria.fr/docs/00/11/72/66/PDF/MoGoReport.pdf}}</ref> Paradoxically, not always playing stronger in simulations makes an MCST program play stronger overall.<ref>{{cite book|author=Seth Pellegrino, Peter Drake|chapter=Investigating the Effects of Playout Strength in Monte-Carlo Go|pages=1015–1018|others=Hamid R. Arabnia, David de la Fuente, Elena B. Kozerenko, José Angel Olivas, Rui Chang, Peter M. LaMonica, Raymond A. Liuzzi, Ashu M. G. Solo (eds.)|title=Proceedings of the 2010 International Conference on Artificial Intelligence, ICAI 2010, July 12–15, 2010, Las Vegas Nevada, USA|publisher=CSREA Press|year=2010|isbn=1-60132-148-1}}</ref>
| |
| | |
| [[File:Mogo-hane.svg|frame|center|Patterns of ''hane'' (surrounding opponent stones) used in playouts by the MoGo program. It is advantageous both for black and for white to put a stone on the middle square, except the rightmost pattern where it is advantageous for black only. Based on.<ref name="Gelly-et-al"/>]]
| |
| | |
| Domain-specific knowledge can also be used while building the game tree, to help the exploitation of some variants. One of such methods consists in assigning nonzero ''priors'' to the numbers of won and played simulations when creating a child node. Such artificially raised or lowered average win rate causes the node to be chosen more or less frequently, respectively, in the selection step.<ref name="Gelly-Silver">{{cite book|author=Sylvain Gelly, David Silver|chapter=Combining Online and Offline Knowledge in UCT|pages=273–280|others=Zoubin Ghahramani (ed.)|title=Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, June 20–24, 2007|publisher=ACM|year=2007|isbn=978-1-59593-793-3|url=http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf}}</ref> A related method, called ''progressive bias'', consists in adding to the UCB1 formula an <math>\frac{b_i}{n_i}</math> element, where <math>b_i</math> is a heuristic score of the <math>i</math>th move.<ref name="chaslot2008"/>
| |
| | |
| The basic MCTS collects enough information to find the most promising moves after many rounds. Before that, it chooses more or less random moves. This initial phase can be significantly shortened in a certain class of games thanks to RAVE (''Rapid Action Value Estimation'').<ref name="Gelly-Silver"/> The games in question are those where permutations of a sequence of moves lead to the same position, especially board games where a move consists in putting a piece or a stone on the board. In such games, the value of a move is often only slightly influenced by moves played elsewhere.
| |
| | |
| In RAVE, for a given game tree node <math>N</math>, its child nodes <math>C_i</math> store besides the statistics of wins in playouts started in node <math>N</math> also the statistics of wins in all playouts started in node <math>N</math> and below it, if they contain move <math>i</math> (also when the move was played in the tree, between node <math>N</math> and a playout). This way, the contents of tree nodes are influenced not only by moves played immediately in a given position but also by the same moves played later.
| |
| | |
| [[File:Tic-tac-toe-RAVE-English.svg|frame|center|RAVE on the example of tic-tac-toe. In red nodes, the RAVE statistics will be updated after the b1-a2-b3 simulation.]]
| |
| | |
| When using RAVE, the selection step selects the node, for which the modified UCB1 formula <math>(1-\beta(n_i, \tilde{n}_i))\frac{w_i}{n_i} + \beta(n_i, \tilde{n}_i)\frac{\tilde{w}_i}{\tilde{n}_i} + c\sqrt{\frac{\ln t}{n_i}}</math> has the highest value. In this formula, <math>\tilde{w}_i</math> and <math>\tilde{n}_i</math> stand for the number of won playouts containing move <math>i</math> and the number of all playouts containing move <math>i</math>, and the <math>\beta(n_i, \tilde{n}_i)</math> function should be close to one and to zero for relatively small and relatively big <math>n_i</math> and <math>\tilde{n}_i</math>, respectively. One of many formulas for <math>\beta(n_i, \tilde{n}_i)</math>, proposed by D. Silver,<ref>{{cite book|author=David Silver|title=Reinforcement Learning and Simulation-Based Search in Computer Go|publisher=PhD thesis, University of Alberta|year=2009|url=http://papersdb.cs.ualberta.ca/~papersdb/uploaded_files/1029/paper_thesis.pdf}}</ref> says that in balanced positions one can take <math>\beta(n_i, \tilde{n}_i)=\frac{\tilde{n}_i}{n_i+\tilde{n}_i+4b^2 n_i\tilde{n}_i}</math>, where <math>b</math> is an empirically chosen constant.
| |
| | |
| Heuristics used in MCTS often depend on many parameters. There are methods of automatic tuning the values of these parameters to maximize the win rate.<ref>{{cite book|author=Rémi Coulom|chapter=CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning|title=ACG 2011: Advances in Computer Games 13 Conference, Tilburg, the Netherlands, November 20–22|url=http://remi.coulom.free.fr/CLOP/}}</ref>
| |
| | |
| MCTS can be concurrently executed by many [[thread (computing)|threads]] or [[process (computing)|processes]]. There are several fundamentally different methods of its [[parallel computing|parallel]] execution:<ref>{{cite book|author=Guillaume M.J-B. Chaslot, Mark H.M. Winands, H. Jaap van den Herik|chapter=Parallel Monte-Carlo Tree Search|pages=60–71|others=H. Jaap van den Herik, Xinhe Xu, Zongmin Ma, Mark H. M. Winands (eds.)|title=Computers and Games, 6th International Conference, CG 2008, Beijing, China, September 29 – October 1, 2008. Proceedings|publisher=Springer|year=2008|isbn=978-3-540-87607-6|url=https://dke.maastrichtuniversity.nl/m.winands/documents/multithreadedMCTS2.pdf}}</ref>
| |
| * ''Leaf parallellization'', i.e. parallel execution of many playouts from one leaf of the game tree.
| |
| * ''Root parallellization'', i.e. building independent game trees in parallel and making the move basing on the root-level branches of all these trees.
| |
| * ''Tree parallellization'', i.e. parallel building of the same game tree, protecting data from simultaneous writes either with one, global [[mutex]], with more mutexes, or with [[non-blocking algorithm|non-blocking synchronization]].<ref>{{cite book|author=Markus Enzenberger, Martin Müller|chapter=A Lock-free Multithreaded Monte-Carlo Tree Search Algorithm|pages=14–20|others=H. Jaap Van Den Herik, Pieter Spronck (eds.)|title=Advances in Computer Games: 12th International Conference, ACG 2009, Pamplona, Spain, May 11–13, 2009, Revised Papers|publisher=Springer|year=2010|url=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.161.1984|doi=10.1.1.161.1984|isbn=978-3-642-12992-6}}</ref>
| |
| | |
| == History ==
| |
| The [[Monte Carlo method]], based on random sampling, dates back to the 1940s. Bruce Abramson explored the idea in his 1987 PhD thesis and said it "is shown to be precise, accurate, easily estimable, efficiently calculable, and domain-independent."<ref name=Abramson>{{cite book|last=Abramson|first=Bruce|title=The Expected-Outcome Model of Two-Player Games|publisher=Technical report, Department of Computer Science, Columbia University|year=1987|url=http://academiccommons.columbia.edu/download/fedora_content/download/ac:142327/CONTENT/CUCS-315-87.pdf|accessdate=23 December 2013}}</ref> He experimented in-depth with [[Tic-tac-toe]] and then with machine generated evaluation functions for [[Reversi|Othello]] and [[Chess]]. In 1992, B. Brügmann employed it for the first time in a go-playing program,<ref name="Bruegmann"/> but his idea was not taken seriously. In 2006, called the year of the Monte-Carlo revolution in go,<ref>{{cite book|author=Rémi Coulom|chapter=The Monte-Carlo Revolution in Go|title=Japanese-French Frontiers of Science Symposium|year=2008|url=http://remi.coulom.free.fr/JFFoS/JFFoS.pdf}}</ref> R. Coulom described the application of the Monte Carlo method to game-tree search and coined the name Monte-Carlo tree search,<ref>{{cite book|author=Rémi Coulom|chapter=Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search|pages=72–83|others=H. Jaap van den Herik, Paolo Ciancarini, H. H. L. M. Donkers (eds.)|title=Computers and Games, 5th International Conference, CG 2006, Turin, Italy, May 29–31, 2006. Revised Papers|publisher=Springer|year=2007|isbn=978-3-540-75537-1|url=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.81.6817|doi=10.1.1.81.6817}}</ref> L. Kocsis and Cs. Szepesvári developed the UCT algorithm,<ref name="Kocsis-Szepesvari"/> and S. Gelly et al. implemented UCT in their program MoGo.<ref name="Gelly-et-al"/> In 2008, MoGo achieved [[dan (rank)|dan]] (master) level in 9×9 go,<ref>{{cite journal|author=Chang-Shing Lee, Mei-Hui Wang, Guillaume Chaslot, Jean-Baptiste Hoock, Arpad Rimmel, Olivier Teytaud, Shang-Rong Tsai, Shun-Chin Hsu, Tzung-Pei Hong|title=The Computational Intelligence of MoGo Revealed in Taiwan’s Computer Go Tournaments|journal=IEEE Transactions on Computational Intelligence and AI in Games|pages=73–89|volume=1|issue=1|year=2009|url=http://hal.inria.fr/docs/00/36/97/86/PDF/TCIAIG-2008-0010_Accepted_.pdf}}</ref> and the Fuego program began to win with strong amateur players in 9×9 go.<ref>{{cite book|author=Markus Enzenberger, Martin Mūller|title=Fuego – An Open-Source Framework for Board Games and Go Engine Based on Monte Carlo Tree Search|publisher=Technical report, University of Alberta|year=2008|url=https://www.cs.ualberta.ca/system/files/tech_report/2009/TR09-08_0.pdf}}</ref> In January 2012, the Zen program won 3:1 a 19×19 go match with John Tromp, a 2 dan player.<ref>{{cite web|url=http://dcook.org/gobet/|title=The Shodan Go Bet|accessdate=2012-05-02}}</ref>
| |
| | |
| [[File:Computer-go-ratings-English.svg|frame|center|The rating of best go-playing programs on the KGS server since 2007. Since 2006, all the best programs use MCTS. Source:<ref>{{cite web|url=http://senseis.xmp.net/?KGSBotRatings|title=Sensei's Library: KGSBotRatings|accessdate=2012-05-03}}</ref>]]
| |
| | |
| == References ==
| |
| {{Reflist|2}}
| |
| | |
| == Bibliography ==
| |
| * {{cite journal|author=Cameron Browne, Edward Powley, Daniel Whitehouse, Simon Lucas, Peter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, Simon Colton|title=A Survey of Monte Carlo Tree Search Methods|journal=IEEE Transactions on Computational Intelligence and AI in Games|volume=4|issue=1|date=March 2012|url=http://www.cameronius.com/cv/mcts-survey-master.pdf}}
| |
| | |
| [[Category:Combinatorial game theory]]
| |
| [[Category:Heuristic algorithms]]
| |
| [[Category:Monte Carlo methods]]
| |