Experiences in Evaluation with BKG - A Program that Plays Backgammon
暂无分享,去创建一个
Because of v e r y high branching factors, a backgammon p r o g r a m must re ly on knowledge rather than search for p e r f o r m a n c e . We here discuss insights gained about the s t r u c t u r e of evaluat ion functions for a large domain such as backgammon. Evaluation began as a single linear po lynomia l of backgammon features. Later, we introduced Ma te -c lasses , each w i th its own evaluation function. This i m p r o v e d the play, but caused problems wi th o d g e e f f e c t s be tween state-classes. Our latest ef for t uses models of posi t ion potential to select across the set of best members of each represented state-class. "This has p roduced a significant jump in performance of BKG. Because of the localization of knowledge, state-classes permi t re la t i ve ly easy modif ication of knowledge used in eva lua t ion . They also permit the building of opponent models based upon what evidence shows the o p p o n e n t knows in each state-class. Our p rog ram plays a general ly competent game at an i n te rmed ia te level of skill. It correct ly solves a high pe rcen tage of intermediate level problems in books. I. Why Yet Another Game? Backgammon is a game of skill and chance. It is an i n t e r e s t i n g ob jec t of study for AI because in any given pos i t i on (of w i t h there are 1 0 ? 0 [Le76]), there are 21 poss ib le combinat ions that the throw of two dice can p roduce . Each of these, can be played legally in the ave rage board posi t ion about 40 di f ferent ways ( about 17 in actual game posit ions). Thus if one were to i nves t iga te a backgammon posit ion by tree searching, it w o u l d be necessary to deal wi th a branching factor of more than 8 0 0 (!!) at every node. Clearly this is comp le te l y impract ical . Therefore backgammon must be app roached w i t h evaluat ion and knowledge in mind. Pos i t ion Pi wi l l have to p re fe r red over posit ion P2 because it has fea tu res that more endear it to the player who can produce it than the features that obtain in P2. In a game such as chess, it has been customary to search v e r y large t rees of 5000 to 2 million terminal nodes. In such a paradigm, the execution of a terminal eva lua t ion funct ion requires a certain amount of time, wh i ch must then be mult ipl ied by the expected number of te rmina l nodes in the search. Thus designers of chess programs are very circumspect in creating Tins w o r k was suppor ted by the Advanced Research P ro jec t s Agency of the Office of the Secretary of Defense (con t rac t F 4 4 6 2 0 7 3 C 0 0 7 4 ) and is monitored by the Air Force Off ice of Scientif ic Research. P r o h l f » n S o l v ]nr 428 eva lua t i on funct ions which require lengthy execution t imes. For this reason certa in features that are not t r i v i a l to compute are usually left out, so that the p r o g r a m may opera te faster and search more. Since t h e r e can be l i t t le or no searching in a practical backgammon p rogram, these contingencies will not apply. On the c o n t r a r y , it is desirable to apply all possible know ledge to successor posit ions of the root node, in an a t tempt to f ind the best next move. Further, the fact that modern backgammon involves doubling places an e v e n g rea te r emphasis on the use of knowledge, since it r equ i r es an understanding of a posit ion (not just the ab i l i t y to d iscr iminate the best move) to know when to doub le and when to accept or refuse. I I . t he S t ruc tu re of BKG BKG is an in terac t ive program. For a given roll of the d ice, it genera tes a list of all possible legal plays. If it is the p rogram's t u r n to play, i t serves these potential p lays up one at a time to the evaluation procedure. It t hen selects the best. If it is a human opponents's turn to p lay, it wai ts to receive a legal play from its
[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[2] A. L. Samuel,et al. Some studies in machine learning using the game of checkers. II: recent progress , 1967 .
[3] Joel Spencer,et al. Optimal Doubling in Backgammon , 1975, Oper. Res..
[4] Norman Zadeh,et al. On Optimal Doubling in Backgammon , 1977 .