Comments/Ratings for a Single Item

⇧Earliest ⇧Earlier ⇧Reverse Order⇩ ~~Later~~

Chess programs move making[Subject Thread] [Add Response]

Aurelian Florea wrote on Mon, Sep 19, 2022 08:17 AM UTC in reply to Greg Strong from Sun Sep 18 05:53 PM:

Thanks Greg, My conundrum comes from the definition of leaf nodes. In the traditional way you apply the evaluation function, but in the MCTS of Alpha zero are only when the endgame conditions apply.

H. G. Muller wrote on Mon, Sep 19, 2022 08:25 AM UTC in reply to Aurelian Florea from 08:17 AM:

I thought AlphaZero used the output of its NN for evaluating leaf nodes. That makes it different from 'normal' MCTS, which would randomly play out games until they satisfy a win or draw condition, and uses the statistics of such 'rollouts' as a measure for the winning probability in the leaf.

Aurelian Florea wrote on Mon, Sep 19, 2022 11:00 AM UTC in reply to H. G. Muller from 08:25 AM:

The NN outputs a probability distribution over all the possible moves (illegal moves are set to 0 and the probabilities sum to 1). The MTCS call for this distribution, combine it with an exploration coefficient and Dirichlet noise, to form a score, and choses a move to expand until a certain number (in chess is 6000) of nodes have been visited. Nodes are expanded until leaf nodes are explored. This link explains it better than I:

https://joshvarty.github.io/AlphaZero/

H. G. Muller wrote on Tue, Sep 20, 2022 07:05 AM UTC in reply to Aurelian Florea from Mon Sep 19 11:00 AM:

That describes the 'policy head' of the NN, which is used to bias the move choice (which is otherwise based on the number of visits of the move and that of the total for the node, and the move scores) when walking the tree from root to leaf for finding the next leaf to expand. But my understanding was that when the leaf is chosen and expanded, all daughters should receive a score from the 'evaluation head' of the NN in the position after the move, rather than just inheriting their policy weight from the position before the move. These scores are then back-propagated towards the root, by including them in the average score of all nodes in the path to the expanded leaf.

Aurelian Florea wrote on Tue, Sep 20, 2022 07:54 AM UTC in reply to H. G. Muller from 07:05 AM:

I do not understand "tgat". You probably meant that.

H. G. Muller wrote on Tue, Sep 20, 2022 08:07 AM UTC in reply to Aurelian Florea from 07:54 AM:

Indeed. Hard to avoid typos on these virtual keyboards of Android devices... I corrected it.

Aurelian Florea wrote on Tue, Sep 20, 2022 12:06 PM UTC in reply to H. G. Muller from 07:05 AM:

I'm not sure I understand what you say, HG!

Gerd Degens wrote on Tue, Sep 20, 2022 04:25 PM UTC in reply to Aurelian Florea from 12:06 PM:

What is not understandable? Typo! What else.
By the way, details about programming are not clear for most people. How to deal with it?

Aurelian Florea wrote on Tue, Sep 20, 2022 04:40 PM UTC in reply to Gerd Degens from 04:25 PM:

I did not meant the typo!

Greg Strong wrote on Tue, Sep 20, 2022 05:18 PM UTC in reply to Gerd Degens from 04:25 PM:

By the way, details about programming are not clear for most people. How to deal with it?

I don't think there is any "fix" to this issue. I am not sure there is any issue at all. Some conversations are going to involve things other people don't understand. That said, the talkchess forums are the usual place for these kinds of discussions, but I am happy to have some discussion here as well. Some people who are not chess programmers may still be interested in whether the new neural-network techniques being applied to orthodox chess can be applied to chess variants.

Gerd Degens wrote on Thu, Sep 22, 2022 06:41 AM UTC in reply to Greg Strong from Tue Sep 20 05:18 PM:

That sounds plausible.

11 comments displayed

⇧Earliest ⇧Earlier ⇧Reverse Order⇩ ~~Later~~

Permalink to the exact comments currently displayed.