[FRIAM] Adversarial Go trick defeats KataGo

Alexander Rasmus alex.m.rasmus at gmail.com
Sat Nov 12 23:17:31 EST 2022


It looks like the adversarial net paper and KataGo use almost the same
algorithm for scoring. The paper ars is discussing uses the Tromp-Taylor
algorithm (stated in the caption of Fig. 1 in
https://arxiv.org/pdf/2211.00241.pdf), whereas the training for KataGo used
a modified version of Tromp-Taylor that doesn't require capturing isolated
stones explicitly in some circumstances (top of page 4 in
https://arxiv.org/pdf/1902.10565.pdf), but this appears to only apply in
regions where there's a group that's unconditionally alive, and shouldn't
matter for scoring the types of games the papers are discussing.

This is a really nice paper demonstrating how neural networks are sometimes
more robust for real world problems than you would expect, even exceeding
the limitations of the framework within which they're trained.

Best,
Alex

On Sat, Nov 12, 2022 at 6:26 PM Alexander Rasmus <alex.m.rasmus at gmail.com>
wrote:

> Jon will probably chime in with his annoyance as well, but the headline
> here is fake. In particular, they have bungled the implementation of how a
> game of Go is scored severely to get the paper result. The key
> "technicality" of the paper, quoting from the article, is:
> "As a result of its [KataGo's] overconfidence in a win—assuming it will
> win if the game ends and the points are tallied—KataGo plays a pass move,
> allowing the adversary to intentionally pass as well, ending the game. (Two
> consecutive passes end the game in Go.) After that, a point tally begins.
> As the paper explains, "The adversary gets points for its corner territory
> (devoid of victim stones) whereas the victim [KataGo] does not receive
> points for its unsecured territory because of the presence of the
> adversary's stones.""
>
> Summarizing: KataGo gets so far ahead that it passes, the adversarial ai
> also passes, and they then hand the game to an incorrect scoring mechanism.
> KataGo maintains a score estimate of the game on its own in addition. To
> neatly transpose this to the situation of two human players, we will treat
> the incorrect scoring mechanism as if it is the adversarial nets estimate
> of the score.
>
> So, we have a good player vs a bad player, they both pass (tentatively
> ending the game), and they disagree on the score of the game. How this
> resolves depends on ruleset (though if you're playing casually the solution
> is almost always to resume play). Quoting from Wikipedia, which seems to do
> an okay job on this:
> ---quote---
> "Counting phase:
> Customarily, when players agree that there are no useful moves left (most
> often by passing in succession), they attempt to agree which groups are
> alive and which are dead. If disagreement arises, then under Chinese rules
> the players simply play on.
>
> However, under Japanese rules, the game is already considered to have
> ended. The players attempt to ascertain which groups of stones would remain
> if both players played perfectly from that point on. (These groups are said
> to be alive.) In addition, this play is done under rules in which kos are
> treated differently from ordinary play. If the players reach an incorrect
> conclusion, then they both lose.
>
> Unlike most other rulesets, the Japanese rules contain lengthy definitions
> of when groups are considered alive and when they are dead. In fact, these
> definitions do not cover every situation that may arise. Some difficult
> cases not entirely determined by the rules and existing precedent must be
> adjudicated by a go tribunal.
>
> The need for the Japanese rules to address the definition of life and
> death follows from the fact that in the Japanese rules, scores are
> calculated by territory rather than by area. The rules cannot simply
> require a player to play on in order to prove that an opponent's group is
> dead, since playing in their own territory to do this would reduce their
> score. Therefore, the game is divided into a phase of ordinary play, and a
> phase of determination of life and death (which according to the Japanese
> rules is not technically part of the game)."
> ---endquote---
>
> So, for Chinese rules, they play on and KataGo will almost certainly 'win'
> in a final sense, though if we assume both bots continue on with the same
> strategy, the rest of the game will consist of a countable number of passes
> and no other moves. For Japanese rules, we are probably best off just
> taking the guidance to assume 'perfect play' from both players to mean that
> we have either AlphaGo Zero or KataGo take over both positions and carry
> on--almost certainly leading to a victory by KataGo.
>
> If you encountered this in person, it would basically consist of a bad
> player playing terribly, insisting that they're ahead when they're not, and
> then continuing to argue the point indefinitely instead of playing the
> game. I don't think there's a neat transposition to chess due to the
> difference in how the games end, but the idea is probably something like
> someone playing the offensive portion of a fool's mate and then insisting
> they won even though it was successfully defended against. I can 100% make
> a bad scoring algorithm for chess which declares this a winning strategy,
> but that doesn't mean anyone should care...
>
> Assuming that the scoring mechanism in the paper is the same one used when
> scoring, e.g., AlphaGo Zero or KataGo self-matches (I do not know whether
> this is true), the correct headline would be something like "Adversarial
> net identifies inaccuracy in Go scoring algorithms used to train AIs." This
> would still be an interesting result, as you're demonstrating that KataGo
> can be tricked into "losing" when using the determiner of victor it was
> trained with. However, what they've identified is a region of play where
> the scoring mechanism is diverging from the game of Go, rather than a
> region of play where KataGo is bad at Go itself. Perversely, this would
> indicate that KataGo is MORE robust at the game of Go than the scoring
> mechanism it was trained against, which is the opposite of what the paper
> is saying directionally.
>
> Best,
> Alex
>
>
>
> On Thu, Nov 10, 2022 at 11:24 AM Roger Frye <frye.roger at gmail.com> wrote:
>
>>
>> https://arstechnica.com/information-technology/2022/11/new-go-playing-trick-defeats-world-class-go-ai-but-loses-to-human-amateurs/
>>
>> -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
>> FRIAM Applied Complexity Group listserv
>> Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom
>> https://bit.ly/virtualfriam
>> to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
>> FRIAM-COMIC http://friam-comic.blogspot.com/
>> archives:  5/2017 thru present
>> https://redfish.com/pipermail/friam_redfish.com/
>>   1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://redfish.com/pipermail/friam_redfish.com/attachments/20221112/564420cc/attachment.html>


More information about the Friam mailing list