Difference between revisions of "User:Hexanna"
(recursive swap) |
|||
Line 34: | Line 34: | ||
69:(q2 c18) | 69:(q2 c18) | ||
96:(b17 r3) | 96:(b17 r3) | ||
+ | 201:(b18 r2) | ||
77:(a2 s18) | 77:(a2 s18) | ||
67:(b2 r18) | 67:(b2 r18) | ||
56:(c2 q18) | 56:(c2 q18) | ||
100:(a3 s17) | 100:(a3 s17) | ||
+ | 137:(b3 r17) | ||
+ | 157:(c3 q17) | ||
83:(a4 s16) | 83:(a4 s16) | ||
73:(b4 r16) | 73:(b4 r16) | ||
Line 47: | Line 50: | ||
41:(a10 s10) | 41:(a10 s10) | ||
81:(a11 s9) | 81:(a11 s9) | ||
− | + | 115:(a12 s8) | |
78:(a13 s7) | 78:(a13 s7) | ||
56:(a14 s6) | 56:(a14 s6) | ||
Line 93: | Line 96: | ||
** On the other hand, and the author seems to agree, the 37×37 map looks very unreliable. I see percentages as low as 37% but only as high as 54% (for a move like f1, which should almost certainly be a losing move). | ** On the other hand, and the author seems to agree, the 37×37 map looks very unreliable. I see percentages as low as 37% but only as high as 54% (for a move like f1, which should almost certainly be a losing move). | ||
** The 27×27 map looks more reliable. I'm personally very skeptical that moves on Red's 6th row are among the most balanced moves, but there are some interesting (if somewhat noisy) insights to be had still. | ** The 27×27 map looks more reliable. I'm personally very skeptical that moves on Red's 6th row are among the most balanced moves, but there are some interesting (if somewhat noisy) insights to be had still. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
==Recursive swap== | ==Recursive swap== |
Revision as of 11:59, 4 May 2024
I started learning Hex mid-July of 2021, about 2 weeks before I created my LittleGolem account.
Contents
Insights and tidbits from KataHex (hzy's bot)
- katahex_model_20220618.bin.gz (I'll call this the "strong" net) appears significantly stronger than the "default" net.
- Swap map for 19×19 generated with the strong net, with around 15k visits for most moves. For the red hexes, the number corresponds to Blue's Elo advantage if she swaps Red's move; for the blue hexes, the number corresponds to Blue's Elo advantage if she does not swap Red's move. Smaller numbers correspond to fairer openings. Includes all fair openings as well as a few selected unfair openings, including the strongest move without swap (e10). I used more visits for the fairest moves: 1000k for e3 (49.5% win rate), 500k for n3 (49.2%), 100k for a15 (52.5%).
- Key takeaways: The swap map at Swap rule#Size 19 uses data that is almost certainly from the "weak" net. Compared to the weak net, the strong net notably thinks a19, n3—p3, and k4—l4 are stronger. I personally trust the strong net's evaluations more; I think it's dubious that the weak net thought l4 was a very fair opening. The nets disagree on whether e3 is winning or losing, though it's so close to 50% that the difference isn't meaningful.
Random unsolved questions
Most of these are very difficult to answer, and I would be happy if even a few were answered in the next few years:
- Is the obtuse corner always winning on larger board sizes? What about the b4 opening? Let P(n) be the statement that "the obtuse corner is a winning opening in n×n Hex without swap." There are a few possible cases; an interesting exercise is to come up with subjective probabilities of each case being true.
- A. P(n) is always true. If so, can we prove this?
- B. P(n) is true for infinitely many n, with finitely many counterexamples. If so, what's the smallest counterexample?
- C. P(n) is true for infinitely many n, with infinitely many counterexamples. If so, does P(n) hold "almost always," "almost never," or somewhere in between?
- D. P(n) is true for finitely many n. If so, what's the largest such n?
- Kriegspiel Hex (Dark Hex), a variant with incomplete information
- Under optimal mixed strategies, what is Red's win probability on 4×4?
- For larger boards (say, 19×19), is Red's win probability close to 50%?
- If so, a swap rule might not be needed for Kriegspiel Hex, which would be neat.
- If not, imagine a variant where Red's first move is publicly announced to both players, and Blue has the option to swap it. Which initial moves are the fairest now?
replies by Demer:
- https://zhuanlan.zhihu.com/p/476464087 has percentages, although it doesn't translate these into a guessed swap map and I don't know anything about the bot they came from.
- It suggests that [on 13x13, g3 is the most balanced opening] and [on 14x14, g3 should not be swapped].
- On 27x27 without swap, it likes the 4-4 obtuse corner opening slightly more than anything else nearby.
- As far as I'm aware, even 3×4 Dark Hex has not been solved. (https://content.iospress.com/articles/icga-journal/icg180057 apparently gives "some preliminary results" for that size.)
hexanna:
- Thank you, this is amazing! From the Google Translate, the bot is an adaptation of KataGo trained on 13×13 and smaller, using transfer learning to train larger nets on top of the 13×13 net for a short period of time. I may edit the swap rule article later with some insights.
- The results for up to 15×15 look very reliable to me. This is because many of the subtle patterns suggested by other bots, like leela_bot, appear in these swap maps. For example, on 13×13:
- a1–c1 are stronger than d1; a2–c2 ≥ d2 ≥ e2 in strength; and a similar relation holds for moves on the third row. See Openings on 11 x 11#d2.
- b4 is weaker than all of its neighbors, because Blue can fit the ziggurat in the corner.
- j3 is surprisingly weak and i3 is surprisingly strong. Many people were surprised about this when leela_bot's swap map came out, but the result may be more than just random noise.
- a10 is the weakest of a4–a10, while a5 is the strongest.
- b10 is stronger than all of its neighbors, because Blue cannot fit the ziggurat in the obtuse corner.
- That this bot picked up on all these subtleties, and assigns a win percentage close to 100% for most moves on 13×13, suggest to me that it is probably stronger than leela_bot and gzero_bot. I can't know for sure, though.
- On the other hand, and the author seems to agree, the 37×37 map looks very unreliable. I see percentages as low as 37% but only as high as 54% (for a move like f1, which should almost certainly be a losing move).
- The 27×27 map looks more reliable. I'm personally very skeptical that moves on Red's 6th row are among the most balanced moves, but there are some interesting (if somewhat noisy) insights to be had still.
- The results for up to 15×15 look very reliable to me. This is because many of the subtle patterns suggested by other bots, like leela_bot, appear in these swap maps. For example, on 13×13:
Recursive swap
Not really a serious suggestion, just for fun. One advantage of "recursive swap" over multi-stone swap is that opening preparation plays a smaller role, because both players have control over the first n stones.
RECURSIVE_SWAP'[k, depth, color]: if depth = 0: [color] continues playing as normal. else: [color] plays a move. [~color] can either swap[k], or RECURSIVE_SWAP'[k+1, depth-1, ~color] RECURSIVE_SWAP[n]: RECURSIVE_SWAP'[1, n, Red]
RECURSIVE_SWAP[0]:
Red continues playing as normal.
RECURSIVE_SWAP[1]:
Red plays a move. Blue can either
- swap, or
- continue playing as normal.
RECURSIVE_SWAP[2]:
Red plays a move. Blue can either
- swap, or
- play a move, after which Red can either
- swap2, or
- continue playing as normal.
RECURSIVE_SWAP[3]:
Red plays a move. Blue can either
- swap, or
- play a move, after which Red can either
- swap2, or
- play a move, after which Blue can either
- swap3, or
- continue playing as normal.
Analysis
RECURSIVE_SWAP[0] is the same as playing with no swap.
RECURSIVE_SWAP[1] is the same as playing with the swap rule.
RECURSIVE_SWAP[2]:
- Red shouldn't play a move that's too strong or it'll be swapped.
- If Red plays a weak stone, Blue should try to play a move just strong enough that Red will be indifferent to swap2. (If a "fair" move is half a stone, and Red plays a weak move worth x < 0.5 stones, Blue should play a move worth x + 0.5 stones.)
- Red should try to play a weak move that's also hard for Blue to equalize (so that Red gets a sizable advantage when deciding whether to swap2 or not).
RECURSIVE_SWAP[3]:
- If Red plays a move worth x > 0.5 stones, Blue should swap.
- If Red plays a weak stone worth x < 0.5, Blue should play a move worth less than x + 0.5 (or else Red will swap2). If Blue's move is worth y, then Red should play a move as close to (y - x) + 0.5 as possible, so that Blue's swap3 decision is difficult.
- Red should play a weak move that's hard for Blue to find a tricky reply to (where a "tricky" reply is one that makes it hard for Red to equalize, such that Blue has an easy time deciding whether to swap3 or not).
HexWorld bugs:
On 30x30, ad6 and ad11 are "dead" hexes that you can't click on. They don't show up even if you specify them in the url: here. All other hexes look fine.
On 31x31, the same clicking and url issue occurs for ad6, ad11, and ad31: here
- Interesting. It turns out that this bug is caused by an ad blocker, in my case Adblock Plus, although Comonoid reproduced it with AdGuard as well. The ad blocker doesn't like column 30 because its name is "ad". It adds the following CSS rules, among thousands of others:
ad6 {display: none !important;} ad11 {display: none !important;} ad31 {display: none !important;}
- That's why those 3 cells are disabled, and no others! I will fix the bug soon.
- By the way, your HexWiki user page might not be the most efficient place to report a bug. Selinger (talk) 22:21, 20 April 2024 (UTC)
Fjan2ej57w's question 7, "how much space of an empty board would be filled if both sides play optimally":
Stack Overflow answer for reference. My conjecture is that Hex without swap asymptotically requires Θ(n^2) cells, and more generally, a Demer handicap of Θ(f(n)) stones requires Θ(n^2/f(n)) cells, for all f(n) between Θ(1) and Θ(n). My intuition is that on 1000000×1000000 Hex, the first-player advantage is minuscule, and even a handicap of n^(1/2) = 1000 stones, say spaced out evenly across the short diagonal, would require on the order of "1000 columns and 1000000 rows", n^(3/2), to convert to a final connection. Another interesting question is to find a constructive winning strategy with an o(n) (sub-linear) handicap.
reply by Demer: Even one with n/6 - ω(1) handicap would be interesting. (improving on https://webdocs.cs.ualberta.ca/~hayward/papers/handicap.pdf )