Difference between revisions of "User:Hexanna"
m |
|||
Line 4: | Line 4: | ||
* A self-play game on [https://hexworld.org/board/#15n,f3l4d12g8d8f5c7e4n3k3l3e10c11c12d11f11m5n2m3d9i8n4m4i9g11g10h10h9j9i10k10k12l12k13m13l6m6l7m7l14m14l9j7k2m1k5k8m12j12l8h7h4b6c3b3c2b2b4b10c9g6i3j4k4d4c8a9b7c6b8e7d7e6d5:rb 15×15], where KataHex thinks long enough to have around 50k visits on the top move, and more if it's unsure between two moves. f3 is among the fairest openings on 15×15. | * A self-play game on [https://hexworld.org/board/#15n,f3l4d12g8d8f5c7e4n3k3l3e10c11c12d11f11m5n2m3d9i8n4m4i9g11g10h10h9j9i10k10k12l12k13m13l6m6l7m7l14m14l9j7k2m1k5k8m12j12l8h7h4b6c3b3c2b2b4b10c9g6i3j4k4d4c8a9b7c6b8e7d7e6d5:rb 15×15], where KataHex thinks long enough to have around 50k visits on the top move, and more if it's unsure between two moves. f3 is among the fairest openings on 15×15. | ||
* Two 19×19 self-play games, with [https://hexworld.org/board/#19n,a14j5o9d16n11l4f7e4b12e17p15g16c7h4i7i15d4d5b6k14f17f16d17d15i13c3c15b17b5e2e3f2f3g2h6h7e16d18i6h13c18f14i12p16o16i4g3h2h3i2i3j2j3k2l3k3k4j4k5n13l14m15k16k15l15l13m13n6r4q2q3p3j7h11f12j10k9n8m11n2m1n12m12l7g13h12k10l11f15m10g14f13g12f11d13e12e13e10b9b11c11d12h9h10c13d9c10c8b8c9b10e6f5g6f6g7g10g11f9:rw a14] and [https://hexworld.org/board/#19n,a19j5n8i15d15k14n13n15o15n16p16p17q16q17o17p15r14r15q15o5q6m18n19o3f7c17b17f5c14c16k7i11m10m8e5f4d4d3b3c2e3c15p3q5e16d18f17e19g18d14a16b15a15b14g12h10f11n7a14b13l8m4n6g13o6o7h12e6p5m7l9h14h13j12e12g7p6p4q10e14f14m5d8c7h4k8j8l7i10h11j10i9j9i7r3p8i8k5g6e10c10d11l4l5f8g9b12h5g5a13i6j4j6n12o12n10j11i12r6l12q8q9p9p10:rb a19] openings. Only 1k visits on the top move for these games. I think it's interesting how different the opening strategies are in these two games. | * Two 19×19 self-play games, with [https://hexworld.org/board/#19n,a14j5o9d16n11l4f7e4b12e17p15g16c7h4i7i15d4d5b6k14f17f16d17d15i13c3c15b17b5e2e3f2f3g2h6h7e16d18i6h13c18f14i12p16o16i4g3h2h3i2i3j2j3k2l3k3k4j4k5n13l14m15k16k15l15l13m13n6r4q2q3p3j7h11f12j10k9n8m11n2m1n12m12l7g13h12k10l11f15m10g14f13g12f11d13e12e13e10b9b11c11d12h9h10c13d9c10c8b8c9b10e6f5g6f6g7g10g11f9:rw a14] and [https://hexworld.org/board/#19n,a19j5n8i15d15k14n13n15o15n16p16p17q16q17o17p15r14r15q15o5q6m18n19o3f7c17b17f5c14c16k7i11m10m8e5f4d4d3b3c2e3c15p3q5e16d18f17e19g18d14a16b15a15b14g12h10f11n7a14b13l8m4n6g13o6o7h12e6p5m7l9h14h13j12e12g7p6p4q10e14f14m5d8c7h4k8j8l7i10h11j10i9j9i7r3p8i8k5g6e10c10d11l4l5f8g9b12h5g5a13i6j4j6n12o12n10j11i12r6l12q8q9p9p10:rb a19] openings. Only 1k visits on the top move for these games. I think it's interesting how different the opening strategies are in these two games. | ||
− | * | + | * The b4 opening appears to be weaker than all 6 of its neighbors. On a large enough board, maybe even 27×27, b4 could be a losing opening, and the swap map could contain a hole: |
+ | <hexboard size="5x4" | ||
+ | coords="show" | ||
+ | edges="top left" | ||
+ | contents="S red:all blue:(a1--d1 a2--d2 a3 b4)" | ||
+ | /> | ||
+ | * A 13×13 swap map, with KataHex's self-play Elo estimate of the swap advantage for each opening. Generated using around 30k visits for most moves. For the red hexes, the number corresponds to Blue's Elo advantage if she swaps Red's move; for the blue hexes, the number corresponds to Blue's Elo advantage if she does not swap Red's move. Smaller numbers correspond to fairer openings. Hexes without numbers are unfair openings that confer Blue more than a 300 Elo advantage. For example, the fairest opening is g3 (or g11), which KataHex thinks Blue should swap, leaving Blue with a 51.5% win rate, or 10 Elo. | ||
+ | ** Key takeaways: The "common" human openings c2, k2, a10, a13 are all reasonably fair. g3 has become more popular recently, for good reason. b4 is rarely played, but it seems fair enough to be suitable for even high-level human play. | ||
+ | <hexboard size="13x13" | ||
+ | coords="show" | ||
+ | contents="S red:all | ||
+ | blue:(a1--l1 a2--k2 a3 a11) | ||
+ | blue:(b13--m13 c12--m12 m11 m3) | ||
+ | E 239:(d3 j11) | ||
+ | 187:(e3 i11) | ||
+ | 48:(f3 h11) | ||
+ | 10:(g3 g11) | ||
+ | 68:(h3 f11) | ||
+ | 185:(i3 e11) | ||
+ | 158:(j3 d11) | ||
+ | 107:(a13 m1) | ||
+ | 161:(k2 c12) | ||
+ | 258:(d2 j12) | ||
+ | 110:(c2 k12) | ||
+ | 184:(b2 l12) | ||
+ | 189:(a2 m12) | ||
+ | 207:(a3 m11) | ||
+ | 143:(b4 l10) | ||
+ | 226:(b11 l3) | ||
+ | 247:(a4 m10) | ||
+ | 211:(a6 m8) | ||
+ | 219:(a7 m7) | ||
+ | 197:(a8 m6) | ||
+ | 171:(a9 m5) | ||
+ | 158:(a10 m4) | ||
+ | 131:(a11 m3)" | ||
+ | /> | ||
+ | |||
==Random unsolved questions== | ==Random unsolved questions== | ||
Line 49: | Line 86: | ||
* '''Motifs''' — very loosely related to joseki; small local patterns that occur in the middle of the board, usually representing optimal play from at least one side but not necessarily both sides | * '''Motifs''' — very loosely related to joseki; small local patterns that occur in the middle of the board, usually representing optimal play from at least one side but not necessarily both sides | ||
** Motifs have some notion of '''"local efficiency"''' (not to be confused with [[efficiency]]) — some motifs are, on average, good or bad for a particular player. Strong players anecdotally try to play locally efficient moves on large boards where calculating everything is impractical. It would be useful to have some of these rules of thumb written down. Can be thought of as a generalization of dead/captured cells, where LE(dead cell) = 0, and LE(X) ≤ LE(Y) if Y capture-dominates X. | ** Motifs have some notion of '''"local efficiency"''' (not to be confused with [[efficiency]]) — some motifs are, on average, good or bad for a particular player. Strong players anecdotally try to play locally efficient moves on large boards where calculating everything is impractical. It would be useful to have some of these rules of thumb written down. Can be thought of as a generalization of dead/captured cells, where LE(dead cell) = 0, and LE(X) ≤ LE(Y) if Y capture-dominates X. | ||
− | ** Here are some examples. In the first motif, Red 1 is often a weak move. Blue's best response is usually at a, or sometimes at b or c as part of a minimaxing play. But d is rarely (possibly never) the best move, because Red can respond with a, and Blue's central stone is now a dead stone. So, for any reasonable working definition of "local efficiency" LE, we have LE(d) < LE(a), and LE(b) = LE(c) due to symmetry | + | ** Here are some examples. In the first motif, Red 1 is often a weak move. Blue's best response is usually at a, or sometimes at b or c as part of a minimaxing play. But d is rarely (possibly never) the best move, because Red can respond with a, and Blue's central stone is now a dead stone. So, for any reasonable working definition of "local efficiency" LE, we have LE(d) < LE(a), and LE(b) = LE(c) due to symmetry. KataHex suggests that LE(a) > LE(b). |
<hexboard size="5x5" | <hexboard size="5x5" | ||
Line 79: | Line 116: | ||
edges="none" | edges="none" | ||
contents="B c2 d2 R 1:d3 B 2:b4 R 3:b3 B 4:c3" | contents="B c2 d2 R 1:d3 B 2:b4 R 3:b3 B 4:c3" | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
/> | /> |
Revision as of 20:17, 4 March 2023
Insights and tidbits from KataHex (hzy's bot)
- Two very fair openings using two-move equalization, on 11×11 and 13×13. Fairer than any opening with one-move equalization; KataHex thinks win probability is very close to 50% even if you let it think for a long time.
- A self-play game on 15×15, where KataHex thinks long enough to have around 50k visits on the top move, and more if it's unsure between two moves. f3 is among the fairest openings on 15×15.
- Two 19×19 self-play games, with a14 and a19 openings. Only 1k visits on the top move for these games. I think it's interesting how different the opening strategies are in these two games.
- The b4 opening appears to be weaker than all 6 of its neighbors. On a large enough board, maybe even 27×27, b4 could be a losing opening, and the swap map could contain a hole:
- A 13×13 swap map, with KataHex's self-play Elo estimate of the swap advantage for each opening. Generated using around 30k visits for most moves. For the red hexes, the number corresponds to Blue's Elo advantage if she swaps Red's move; for the blue hexes, the number corresponds to Blue's Elo advantage if she does not swap Red's move. Smaller numbers correspond to fairer openings. Hexes without numbers are unfair openings that confer Blue more than a 300 Elo advantage. For example, the fairest opening is g3 (or g11), which KataHex thinks Blue should swap, leaving Blue with a 51.5% win rate, or 10 Elo.
- Key takeaways: The "common" human openings c2, k2, a10, a13 are all reasonably fair. g3 has become more popular recently, for good reason. b4 is rarely played, but it seems fair enough to be suitable for even high-level human play.
Random unsolved questions
Most of these are very difficult to answer, and I would be happy if even a few were answered in the next few years:
- Hex on large boards
- If you trained a strong neural net AI on a 19×19 or larger board, what would its swap map look like?
- Is the obtuse corner always winning on larger board sizes?
- What about a move in the middle of Red's third row, like j3 on 19×19?
- Is the 4-4 or 5-5 obtuse corner still a good move in the early opening, or is it better to play closer to the center?
- For instance, imagine a board with an obtuse corner and sides extending to infinity. 4-4 is likely quite locally efficient with respect to this obtuse corner, for the same reason bots think it is optimal in 13×13. But 4-4 might not be locally optimal, and some other move (say, the 7-7 or 8-8 corner move, or something even further from the corner) could be ever-so-slightly more efficient on the infinite board, for deep tactical reasons that require far more space than on the 13×13 board.
- The above questions are partially solved with KataHex. On 19×19, KataHex thinks obtuse corner is fair but slightly winning; j3 is losing. The 4-4 obtuse corner is strong, and so is playing near the middle of the opponent's 5th row.
- If top humans or bots played 37×37 without the swap rule, how much of an advantage (in Elo terms) does the first player have, in practice?
- If you trained a strong neural net AI on a 19×19 or larger board, what would its swap map look like?
- Kriegspiel Hex (Dark Hex), a variant with incomplete information
- Under optimal mixed strategies, what is Red's win probability on 4×4?
- For larger boards (say, 19×19), is Red's win probability close to 50%?
- If so, a swap rule might not be needed for Kriegspiel Hex, which would be neat.
- If not, imagine a variant where Red's first move is publicly announced to both players, and Blue has the option to swap it. Which initial moves are the fairest now?
replies by Demer:
- https://zhuanlan.zhihu.com/p/476464087 has percentages, although it doesn't translate these into a guessed swap map and I don't know anything about the bot they came from.
- It suggests that [on 13x13, g3 is the most balanced opening] and [on 14x14, g3 should not be swapped].
- On 27x27 without swap, it likes the 4-4 obtuse corner opening slightly more than anything else nearby.
- As far as I'm aware, even 3×4 Dark Hex has not been solved. (https://content.iospress.com/articles/icga-journal/icg180057 apparently gives "some preliminary results" for that size.)
hexanna:
- Thank you, this is amazing! From the Google Translate, the bot is an adaptation of KataGo trained on 13×13 and smaller, using transfer learning to train larger nets on top of the 13×13 net for a short period of time. I may edit the swap rule article later with some insights.
- The results for up to 15×15 look very reliable to me. This is because many of the subtle patterns suggested by other bots, like leela_bot, appear in these swap maps. For example, on 13×13:
- a1–c1 are stronger than d1; a2–c2 ≥ d2 ≥ e2 in strength; and a similar relation holds for moves on the third row. See Openings on 11 x 11#d2.
- b4 is weaker than all of its neighbors, because Blue can fit the ziggurat in the corner.
- j3 is surprisingly weak and i3 is surprisingly strong. Many people were surprised about this when leela_bot's swap map came out, but the result may be more than just random noise.
- a10 is the weakest of a4–a10, while a5 is the strongest.
- b10 is stronger than all of its neighbors, because Blue cannot fit the ziggurat in the obtuse corner.
- That this bot picked up on all these subtleties, and assigns a win percentage close to 100% for most moves on 13×13, suggest to me that it is probably stronger than leela_bot and gzero_bot. I can't know for sure, though.
- On the other hand, and the author seems to agree, the 37×37 map looks very unreliable. I see percentages as low as 37% but only as high as 54% (for a move like f1, which should almost certainly be a losing move).
- The 27×27 map looks more reliable. I'm personally very skeptical that moves on Red's 6th row are among the most balanced moves, but there are some interesting (if somewhat noisy) insights to be had still.
- The results for up to 15×15 look very reliable to me. This is because many of the subtle patterns suggested by other bots, like leela_bot, appear in these swap maps. For example, on 13×13:
Article ideas
- Motifs — very loosely related to joseki; small local patterns that occur in the middle of the board, usually representing optimal play from at least one side but not necessarily both sides
- Motifs have some notion of "local efficiency" (not to be confused with efficiency) — some motifs are, on average, good or bad for a particular player. Strong players anecdotally try to play locally efficient moves on large boards where calculating everything is impractical. It would be useful to have some of these rules of thumb written down. Can be thought of as a generalization of dead/captured cells, where LE(dead cell) = 0, and LE(X) ≤ LE(Y) if Y capture-dominates X.
- Here are some examples. In the first motif, Red 1 is often a weak move. Blue's best response is usually at a, or sometimes at b or c as part of a minimaxing play. But d is rarely (possibly never) the best move, because Red can respond with a, and Blue's central stone is now a dead stone. So, for any reasonable working definition of "local efficiency" LE, we have LE(d) < LE(a), and LE(b) = LE(c) due to symmetry. KataHex suggests that LE(a) > LE(b).
The motif below seems quite common on large boards, and in my experience it is usually good for Red, who allows Blue to connect 2 and 4 in exchange for territory.
The following motif is quite clearly good for Blue, who captures the two hexes marked (*):
Sometimes, a player will attempt to minimax by placing two stones adjacent to each other, like the unmarked blue stones below. Red has several options, such as the adjacent block 1, though a far block is often possible too. It would be enlightening to know, absent other considerations, which block is the most "efficient" for Red, so that on a large board, Red could play this block without thinking too hard. Of course, in general the best move depends on the other stones on the board, and there's no move that strictly dominates another. The best move may even plausibly be to "play elsewhere."