Comments/Ratings for a Single Item



How about this for a new army?
The Bent Bozos
- a1, h1: Chiral Griffons
- b1, g1: Chiral Aancas
- c1, f1: Sastiks
- d1: Griffon
- e1: King
All pieces that appear in pairs are chiral, i.e. not identical to their mirror image, and thus come in two varieties. The piece used on the King side will be the mirror image of that used on the 'Queen' side. They will be positioned such that their moves in the initial position deflect away from the King. I.e. pieces starting on the left wing will have trajectories that bend to the left, and those that start on the right wing will bend to the right. This makes it possible for the Chiral Aanca to move out from behind the Pawn wall after the d- and e-Pawns have been pushed, and the Sastik has been developed.
G . . . . K a A a G = Chiral Griffon starting on a8 . g . . . a . a . A = Chiral Aanca starting on g8 . g . . a . . . a g = path of a8 . g . a . . . . . a = path of g8 . g a . . . . . . x = on both a and g paths . x . . . . s s . a g . . s . . . . S = Sastik starting on f1 . g . . s . S . s s = squares attacked by f1
The Sastik is a compound of a Dababba and a Chiral Knight. It has the Knight moves that ly in the same direction off the orthogonals as the wing they start on.
The Griffon is of the variety that also has the Ferz moves.
Mating potential
The Chiral Griffons, and thus the full Griffon, can each force checkmate on a bare King. The Chiral Aanca is a color alternator, and thus cannot. The Sastik covers two orthogonally adjacent squares, and thus in principle can inflict corner mates. But it turns out such mates can be forced only when they are at most 4 moves away, from positions where the bare King is already confined to two corner squares by its rival. A single Sastik is not able to drive the King there.
For the pieces that have no mating potential holds that they can always force mate in pairs. There are no exceptions like KNNK in orthodox Chess. The mates are usually not of the common corner or edge type, (i.e. on a1 or b1, with the attacking King on b3, or a symmetry-equivalent constellation), though.
Strength
The Bent Bozos are about equally strong as FIDE. At least in a direct confrontation, where they beat it with a score of 51.5% over 800 games (alternating color). This would make them one of the weakest armies, as the Clobberers, Nutters and Rookies typically beat FIDE by ~60% or more. As for individual piece values, a Griffon tested as 830 cP, Chiral Griffons as 465 cP, Chiral Aancas as 425 cP, while the Sastiks are expected to be of Knight value (i.e. 325 cP). All this on the Kaufman scale, where Queen = 950 cP.
Since Ralph Betza took 20 years to change Chess Unequal Army to Chess Different Army, there is time to work out the exact name, able to be called "Falcon Scorpion Army" too.
The values re-calibrated for Scorpion also not able to make two completely forward moves in a row are Falcon 4.5, Scorpion 3.5, Dragon 3.0. That makes 22 points and leaves 9 points for Queen. Notice Falcon is equal to Rook on boards 8x10 and up, but on little 64 would be just 4.5. Muller or programmers could test different configurations but they're bound to find '4.5' on the primitive 64-square board.
The Were-Queen moves to Queen squares like Tim Stiles' Wolf and Fox: Wolf, Fox. WereQueen is two-path and has different pathways getting to its squares than Boyscout or Panda or Strange Bishop that Jeremy Good talks about (in emails) for his defined Mal-Queen. By definition, WereQueen excludes the Stiles' squares not belonging to standard Queen, and by prior definition Fox and Wolf cannot stop adjacently. There is one prior use of Wolf and Fox in a CV, what is it?
If it turns out this Army is 32 not 31 points, we can prohibit the squares up to two away if necessary, thus making the first stops (0,2) and (2,2). As it is, WereQueen is compound of Fox and Wolf limited to Queen-radial directions and each arrival square is the two-path. The improvement suggested by Jeremy keeps all the piece-types multi-path. Since the potent eight adjacent squares are excluded, WereQueen may just be the wanted 9 points though benefiting a bit by having second pathway compared to 500-year orthodox Queen. It should be interesting to play, because no pieces except Pawns and King attack an adjacent square, but both Scorpion and Falcon have solitary mating potential, that is sovereign value of 1! So 5 starting pieces can mate alone with King on King.
Is there any significance 31 (points) is the first-two-digit-prime 11th prime? No.
When placing Fantasy Grand of 100 squares, there is this summary of differing forces in CVs that fits in now: http://www.chessvariants.org/index/displaycomment.php?commentid=24682. Another Chess Different Armies, Fantasy Grand is ranked 18 now at Next Chess of 26 placed. Next Chess as my personal take admittedly has bias to larger boards. Yet deliberately excluded for consideration all along has been Betza's CDA. That is because it is too good for "next chess" since it is already well-established among CVers as the best bet to save the small board. So I am all in favor of Jeremy's all of a sudden overstessing CDA. Other GC highly played games are centuries old: Shogi, Xiangqi, OrthoChess. Forty-year-old CDA needs some company and at top of Next Chess so far are Bifurcators playable on 64 and Great Shatranj suitable for the best board of all Capablanca-chosen 8x10. The above linked thread asks, why not have differing yet equally-advantaged rules one side to the other, besides or rather than pieces? One follow-up genre should certainly create different armies for that 8x10 board intermediate between CDA and Fantasy Grand. All it takes to start is adding one paired piece to pre-existing CDA forces.
Re: CDA for 8 x 10, here's one attempt, an off-the-cuff one, pairing an unnamed CDA against Derek Nalls' Carrera contribution where pawn protection is maximized in the opening setup.
Btw, one reason why Waffles are so popular in CDAs is because they are fun to play with and they just feel really good as knight substitutes. Though I too strive for orginality, it's not a "no no" per se to use the same pieces in different CDAs. Ralph Betza did it a lot.
Where I have my yet undefined "st-queen" substituted for the queen piece (can go to all the same squares as a queen via different routes), I might actually like to put a more thematic BNW (cardinal-wazir) but I'm not sure if there is such a thing yet on the Alfaerie - Many set. I may need to create one and get it uploaded. Do you know of such a piece? I've been looking for ways to modify the cardinal to make it equal to the queen on an 8 x 8 board and this way occurred to me this morning.
My most questionable substitute might be the gryphon for the marshall but I think what the marshall loses in the expanded board, the gryphon gains.
Now here's another challenge, George: Create a CDA for 8 x 10 that utilizes the Dragon, Scorpion and Falcon. This time our Dragon shall breathe a little easier.
Purely for the sake of diversity, I've taken the liberty of replacing the Shatranjian Knights with crab-ferzes.
Always look for something like tri-compound (Wazir + Knight + Bishop) in Gilman first. http://www.chessvariants.org/index/msdisplay.php?itemid=MSconclaveecumen has Primate and Cardinal in the same sentence under Pieces, but does not combine them, Shogi Dragon Horse and Carrera Centaur. Then Gilman's 'M&B11 Long-Nosed Generals' has a lot of other Wazir compounds. Actually the Dragon slides pretty well back and forth on 8x8 short of the central four, protected by the Pawns.
By saying it would breathe a little easier, I wasn't meaning to imply that the Dragon in an 8 x 8 is not still a fun piece to play with.
Please let me know, here or in email, what you think of this queen, whether it works for you as a were-queen piece or if you think we need a different piece.
That looks perfect Jeremy for the GC Preset WereQueen and look forward to actual GC game of it because of the surprise that Scorpion and Dragon work well there.
This game is interesting but unbalanced. Of the three new armies, the Nutty Knights are the most powerful. None of them are colorbound, and five non-royal pieces are major pieces. That is more major pieces than each side in Chess has, which is only three. The Remarkable Rookies are more balanced with the usual Chess army. A Chancellor is weaker than a Queen, a short Rook is weaker than a Rook, and a Half-Duck combines the colorboundness of the Bishop with the short-range of the Knight, making it weaker than both. The only advantage of the Remarkable Rookies over the FIDE army is that the Woody Rook, which replaces the Knight, is a major piece, giving this side five major pieces instead of three. The Colorbound Clobberers are the weakest of all. Each side has four colorbound pieces, and the only major piece is weaker than the most powerful major piece in each of the other armies. The Cardinal is weaker than the Queen and Chancellor and probably the Colonel too, because these all have Rook moves, and the Cardinal doesn't.
In order to put data behind what I was saying, I ran Chess with Different Armies on Zillions of Games at 2 seconds per move 12 times, once for each combination of different armies against each other. These are the results:
- Colorbound Clobberers (W) beat Fabulous FIDEs and Nutty Knights, drawing with Remarkable Rookies.
- Fabulous FIDEs (W) beat Remarkable Rookies and Nutty Knights, losing to Colorbound Clobberers
- Nutty Knights (W) drew games with Remarkable Rookies and Fabulous FIDEs, losing to Colorbound Clobberers
- Remarkable Rookies (W) beat Fabulous FIDEs and Nutty Knights, drawing with Colorbound Clobberers
Giving one point for a win, half a point for a draw, here is how they did:
- Colorbound Clobberers:
54.5 - Remarkable Rookies: 4.5
- Fabulous FIDEs:
2.52 - Nutty Knights: 1
These results overturn my initial estimations. It may be that I overestimated endgame values and underestimated the value of mid-game tactics. But this is only one set of trials, and at two seconds a move, there is still room for bad moves. I'll do more trials later with longer periods of time to move.

Note that the Half-Duck is not color bound, through its (3,0) move, and actually is a major piece, and worth about as much as a Rook. I once did the same kind of testing as you are doing now, only with Fairy-Max (which also features CwDA as one of its standard games, selectable as 'fairy' from WinBoard's variant selection menu) instead of Zillions, and playing several hundred games between each of the armies. My conclusion was that the Nutty Knights were the strongest, more than 1 Pawn value stronger than FIDE, (which translates to ~75% score) then the Clobberers, which were just a bit better than the Rookies, and then FIDE. Being color bound hardly hurts, if the pieces come in a pair. Mating potential also only contributes very little to piece value; the Woody Rook has it, but is (slightly) weaker than a Knight over the game as a whole, because of its lower speed, and low forwardness. So the FAD and the DB are really Rook-class pieces, (with 12 move targets, while 8 move targets is already enough to rival the Knight in power), and much easier to develop than the Rook. A pair of them can perform a checkmate similar to the hand-over-hand mating by a pair of Rooks. This more than makes up for the Archbishop being nearly a Pawn weaker than a Queen. The Nutty Knights deserve their name, as they are really wierd pieces. It surprised me how strong that Charging Knight is. The combination of speed through the fhN moves and concentration of attacks by the King moves seems to be a very productive one.
Thanks for mentioning that about the Half-Duck. I overlooked that, because its three-space leap had no link to a piece description. This means the Remarkable Rookies are all major pieces, which may give this army an even greater advantage. The results from my latest tests seem to bear this out. I ran another set of tests today. I ran 12 simultaneous instances of Zillions of Games playing CWDA in expert mode at three minutes per move, which was the maximum time it could take for a move (short of forever, which would have required constant human intervention). I then let the 12 games play while I went to work. The last one just finished recently. So now I can give the results. Colorbound Clobberers, playing White, beat Nutty Knight and Fabulous FIDEs but lost to Remarkable Rookies. Playing Black, they beat Fabulous FIDEs but lost to Nutty Knights and drew with Remarkable Rookies. Total score: 3.5 Fabulous FIDEs lost every game. Total score: 0 Nutty Knights, playing White, beat Colorbound Clobberers and Fabulous FIDEs, losing to Remarkable Rookies. Playing Black, they beat Fabulous FIDEs and Remarkable Rookies, losing to Colorbound Clobberers. Total score: 4 Remarkable Rookies, playing White, beat Fabulous FIDEs, drew Colorbound Clobberers, and lost to Nutty Knights. Playing Black, they beat everyone. Total score: 4.5 So now the ranking is Remarkable Rookies first, Nutty Knights second, Colorbound Clobberers third, and Fabulous FIDEs last. These results are more in line with my initial evaluations. Given that the Remarkable Rookies have seven major pieces, the greatest number any army has, it might be the most powerful, and these results suggest that. Since these results are based on much longer thinking time than my earlier results, I would consider them more reliable.

I never saw any difference depending on the quality of play, in such tests. The score from a particular starting position was the same, whether I played it at 40 moves/10 min, 5 min, 2 min or 1 min. So I don't think there is any reason to assume longer thinking time will make the results more reliable. What is always a concern, however, is the number of games. With normal draw rates (about 30%) the typical error in the measured score percentage would be 40%/sqrt(N), where N is the number of games. So even with 100 games the error is still some 4%, while Pawn odds in orthoChess results in a score advantage of about 18% (i.e. 68% total). So with 100 games you will typically get errors of about 0.25 Pawn, in the strength determination. With only 10 games between two armies the error would be about 12.5%, i.e. about 0.7 Pawn. This is more that the typical difference between the armies (although the extreme case of Nutters vs FIDE came out above 1 Pawn), so in most cases any observed advantage with so few games would just be due to luck rather than strength. For this reason I now always test at 40 moves/min, as that makes it easier to collect the several hundred games I need to reduce the random noise to below 0.1 Pawn. I don't know how that would translate to Zillions time controls. I suppose that Zillions at 3 min/move could beat Fairy-Max at my average 1.5 sec/move. (I do not have Zillions.) One should take into account that when engines get a maximum or fixed time per move, they rarely can use all the time effectively, and might waste more than half of it, while in classical time controls the engine can allocate time such that none of it is wasted.
I ran another set of trials with Chessv. I had it play all 12 combinations of armies playing white and black for 15 seconds per move. This is equivalent to the average time of 40 moves in 10 minutes, and it is also equivalent to the time 12 games playing at once for 3 minutes per move would get if they all shared the same processor. Zillions-of-Games was written before multi-core processors existed and so probably doesn't make use of them, but hopefully Windows 7 was spreading the load of 12 games among my four cores. Assuming it was, my ZOG trials had more than 15 seconds per move. Also, when some games finished, the remaining games got more processing power. Anyway, these games were each played sequentially, so that each one had as much processing power as it could get. Here are the results:
Playing white, Colorbound Clobbers beat Fabulous FIDEs and Nutty Knights but lost to Remarkable Rookies. Playing black, they beat Nutty Knights, drawing Fabulous FIDEs and Remarkable Rookies. Total score: 4
Playing white, Fabulous FIDEs drew Colorbound Clobberers and Remarkable Rookies, losing to Nutty Knights. Playing black, they drew Nutty Knights but lost to Colorbound Clobberers and Remarkable Rookies. Total score: 1.5
Playing white, Nutty Knights drew Fabulous FIDEs and Remarkable Rookies, losing to Colorbound Clobberers. Playing black, they beat Fabulous FIDEs, losing to Colorbound Clobberers and Remarkable Rookies. Total score: 2
Playing white, Remarkable Rookies beat Fabulous FIDEs and Nutty Knights, drawing Colorbound Clobberers. Playing black, they beat Colorbound Clobberers, drawing Fabulous FIDEs and Nutty Knights. Total score: 4.5
This gives a ranking of Remarkable Rookies first, Colorbound Clobberers second, Nutty Knights third, and Fabulous FIDEs last. Like my three minute move trials with ZOG, Remarkable Rookies are first, and Fabulous FIDEs are last. One thing that is notable about the Chessv results is that there were several more draws. This is a good sign of balance between the armies. It's also notable that the first and last place armies each drew every other army once. The only two armies who never drew each other were Colorbound Clobberers and Nutty Knights, where the Colorbound Clobberers beat the Nutty Knights twice.
It's noteworthy that the Fabulous FIDEs have come in last place in my last two sets of trials. Ralph Betza came up with this game before the software for running computerized trials of the game was available. Over the course of 20 years, he came up with these armies by playing with people, presumably those who were very skilled at chess like he was. His target audience, after all, was the skilled chess player looking for a new challenge. When these armies are handled by people skilled and experienced in Chess, their familiarity with the Fabulous FIDEs would have made it easier for them to use them well, and their lack of familiarity with the new armies would lead them to not fully maximize their potential. So, armies that were technically stronger than the FIDEs would appear balanced with FIDEs in human trials with skilled chess players. I expect that is what happened here. All of the new armies are stronger than the FIDEs, but this didn't show up in human trials.
It showed up in the computerized trials, because the computer programs treated each army as a mathematical construct rather than as something familiar or unfamiliar. To the computer programs, no army was more familiar than another. It played each side with mathematical precision, bringing to light differences of strength between them. The Remarkable Rookies are probably the strongest, having seven major pieces.
Combining the results of the last two trials, the scores are:
- Remarkable Rookies: 9
- Colorbound Clobberers: 7.5
- Nutty Knights: 6
- Fabulous FIDEs: 1.5
This gives the same order as this trial.
Combining the results of all three trials, the scores are:
- Remarkable Rookies: 13.5
- Colorbound Clobberers: 12
- Nutty Knights: 7
- Fabulous FIDEs: 3.5
Again, we get the same order. So what factors might account for this? The Remarkable Rookies have seven major pieces, as I mentioned. So, in the endgame, Remarkable Rookies is likely to still have one or two major pieces, whereas other armies might not. Also, the Chancellor is more powerful than the Cardinal or Colonel. One of the strengths that all three new armies have that FIDE doesn't have is that their pieces can back each other up more easily. They each have different pieces with some moves in common, which allows them to continually protect each other more easily. One of the weaknesses of the Nutty Knights is in backward mobility. Although they are very good attack pieces, they can be useless in stopping a pawn promotion if they are too far away.

I used that to do some more accurate Rookies measurements, by playing the Rookies against all other armies at 40 move/min with Fairy-Max 4.8U, both with black and with white. Each of the color/army combinations played 400 games. The table below summarizes the reults; +13% in the table means that the white army scored 13% better than equality, i.e. 63%, etc. Draw percentages were only 20-25%, i.e. significantly lower than the 32% seen in FIDE vs FIDE. For completeness I am redoing matches between the other armies as well. I am playing 4 matches simultaneously, on my 4-core (8 hyper threads) i7 CPU; this way each of the matches will use its own physical core, so that they won't affect each other. (Fairy-Max also uses only 1 core; using more cores in these kinds of tests is quite counter-productive anyway, as a lot of computational effort is wasted when multiple cores have to cooperate in finding the best move.)
black w Rook Nutt Clob FIDE total h Rookies #### +3% +13% +15% +62% i Nutters -3% #### +3% +12% +19% t Clobberers -13% +0% #### +8% -11% e FIDE -16% -10% -10% #### -71%(All results based on 400 games.)
A Pawn is in general worth 15-18% (I did not test that in these particular setups). So the strength differences seem to run upto about 1 Pawn. Indeed it seems that the Rookies are the strongest of all, in agreement with your ChessV and Zillions tests. I find them only marginally stronger than the Nutters, however, while the Clobberers seem to be aboult half-way in strength between them and FIDE. Statistical error in 400 games is 2%, so that means that the 3% of the Rookies-Nutters result can make use decide with >90% confidence that the Rookies indeed have the better chances here.
As to the explanation why FIDE is the weakest, I think you are correct: humans fail to get the most out of an unfamiliar army. This explanation is a bit suspect though (although I do not have any beter one): it is not like the FIDE side is playing normal Chess, and failing to grasp what the opponent can do should also put him to unpleasant surprises, as well as making his opponent miss opportunities. These computer self-tests are really highly insensitive to poor handling of the pieces of one side. If I intentionally mess up the play for one of the armies by putting in wrong piece values (e.g. give a piece that is stronger than a Rook a value lower than that of the Rook) it hardly changes things. The side handling the piece will now try to (inappropriately) trade it for a Rook, but the Rook side will now try to avoid that trading (equally inappropriately), so that in the end very little trading takes place, and both pieces will remain on the board to see the effect of which one is more effective in, say, gobbling up Pawns or delivering checkmates. As long as both sides share the same misconception, (and it is not too hilarious, like swapping Pawn and Queen value) the results are practically the same. It is not clear to me why this would be different for humans.
With the new results on the relative strengths of the different armies, how can they be fine-tuned to the FIDE standard? For the Nutty Knights several proposals exist (replacing the charging knight with a drunken night or with a charging moo, e.g.); but what about the other armies? The Rookies can be weakened in two obvious ways (a) Replacing the Short rook R4 with R3 or (b) making the Woody Rook WD a non-jumping R2. I think both adjustments will have the right size of effect. The Colorbound Clobberers are more difficult because the adjustment needed is smaller. Maybe replacing the Bede (BD) with a BzF2 (Bishop + Crooked Bishop aka Boyscout restricted to 2 moves) has the right size of effect. What would be a good name for the BzF2? EDIT: Changing the notation from BzF2 to BzB2 suggests the nice name "Busy Beaver" for this piece.

I am still a bit worried on the reliability on these strength determinations, because the Nutters seem to get an inconsistent result. They seem to underperform against the Clobberers, and I don't understand why. The Nutters are a very tricky army to handle. They are highly deficient in backward moves, and the only piece that at least can step 2 backward (the Fibnif) can only step 1 sideway at the time. This makes it very difficult for them to catch a passer. Fairy-Max often is not aware of that before it is too late. Perhaps an engine that would intentionally keep at least one piece as 'rear guard' would do much better with the Nutters. But this problem should occur against any army, so it cannot explain the poor score of the Nutters against the Clobberers.
I have finished running a new set of 12 trials. These ran on Chessv at four minutes per move. Since the last three games to finish were taking a long time, I stopped them occasionally and finally finished them at two minutes per move. By this time, that would probably still give them more thinking time than they had when all 12 games were going simultaneously. Here are the results: Colorbound Clobberers, as white, beat Fabulous FIDEs and Remarkable Rookies, losing to Nutty Knights. As black, they beat Nutty Knights, losing to Fabulous FIDEs and Remarkable Rookies. These are perfectly even results. Total score: 3 Fabulous FIDEs, as white, beat Colorbound Clobberers, drew Nutty Knights, and lost to Remarkable Rookies. As black, they drew Nutty Knights but lost to Colorbound Clobberers and Remarkable Rookies. Total score: 2 Nutty Knights, as white, drew Fabulous FIDEs and Remarkable Rookies but lost to Colorbound Clobberers. As black, they beat Colorbound Clobberers and Remarkable Rookies, drawing Fabulous FIDEs. Total score: 3.5 Remarkable Rookies, as white, beat Colorbound Clobberers and Fabulous FIDEs, losing to Nutty Knights. As black, they beat Fabulous FIDEs, drew Nutty Knights, and lost to Colorbound Clobberers. Total score: 3.5 This time, Nutty Knights and Remarkable Rookies tied for first, though in games against each other, the Nutty Knights did better. The FIDEs still came in last, but they did better this time. Notably, these are the most even scores I have gotten so far, and they result from longer thinking time than previous trials have had. Two of the longest games were Fabulous FIDEs vs Remarkable Rookies. These games seemed drawish at times but were eventually won by the Rookies. The other longest game was Fabulous FIDEs (w) vs Nutty Knights (b), which drew despite the Nutty Knights having a major piece vs. a Bishop and equal numbers of Pawns. I thought the Nutty Knights had a chance if the King would just move along the squares the Bishop couldn't reach and help out the other piece, but it didn't happen.
There is of course one dark spot in all strength measurements by computer ... chess programs aren't very good in the opening without an opening book. Some good opening book (but where to get it from?) could change all evaluations. Nevertheless, testing without an opening book is all we have for a new chess variant,

As long as both sides play the opening equally poor, I don't expect this to matter very much.
Despite their intractability (in most cases), it is true (as an existential theorem) for all turn-based, two-player chess variants that, with perfect play, a decisive, game-winning advantage exists for either white or black. Furthermore, this advantage will be amplified where the armies are unequal and/or asymmetrical. The fact that the problem fails to "bite us" because the quality of play needed to reveal it is out of reach for both human or state-of-the-art AI players does not render it insignificant. It just has little practical effect.

> for all turn-based, two-player chess variants that, with perfect play, a decisive, game-winning advantage exists for either white or black I am not sure if I read you correctly, but this sounds to me like you exclude the possibility that the initial position is a theoretical draw. Yet this is the consensus opinion on orthodox Chess. And for a 'Chess variant' where both sides would start with just King + Rook (on d1/e1/d8/e8, say) this has even been proven.
I have no idea why any competent game designer would choose to imitate the maze of overcomplicated rules associated with checkmate from FIDE Chess that unnecessarily create a wide "draw gulf" but unfortunately, there are many thousands of such unimaginative, similar chess variants available. Sorry, I have a tendency to conceive of chess variants potentially as the infinite variety of unique, non-trivial differences from one another that they are in theory. In practice, they fall short.
25 comments displayed
Permalink to the exact comments currently displayed.