Forums Tyrant

Simulator limits: we are ignoring good decks

63 posts

Flag Post

NETRAT’s Evaluate Decks simulator is great, and I use it a lot. I believe that it has dramatically improved the deck-building skills of the Tyrant community. In fact, it has done such a good job, that our deck-building skills now exceed the ability of a simulator to accurately measure the performance of some decks. There is nothing wrong with EvalDecks; this is a limitation of simulators.

I have come across a few different examples, but most of them are complicated and would take many pages to explain. This example is relatively simple, but it is also relatively unimportant. I know it is unimportant, so there is no need to say, “who cares? The win rate is 99.76% anyway.” That’s not my point.

I am using this as an example of how the simulator is sometimes wrong. But on the Wiki and the Fansite, we currently rank decks based on the simulator results. Because our deck-building skills are more sophisticated now, I am absolutely certain that we are ignoring good decks because the simulator says they are inferior to other decks. This example shows one limitation of the simulator.

Again, this example is not about improving the win rate for this deck.

On the Fansite, the “best” deck currently listed for Enclave Flagship has a 99.76% win rate and average net points of 24.840958.

Notice that the order includes playing 1 Daemon before Hephatat, and especially notice that the order includes playing 1 Daemon and Hephatat before Elusive Panzer.

According to the simulator, if you take the same cards and rearrange them slightly, the deck is inferior.

Win rate goes down to 99.66% and average net points to 24.807638.

I know that this is a relatively insignificant drop. My point is that I think it is wrong. If this is wrong, then there are other decks that may be wrong and we might be ignoring good decks. Let me show you why I think it is wrong.

When a human looks at the card order, the first deck suggests, when given a choice, play 1 Daemon before you play Elusive Panzer; the second deck says, when given a choice, always play Elusive Panzer before Daemon.

But when the simulator looks at the second deck, it sees something different. It sees that when it has a choice it should play Hephatat before playing any Daemon or Iron Maiden or Electromagnetic Pulse. That’s fine if you have Longshot or Elusive Panzer in your hand. But if your opening hand has Hephatat and Daemon, but not Longshot or Elusive Panzer, then the simulator will play Hephatat first. That is a bad play and no human would do that. A human would play Daemon first.

And that is what the card order in the first deck tells the simulator to do. In the first deck, the chance that Hephatat will be played first is low. It will only happen if you don’t have Longshot or Daemon in your opening hand and you do have Hephatat. I’m not going to do the full math, but I know that will occur less than 5% of the time.

By comparison, with the second deck, the simulator will play Hephatat first about 17.8% of the time. (If I did the math correctly; whatever the number is, it is much higher than the first deck.)

Plus, both of these decks are “wrong” in the sense that a human and the simulator see Hephatat very differently. A human looks at this deck and thinks, “play Hephatat after two AntiAir units.” The simulator does not do that. The simulator follows the card order as best it can by looking at the cards already played. It doesn’t understand that you want Hephatat to be played after two AntiAir units—any units—it thinks you want it after Longshot and Daemon. (In the first deck.)

So a human will play this deck differently than the simulator simulates it. Therefore, the results are “wrong” in the sense that they don’t accurately reflect game play. (There is nothing wrong with EvalDecks, however, this is a limitation of simulation programs.)

In fact, once I saw the Elusive Panzer inaccuracy, I did a ton of testing, and I am convinced that Hephetat should actually be played second, after an AntiAir unit, in order of preference: Longshot, Elusive Panzer, Daemon. But the simulator will say that is a bad idea. Why? Because if you put the card order as Longshot, Hephatat, [everything else], then Hephatat will be played first quite often and that is a bad thing.

Now you know why I decided to use a simple example instead of a complicated example.

I don’t know what the solution is, however. It’s ironic that the tool that made us into better deck builders—Evaluate Decks—is now the same tool that is sometimes causing us to ignore good ideas.

In fact, for Enclave Flagship, given the cards that I own, I think this is the best deck. If I run this exact deck, however, the simulator says it is inferior. I did a lot of testing using other methods (like comparing 8 Elusive Panzer + Electromagnetic Pulse to 8 Elusive Panzer + Wasteland Skimmer), and I think this is the best deck I can construct right now.

So what do we do? How could I “prove” to you that this last deck is the best? The simulator says it is not. Again, in this situation, the differences are trivial. But there are certainly many other situations where the differences are not trivial. The more Unique cards you use, for example, the more likely the simulator will not accurately reflect your game play.

How do we know which good decks we are ignoring?

 
Flag Post

The simulator algorithm tries to play the cards in order. In cases where it can’t, it’ll prioritize. So, my best guess would be to ensure that the key cards come down first, then 1 filler before the more fragile key cards, then have said fragile cards, then the rest of the fillers.

 
Flag Post

Yes, I was thinking about this fact too, these days. I understood the simulator logic and I con build decks that are the best for the playstile of the evaluator but can be probably improved if an human plays it.
That’s because to get a good rate on the sim you are always obliged to give him a good card to open in every combination of three card you can draw, and so you have to change play order, remove some cards that sim will not play properly, etc…
The only solution to this seems to find a way to make the sim smarter (because I cannot do 1 million battles to see if my deck is good or not) and maybe NETRAT could change it a bit to allow us to specify a card order over all the possible first draw (and the suxxessives, if we need), I mean, we give the sim 10 cards, we set an order for the first card, then we set a conditioned order for the second, the a double-conditioned order for the third, etc…
This would allow us to have the perfect control over the deck behaviour, but at a high price: this would be such a giant mole of information to give and store that nobody will find useful, I mean, you can build a 99,9999999% deck, but the playstile would include so many little details (that you normally learn with playing experience) to result overcomplicated for anyone…
Maybe we should be able to give the sim some basic instruction (start with an antiair unit, play emp only if opponent has more that 3 assault, ecc), but it will not solve the problem, either, it’s a compromise.

Very good thread Hunterhogan, I’ll think about that.
Sorry for bad english.

 
Flag Post
Originally posted by hunterhogan:

How could I “prove” to you that this last deck is the best?

Actually, this is not that important to me. If you don’t believe me, that’s fine. But I learn a lot from other players—if the simulator says the deck is inferior how can you show me that the simulator is wrong? Wasteland Skimmer is a great example. PsychoticSoul posted a deck for this raid in which he replaced Electromagnetic Pulse with Wasteland Skimmer. It was a great idea and the simulator agreed. But when I tried to do it for the “best” deck, the simulator gave it a lower rating. I think PsychoticSoul is right, and in this case, I spent a lot of time trying to figure out what was happening.

But that is inefficient. The point of having a Wiki and a Fansite for posting decks is so 1000 players don’t all build decks in isolation. On the other hand, if there is a good idea (like Wasteland Skimmer) but the simulator says it is a bad idea, then we, as a group, will ignore the idea.

That is the problem I want to solve.

 
Flag Post

The first step is commenting on the deck so that people will read it and understand why it may have a worse winrate / average damage.

 
Flag Post
Originally posted by Shadowhopeful:

The first step is commenting on the deck so that people will read it and understand why it may have a worse winrate / average damage.

Agreed.

 
Flag Post
Originally posted by Shadowhopeful:

The first step is commenting on the deck so that people will read it and understand why it may have a worse winrate / average damage.

QFT

 
Flag Post

So the proper idea might be to make some rules and build a simulator around using those rules. So you said, “we want 1 antiair, then heph”, why not define parameters for that, and then swap in the best antiair cards. (It would take way more processing time though, which is always going to be a limitation of any “better” system).

 
Flag Post

Interesting idea, and you didn’t even mention the most obvious and damaging example: action cards. The simulator has no chance of using these correctly and with the introduction of on play abilities (effectively giving us more playable action cards) the problem is even more wide spread.

 
Flag Post

hunter it’s all elusive panzers not daemons evade is not only good for avoiding the strikes but it’s good for evading the weaken.<*dope*>

also when you run the simulator with “cards played in order” i’m fairly confident it assumes your hand has all 10 cards and just simply plays those cards in order. As a progarmer I know the work involved and extra (unneccisary) computation power required to factor in the fact that a hand has only 3 cards. I find it highly unlikely for that to be part of the coding.

when looking at percentages you have to add for players being able to adapt to situations (this is y running sim won’t help you much in tournaments).

you have to subtract for the fact that you might not always draw the best card to play.

 
Flag Post

@hunter, in your testcase, the sim is correct, your analysis on the two test decks is wrong.

Notice, you play longshot first. Longshot INTERCEPTS the strike, thereby making panzer’s evade not as useful as possible. Daemon’s higher baseattack must then take precedence over the off chance of panzer evading weaken and facing an azr. Also, longshot protects a daemon from strike, and panzer still has a chance to evade on. Placing Longshot by Panzer makes panzer’s evade inefficient.

Did i just successfully nullify your entire argument? ;)

 
Flag Post

This does remind me of a more important point:

“cards played in order” =/= true manual

for GoG1 i got a deck that I tested with 2 bars to over 50% success, it involved lining up anti air with xeno overlord and mothership. However there’s no sim that could do that where as a human can.

Now the sim gave something like 7% success rate so on fansite it was voted down (by idiots that don’t read and try be4 voting).

 
Flag Post

This is news? Of course the simulator isn’t perfect at evaluating decks when played manually. The play in order measure is at best a minimum expected win rate % – how much higher the maximum may be is very hard to determine. Without dedicating far more computing power than is reasonable, it’s really not possible to get an accurate gage of the maximum expected win rate with intelligent playing from a simulator.

It could be improved with the ability to use rule-based play so people could set a list of rules for how to play the cards, but depending on how NETRAT’s simulator currently works that may not be an easy thing to just tack on to the simulator. If the simulator let’s you feed in a play order and leverage it’s battle simulation logic, it mightn’t be too tricky to script the play order rules yourself though.

 
Flag Post

too bad you can’t specify how to “counter” a card in eval.

Example: You load an enemy deck, then go to “manual tab”. “Manual” lists each enemy card, then for each enemy card, you rate the cards in your current deck from best to worst counters.

(So if def deck was sundering ogres, i’d rate something like thundercrag as highest counter, followed by kilgore, etc.)

Action cards would still be a problem though.

 
Flag Post
Originally posted by shian:
… As a progarmer I know the work involved and extra (unneccisary) computation power required to factor in the fact that a hand has only 3 cards….

Oh lord, this made me lol. Shian claiming to be a “progarmer”? Anyone who’s ever programmed knows that precise spelling is important for variable names, among other reasons. Considering Shian’s posts, any code he’d write would be a complete train wreck.

As for the main subject: Hunter, I think you may have answered your own question. If you’ve ever played Final Fantasy XII and used Gambits, or Sorcery Quest’s Automata (on Kong), they provide a very simple interface for making rudimentary algorithms. I.e. Use heal if hp < 30%, or in this case, play Hephatat after two units with anti air have been deployed. Of course this would totally be up to NETRAT, and it still wouldn’t be perfect, but you get the idea.

 
Flag Post

gaahh the eval is right. Hunter forgot longshot has intercept which is changing his results.

(Longshot + daemon being protected by intercept + a panzer still with evade) > (longshot + panzer protected by intercept (useless) + a vulnerable daemon)

 
Flag Post

@shian, i have no proof (and prepared to be proved wrong) but i’m sure that the sim does calculate your hand into the percentages as it would otherwise be deeply flawed system (again no real proof)

I looked into your GoG1 deck and on the face of it does look low but when you look at the different win % in card order you can get 20% wins (at a maximum heap size 700) to 67% wins (7000)
It might be that you deck (which hogan has pointed out with the sim) is strongly variable to the opponents cards and set in a strategy ie kill xeno mothership and overlord with soot launcher

Most high % win decks have a certain autonomous nature built in them like the one hogan pointed out; it has 4 daemons to reduce the possibility of hephatat being played first.

I’ve played a simulated deck on the fansite with 90% auto winrate it was slow and filled with healing and I lost twice in a row…
I played a quick strike decks as from the losses it seemed like as long as a card is dead before it can activate I can win and completed the mission without any more losses.
Of course if i could play 1million battles in 7 secs my wins and losses might be closer to the sim %

Those % seem less dependent on an actual skill while playing and more of overpowered combination that the opponents cards can’t win against

 
Flag Post
Originally posted by Shadowhopeful:

The simulator algorithm tries to play the cards in order. In cases where it can’t, it’ll prioritize.


Is it possible to add an “exact order” option? When using this option, the simulator ALWAYS assume we have the required cards in that order, so the cards can be played in that exact order. I know it is impossible when we actually play the deck in the game, but this may show that the best card order in a optimal way.

 
Flag Post
Originally posted by shian:

also when you run the simulator with “cards played in order” i’m fairly confident it assumes your hand has all 10 cards and just simply plays those cards in order. As a progarmer I know the work involved and extra (unneccisary) computation power required to factor in the fact that a hand has only 3 cards. I find it highly unlikely for that to be part of the coding.

In some other thread I can’t be arsed to track down right now, it was plainly stated that the evaluator takes draw order into account, and attempts to adhere as closely as possible to the specified play order.

The “work” involved in shuffling the deck and restricting choices to one of the three cards in hand at any point in time is utterly trivial, and given that you are unlikely to draw your cards in the same order every match, this restriction helps account for the probability of placing key cards at particular moments — it is certainly not “unneccisary[sic]”.

I can’t decide if you were trying to call yourself a “programmer” or a “pro gamer”, but both are highly improbable.

 
Flag Post
Originally posted by BlankZero:
Originally posted by shian:

also when you run the simulator with “cards played in order” i’m fairly confident it assumes your hand has all 10 cards and just simply plays those cards in order. As a progarmer I know the work involved and extra (unneccisary) computation power required to factor in the fact that a hand has only 3 cards. I find it highly unlikely for that to be part of the coding.

In some other thread I can’t be arsed to track down right now, it was plainly stated that the evaluator takes draw order into account, and attempts to adhere as closely as possible to the specified play order.

The “work” involved in shuffling the deck and restricting choices to one of the three cards in hand at any point in time is utterly trivial, and given that you are unlikely to draw your cards in the same order every match, this restriction helps account for the probability of placing key cards at particular moments — it is certainly not “unneccisary[sic]”.

I can’t decide if you were trying to call yourself a “programmer” or a “pro gamer”, but both are highly improbable.

wow good to know, I mean I can think of the algorithem to take into account that you only draw 3 cards and pick the “highest priority” of the 3 just thought it seemed trivial and unnecisary work.

 
Flag Post
Originally posted by Atsuiai:
Originally posted by shian:
… As a progarmer I know the work involved and extra (unneccisary) computation power required to factor in the fact that a hand has only 3 cards….

Oh lord, this made me lol. Shian claiming to be a “progarmer”? Anyone who’s ever programmed knows that precise spelling is important for variable names, among other reasons. Considering Shian’s posts, any code he’d write would be a complete train wreck.

As for the main subject: Hunter, I think you may have answered your own question. If you’ve ever played Final Fantasy XII and used Gambits, or Sorcery Quest’s Automata (on Kong), they provide a very simple interface for making rudimentary algorithms. I.e. Use heal if hp < 30%, or in this case, play Hephatat after two units with anti air have been deployed. Of course this would totally be up to NETRAT, and it still wouldn’t be perfect, but you get the idea.

compile errors is like spell check =P. Then you gota figure out if it’s a logic issue, syntax, or run time error ya it can be a pain. I admit speling and syntax is not my strong suit.

 
Flag Post
Originally posted by BlankZero:

The “work” involved in shuffling the deck and restricting choices to one of the three cards in hand at any point in time is utterly trivial, and given that you are unlikely to draw your cards in the same order every match, this restriction helps account for the probability of placing key cards at particular moments — it is certainly not “unneccisary[sic]”.


I think this restriction is useful for normal players, but unnecessary for advanced players. Because if we cannot control all the perimeters in the simulation program, we may get a wrong answer.

So, I think if we can split the “Ordered” option into “Priority order”(= old one) and “Exact order”(assume we can always play the right card), we can manually find the best order and the second best order.

 
Flag Post

For those looking for exact order, you can get a small sample size with win ratio in the “Calcuate best order” feature under the Card Order tab.


Also, I suggest having Hephatat in third position even if it’s a key card you don’t want mimicked or double striked. It’s more important to have the first few cards two-shot (or one) to get a disassembly line going.

Longshot isn’t that great in terms of damage potential in this raid since it only does 3 damage to air units (can’t one-shot Troopers and Interceptors, can’t two-shot Vaporwings) and 1 damage to ground units (stalled by Dominated Hatchlings) but it makes up for it by survivability.
Perhaps this is why EMP is preferred over Skimmer since some damage from flanking isn’t guaranteed for the quicker win.


As for the issue with uniques, there isn’t much of a problem as most of the time the best counter to all cards will be recommended as first drop in auto evaluation. (e.g. Lucina the Wicked + Tiamat for Sentinel Reborn)
If the deck is based around a combo, then you might have to use a little intuition.

 
Flag Post

Another idea that came to me today is to use heuristics to roughly guess which is the best card to play in any scenario (the same could also be used to find an optimal order). This would help us determine what the ultimate right best card is in any given situation without having to do thousands or millions of simulations. The only problem is that developing truly comprehensive heuristics that would come close to matching the strength of an evaluator is near impossible (ie what a human thinks of in roughly 3-4 seconds would take hundreds of lines of coding the heuristics). And then the heuristics would have to be tested. All in all, it (probably) is the optimal solution, in terms of balencing all the issues, but would require major work.

 
Flag Post

I know that, but the sim evaluates how the deck perform in all cases, even if you missplay the cards. by playing cards in a certain order the sim evaluates the number of chances it has to improve the win rate of the deck, despite the fact that the card will die soon.

I can say this because i also posted a raid deck that on auto it had a 70% win chance, but on auto it could win up to 85% or close to that because of the good use of other cards, like epicenter thunder crag and an action card.

Action cards are unpredictable and best used by a human player to evaluate whether or not they should be used, and usually the evaluator gives it a lower rating because it can’t calculate how useful it can actually be.

Also the evaluator can’t calculate the experience of the user which usually comes in handy when playing certain high risk high reward cards.