New hand evaluation method

7 Pages
←
1
2
3
4
→
Last »

You cannot start a new topic
You cannot reply to this topic

New hand evaluation method

#21 Stefan_O

Group: Full Members
Posts: 469
Joined: 2016-April-01

Posted 2016-July-14, 17:51

Hi Tim,

Minor detail:
On page 4 in the pdf, the example says:

Quote

AKQT6 = 2. "Any 4 out of 4 top cards" rule.

That looks like a typo?
AKQT6 is "3 out of 4" and "4 out of 5" top cards -- but not "4 out of 4".

#22 Stefan_O

Group: Full Members
Posts: 469
Joined: 2016-April-01

Posted 2016-July-14, 18:07

Just wondering also on page 3, under "Trump model" => "Number of cards in trump suit" table...

If the pair has, say, 4-4 in trumps, I understand from the table that each player should add 3 points (total 6 points) for the trump length -- is that correct?
Assuming 3 points = 1 trick, this amounts to at least 2 extra tricks.
(In addition, you also add more extra points later for any shortnesses.)

But if you have 4-4 trumps, I think you mostly only gain 1 trick compared to NT-play? (3 rounds to draw trumps + 1 ruff in each hand)

So, do I understand your table correctly?

#23 m1cha

Group: Full Members
Posts: 397
Joined: 2014-February-23
Gender:Male
Location:Germany

Posted 2016-July-14, 18:41

Hi, tnevolin:

tnevolin, on 2016-July-13, 20:02, said:

I just afraid ton of people would ask the same question in confusion: "What to do with high cards in NT model? Should it change? Your document doesn't say anything."

Probably

. You may want to add a phrase under that table saying: "In other situations, HCP points do not deviate from the base values." That might do it.

tnevolin, on 2016-July-13, 20:02, said:

So 26 total points are not even close to 26 my points.

I suggest you call your points Evolin points. It carries the ideas of your last name, "evaluation", "evolution" and can be remembered with the name "Evelyn".

tnevolin, on 2016-July-13, 20:02, said:

After all remembering few more numbers for different contract level requirements is a piece of cake comparing to three pages of preceding calculation rules. You play it 20 times and will memorize it easily. Yet, if you think nobody cares about it, then sure - I'll give up precision in favor of convenience.

You are right. I am not sure how much other people care about this or that. Time will tell.

I believe I found a small error in your new description within the Examples of section "High card combinations in side suit (out of 4 top cards)". AKQT6 gets +2 points for 4 out of 4 top cards, but it is 4 out of 5. I think you wanted to write AKQJ6.

And in the section "Value duplication: high cards in your hand against partner's shortness (out of 3 top cards)", I cannot understand why a Q opposite a void should get +1 point; and I also cannot understand why a K opposite a singleton should get -1 while opposite a void it gets 0. I tend to think that these are statistical glitches, is that possible? I would expect -1 for both opposite a void (and perhaps also opposite a singleton), but this is just a guess. I think I would keep the figures at 0 until I have better evidence. Sorry, I have no idea where to get you more data.

#24 Stefan_O

Group: Full Members
Posts: 469
Joined: 2016-April-01

Posted 2016-July-14, 18:41

tnevolin, on 2016-July-12, 07:17, said:

I've used real tournament boards. That was my initial goal - to get the average on number of tricks taken in real games. That would account for everything. Like strength of play, mistakes, deception, etc. In other words, my method produce a number that is an "average number of tricks other average pairs would get from this hand(s)".

By the way, where do you find all those 400k tournament recorded deals?
Are they freely available somewhere?

#25 m1cha

Group: Full Members
Posts: 397
Joined: 2014-February-23
Gender:Male
Location:Germany

Posted 2016-July-14, 19:04

Stefan_O, on 2016-July-14, 18:07, said:

Yes, you do. But you cannot compare trump-play and NT-play evaluations that easily.

If you play trumps, you get +6 points for playing in a 4-4 fit, correct.

Then you subtract one point for each honor in a side suit, typically ~ -7 for a full game of 4M, so you are down 1 point in trumps, so far.

Then you get points for shortages, maybe +3 for playing 4432 against 4315. So you end up +2 for trumps and play a full game in 4M.

While if you had 4432 against 4333 you'd only get +1 for shortages and prefer to play 3NT because you are unlikely to generate an additional trick if you play trumps. That is the idea as far as I have understood it.

#26 Stefan_O

Group: Full Members
Posts: 469
Joined: 2016-April-01

Posted 2016-July-14, 19:22

@M1cha

ummm... I see... yes, makes sense, then

#27 tnevolin

Group: Full Members
Posts: 64
Joined: 2011-November-12
Gender:Male

Posted 2016-July-15, 05:00

Stefan_O, on 2016-July-14, 17:51, said:

Hi Tim,

Minor detail:
On page 4 in the pdf, the example says:

Quote

AKQT6 = 2. "Any 4 out of 4 top cards" rule.

That looks like a typo?
AKQT6 is "3 out of 4" and "4 out of 5" top cards -- but not "4 out of 4".

That's right. Thanks for good catch. Corrected to AKQJ6 and updated documents.

#28 tnevolin

Group: Full Members
Posts: 64
Joined: 2011-November-12
Gender:Male

Posted 2016-July-15, 05:31

Stefan_O, on 2016-July-14, 18:07, said:

I thought about it myself a lot. Here is my speculation on it.
Let's take, for example, 25 HCP combined, 4333-4333 distribution on both hands, and 4-4 fit. This is said to be enough for 3NT.
Now let's evaluate trump contract. 6 points for 4-4 fit. Then you need to correct for high cards in side suit. Average price of high card is (4+3+2+1)/4 = 2.5. So you get 25/(2.5) = 10 high cards on average. That is 10 * 3/4 = 7.5 high cards in side suits. Subtract 1 point for each = -7.5 points. Out of these 7.5 cards in 6 different suits (2 hands * 3 side suits) you probably end up with 2-3 side suits having more than one card (or having ace). So you may add 2-3 points for high card combination in side suits. All together: 25 + 6 - 7.5 + 3 = 26.5 which is a little below for major game. That is an obvious result as you cannot ruff anything so you trumps are no better than played in NT.
Now let's consider 4432-4234 distribution giving each hand one doubleton. It will add 2 additional points for short suits total (1 point in each hand) resulting in 26.5 + 2 = 28.5 points - exactly above 4 major requirement! That is again quite explainable because with doubletons on both hands you'll end up with two trumps playing separately delivering one more trick.
You see, everything works like a charm.

Yet, let me reiterate again that trying to rationally explain each single coefficient is futile because they all are part of an equation and not a single of them means anything as stand alone number. You suppose to add up them all up and then combined total would predict number of tricks. That's all. No individual part of equation can predicts anything on its own. We can only speculate in attempt to rationally explain it as I did in paragraph above.
Also keep in mind that my example above is an average of averages. With real hand your high cards may distribute differently resulting in different combined strength and giving you a hint where to play this particular hand in 3NT or 4 major.

@Stefan_O, @m1cha
Sorry guys. I just saw your reply after I posted mine with similar example calculation. Looks like we came up with same explanation.

#29 tnevolin

Group: Full Members
Posts: 64
Joined: 2011-November-12
Gender:Male

Posted 2016-July-15, 06:05

m1cha, on 2016-July-14, 18:41, said:

Quote

I just afraid ton of people would ask the same question in confusion: "What to do with high cards in NT model? Should it change? Your document doesn't say anything."

Probably

. You may want to add a phrase under that table saying: "In other situations, HCP points do not deviate from the base values." That might do it.

I replaced it with spelled out rule in some places where such rule can be spelled at all. I also kept the table as well as a reference in case the rule itself is not straightforward (like for short suits).

m1cha, on 2016-July-14, 18:41, said:

I suggest you call your points Evolin points. It carries the ideas of your last name, "evaluation", "evolution" and can be remembered with the name "Evelyn".

Wow! Thanks for a great name. Now my forum post really starts to pay out.

"Evolin" is cool but I'm afraid it won't stick as a new word. "Evelyn" is more promising. Same word already exists and people would just transfer the meaning and remember it easier. What do you think?
http://www.sheknows....mes/name/evelyn
In American the meaning of the name Evelyn is: Life.

m1cha, on 2016-July-14, 18:41, said:

And in the section "Value duplication: high cards in your hand against partner's shortness (out of 3 top cards)", I cannot understand why a Q opposite a void should get +1 point; and I also cannot understand why a K opposite a singleton should get -1 while opposite a void it gets 0. I tend to think that these are statistical glitches, is that possible? I would expect -1 for both opposite a void (and perhaps also opposite a singleton), but this is just a guess. I think I would keep the figures at 0 until I have better evidence. Sorry, I have no idea where to get you more data.

That is the same story again - statistics. I load pile of games into the machine and it gives out some coefficients those it thinks are best fit. I just may round them to the nearest whole number, that's all. In my last document I explicitly highlighted this irregularity with little speculation. Document describes the most exact version I can come up with. People are free to ignore/change part of it giving up little of precision in favor of memorability. That's fine and this is how it work. In fact this document IS as simplification of even more complex version that I intentionally slashed to not scare people away from the first sight of it.

Now here are my speculation about K against a singleton. We are talking about controls duplication. First round controls are Ace and void. Second round controls are King and singlet. Practically A against a singleton is not a duplication as A controls first round and singleton - second. Q against singleton is not a duplication either as Q doesn't control either first or second round but singleton does. While K against a singleton is a duplication by definition as they both control second round. Same story about void. A against void is a duplication while K or Q is not and you see it in the table. As for two-three cards combinations I cannot say anything - it is all statistics.

#30 tnevolin

Group: Full Members
Posts: 64
Joined: 2011-November-12
Gender:Male

Posted 2016-July-15, 06:12

Stefan_O, on 2016-July-14, 18:41, said:

Quote

I've used real tournament boards. That was my initial goal - to get the average on number of tricks taken in real games. That would account for everything. Like strength of play, mistakes, deception, etc. In other words, my method produce a number that is an "average number of tricks other average pairs would get from this hand(s)".

By the way, where do you find all those 400k tournament recorded deals?
Are they freely available somewhere?

This is my pain. I couldn't get any freely available source anywhere in the internet. There are plenty tournament results but they do not include each pair score! Damn. I got part from swan games and part from one of the online bridge web site. Even there they are not prepared for download. I had to write my own crawler to scrape screens.
If you find any web source where pair results are at least visible on a page - let me know.

#31 tnevolin

Group: Full Members
Posts: 64
Joined: 2011-November-12
Gender:Male

Posted 2016-July-15, 06:20

#32 Zelandakh

Group: Advanced Members
Posts: 10,830
Joined: 2006-May-18
Gender:Not Telling

Posted 2016-July-15, 06:38

tnevolin, on 2016-July-15, 05:31, said:

Now let's consider 4432-4234 distribution giving each hand one doubleton.

How about 4234-4234?

tnevolin, on 2016-July-15, 06:05, said:

"Evolin" is cool but I'm afraid it won't stick as a new word.

The last major system to be released is generally known as Zar Points. You will notice that zar is also not a word. Similarly, the 4321 method is the Milton Work Count. Milton is also not a word so it seems that there is enough precedent to believe that that should in itself not be an impediment.

(-: Zel :-)

#33 jogs

Group: Advanced Members
Posts: 1,316
Joined: 2011-March-01
Gender:Male
Interests:student of the game

Posted 2016-July-15, 07:58

tnevolin, on 2016-July-10, 12:19, said:

I've analyzed 400k deals on the computer to find all the parameters and adjusted values.

Do you mean 400K observations or 400K boards? Are you actually able to examine the results by computer? Meaning you do not need to visually inspect each observation?

#34 Stefan_O

Group: Full Members
Posts: 469
Joined: 2016-April-01

Posted 2016-July-15, 08:05

Zelandakh, on 2016-July-15, 06:38, said:

The last major system to be released is generally known as Zar Points.

Umm.... "Zar points"... "major system"...
That one never really took off, did it...?

#35 Stefan_O

Group: Full Members
Posts: 469
Joined: 2016-April-01

Posted 2016-July-15, 08:15

jogs, on 2016-July-15, 07:58, said:

Do you mean 400K observations or 400K boards? Are you actually able to examine the results by computer? Meaning you do not need to visually inspect each observation?

Once you have all deals+scoreboards in digital form, you can let the computer scan the deals.

For each pair of hands, check the popular contracts how many tricks the majority of pairs make,
and compare to how many tricks your eval-method predicts for the hands....

Then run the same comparison on Milton/LTC/Zar/whatever..., and see which is more accurate in the long run...

#36 tnevolin

Group: Full Members
Posts: 64
Joined: 2011-November-12
Gender:Male

Posted 2016-July-15, 08:21

Zelandakh, on 2016-July-15, 06:38, said:

Quote

Now let's consider 4432-4234 distribution giving each hand one doubleton.

How about 4234-4234?

That would count the same. You won't be able to use shortness because of the mirrored distribution, though. It is a downward fluctuation in statistics which you cannot protect yourself against when you don't know partner's exact distribution. The only comforting thing is that doubleton against doubleton distribution is statistically much more rare so you should be good in most of the cases but not all of them.
In general, most bidding system do not display doubleton so you'll end up with same flaw regardless of bidding/calculation system.
If you are interested, I can recalculate the model with this feature added to see how much specific distribution is worth against your partner's specific distribution. Will make it much more complicated but it is doable.

Zelandakh, on 2016-July-15, 06:38, said:

Sadly, the opposite.

https://en.wikipedia...wiki/Zar_Points - named after Zar Petkov
http://www.dictionar...lton-work-count - named after Milton Work

#37 jogs

Group: Advanced Members
Posts: 1,316
Joined: 2011-March-01
Gender:Male
Interests:student of the game

Posted 2016-July-15, 08:22

The_Badger, on 2016-July-11, 02:01, said:

Hello tnevolin (Tim)

The point I'm making is that your system is probably very good for teaching novices and beginners how to evaluate hands initially, and that's great, but until it is used successfully at higher level expert bridge, then it's a hand evaluation system that helps but might not be conclusively the ultimate solution to many a bridge player's woes. It's worth reading the Wikipedia entry (https://en.wikipedia...Hand_evaluation) to get a taste of what's currently out there, too. (Anyone using Zar points these days?)

Good luck!

From (https://en.wikipedia...Hand_evaluation)

Quote

In contract bridge, various bidding systems have been devised to enable partners to describe their hands to each other so that they may reach the optimum contract. Key to this process is that players evaluate and re-evaluate the trick-taking potential of their hands as the auction proceeds and additional information about partner's hand and the opponent's hands becomes available.

I believe systems should attempt to estimate tricks. All point count systems are artificial. Tricks are real. Losing trick count attempts to estimate tricks.

Quote

Hand evaluation methods assess various features of a hand, including: its high card strength, shape or suit distribution, controls, fit with partner, quality of suits and quality of the whole hand. The methods range from basic to complex, requiring partners to have the same understandings and agreements about their application in their bidding system.

There are two independent random variables for estimating tricks, power and pattern. Power is high card strength. quality of suits, fit of suits within the partnership, etc. Pattern is shape or suit distribution, fit of the suits within the partnership, etc. Controls is partly power and partly pattern.
I believe all systems should stop adjusting points for distribution. These are parameters of separate independent random variables. Adjust the estimated tricks directly. That makes it easier to detect duplication of values.

#38 Stefan_O

Group: Full Members
Posts: 469
Joined: 2016-April-01

Posted 2016-July-15, 08:28

tnevolin, on 2016-July-15, 08:21, said:

If you are interested, I can recalculate the model with this feature added to see how much specific distribution is worth against your partner's specific distribution.

Out of curiosity, another idea/suggestion...

If you use double-dummy-analysis results on the deals, instead of actual play results,
would your eval-system come out significantly different, or would it be mostly the same?

#39 jogs

Group: Advanced Members
Posts: 1,316
Joined: 2011-March-01
Gender:Male
Interests:student of the game

Posted 2016-July-15, 08:30

Let's look at joint partnership patterns.

a) 4324 // 4324
b) 4324 // 4234

The doubletons in different suits makes pattern b stronger than pattern a. Only how can we learn this quickly during the auction?

a) AKxx xxx xx QJxx // QJxx xxx xx AKxx
b) AKxx xxx xx QJxx // QJxx xx xxx AKxx

5 losers in a. Only 4 losers in b.

#40 tnevolin

Group: Full Members
Posts: 64
Joined: 2011-November-12
Gender:Male

Posted 2016-July-15, 08:30

Stefan_O, on 2016-July-15, 08:15, said:

Quote

Do you mean 400K observations or 400K boards? Are you actually able to examine the results by computer? Meaning you do not need to visually inspect each observation?

Once you have all deals+scoreboards in digital form, you can let the computer scan the deals.

For each pair of hands, check the popular contracts how many tricks people make on average (or median?),
and compare to how many tricks your eval-method predicts for the hands....

Then run the same comparison on Milton/LTC/Zar/whatever..., and see which is more accurate in the long run...

Thanks, Stefan. Couldn't have it explained better.

Regarding observations. 400k boards with ~10-20 pairs playing somewhere 2-3 popular strains = 1200k observations. However, I stripped very low strength (below 7 tricks) and I selected only those played by 4 pairs at least to maintain good statistics. So I ended up somewhere 600-800k observations.

I compared my results to ideal player with X-ray vision to see how far away am I from perfection, not to Mitons directly. However, can easily do it right away if someone is interested.

7 Pages
←
1
2
3
4
→
Last »

You cannot start a new topic
You cannot reply to this topic

BBO Discussion Forums: New hand evaluation method - BBO Discussion Forums

New hand evaluation method

#21 Stefan_O

#22 Stefan_O

#23 m1cha

#24 Stefan_O

#25 m1cha

#26 Stefan_O

#27 tnevolin

#28 tnevolin

#29 tnevolin

#30 tnevolin

#31 tnevolin

#32 Zelandakh

#33 jogs

#34 Stefan_O

#35 Stefan_O

#36 tnevolin

#37 jogs

#38 Stefan_O

#39 jogs

#40 tnevolin

2 User(s) are reading this topic
0 members, 2 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

BBO Discussion Forums: New hand evaluation method - BBO Discussion Forums

New hand evaluation method

2 User(s) are reading this topic 0 members, 2 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

2 User(s) are reading this topic
0 members, 2 guests, 0 anonymous users