State of the Art

2 Pages
1
2
→

You cannot start a new topic
This topic is locked

State of the Art Monte Carlo simulations and Bridgify 104

#1 Scarabin

Group: Full Members
Posts: 382
Joined: 2010-December-30
Gender:Male
Interests:All types of games especially bridge & war games.
old bidding systems & computer simulation programming.

Posted 2012-January-26, 00:50

__________ In this post I apply GIB, JackBridge, SharkBridge, and WBridge5 to a very simple deal. (Note, these robots are all current or past world champions.) Finally I examine the deal using Bridgify 104, my favourite double-dummy solver. I won't identify which robot did what because my intention is not to compare programs but rather to examine the current level of play. ________ The contract is 3NT and declarer is South. Diamonds break 3-2 so success requires only a duck on the first round of diamonds and recognition that playing on diamonds is better than finessing in hearts. All very basic. In the bidding 3 of the robots duly bid 3NT, 1 persevered to 6NT but I forced it back to 3NT. __________ ________ ____________________ _________ On my methods West could lead either the CQ or the S4 without changing declarer's play. I decided to try both leads and run the simulations twice. On the lead of the ♣Q all four made the contract (with the obvious overtrick). On the lead of the ♠4 two robots made the contract and two went down. _______________________________ ______________ ____ ______ I like Bridgify 104 primarily because of the neat way it presents its results (It is available on free download from www.bridgify.net and the author is Florian Krugel). When South wins the ♣Q opening lead Bridgify says that leading any card other than the ♠5 will win 9 more tricks. ____________________________________________ ________ ____________________ I like to think that monte carlo simulations see the world as Bridgify does except their data is approximate rather than accurate. This could explain why robots frequently and pointlessly play neutral cards in side suits before grappling with the main problem. And a skewed set of sample deals could explain why they can make and go down in identical problems. I chose this deal as being so easy that I thought no robot would fail to make but robots do not even remotely function like human beings and this is probably why they excite our ire. The team of dedicated men who respond to complaints about Gib will know much more than I about the details of these simulations. I hope to do a few more of these posts and try to draw some valid conclusions. To which end I welcome and value any insights, input or help you may offer.

#2 inquiry

Group: Admin
Posts: 14,566
Joined: 2003-February-13
Gender:Male
Location:Amelia Island, FL
Interests:Bridge, what else?

Posted 2012-January-28, 14:30

FIRST, barmar knows about this stuff, not me. Having said that, I have to admit that any robot going down on this one is a little surprising. From experience, I know if you play with GIB on fastest setting (quick bids and play) and played this hand a number of times, it would go down sometimes. If you played this hand on the slowest setting, I would be surprised if it did go down. The reason is it tries different lines with simulated hands consistent with the bidding. Given a large enough group of random simulations the robots can figure out the odds of 3-2 diamond split is greater than the odds of the heart finesse being on, and with no opponents bidding, they all should get this hand right.

The reason humans can process this hand quickly, is that we "know" that 3-2 split is greater percentage than a finesse. With enough simulations, robots will figure this out. If you have east throw in some kind of crazy bid on the way to 3NT, all or nearly all of the simulations will place the heart king with east and then GIB will go back to going down as it will find two heart finesses attractive. Same thing happens in the real world with real people. Imagine if east dealt and opened 1♦,. That would surely alter your play. The bots take bids into account in their simulations.

--Ben--

#3 Scarabin

Group: Full Members
Posts: 382
Joined: 2010-December-30
Gender:Male
Interests:All types of games especially bridge & war games.
old bidding systems & computer simulation programming.

Posted 2012-January-29, 20:36

East-West were silent throughout the bidding.

Appreciate your analysis of the monte-carlo approach: in my language this is akin to saying humans use a priori probabilities and the robots use a posteriori probabilities which require random trials.

I started this series of posts with the robots on their default settings and have now set them all to the slowest, strongest play level.

Would you agree that at their best the robots are very good, at their worst unbelievably bad? I started with the belief I could easily rank these robots in order of strength, now I do not think so; they seem equally prone to blunder or brilliance.

#4 hrothgar

Group: Advanced Members
Posts: 15,472
Joined: 2003-February-13
Gender:Male
Location:Natick, MA
Interests:Travel
Cooking
Brewing
Hiking

Posted 2012-January-30, 07:54

Scarabin, on 2012-January-29, 20:36, said:

I would argue that the difference is how the humans / robots generate their priors...

Alderaan delenda est

#5 inquiry

Group: Admin
Posts: 14,566
Joined: 2003-February-13
Gender:Male
Location:Amelia Island, FL
Interests:Bridge, what else?

Posted 2012-January-30, 16:58

Scarabin, on 2012-January-29, 20:36, said:

<b></b>

Would you agree that at their best the robots are very good, at their worst unbelievably bad?

Yes

--Ben--

#6 barmar

Group: Admin
Posts: 21,570
Joined: 2004-August-21
Gender:Male

Posted 2012-January-30, 20:41

The same can be said of human players, can't it?

#7 Scarabin

Group: Full Members
Posts: 382
Joined: 2010-December-30
Gender:Male
Interests:All types of games especially bridge & war games.
old bidding systems & computer simulation programming.

Posted 2012-January-30, 23:49

Sorry, my semantics are as muddled as my thinking.

I think there is an essential difference to the way humans and robots approach bridge play (obvious but bear with me for just a moment, please).

The human approach is pragmatic or logical and takes into account, or should take into account, all available information.

The robot approach is restricted-random: a series of random double dummy layouts restricted by taking into account the known distribution of declarer's and dummy's hands and any information about the other hands revealed by the bidding.

This is a bit tortuous, what if any are the practical consequences?

First, the only criterion the robot uses is the double dummy question: "How many tricks will I win if I lead this card?".

Second a human player can say both these lines of play may fulfill the contract but one is superior to the other (in all circumstances). The robot will choose the line of play which figures in the majority of his samples.

Next the human will evolve a plan and keep to it until it is found to be faulty, the robot seems to do a new simulation for every card played and hence may change to a new plan merely because he has generated a different set of samples.

Another point which interests me but may have no relevance: computer random usually means pseudo random?

Having said all that I hope it's not a hopeless muddle, perhaps SoA3 will help to clarify my thinking.

#8 Antrax

Group: Advanced Members
Posts: 2,458
Joined: 2011-March-15
Gender:Male

Posted 2012-January-31, 00:04

Quote

Another point which interests me but may have no relevance: computer random usually means pseudo random?

In a nutshell, yes. But why does it matter?

#9 Scarabin

Group: Full Members
Posts: 382
Joined: 2010-December-30
Gender:Male
Interests:All types of games especially bridge & war games.
old bidding systems & computer simulation programming.

Posted 2012-January-31, 22:13

Antrax, on 2012-January-31, 00:04, said:

In a nutshell, yes. But why does it matter?

Probably does not matter but could perhaps contribute to skewed samples and incorrect probabilities?

#10 Scarabin

Group: Full Members
Posts: 382
Joined: 2010-December-30
Gender:Male
Interests:All types of games especially bridge & war games.
old bidding systems & computer simulation programming.

Posted 2012-January-31, 22:16

barmar, on 2012-January-30, 20:41, said:

The same can be said of human players, can't it?

Yes indeed, but on playing the same deal twice?

#11 Antrax

Group: Advanced Members
Posts: 2,458
Joined: 2011-March-15
Gender:Male

Posted 2012-February-01, 01:13

Scarabin, on 2012-January-31, 22:13, said:

Probably does not matter but could perhaps contribute to skewed samples and incorrect probabilities?

Psuedo-random means exactly "for you, it's really random"

What you might be getting at is that the implementation of randomness in GIB's (or other bots') code could be faulty.

#12 barmar

Group: Admin
Posts: 21,570
Joined: 2004-August-21
Gender:Male

Posted 2012-February-01, 09:31

PRNG algorithms have been studied for decades, and is pretty well understood. About the only types of applications where it may not be adequate are cryptographic security. Where this is really important, they often make use of hardware devices that measure physical properties that have quantum randomness.

#13 ahydra

AQT92 AQ --- QJ6532

Group: Advanced Members
Posts: 2,840
Joined: 2009-September-09
Gender:Male
Location:Wellington, NZ

Posted 2012-March-16, 08:32

What are all the ________ _________ for?

ahydra

#14 inquiry

Group: Admin
Posts: 14,566
Joined: 2003-February-13
Gender:Male
Location:Amelia Island, FL
Interests:Bridge, what else?

Posted 2012-March-16, 08:43

ahydra, on 2012-March-16, 08:32, said:

What are all the ________ _________ for?

ahydra

his interpretation of instructions I gave him. i didn't want someone (anyone) to come and promote one computer bridge program over all others -- and especially over bbo's GIB program. Also what a computer program does, depends a large part on what settings it is on (slow bidding and play -- the better in theory it does). One could run hands through all the programs and then show only ones that your favorite robot shows. He decided not to list what each program did, so he uses underscores. I am not sure it had to go that far, but his way is the safest to make sure the points he wants to make gets left alone after he posted it.

--Ben--

#15 ahydra

AQT92 AQ --- QJ6532

Group: Advanced Members
Posts: 2,840
Joined: 2009-September-09
Gender:Male
Location:Wellington, NZ

Posted 2012-March-16, 10:19

inquiry, on 2012-March-16, 08:43, said:

Makes sense. But then the post just looks silly - why not just omit the underscores, or (better) say "Robot A threw his king under the ace, while Robot B played perfectly right up until the point it cashed its last stop in the opponents' suit before exiting".

Then of course there's Human A, who signals with the 8 of hearts only to find himself unable to overruff dummy's 6 a trick later. Oh wait, that was me.

ahydra

#16 Scarabin

Group: Full Members
Posts: 382
Joined: 2010-December-30
Gender:Male
Interests:All types of games especially bridge & war games.
old bidding systems & computer simulation programming.

Posted 2012-March-16, 17:50

Sorry, Inquiry is being kind but actually the underscores are my fumbling attempts to format my posts. The earliest posts are one continuous line the latest are formatted normally.

Apologies if these underscores appeared like a petulant protest. I find the prohibition on promoting other robots very reasonable.

Perhaps I should have gone back and edited these posts but I thought it probably was not worthwhile.

#17 Free

mmm Duvel

Group: Advanced Members
Posts: 10,728
Joined: 2003-July-30
Gender:Male
Location:Belgium
Interests:Duvel, Whisky

Posted 2012-April-15, 13:40

Scarabin, on 2012-January-30, 23:49, said:

First, the only criterion the robot uses is the double dummy question: "How many tricks will I win if I lead this card?".

That's not entirely true. It's possible to incorporate different types of scoring. It's still based on the amount of tricks it will make, but it can make overtricks less (and down tricks more) important. For example, if you calculate your average tricks, then one card can make an average of 9.5 tricks while another will have an average of 9.2 tricks. However, if line one goes down in 50% of the cases and makes 2 overtricks in the other 50% of the cases, then the average score will be quite low. Compare that with line two, which makes 9 tricks all the time and makes an overtrick in about 20% of the cases. The average score will now be much higher. Based on number of tricks (for example MP play) a computer may choose line one, based on average score (for example imp play) the computer will choose line two.

Scarabin, on 2012-January-30, 23:49, said:

Second a human player can say both these lines of play may fulfill the contract but one is superior to the other (in all circumstances). The robot will choose the line of play which figures in the majority of his samples.

Next the human will evolve a plan and keep to it until it is found to be faulty, the robot seems to do a new simulation for every card played and hence may change to a new plan merely because he has generated a different set of samples.

The best computer player would generate all possible hands, analyze the DD result of each card in each of these deals, and come up with the percentage line of play. In practice that's impossible, so it has to stick to some number of generated deals. The more deals it generates and analyzes, the more accurate (and better) the results. That's also a reason why computer players are much better in the end game rather than in planning the play of the entire hand, because they can analyze hands a lot faster in later stages of the game.

Also note that it's quite difficult to tell a computer what is a possible distribution and what is not. Some people may overcall AQxxx and out, while others won't. Against certain opponents the computer should analyze this hand, against others it shouldn't. When generating a small amount of deals this may have a big influence on the result, but when analyzing all deals this would have a negligible influence.

Scarabin, on 2012-January-30, 23:49, said:

Another point which interests me but may have no relevance: computer random usually means pseudo random?

The amount of deals and possible distributions is large enough for this to be completely irrelevant.

"It may be rude to leave to go to the bathroom, but it's downright stupid to sit there and piss yourself" - blackshoe

#18 Scarabin

Group: Full Members
Posts: 382
Joined: 2010-December-30
Gender:Male
Interests:All types of games especially bridge & war games.
old bidding systems & computer simulation programming.

Posted 2012-April-16, 22:04

I think we may be at cross purposes? You may be talking about what computer programs should do while I am trying to establish what 4 specific programs - GIB, Jackbridge, Sharkbridge, and Wbridge5 - actually do, based on deductions from actual examples and I am continually learning more about them. Having said that I agree completely with everything you say.

I have now discovered another program, Oxford bridge, which incorporates pragmatic reasoning and propose to attempt a statistical analysis of these 5 programs performance on hands from educative software.

#19 Scarabin

Group: Full Members
Posts: 382
Joined: 2010-December-30
Gender:Male
Interests:All types of games especially bridge & war games.
old bidding systems & computer simulation programming.

Posted 2012-October-04, 00:26

I have come across another double dummy solver whose performance rivals Bridgify 104's. This is "Bridge Calculator" written by Piotr Beling and available from http://bcalc.w8.pl/

Piotr has also written a single dummy solver which I have downloaded but cannot run because it cannot find some DLL. I would ask Piotr for help but I do not speak Polish and cannot find his email address.

#20 Antrax

Group: Advanced Members
Posts: 2,458
Joined: 2011-March-15
Gender:Male

Posted 2012-October-04, 01:38

Sorry for stating the obvious, but what comes up when you Google the DLL's name?

2 Pages
1
2
→

You cannot start a new topic
This topic is locked

BBO Discussion Forums: State of the Art - BBO Discussion Forums

State of the Art Monte Carlo simulations and Bridgify 104

#1 Scarabin

#2 inquiry

#3 Scarabin

#4 hrothgar

#5 inquiry

#6 barmar

#7 Scarabin

#8 Antrax

#9 Scarabin

#10 Scarabin

#11 Antrax

#12 barmar

#13 ahydra

#14 inquiry

#15 ahydra

#16 Scarabin

#17 Free

#18 Scarabin

#19 Scarabin

#20 Antrax

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

BBO Discussion Forums: State of the Art - BBO Discussion Forums

State of the Art Monte Carlo simulations and Bridgify 104

1 User(s) are reading this topic 0 members, 1 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users