GIB vs Ben vs Humans: Latest Robot Performance Stats

2 Pages
1
2
→

You cannot start a new topic
You cannot reply to this topic

GIB vs Ben vs Humans: Latest Robot Performance Stats Robot Rankings Hub

#1 diana_eva

Group: Admin
Posts: 5,115
Joined: 2009-July-26
Gender:Female
Location:bucharest / romania

Posted 2024-November-18, 18:12

We periodically run simulations to see how different types of robots perform in games originally played by humans. By replaying these games with robots, we can evaluate their performance relative to the human field and to each other.

The robots competing in these simulations are Basic GIB, Advanced GIB, BBO's Ben AI and the latest candidate for a Ben upgrade, which we fondly call "Big Ben".

We plan to provide regular updates on these simulations, either monthly or once every few months, as well as after major robot upgrades. We'll kick off this thread with the latest results from our November simulations.

About the Robots

GIB is the classic BBO bridge robot, first integrated into BBO in July 2005. GIB, which stands for Ginsberg's Intelligent Bridgeplayer, was created by Matt Ginsberg. Over the years, BBO has maintained and refined GIB, with a focus mainly on updating the 2/1 system it plays.

Basic GIB is the most used bridge robot on BBO -- you can try it in all our free robot games and as a substitute when a player leaves mid-hand; you can recognize from the name BasicGIB 2/1.
Advanced GIB uses simulations during close decisions, in addition to the programmed rules, and analyzes multiple possible outcomes to choose the best course of action. If you see AdvGIB 2/1 on BBO, it's the advanced GIB robot -- you will have it as your partner and opponent in almost all premium games, as well as when you register with a robot partner in pair games.
While both versions of GIB use the same bidding system, the difference lies in how much they "think" before acting.

Ben (short for Bridge ENgine) is a machine learning-based robot developed by Lorand Dali, who is now part of the BBO team. BBO's version of the Ben AI was trained on hundreds of millions of deals played by humans on BBO. You can play with Ben at any time by visiting Robot World, and then "Try our AI Bridge Engine".

Big Ben is an enhanced version of Ben, with more training on the bidding and numerous improvements to the play engine. 'Big Ben' will soon be introduced as an upgrade to the Ben AI. You can read more about Ben and its variations here.

#2 diana_eva

Group: Admin
Posts: 5,115
Joined: 2009-July-26
Gender:Female
Location:bucharest / romania

Posted 2024-November-18, 18:37

November Highlights: Robot Performance at a Glance

Below are the BBO robot results in various games played in November 2024.

Advanced GIB consistently outperformed the field, achieving the highest average matchpoint scores in nearly every game format. It stands out most in the Just Declare formats, where the focus is on declarer play.
BBO Big Ben appears to be almost as good as Advanced GIB, and even surpassed it in specific games. Its results in the Zenith Daylong Reward — one of BBO's most popular and competitive formats — highlights the significant improvements made to its bidding and play, compared to the 'simpler' BBO Ben AI.
Basic GIB continues to hold its own as a reliable and solid bridge partner, with above-average performance across most simulations.
BBO Ben, our first AI child, excelled in massive free games like the Free BBO Super Sunday Daylong, where it remained competitive with Advanced GIB and Big Ben.

#3 johnu

Group: Advanced Members
Posts: 5,301
Joined: 2008-September-10
Gender:Male

Posted 2024-November-18, 19:31

The like button is turned off for posts from BBO staff, so I'm posting this just to like your post.

Interesting results for the robots. Not much different results for Advanced and Basic GIB from previous comparisons, but Big Ben seems to be a noticeable improvement over Ben. Ben seemed very weak based on some of the earlier bug reports in this forum, so it's encouraging that Big Ben is scoring almost as well as advanced GIB.

#4 lorserker

Group: Full Members
Posts: 141
Joined: 2007-November-26

Posted 2024-November-19, 05:55

Updated the robot in Ben & Friends to be the new improved Ben

#5 pilowsky

Group: Advanced Members
Posts: 3,885
Joined: 2019-October-04
Gender:Male
Location:Poland

Posted 2024-November-19, 15:47

Is there an update log?
Do the improvements include bidding the same way in response to the same bids?

Fortuna Fortis Felix

#6 diana_eva

Group: Admin
Posts: 5,115
Joined: 2009-July-26
Gender:Female
Location:bucharest / romania

Posted 2024-November-19, 16:53

pilowsky, on 2024-November-19, 15:47, said:

Is there an update log?
Do the improvements include bidding the same way in response to the same bids?

If you mean for Ben, no release notes yet, as it's still in Beta, but I added the beta release to the Changelog section here: https://news.bridgeb...out-ben-on-bbo/

#7 pilowsky

Group: Advanced Members
Posts: 3,885
Joined: 2019-October-04
Gender:Male
Location:Poland

Posted 2024-November-19, 17:49

diana_eva, on 2024-November-19, 16:53, said:

From the link above.

Efficiency

...
Simulation algorithm allows to do a lot of simulation during bidding (e.g basic GIB doesn't use any simulation at all).

Is this the reason for the variations in bidding during auctions that are otherwise identical?

Fortuna Fortis Felix

#8 benellis58

Group: Full Members
Posts: 188
Joined: 2022-July-07

Posted 2024-November-20, 21:48

It's amazing that the GIB robot has these results, since it plays a very poor system, can't "think", doesn't signal, has abominable "judgment", seems to treat all spot cards interchangeably, is an incredibly poor leader, is not only a wildly untalented bidder but also a ridiculously erratic and inconsistent one, frequently misdefends atrociously, and can't even declare very well. I may have lived a sheltered existence, but I cannot ever remember having seen even a single human bridge competitor who played as consistently and constantly poorly in all facets of the game as the GIB robots.

#9 smerriman

Group: Advanced Members
Posts: 4,713
Joined: 2014-March-15
Gender:Male

Posted 2024-November-20, 22:36

benellis58, on 2024-November-20, 21:48, said:

I assume you've never played in the Main Bridge Club on BBO. GIB has its flaws, but I disagree that it is inconsistent; for those who have learnt its system - most humans in the MBC are far worse.

A better handviewer
Reverse engineering GIB, part 1: A treatise on the insanity of bidding simulations

#10 benellis58

Group: Full Members
Posts: 188
Joined: 2022-July-07

Posted 2024-November-21, 02:10

Stephen, you're correct in assuming that I have never played in the Main Bridge Club on BBO, but I have played "real" bridge F2F for decades, and in that milieu I have never...ever...seen a human as completely and consistently incompetent as the GIB robots. If, as you say, "most humans in the MBC are far worse", then those poor souls have my deepest sympathy, because it is virtually impossible for me to imagine anyone playing worse than a GIB robot on a regular basis.

You're also VERY correct when you say that "GIB has its flaws". Lord KNOWS that you are correct with that statement!

I'm well aware that you are very knowledgeable about GIB and I always look forward to and appreciate your comments, but based on the many thousands of hands I have played in the robot world, I must disagree with you on the issue of robotic inconsistency. I will concede, however, that they are so abysmally weak in so many other areas that the inconsistency is one of their less egregious failings (bad as it nonetheless is). When I speak of their inconsistency, by the way, I am referring more to their inconsistency within a single auction than to inconsistency regarding their (very poor) system.

#11 johnu

Group: Advanced Members
Posts: 5,301
Joined: 2008-September-10
Gender:Male

Posted 2024-November-21, 03:01

benellis58, on 2024-November-20, 21:48, said:

I may have lived a sheltered existence, but I cannot ever remember having seen even a single human bridge competitor who played as consistently and constantly poorly in all facets of the game as the GIB robots.

Yes, you have led a sheltered existence. Or maybe just have a bad memory. Have you ever played in a novice game, e.g. ACBL game with limit of 0-20 masterpoints?

Yes, there are actual world class players who play in some of the BBO games. They are a very small minority. The vast majority of players are very bad players. Some are new players who have only played online a short time, or maybe in home games. Others may have played a long time but they basically have never improved after years of playing.

Have you played in a robot tournament? For the most part, just bidding a game and taking all your tricks is like a 60-70% matchpoint score. A few people will end up in a ridiculous contract. Others will fail to take the obvious tricks.

#12 lorserker

Group: Full Members
Posts: 141
Joined: 2007-November-26

Posted 2024-November-21, 06:14

benellis58, on 2024-November-20, 21:48, said:

I suggest you do an experiment.
Challenge GIB a few times (under Challenges->Challenge a robot). The basic GIB is free to challenge.
See how you fare. You may find it hard to beat.
If you can consistently beat it, try challenging in "Just Declare" mode. That is even harder to beat.
I regularly do this to practice.
Then try to challenge your friends who are much better than the robot. See how you fare against them.
(Irrelevant to the experiment, you may also challenge me, I would be happy to play)

My impression is that GIB is a pretty strong player. Yes, it plays very differently from humans.
With Ben we tried to build a robot that is supposed to be more "human". I invite you to try it and see what you think.

#13 lorserker

Group: Full Members
Posts: 141
Joined: 2007-November-26

Posted 2024-November-21, 08:56

pilowsky, on 2024-November-19, 15:47, said:

Is there an update log?
Do the improvements include bidding the same way in response to the same bids?

Yes, I have fixed the randomness.

#14 mycroft

Secretary Bird

Group: Advanced Members
Posts: 8,143
Joined: 2003-July-12
Gender:Male
Location:Calgary, D18; Chapala, D16

Posted 2024-November-21, 10:50

GIB, playing with GIB, scores 54% almost like clockwork in the average club, when asked to fill in. Even against pairs that don't play GIB (or even play totally not-GIB, Precision! 10-12 NT! Even DONT instead of Cappelletti!)

Does GIB bid, lead, signal, or play *differently* to human players? Of course. Is GIB able to adjust to partner? 100% no - you play with GIB, you play GIB's style and system or you lose.

But evidence - including the evidence in this thread - shows that *it is better than the average club player* if given a partner it can trust and understand.

If that means you've never seen a human as bad as GIB, then you play in *much better* clubs than I do.

If you're going to "never", I will repeat my "church 'kitchen' bridge" story. I open 2NT (22-24, remember, this is kitchen bridge). Partner raises me to 3, and puts down a boring flat 12 count. After I play the squeeze for the 13th trick, he asks me "should I have looked for slam?" He had been playing for decades at this point (but never duplicate). If you say "no bridge player", you need to deal with the ones who GIB will beat just because it can count to 33...

Long live the Republic-k. -- Major General J. Golding Frederick (tSCoSI)

#15 pescetom

Group: Advanced Members
Posts: 9,071
Joined: 2014-February-18
Gender:Male
Location:Italy

Posted 2024-November-21, 16:31

mycroft, on 2024-November-21, 10:50, said:

GiB can count to 33 but it frequently does so where humans would count 31 or 35 or admit bidding error

Having said that, it is competent (despite the vestigal system notes and untruthful explanations) in uncontested partial and game seeking situations and better than most humans in competitive situations.
I dispute (as often before) the 54% vs club number, but that is probably because an Italian club bids in a different way and the robot is too dumb to understand explanations.

#16 johnu

Group: Advanced Members
Posts: 5,301
Joined: 2008-September-10
Gender:Male

Posted 2024-November-21, 22:43

mycroft, on 2024-November-21, 10:50, said:

If you're going to "never", I will repeat my "church 'kitchen' bridge" story.

My work lunchtime bridge story.

When I was in graduate school, I had a part time job working at a large company that had a cafeteria. There was a long running bridge game that happened, and one of the regulars was out sick, so I was recruited to play. None of the players in the game ever played duplicate bridge.

I had a decent 14 or so HCP as South, and passed the 4♠ bid. Partner puts down an 17 HCP hand, and 6♠ was laydown. I made the mistake of asking why partner didn't make a stronger bid, and was informed that 3♠ showed a good opening hand (OK, limit raises had not made it to the game yet), and that 4♠ showed a stronger hand than 3♠ (not sure when that may have been in style, if ever). I made another mistake when I tried to point out that 4♠ showed a preemptive type of hand, and the other players plus a couple of kibitizers assured me that 4♠ was indeed a very strong hand.

That was the last time I played lunchtime bridge.

#17 diana_eva

Group: Admin
Posts: 5,115
Joined: 2009-July-26
Gender:Female
Location:bucharest / romania

Posted 2024-November-22, 05:52

johnu, on 2024-November-21, 22:43, said:

.... other players plus a couple of kibitizers assured me that 4♠ was indeed a very strong hand.

This is how I learned to play too FWIW, for quite some time. 2x opening was also strong. Our only forcing bids were jump and jump more

#18 pescetom

Group: Advanced Members
Posts: 9,071
Joined: 2014-February-18
Gender:Male
Location:Italy

Posted 2024-November-22, 09:22

diana_eva, on 2024-November-22, 05:52, said:

This is how I learned to play too FWIW, for quite some time. 2x opening was also strong. Our only forcing bids were jump and jump more

No leaning forwards in a hunch of expectation ? B-)

#19 hrothgar

Group: Advanced Members
Posts: 15,724
Joined: 2003-February-13
Gender:Male
Location:Natick, MA
Interests:Travel
Cooking
Brewing
Hiking

Posted 2024-November-22, 09:29

Thanks very much for posting how well GIB / Ben do in the various events

If possible, it would be interesting to see the Standard Deviation in addition to the means.

To me, the most interesting result is how much worse GIB does in the Zenith daylong than any other event...

Are the conditions of contest significantly different?

Alderaan delenda est

#20 diana_eva

Group: Admin
Posts: 5,115
Joined: 2009-July-26
Gender:Female
Location:bucharest / romania

Posted 2024-November-22, 11:46

hrothgar, on 2024-November-22, 09:29, said:

I'm not sure how to compute that, but Lorand will probably know.

About the conditions of contest yes, they are widely different. We tried to choose games which are very different and appealing to different audiences.

Free Super Sunday is 12 boards, free to play, Best Hand, huge attendance (because free) and huge number of deal pools.
Premium BBO Daylong is 8 boards, Best Hand, has an entry fee, large number of players but significantly lower than any of the free games.
Premium Just Declare is 8 boards, "just declare", random (ie, not best-hand), has an entry fee.
ACBL Daylong is 12 boards, Best Hand, has an entry fee, specific audience and ACBL masterpoints are awarded using ACBL's formula, so the awards are different too, compared to BBO Daylongs.
Zenith Daylong Reward is 16 boards, random deals, has an entry fee and BB$ prizes (70% of the entry fees goes back to top scorers)
Stars & Platinum is restricted to star players and players with at least 5 Platinum ACBL Masterpoints, it's 3 sessions over 3 days, 16 boards each session, Just Declare, random deals.

2 Pages
1
2
→

You cannot start a new topic
You cannot reply to this topic

3 User(s) are reading this topic
0 members, 3 guests, 0 anonymous users

Google

BBO Discussion Forums: GIB vs Ben vs Humans: Latest Robot Performance Stats - BBO Discussion Forums

GIB vs Ben vs Humans: Latest Robot Performance Stats Robot Rankings Hub

#1 diana_eva

#2 diana_eva

#3 johnu

#4 lorserker

#5 pilowsky

#6 diana_eva

#7 pilowsky

#8 benellis58

#9 smerriman

#10 benellis58

#11 johnu

#12 lorserker

#13 lorserker

#14 mycroft

#15 pescetom

#16 johnu

#17 diana_eva

#18 pescetom

#19 hrothgar

#20 diana_eva

3 User(s) are reading this topic
0 members, 3 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

BBO Discussion Forums: GIB vs Ben vs Humans: Latest Robot Performance Stats - BBO Discussion Forums

GIB vs Ben vs Humans: Latest Robot Performance Stats Robot Rankings Hub

3 User(s) are reading this topic 0 members, 3 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

3 User(s) are reading this topic
0 members, 3 guests, 0 anonymous users