Is Facebook’s new poker AI really the best in the world?

Facebook released a paper and blog post about a new AI called Pluribus that can beat human pros. The paper title (in Science!) calls it “superhuman”, and the popular media is using words like “unbeatable”.

But I think this is overblown.

If you look at the confidence intervals in the FB blog post above, you’ll see that while Pluribus was definitely better against the human pros on average, Linus Loeliger “was down 0.5 bb/100 (standard error of 1.0 bb/100).” The post also mentions that “Loeliger is considered by many to be the best player in the world at six-player no-limit Hold’em cash games.” Given that prior, and the data, I’d assign something like a 65-75% probability that Pluribus is actually better than Loeliger. That’s certainly impressive. But it’s not “superhuman”.

I don’t know enough about poker or the AIVAT technique they used for variation reduction to get much deeper into this. How do people quantify the skill difference across the pros now?

I’m also a bit skeptical about the compensation scheme that was adopted – if the human players were compensated for anything other than the exact inverse of the outcome metric they’re using, I’d find that shady – but the paper didn’t include those details.

Thoughts?