try:
// results — bob is watching
Bob is judging all five… this might sting.
Bob's global verdict:
// top 10 worst answers — all time
// model analytics — who is winning, failing, and slowing down
0ms
average judge latency
$0.00
average spend per run
Best Overall
none
waiting for runs
Most Reliable
none
waiting for runs
Fastest
none
waiting for runs
Best Value
none
waiting for runs
Recommended Default
none
waiting for runs
Cheap Fallback
none
waiting for runs
Premium Option
none
waiting for runs
Cheapest Reliable
none
waiting for runs
Top Spend Driver
none
waiting for runs
Promote
none
waiting for runs
Demote
none
waiting for runs
// recent runs — receipts included
none
most affected provider
Latest Judge Parse Failures
Select a run
Pick a recent run to inspect prompt, providers, timings, and Bob's verdict.
you bored? ping us.