benchmarks

The numbers other vendors won't publish.

Bot join time. Recording-finalize latency. Per-platform reliability. Anti-bot detection escape rates. Recorded by a synthetic test suite that runs against our production every hour. The repo is public — fork it and run it against any vendor in this category.

No production data yet — pre-launch. Numbers re-populate weekly once the synthetic test suite starts emitting; cells stay as em-dash until then.

Why we publish this

Recall claims a 99.9% SLA but doesn't publish the underlying numbers. MeetingBaaS, Vexa and Attendee don't publish any. We instrument every bot we dispatch and emit the raw observations to a Postgres warehouse the synthetic-test-suite repo reads from.

If you're picking infra for production you should be able to read the numbers, run the test suite, and audit the methodology. The link below is the repo with the harness, the queries, and the historical CSVs.

github.com/meetbot-dev/benchmarks

Bot join time, by platform

Wall-clock seconds from API dispatch to bot-in-meeting. Lower is better. Aggregated over the last 30 days of synthetic runs (one dispatch per platform every 15 min · ~2,880 samples per platform).

platform	mean	p50	p95	p99	samples
Google Meet	—	—	—	—	—
Microsoft Teams	—	—	—	—	—
Zoom	—	—	—	—	—

Recording finalize latency

Wall-clock seconds from meeting-end-detected to file-available-in-S3 + signed webhook delivered. Lower is better.

platform	mean	p50	p95	p99	samples
Google Meet	—	—	—	—	—
Microsoft Teams	—	—	—	—	—
Zoom	—	—	—	—	—

Per-platform reliability

Percentage of dispatched bots that successfully joined the meeting. Failures are bucketed: anti-bot rejection, platform-side error, our error.

platform	joined	anti-bot	platform err	our err
Google Meet	—	—	—	—
Microsoft Teams	—	—	—	—
Zoom	—	—	—	—

Anti-bot detection escapes

Percentage of dispatched bots flagged or kicked by the platform's anti-bot logic. Lower is better. Tracked separately from join failures because a bot can join then be kicked mid-call.

platform	anti-bot	samples
Google Meet	—	—
Microsoft Teams	—	—
Zoom	—	—

Marketing claims vs measured numbers

What each vendor claims on their pricing/landing page, vs what we can measure (where measurable). We re-run this when their pages change. Sources linked.

vendor	marketing claim	measured (us)	source
Recall.ai	99.9% SLA on enterprise tier	—	recall.ai/pricing
MeetingBaaS	no published reliability number	—	meetingbaas.com
Vexa	no published reliability number	—	vexa.ai
Attendee	no published reliability number	—	attendee.dev

Methodology

A Trigger.dev cron fires every 15 minutes per platform. Each run dispatches a bot to a synthetic test meeting (a long-lived Meet/Teams/Zoom URL hosted on a control account), measures the join time, lets the bot record for exactly 60 seconds, then ends the call and measures finalize latency.

Failures are tagged with a structured failure code (anti_bot_rejected · platform_error · our_error) so the reliability table separates 'platform kicked us' from 'we crashed'. The synthetic test suite is open under MIT — anyone can fork it and point it at a different vendor's API.

We don't smooth or hand-pick. The aggregations below come from a SQL query that runs on the raw observations table; the query is in the repo. If you find a bug in the methodology, open an issue.

Last updated 2026-05-09 · re-measured weekly via the public synthetic test suite at github.com/meetbot-dev/benchmarks · methodology in /docs/benchmarks-methodology · raw CSVs in the repo's data/ directory.