Tanagram Canon meaningfully outperforms other code review bots in signal-to-noise, with some users seeing a greater than 3× improvement.

A chart showing "Comment addressed rate" for various code review bots: Qodo at 5.9%, Greptile at 12.0%, Coderabbit at 13.1%, Cubic at 15.3%, Bugbot at 15.4%, and Tanagram ranging from 16.5–48%, with an outlier at 64% for one customer who mandated that all Tanagram comments be addressed before people reviewed a PR.

Methodology

We consider a comment on file $f$ and line $n$ on commit $C_i$ " addressed" if there exists a commit $C_{j, j > i}$ that also modifies line $n$ in file $f$ . This is admittedly a loose definition of "addressed", but we found that it correlated well with bots' self-reporting of whether a given comment was resolved.

For non-Tanagram data, we started by searching open-source projects on Github to identify a set of repos with between 100–10,000 stars that contained comments from any of the identified bots. We ended up with 65 arbitrary repos, most of which had only one of the bots installed.

For each such repo, we selected up to 50 comments from each bot, only selecting merged PRs, with no more than one comment from any given PR to avoid per-PR bias (e.g. a hotfix PR that attracted many comments but was merged in haste).

Given rate-limit constraints, we ended up with 2052 comments:

502 from Greptile
947 from Coderabbit
17 from Qodo
262 from Cubic
324 from Bugbot

In Github's API, each comment is anchored to a specific commit $C_i$ in a PR. For each comment, we listed the commits in its corresponding PR, generated the git diff for all subsequent commits $C_{j, j>i}$ , and checked to see if the comment's file and line were contained in that diff.

Tanagram data started with our database records of the comments we generated, instead of searching through Github, but was otherwise derived the same way. Our data is segmented by user, hence the range in output.

What Makes Tanagram Different

With so many players in the market, code review is commoditized¹. Everyone has access to the same market beta in model intelligence and harnesses.

Our alpha comes from what we choose to evaluate.

Most other bots are designed to find a bug, any bug. That's certainly useful — we use Bugbot and it's caught major issues — but it leads to noisy, inconsistent results:

Low-priority nits
Overlooked problems
Different problems coming up every time you push a change

In contrast, Tanagram focuses on the exact rules that your team cares about. This means everything Tanagram looks for is something that your team explicitly cares about, and we can give precise instructions to our agent, improving both precision and recall.

Why It Matters

Writing software is easier than ever; but knowing what to write — architecture, design, evolving patterns — becomes a bottleneck. In an org of 20 engineers, 1 or 2 of them are the subject-matter experts that get pulled onto every project to offer their expertise and judgment.

This doesn't scale. Although, as an industry, we've sped up other aspects of our software factory, the limiting step remains expert review.

Tanagram solves this bottleneck by indexing every team's history, insights, and expertise, and automatically uses that context to guide engineering output across the development lifecycle. It's a copy of your best principal engineer, available to every engineer.

Try Tanagram

Tanagram Canon is a repository of team-specific rules that powers code reviews on Github.

It works alongside our CLI, which uses those same rules to steer agent output while they're generating code, before PR time.

The CLI also powers Lore, which enables teams to archive and collaborate on coding agent threads.

We encourage you to explore the documentation or try it out.

That's why we consider code review to be a feature within our broader product offering ↩

Methodology

What Makes Tanagram Different

Why It Matters

Try Tanagram

Footnotes