Qodo.
How a quality-first AI platform actually thinks — and what to steal from it.
The 30-second version
Read this and you can hold a conversation. The rest of the page earns you the deep version.
- 1Qodo is a "quality-first" AI dev platform — its bet is that AI should verify and explain code, not just generate it.
- 2Three surfaces over one brain: Gen (IDE), Merge (PR review), Command (CLI agents) — all fed by a cross-repo Context Engine.
- 3Qodo Merge = the old PR-Agent, which is open source (Apache-2.0) and the single best thing to read in the whole Qodo universe.
- 4Two transferable ideas: flow engineering (don't one-shot hard tasks) and agentic RAG (retrieve code by structure + intent, not raw tokens).
- 5The craft secret everywhere: force the model into typed structured output, then render and gate on it.
Quality is the product, not speed.
Qodo (Tel Aviv, founded 2022 by Itamar Friedman & Dedy Kredo; rebranded from CodiumAI in September 2024 alongside a $40M Series A) sells a thesis: most AI tooling optimizes for writing code faster, and almost nothing optimizes for trusting it afterwards.
That framing is exactly why it's worth studying if you care about explanation and evaluation. Qodo's center of gravity — review, test, audit, justify — is precisely the territory of a tool that exists to make people want to read an analysis of their code.
Everything below is one of two things: a product you might use, or an idea you should internalize. The ideas outlast the products.
Three surfaces, one brain.
Naming churns constantly — PR-Agent became Qodo Merge; Qodo Gen CLI became Qodo Command. Don't let the renames confuse you.
Formerly PR-Agent. AI pull-request review & automation across GitHub, GitLab, Bitbucket, Azure. Open-source core + hosted Pro tier.
A CLI to build, run and schedule custom AI agents defined as TOML files. Can emit validated JSON, write files, run as an MCP server.
The IDE plugin (VS Code, JetBrains): code generation, test generation, chat, local review.
Agentic RAG across thousands of repos (semantic + architectural + temporal). Open Aware is the free MCP variant for public repos.
Internalize these and the rest is detail.
Flow engineering
Their AlphaCodium research (arXiv 2401.08500) lifted GPT-4 on CodeContests from 19% → 44% — not with a bigger model, but by structuring the task into verified stages: reflect → propose → generate tests → run & fix.
Lesson for you: a good audit is never "dump the diff, ask for a review." It's describe → analyze → evaluate → rank → explain, each stage feeding the next.
Agentic RAG
You can't paste a repo into a prompt. Qodo indexes with AST-aware chunking (a method kept with its class signature + imports) and embeds an LLM-written description next to each chunk, so retrieval matches intent, not just tokens.
Three retrieval axes — semantic, architectural, temporal — map almost 1:1 onto your scopes: module/repo, dependencies, and commit/branch history.
The reference implementation.
It's open source and it already does the "analyze a changeset" half of any code-audit tool. Here is how a /review actually flows from a diff to a comment.
Two pieces are pure gold to copy. First, the diff is reformatted with explicit __new hunk__ / __old hunk__ markers plus line numbers — that's the trick that lets the model cite an exact file:line. Second, the git-provider abstraction includes a local provider, so it diffs a local branch, not only a hosted PR.
Run it in ten minutes
pip install pr-agent
export OPENAI_KEY=sk-... # or ANTHROPIC_KEY, via litellm
# review a hosted pull request
pr-agent --pr_url https://github.com/owner/repo/pull/123 review
pr-agent --pr_url ... describe
pr-agent --pr_url ... improve
# steer it inline
pr-agent --pr_url ... review --pr_reviewer.extra_instructions="focus on security"
# a local branch with no hosted PR: set config.git_provider="local"
Config lives in settings/configuration.toml, overridable per-repo with a root .pr_agent.toml. The gap worth noticing: PR-Agent emits Markdown only and has no single-commit / module / whole-repo audit mode. That gap is the opportunity.
What the artifacts actually look like.
Seeing the real shapes is what turns "I know what Qodo is" into "I know how Qodo thinks." Examples are faithful in shape, paraphrased for clarity.
A. The output of /review
Note it's a table of typed fields, not prose — typed fields render into UI and gate CI; free prose does neither.
## PR Review
**Estimated effort to review [1-5]:** 3 — touches auth middleware + a refresh path.
**Score:** 72
**Relevant tests:** No — the token-refresh path has no coverage.
**Possible security concern:** Yes — `refresh_token` logged at INFO in `auth/session.py:88`.
### Key issues to review
| Issue | File | Why it matters |
|-------|------|----------------|
| Missing null-check on `user` | `api/handler.py:42` | NPE on anonymous requests |
| Broad `except:` swallows errors | `core/db.py:130` | Hides connection failures |
B. The most instructive artifact — a prompt
Prompts are Jinja2 templates stored inside .toml files. The schema is the prompt: define the YAML shape, demand conformance, parse defensively.
[pr_review_prompt]
system = """You are PR-Reviewer, an AI reviewing a Git Pull Request.
Respond in YAML conforming exactly to this schema:
review:
estimated_effort_to_review_[1-5]: int # 1 = trivial, 5 = very hard
score: str # 0-100
relevant_tests: str # "yes" / "no"
security_concerns: str # "No" or describe + file:line
key_issues_to_review:
- relevant_file: str
issue_header: str
issue_content: str
start_line: int
"""
user = """PR title: {{ title }}
The diff (line-numbered, with '__new hunk__' / '__old hunk__' markers):
{{ diff }}
Respond ONLY with valid YAML matching the schema above."""
C. A Qodo Command agent
A "custom agent" is just a TOML block. output_schema forces validated JSON; execution_strategy="plan" is flow-engineering as a knob; exit_expression turns analysis into a CI gate.
[commands.audit_module]
description = "Audit a module and emit structured findings"
instructions = "Analyze files under {{ module_path }}; rank issues by severity."
arguments = [
{ name = "module_path", type = "string", required = true },
]
tools = ["filesystem", "ripgrep", "git", "qodo_aware_context_retriever"]
execution_strategy = "plan" # multi-step (vs "act" = one-shot)
exit_expression = "issues.length == 0" # CI pass / fail gate
D. The pattern you'll reuse most — annotate the diff
The single most persuasive format: render the real change, drop a severity-colored callout at the exact line. Showing beats telling.
def refresh(token):
- log.debug("refreshing")
+ log.info("refreshing %s", refresh_token)
The raw refresh_token is now written to INFO-level logs, which are typically shipped off-box and retained. Log a redacted prefix instead, e.g. refresh_token[:6] + "…".
How a huge diff fits in a small window.
The hardest part of any code-analysis tool. Worth knowing cold.
- Additions > deletions. Collapse deleted files to a one-line list; drop deletion-only hunks.
- Token-cost sort. Estimate tokens per file with tiktoken; group by language.
- Greedy packing. Add patches until a buffer threshold below the model max; clip any single oversized patch.
- List the rest. Files that didn't fit are named ("other modified files") so the model knows they exist.
- Reserve output budget. Hold back tokens so the response isn't truncated mid-YAML.
Lock in the vocabulary.
- Flow engineering
- Structuring an LLM task into verified multi-step stages instead of one prompt. AlphaCodium's core idea.
- Agentic RAG
- Retrieval where an agent iteratively re-queries and reasons over results, vs one-shot fetch-then-answer.
- PR compression
- The token-aware diff-packing that fits a large changeset into a context window.
- Structured output
- Forcing the model into a typed YAML/JSON schema so the result is machine-renderable and gate-able.
- Context Engine
- Qodo's multi-repo index: AST chunking + NL descriptions + semantic / architectural / temporal retrieval.
- MCP
- Model Context Protocol — the open standard by which Qodo exposes tools/context to and consumes them from Claude, Cursor, etc.
The canonical sources, in order.
- github.com/qodo-ai/pr-agent
Read
settings/*_prompts.tomlandalgo/pr_processing.py. The highest-signal artifact in the whole Qodo universe. - qodo-merge-docs.qodo.ai
The "core-abilities" pages on compression strategy and metadata.
- arXiv:2401.08500 — AlphaCodium
The intellectual foundation: flow engineering + test-based iteration.
- qodo.ai/blog — "RAG for large-scale code repos"
The Context Engine internals: AST chunking, embedded descriptions, retrieval axes.
- github.com/qodo-ai/agents
Example agents — look at ArchMind and Changelog Generator, the closest analogs to "audit → document."
- deepwiki.com/qodo-ai/pr-agent
An auto-generated architecture map if you want orientation before the source.
What the marketing won't tell you.
Names drift constantly. PR-Agent ↔ Qodo Merge, Gen CLI ↔ Command. Always check which era a blog post is from.
Pricing numbers in third-party articles conflict ($19 vs $30/user; 75 vs 250 free credits). Trust only qodo.ai/pricing.
Some 2026 claims are unverified. "Series B", "Qodo 2.0", SWE-bench scores mostly trace to SEO sites, not primary sources.
Qodo is not deterministic SAST. It complements Sonar's rules and gates; it doesn't replace them. AI review is probabilistic — it can miss, and it can hallucinate.