Methodology

How a protocol's letter grade is computed. The rubric is open and deterministic.

Last updated 2026-06-11 Version v1.7.0 License CC-BY 4.0

Rubric overview #

DeFi Risk grades a protocol on 13 evidence categories and rolls those into a single letter A through F. The rubric is open, deterministic, and versioned. Two reviewers reading the same evidence should arrive at the same letter.

The letter is produced by a three-stage pipeline: each category receives a severity score (0–100) based on the proportion of red and yellow factors it contains; those category severities are aggregated into a protocol risk score (0–100) with core-five categories weighted 1.5×; the risk score is bucketed into a letter band. Critical-flag reds carry an additional penalty and can force the bottom two grades regardless of the score. There is no hidden weight vector and no trained model — every step of the derivation is published.

Shape of a grade #

A protocol page carries a letter, a one-word meaning, a risk score, and a verdict sentence. The letter and score are computed; the meaning is fixed per letter; the verdict is human prose under a 240-character cap. Where a single core-five category severity is severe enough to override the score-based letter, the page shows the cap reason.

Inputs #

Each of the 184 evidence factors is reviewed against a fixed citation list: audit reports, on-chain state, governance forums, public incident records, source repos, and operator disclosures. Every factor row on a protocol page links to its sources. If a citation cannot be made public, the factor is marked n/a — under embargo and the underlying claim does not contribute to the grade.

What we do not grade

We do not score user experience, frontend hosting, token price, marketing surface, or community sentiment. Where a category technically applies but the protocol does not use the relevant primitive — e.g. an isolated AMM with no oracle dependency — the category is marked n/a and excluded from the severity calculation entirely.

Letter thresholds #

The letter is derived from a protocol risk score (0–100) and a count of critical-flag reds. The first matching rule wins; precedence runs F > D > C > B > A.

Grade	Meaning	Threshold (first match wins)	Critical-flag trigger
A	Resilient	Risk score ≤ 12. No critical flags.	0
B	Sound	Risk score ≤ 20 with 0 critical flags; or exactly 1 critical flag with risk score ≤ 20 (prevents A regardless of score).	≤ 1
C	Watch	Risk score > 20 and ≤ 35. Material gaps warrant monitoring.	≤ 1
D	Compromised	Risk score > 35 and ≤ 55; or a single core-five category severity ≥ 60 overrides a higher natural letter.	≥ 2
F	Failing	Risk score > 55, or a core-five category severity ≥ 90 overrides any natural letter.	≥ 3

Note

A protocol with no oracle dependency cannot be downgraded for missing oracle controls. The same applies to the other core-five categories. Categories with no assessed factors (n/a) are excluded from the severity and risk-score calculations entirely.

Scoring pipeline #

The three steps from raw factors to letter:

Per-category severity. For each of the 13 categories, count assessed (non-gray) factors only: severity = (red × 3 + yellow × 1) / (assessed × 3) × 100. A category with all green factors scores 0; all red scores 100. Gray factors are excluded from the denominator — data gaps neither help nor hurt the protocol.
Protocol risk score. A weighted average of per-category severities, where core-five categories (code, governance, oracle, operational history, fork lineage) contribute at 1.5× and all other categories at 1.0×. Gray categories (no assessed factors) are excluded from the weighted average. A critical-red penalty of 5 points per critical-flag red (capped at 15 total) is added to the weighted average, capped at 100.
Letter band. The risk score and critical-flag count are compared against the threshold table above. After the natural letter is assigned, a cap override checks each core-five category: a severity ≥ 60 caps the letter at D; a severity ≥ 90 forces F, regardless of the natural letter. The cap reason is shown on the protocol page.

Critical factors #

A critical factor is an evidence item that, when red, prevents an A grade on its own and adds 5 points to the protocol risk score (cap 15 across all critical reds). A single critical-flag red guarantees at least B; two guarantee at least D; three or more guarantee F. Critical status is asserted explicitly in the factor definition file and cannot be inferred at grading time. As of rubric v1.7.0 there are 20 critical factors out of 184 total.

Examples of critical-factor categories include:

Live admin without delay. Any path that lets a single key — or a single multisig threshold — execute a state-changing upgrade with no timelock delay.
Unaudited live primitive. A core primitive (matching engine, lending core, oracle adapter) deployed without a public audit covering the live commit.
Oracle without staleness floor. An oracle adapter with no on-chain staleness or deviation guard against the underlying feed.
Solvency self-attestation only. A protocol whose reserve solvency is asserted only off-chain, with no on-chain or proof-of-reserve mechanism.

The complete critical-factor list lives in rubric/critical.yml in the public repo. Each entry carries the exact predicate, the citation requirement, and the chain of reasoning from evidence to grade.

Core-five categories #

The 13-category taxonomy on a protocol page is the full evidence surface. Five of those categories are load-bearing for the letter:

Category	ID	What it covers
Code & audits	1	Source-level review state, audit coverage of the live commit, severity-by-severity remediation log.
Governance & admin controls	2	Multisig topology, timelock delays, emergency paths, signer rotation, upgrade scope.
Oracle & external deps	3	Feed providers, heartbeat & deviation, fallback logic, cross-chain relay risk.
Operational history	5	Incident record, postmortem cadence, recovery posture, response times.
Fork / dependency lineage	8	Upstream code provenance, divergence from canonical fork, dependency CVE exposure.

Core-five categories carry a 1.5× weight in the protocol risk score, and are the only categories subject to the single-category cap override (severity ≥ 60 → D floor; ≥ 90 → F override). The remaining eight categories — economic risk, real-time signals, dev identity, post-deploy hygiene, cross-chain, threat intelligence, tooling, and response hygiene — are weighted at 1.0× and cannot trigger the cap. They contribute to the risk score in full, but a single weakness in one of those eight categories has less leverage than the same weakness in a core-five category.

Appeals #

A protocol team that disputes a grade can file an appeal. Appeals are public from the moment they are filed.

Open a dispute against a specific factor — not against the letter. Letters are downstream of factors, so a successful appeal targets the underlying evidence.
Submit counter-citations that bear on the predicate. New audits, on-chain changes, or updated governance configurations all qualify.
Two-reviewer adjudication against the rubric. The first reviewer is the one who originally signed off; the second is drawn from the standing reviewer pool.
Decision published within 14 days, with full reasoning. The factor either flips, stays, or is rewritten — and the resulting letter is recomputed.

Appeals do not pause the grade

A grade stays live throughout the appeal window. We do not gate publication on the appeal calendar. If an appeal succeeds, the historical record shows both the pre-appeal and post-appeal grade with the date of change.

Changelog #

The rubric itself is versioned. Material changes are documented in the rubric changelog; every protocol assessment is stamped with the rubric version it was graded against.

A protocol's letter can change for two reasons: the evidence changed, or the rubric changed. Both are tracked separately. When a rubric version flips a letter without any evidence change, the assessment carries a small rubric-shift note explaining which rule moved.