Scoring guidance
The goal is not theoretical precision. The goal is consistent execution governance.
For each task, ask:
Then assign a value from 1 to 3 for each dimension.
Practical advice
If one team scores slightly differently from another, that is acceptable. What matters is that the scoring is stable enough to drive predictable policy.
Do not lower a score just because the team wants faster approval. Do not raise a score just to block work. CIRK should describe the task honestly.
If review is truly critical, score R3 even if the rest of the task feels simple. If rollout is sensitive, score K3 even if implementation is small.
Contrast
These tasks may both be quick to draft with AI, but their execution conditions are radically different.
Update copy on marketing page
Modify billing webhook behavior
Execution window
For teams that need a time-adjacent signal — particularly during transition from traditional planning — CIRK vectors can be mapped to an execution window.
| Vector profile | Execution window | Why |
|---|---|---|
| All ≤ 1 | Minutes | Autonomous, no gates |
| Low R, low K | Hours | Fast execution, light review |
| High I (I3) | Hours to 1 day | Iteration cycles add elapsed time |
| High R (R3) | +1 day | Human review is calendar-bound |
| High K (K3) | +1–2 days | Deploy coordination, runbooks, staging |
| Score ≥ 10 | Days | Multiple approval gates compound |
| C3 I3 R3 K3 | Blocked | Must decompose before estimating |
This is not a CIRK output. It is a derived heuristic that teams can calibrate to their own cadence.
High R and high K add calendar time — not because the work is large, but because humans and processes are in the critical path.
Scoring FAQ
Give them an Execution Window derived from the vector. Explain that the window is governed by review and rollout constraints, not code complexity. This is more accurate than story points ever were — because it models the actual bottleneck.
Not in the story-point sense. But you can track throughput by execution mode: how many autonomous tasks shipped this week vs. how many supervised tasks completed. That tells you more about team capacity than a velocity chart.
CIRK still works. Even in fully human workflows, the dimensions hold: context depth, iteration likelihood, review burden, and integration risk are real constraints regardless of who writes the code. The execution policies apply the same way.