YOU HAVE 0 OPEN COMPLIANCE EVENTS

Stop estimating.
Start controlling AI execution.

A proposed standard that replaces story points with a model that defines how AI should execute, review, and deploy work.

CContextHow much context does the AI need?

IIterationHow many iterations until it works?

RReviewHow much human review is required?

KRiskHow dangerous is deploying this?

Task: fix bug in payment calculation

C: 3 | I: 2 | R: 3 | K: 3

→ requires human review + controlled deployment

AI writes the code. CIRK governs the execution.

Read the spec GitHub →

Story points ask

"How long will this take?"

CIRK asks

"How should this run?"

The shift

Story points were built to estimate human effort.

But AI changed the bottleneck. What matters now is not just how long work takes. It is:

how much context an agent needs
how many iterations are likely
how much human review is required
how risky rollout will be

CIRK models that reality.

Story points governed human delivery for 20 years. CIRK is designed to govern AI delivery.

Time is a side effect of the vector — not a dimension of it.

The model

Four dimensions. One execution policy.

Each task is scored from 1 to 3 across four dimensions. The vector defines how it runs.

[ C, I, R, K ] → Execution Decision

Context

How much context does the AI need?

C1	Isolated logic, local change, minimal context
C2	Multiple components, shared patterns, moderate system understanding
C3	Cross-system behavior, architectural reasoning, or domain-critical context

Iteration

How many iterations until it works?

I1	Deterministic or near one-pass execution
I2	Some iteration expected, usually 2–3 cycles
I3	Ambiguous, exploratory, or high-complexity convergence

Review

How much human review is required?

R1	Quick validation or spot-check
R2	Rule-level or functional review required
R3	Deep architectural, security, or mission-critical review

Integration Risk

How dangerous is deploying this?

K1	Safe, additive, low-risk rollout
K2	Existing behavior changes, moderate release caution
K3	Coordinated rollout, migration, or cross-team dependency

Composite score = C + I + R + K

4–5 Trivial Autonomous execution — no human intervention needed

6–7 Simple Guided execution — human review before merge

8–9 Medium Draft-first or guided — no auto-merge

10–11 Complex Supervised — approval per step, deploy plan required

12 Split Required Execution blocked — task must be decomposed first

The vector matters more than the score. Two tasks with the same total may require completely different execution policies.

"Score the execution reality, not the political preference."
CIRK Scoring Guidance

Example — Login API

C2 I2 R3 K1 → Score 8

✓ Agent can prepare implementation ⚠ Human review is mandatory ⚠ Auto-merge is disabled

What CIRK is not

✕ A time estimate

✕ A velocity metric

✕ A sprint planning abstraction

✕ A product or platform

CIRK is a proposed execution standard for AI-native development.

CIRK in 60 seconds

Four steps.

Pick a task

Any unit of work: a feature, a fix, a refactor, a migration.

Score it

Assign C, I, R, K values from 1 to 3. Ask: how much context? how many cycles? how much review? how risky is rollout?

Apply execution policy

The vector maps to an execution mode — autonomous, guided, draft-first, supervised, or blocked. Agents and humans follow the same rules.

Use it consistently

Teams gain a shared language: "This is high R." "Low K, we can ship." "I3 — let the agent draft first."

Rule of thumb

High R→ more human review

High K→ tighter deploy controls

High I→ more draft-first behavior

High C→ more context preparation

Execution mapping

CIRK as policy-as-code.

The vector defines what happens — not just how big the task is.

Autonomous

Score 4–5 · R1

Agent executes without intervention. Auto-approval allowed. No checkpoints.

Guided

Score 6–9

Agent executes. Human review required before merge.

Draft-First

I3 or R3

Agent produces a draft. Human validates before any commit.

Supervised

Score ≥ 10 · K3

Step-by-step execution. Approval per step. Deploy runbook required.

Blocked

Score = 12

Task must be decomposed. Execution not allowed.

Policy rules

I3Draft-first required

R3Auto-approval disabled

K3Deploy runbook required

C3Context preparation should be explicit

Quick start

How to start using CIRK.

Pick a task

Any task your team or agent is about to execute.

Score it

C(1-3) I(1-3) R(1-3) K(1-3)

Apply the policy

4–5 automate 6–9 require review 10+ control execution

Examples

CIRK in practice.

Each vector maps to concrete execution behavior, with reasoning for each dimension.

Backend

Frontend

Agent

C1 I1 R1 K1 → 4 Trivial

Fix null check in cart total

C1 single function, local fix I1 deterministic fix R1 low review burden K1 safe, localized fix

✓ Autonomous execution ✓ Auto-approval allowed

C1 I1 R1 K1 → 4 Trivial

Update button color token

C1 local visual change I1 deterministic implementation R1 minimal review burden K1 low rollout risk

✓ Autonomous execution ✓ No checkpoints needed

C1 I1 R1 K1 → 4 Trivial

Rename internal variable and update references

C1 local, contained change I1 deterministic, no ambiguity R1 lightweight validation enough K1 no release risk

✓ Agent executes directly ✓ Lightweight validation only

C2 I2 R3 K1 → 8 Medium

C2 moderate context across auth components I2 validation and edge cases require iteration R3 security-sensitive — strong review required K1 rollout can remain contained

✓ Agent can prepare implementation ⚠ Human review mandatory ⚠ Auto-merge disabled

C2 I3 R2 K2 → 9 Medium

Dashboard layout revision

C2 touches multiple components I3 UX refinement requires iteration R2 functional review, not architectural K2 changes existing behavior

⚠ Draft-first required (I3) ⚠ Human review before merge

C2 I3 R2 K2 → 9 Medium

Refactor repository layer for shared patterns

C2 shared patterns across modules I3 exploratory, convergence uncertain R2 review before full implementation K2 changes existing behavior

⚠ Agent proposes plan first (I3) ⚠ Review before implementation

C2 I2 R3 K3 → 10 Complex

Billing webhook change

C2 touches business-critical logic I2 correctness may require several cycles R3 review is mandatory K3 bad rollout can create financial issues

✕ Supervised execution ✕ Approval per step ✕ Deploy runbook required

C3 I3 R3 K2 → 11 Complex

Auth UI redesign (login + recovery flows)

C3 cross-flow understanding required I3 multiple design and functional iterations R3 UX + correctness review required K2 changes a critical entry point

✕ Supervised execution ✕ Approval per step ✕ Architect sign-off required

C3 I2 R3 K3 → 11 Complex

Migrate payment gateway provider

C3 cross-system payment flow understanding I2 multiple verification cycles expected R3 mandatory human review K3 rollout must be staged and monitored

✕ Supervised execution ✕ Staged rollout required ✕ Deploy runbook mandatory

Open standard

CIRK is not a product.

It is a proposed open standard for execution governance in AI-native software development. We are looking for teams willing to test it.

It can be adopted in issue trackers, coding agents, pull request workflows, internal platforms, or governance layers such as Orbit618.

Issue trackers

AI coding agents

PR workflows

Internal platforms

FAQ

Common questions.

Is CIRK a replacement for story points? +

Yes, in AI-assisted development contexts where execution governance matters more than effort estimation.

Story points estimate human effort. CIRK estimates execution conditions for AI-native workflows.

Does CIRK estimate time? +

Not directly. CIRK is about execution complexity, review intensity, and rollout risk.

Teams may later derive time insights from calibration data, but duration is not the primary output.

Why doesn't CIRK include a time dimension? +

Time obscures the real constraint. CIRK makes it explicit.

Duration is a byproduct of complexity, iteration depth, review burden, and rollout sensitivity — not an independent variable. Adding time to the model would conflate cause and effect.

Teams that need execution windows can derive them from the vector: high R means longer review cycles, high K means wider rollout windows, high I means more iteration rounds. The vector already encodes the information — time just reads it.

Can CIRK be used without coding agents? +

Yes. Even when humans perform the implementation, CIRK still helps classify review burden, context depth, and deployment sensitivity.

Is CIRK tied to Orbit618? +

No. Orbit618 is one possible implementation environment for CIRK, but CIRK is designed as a standalone proposed open standard.

Why not keep story points and add AI rules separately? +

Because effort is no longer the most important variable for AI-assisted execution.

CIRK makes the real constraints explicit instead of treating them as side rules layered on top of an effort model.

What is the most important part of CIRK? +

The vector, not just the score.

C3 I1 R1 K3 and C1 I3 R3 K1 may have the same sum but require very different execution policies. One is deploy-sensitive. The other is review-sensitive.

Is CIRK a framework or a standard? +

CIRK is a proposed standard — not a framework, not a tool. It defines a shared language for execution governance. Any team, tool, or platform can implement it. MIT licensed. We are looking for teams willing to test it.

How do I start using CIRK? +

Pick a task. Score it across C, I, R, K (1–3 each). Sum the vector. Apply the policy: 4–5 automate, 6–9 require review, 10+ control execution. That's it.

Does CIRK work with Scrum? +

Yes. CIRK replaces the estimation layer (story points), not the process. Sprints, standups, and backlogs stay the same. The difference is that each task carries an execution policy instead of an effort guess.

Open questions

Where CIRK might be wrong.

These are tensions we are still debating. If you have answers or counterexamples, we want to hear them.

What happens to roadmap predictability without story points?

CIRK governs execution, not timelines. Teams that depend on velocity charts and burndowns may lose a planning signal. Is execution governance enough, or does a complementary time layer still matter?

Does CIRK cover governance beyond engineering?

CIRK models Context, Iteration, Review, and Integration Risk for software tasks. But execution governance in production systems also involves compliance, reversibility, and audit trails. Should CIRK expand or stay scoped to engineering?

How does CIRK coexist with Scrum, Kanban, and SAFe?

CIRK replaces the estimation layer, not the process. But in practice, story points are deeply embedded in sprint planning, capacity allocation, and stakeholder reporting. Can CIRK slot in without disrupting those flows?

Have a perspective? Join the discussion on GitHub →

Stop estimating.Start controlling AI execution.

Story points were built to estimate human effort.

Four dimensions. One execution policy.

Context

Iteration

Review

Integration Risk

Four steps.

Pick a task

Score it

Apply execution policy

Use it consistently

CIRK as policy-as-code.

How to start using CIRK.

CIRK in practice.

CIRK is not a product.

Common questions.

Where CIRK might be wrong.

Try CIRK with your team.

Stop estimating.
Start controlling AI execution.