feat: add gitea agentic runtime control plane
This commit is contained in:
266
docs/superpowers/plans/2026-03-13-gitea-agentic-runtime-plan.md
Normal file
266
docs/superpowers/plans/2026-03-13-gitea-agentic-runtime-plan.md
Normal file
@@ -0,0 +1,266 @@
|
||||
# Gitea Agentic Runtime Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Build a Gitea-backed workflow spec/compiler/runtime subsystem that turns the existing delivery skill into a policy-enforced execution engine with automated tests and real Gitea acceptance.
|
||||
|
||||
**Architecture:** Add a small Python control-plane package under `engine/devops_agent`, keep provider logic isolated behind interfaces, compile workflow specs into JSON lock artifacts, and run policy-guarded issue workflows against Gitea. Persist plans and evidence locally so tests and operators can inspect every run deterministically.
|
||||
|
||||
**Tech Stack:** Python 3.11+, pytest, requests-free stdlib HTTP where practical, Markdown/JSON workflow artifacts, existing repository docs and Gitea workflow templates
|
||||
|
||||
---
|
||||
|
||||
## Chunk 1: Project Scaffold and Spec Format
|
||||
|
||||
### Task 1: Create the Python package scaffold and test harness
|
||||
|
||||
**Files:**
|
||||
- Create: `pyproject.toml`
|
||||
- Create: `engine/devops_agent/__init__.py`
|
||||
- Create: `engine/devops_agent/cli.py`
|
||||
- Create: `tests/unit/test_smoke_imports.py`
|
||||
|
||||
- [ ] **Step 1: Write the failing package smoke test**
|
||||
|
||||
Create a test that imports the CLI entrypoint and key modules that will exist after scaffolding.
|
||||
|
||||
- [ ] **Step 2: Run the smoke test and confirm failure**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_smoke_imports.py -q`
|
||||
Expected: import failure because package files do not exist yet
|
||||
|
||||
- [ ] **Step 3: Add the minimal package scaffold**
|
||||
|
||||
Create the package directories, `__init__.py`, and a CLI module with a placeholder command parser.
|
||||
|
||||
- [ ] **Step 4: Add the Python project file**
|
||||
|
||||
Create `pyproject.toml` with the minimum metadata and `pytest` test configuration.
|
||||
|
||||
- [ ] **Step 5: Re-run the smoke test**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_smoke_imports.py -q`
|
||||
Expected: pass
|
||||
|
||||
### Task 2: Define the workflow spec shape and a sample Gitea workflow
|
||||
|
||||
**Files:**
|
||||
- Create: `engine/devops_agent/spec.py`
|
||||
- Create: `workflows/gitea-issue-delivery.md`
|
||||
- Create: `tests/unit/test_spec_loader.py`
|
||||
|
||||
- [ ] **Step 1: Write the failing spec loader tests**
|
||||
|
||||
Test frontmatter parsing, Markdown body extraction, and required field detection.
|
||||
|
||||
- [ ] **Step 2: Run the spec loader tests and confirm failure**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_spec_loader.py -q`
|
||||
Expected: missing loader implementation
|
||||
|
||||
- [ ] **Step 3: Implement the minimal spec loader**
|
||||
|
||||
Parse frontmatter and Markdown body into a structured object.
|
||||
|
||||
- [ ] **Step 4: Add the sample Gitea workflow spec**
|
||||
|
||||
Define one workflow that models fixed-issue delivery with safe comment output.
|
||||
|
||||
- [ ] **Step 5: Re-run the spec loader tests**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_spec_loader.py -q`
|
||||
Expected: pass
|
||||
|
||||
## Chunk 2: Compiler, Validator, and Policy Enforcement
|
||||
|
||||
### Task 3: Implement lock compilation
|
||||
|
||||
**Files:**
|
||||
- Create: `engine/devops_agent/compiler.py`
|
||||
- Create: `tests/unit/test_compiler.py`
|
||||
|
||||
- [ ] **Step 1: Write failing compiler tests**
|
||||
|
||||
Cover default normalization, trigger expansion, and lock artifact emission.
|
||||
|
||||
- [ ] **Step 2: Run the compiler tests and confirm failure**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_compiler.py -q`
|
||||
Expected: missing compiler implementation
|
||||
|
||||
- [ ] **Step 3: Implement the minimal compiler**
|
||||
|
||||
Compile the loaded spec into a deterministic JSON-compatible lock payload.
|
||||
|
||||
- [ ] **Step 4: Re-run the compiler tests**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_compiler.py -q`
|
||||
Expected: pass
|
||||
|
||||
### Task 4: Implement validation and safe output policies
|
||||
|
||||
**Files:**
|
||||
- Create: `engine/devops_agent/validator.py`
|
||||
- Create: `engine/devops_agent/policies.py`
|
||||
- Create: `tests/unit/test_validator.py`
|
||||
- Create: `tests/unit/test_policies.py`
|
||||
|
||||
- [ ] **Step 1: Write failing validation and policy tests**
|
||||
|
||||
Cover missing provider, invalid trigger combinations, undeclared write actions, and path-scope checks.
|
||||
|
||||
- [ ] **Step 2: Run those tests and confirm failure**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_validator.py tests/unit/test_policies.py -q`
|
||||
Expected: missing implementations
|
||||
|
||||
- [ ] **Step 3: Implement the validator**
|
||||
|
||||
Return explicit validation errors for incomplete or unsafe specs.
|
||||
|
||||
- [ ] **Step 4: Implement the policy layer**
|
||||
|
||||
Enforce read-only default, safe output declarations, and bounded write operations.
|
||||
|
||||
- [ ] **Step 5: Re-run the validation and policy tests**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_validator.py tests/unit/test_policies.py -q`
|
||||
Expected: pass
|
||||
|
||||
## Chunk 3: Provider and Runtime
|
||||
|
||||
### Task 5: Implement the provider interface and Gitea provider
|
||||
|
||||
**Files:**
|
||||
- Create: `engine/devops_agent/providers/__init__.py`
|
||||
- Create: `engine/devops_agent/providers/base.py`
|
||||
- Create: `engine/devops_agent/providers/gitea.py`
|
||||
- Create: `tests/fixtures/gitea/issue.json`
|
||||
- Create: `tests/fixtures/gitea/comment_event.json`
|
||||
- Create: `tests/unit/test_gitea_provider.py`
|
||||
|
||||
- [ ] **Step 1: Write failing provider tests**
|
||||
|
||||
Cover issue fetch, comment post request shaping, and trigger event parsing using fixtures.
|
||||
|
||||
- [ ] **Step 2: Run the provider tests and confirm failure**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_gitea_provider.py -q`
|
||||
Expected: provider implementation missing
|
||||
|
||||
- [ ] **Step 3: Implement the provider interface and Gitea provider**
|
||||
|
||||
Add a minimal provider abstraction and the first concrete implementation for Gitea.
|
||||
|
||||
- [ ] **Step 4: Re-run the provider tests**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_gitea_provider.py -q`
|
||||
Expected: pass
|
||||
|
||||
### Task 6: Implement runtime and evidence persistence
|
||||
|
||||
**Files:**
|
||||
- Create: `engine/devops_agent/runtime.py`
|
||||
- Create: `engine/devops_agent/evidence.py`
|
||||
- Create: `tests/integration/test_runtime_flow.py`
|
||||
|
||||
- [ ] **Step 1: Write a failing runtime integration test**
|
||||
|
||||
Cover compile -> validate -> runtime execution -> evidence artifact output using a fake provider transport.
|
||||
|
||||
- [ ] **Step 2: Run the runtime integration test and confirm failure**
|
||||
|
||||
Run: `python -m pytest tests/integration/test_runtime_flow.py -q`
|
||||
Expected: runtime implementation missing
|
||||
|
||||
- [ ] **Step 3: Implement the runtime and evidence writer**
|
||||
|
||||
Persist run artifacts and plan/evidence summaries under a deterministic output directory.
|
||||
|
||||
- [ ] **Step 4: Re-run the runtime integration test**
|
||||
|
||||
Run: `python -m pytest tests/integration/test_runtime_flow.py -q`
|
||||
Expected: pass
|
||||
|
||||
## Chunk 4: CLI, Acceptance, and Documentation
|
||||
|
||||
### Task 7: Expose compile, validate, run, and acceptance commands
|
||||
|
||||
**Files:**
|
||||
- Modify: `engine/devops_agent/cli.py`
|
||||
- Create: `tests/unit/test_cli.py`
|
||||
|
||||
- [ ] **Step 1: Write failing CLI tests**
|
||||
|
||||
Cover `compile`, `validate`, and `run` command dispatch with filesystem outputs.
|
||||
|
||||
- [ ] **Step 2: Run the CLI tests and confirm failure**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_cli.py -q`
|
||||
Expected: placeholder CLI insufficient
|
||||
|
||||
- [ ] **Step 3: Implement the CLI commands**
|
||||
|
||||
Add command handling for `compile`, `validate`, `run`, and `acceptance`.
|
||||
|
||||
- [ ] **Step 4: Re-run the CLI tests**
|
||||
|
||||
Run: `python -m pytest tests/unit/test_cli.py -q`
|
||||
Expected: pass
|
||||
|
||||
### Task 8: Add real Gitea acceptance and update docs
|
||||
|
||||
**Files:**
|
||||
- Create: `tests/acceptance/test_gitea_acceptance.py`
|
||||
- Modify: `README.md`
|
||||
- Modify: `skills/gitea-issue-devops-agent/SKILL.md`
|
||||
|
||||
- [ ] **Step 1: Write acceptance tests**
|
||||
|
||||
Make the suite skip when `GITEA_BASE_URL`, `GITEA_REPO`, `GITEA_TOKEN`, and `GITEA_ISSUE_NUMBER` are absent, and perform real provider read/comment flow when present.
|
||||
|
||||
- [ ] **Step 2: Run the acceptance suite without env vars**
|
||||
|
||||
Run: `python -m pytest tests/acceptance/test_gitea_acceptance.py -q`
|
||||
Expected: clean skip
|
||||
|
||||
- [ ] **Step 3: Update README and skill docs**
|
||||
|
||||
Document the new spec/runtime model, CLI commands, safe outputs, and Gitea acceptance procedure.
|
||||
|
||||
- [ ] **Step 4: Run real acceptance**
|
||||
|
||||
Run: `python -m pytest tests/acceptance/test_gitea_acceptance.py -q`
|
||||
Expected: pass against the configured Gitea repository
|
||||
|
||||
## Chunk 5: End-to-End Verification and Delivery
|
||||
|
||||
### Task 9: Run the full automated test suite and review repository state
|
||||
|
||||
**Files:**
|
||||
- Modify: repo index and generated lock artifacts as needed
|
||||
|
||||
- [ ] **Step 1: Run unit and integration tests**
|
||||
|
||||
Run: `python -m pytest tests/unit tests/integration -q`
|
||||
Expected: all pass
|
||||
|
||||
- [ ] **Step 2: Run full suite including acceptance**
|
||||
|
||||
Run: `python -m pytest tests/unit tests/integration tests/acceptance -q`
|
||||
Expected: all pass, with acceptance either passing or explicitly skipped when env vars are absent
|
||||
|
||||
- [ ] **Step 3: Compile the sample workflow and verify output**
|
||||
|
||||
Run: `python -m engine.devops_agent.cli compile workflows/gitea-issue-delivery.md --output workflows/gitea-issue-delivery.lock.json`
|
||||
Expected: lock file generated successfully
|
||||
|
||||
- [ ] **Step 4: Validate the sample workflow**
|
||||
|
||||
Run: `python -m engine.devops_agent.cli validate workflows/gitea-issue-delivery.md`
|
||||
Expected: success output and zero exit code
|
||||
|
||||
- [ ] **Step 5: Review diff formatting**
|
||||
|
||||
Run: `git diff --check`
|
||||
Expected: no formatting errors
|
||||
@@ -0,0 +1,268 @@
|
||||
# Gitea Agentic Runtime Design
|
||||
|
||||
**Date:** 2026-03-13
|
||||
**Status:** Approved for implementation
|
||||
**Scope:** Gitea single-platform real acceptance, with provider boundaries that keep later GitHub support incremental instead of invasive.
|
||||
|
||||
## Goal
|
||||
|
||||
Upgrade `gitea-issue-devops-agent` from a documentation-heavy delivery skill into a controlled execution subsystem that can:
|
||||
|
||||
- parse repository-local workflow specs
|
||||
- compile and validate those specs into a locked execution plan
|
||||
- run a Gitea-triggered issue workflow under hard policies
|
||||
- persist plan and evidence artifacts
|
||||
- pass automated unit, integration, and real Gitea acceptance checks
|
||||
|
||||
The external workflow remains:
|
||||
|
||||
`Issue -> Plan -> Branch -> Draft PR -> Preview -> Test Loop -> Human Merge`
|
||||
|
||||
The internal workflow gains a deterministic control plane.
|
||||
|
||||
## Why This Design
|
||||
|
||||
The repository currently has strong delivery guidance, but not a runtime that enforces it. `gh-aw` demonstrates the value of turning natural-language workflow definitions into constrained executable automations. We do not want to clone GitHub-native product shape. We do want equivalent execution discipline:
|
||||
|
||||
- a spec format users can author
|
||||
- a compiler/validator that rejects unsafe or incomplete workflows
|
||||
- a provider layer for platform APIs
|
||||
- hard policy checks before write actions
|
||||
- run evidence suitable for audit and acceptance
|
||||
|
||||
This keeps the product positioned as an enterprise AI DevOps control layer, not a GitHub Actions clone.
|
||||
|
||||
## In Scope
|
||||
|
||||
- Gitea provider
|
||||
- workflow spec parsing
|
||||
- compilation into lock artifacts
|
||||
- validation of triggers, permissions, safe outputs, edit scope, and evidence contract
|
||||
- runtime execution for selected issue flows
|
||||
- safe output enforcement for Gitea writes
|
||||
- evidence persistence
|
||||
- CLI entrypoints for compile, validate, run, and acceptance
|
||||
- automated tests
|
||||
- one real Gitea acceptance path
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- GitHub provider implementation
|
||||
- autonomous queue-wide issue fixing
|
||||
- automatic merge
|
||||
- hosted webhook service
|
||||
- UI control plane
|
||||
- inference cost dashboards
|
||||
|
||||
## Architecture
|
||||
|
||||
### 1. Workflow Spec
|
||||
|
||||
Workflow specs live in-repo and use frontmatter plus Markdown body:
|
||||
|
||||
- frontmatter defines triggers, provider, permissions, safe outputs, required evidence, and policy defaults
|
||||
- body captures workflow intent, operator notes, and execution hints
|
||||
|
||||
The spec is the source of truth. Runtime behavior never depends on free-form Markdown alone.
|
||||
|
||||
### 2. Compiler
|
||||
|
||||
The compiler loads a workflow spec and emits a lock artifact:
|
||||
|
||||
- normalizes defaults
|
||||
- expands trigger shorthand into explicit configuration
|
||||
- resolves safe output declarations into executable policy objects
|
||||
- records required evidence and path scope
|
||||
|
||||
The lock artifact is immutable input to runtime. It is designed as JSON in this iteration because JSON is easy to diff, assert in tests, and consume from Python.
|
||||
|
||||
### 3. Validator
|
||||
|
||||
Validation rejects unsafe or incomplete specs before runtime:
|
||||
|
||||
- unsupported trigger combinations
|
||||
- missing provider
|
||||
- unsafe write permissions
|
||||
- missing safe outputs for write behaviors
|
||||
- invalid path scope syntax
|
||||
- missing evidence requirements
|
||||
|
||||
This is the first hard boundary between intent and execution.
|
||||
|
||||
### 4. Runtime
|
||||
|
||||
Runtime consumes:
|
||||
|
||||
- a compiled lock artifact
|
||||
- an event payload
|
||||
- provider configuration and credentials
|
||||
|
||||
Runtime responsibilities:
|
||||
|
||||
- load and sanitize issue-trigger context
|
||||
- initialize or update the persisted plan state
|
||||
- enforce policy gates before any write action
|
||||
- call provider APIs through a narrow interface
|
||||
- collect evidence and write run artifacts
|
||||
|
||||
The runtime is not a general autonomous coding engine in this iteration. It is a control and orchestration layer for issue delivery actions.
|
||||
|
||||
### 5. Provider Layer
|
||||
|
||||
Provider interfaces isolate platform behavior from workflow logic.
|
||||
|
||||
The first provider is `GiteaProvider`, which supports:
|
||||
|
||||
- repository metadata reads
|
||||
- issue reads
|
||||
- issue comments
|
||||
- branch hint extraction inputs
|
||||
- draft PR preparation primitives
|
||||
|
||||
The abstraction must make later `GitHubProvider` addition additive rather than structural.
|
||||
|
||||
### 6. Policy and Safe Outputs
|
||||
|
||||
The policy layer turns delivery rules into code:
|
||||
|
||||
- read-only by default
|
||||
- no merge action
|
||||
- comment/create/update operations only if declared in safe outputs
|
||||
- path-scope enforcement for file writes
|
||||
- evidence-required status promotion
|
||||
- bounded output counts where relevant
|
||||
|
||||
This is the key product maturity improvement over pure skill text.
|
||||
|
||||
### 7. Evidence Layer
|
||||
|
||||
Every run produces durable artifacts:
|
||||
|
||||
- resolved plan state
|
||||
- execution summary
|
||||
- provider operations executed
|
||||
- evidence bundle for commit/PR/test/preview placeholders
|
||||
- acceptance result metadata
|
||||
|
||||
Evidence is stored locally under a deterministic run directory so tests and operators can inspect it.
|
||||
|
||||
## File Structure
|
||||
|
||||
New code will be added under:
|
||||
|
||||
- `pyproject.toml`
|
||||
- `engine/devops_agent/__init__.py`
|
||||
- `engine/devops_agent/spec.py`
|
||||
- `engine/devops_agent/compiler.py`
|
||||
- `engine/devops_agent/validator.py`
|
||||
- `engine/devops_agent/runtime.py`
|
||||
- `engine/devops_agent/policies.py`
|
||||
- `engine/devops_agent/evidence.py`
|
||||
- `engine/devops_agent/cli.py`
|
||||
- `engine/devops_agent/providers/__init__.py`
|
||||
- `engine/devops_agent/providers/base.py`
|
||||
- `engine/devops_agent/providers/gitea.py`
|
||||
- `workflows/gitea-issue-delivery.md`
|
||||
- `tests/unit/...`
|
||||
- `tests/integration/...`
|
||||
- `tests/fixtures/gitea/...`
|
||||
- `tests/acceptance/test_gitea_acceptance.py`
|
||||
|
||||
Existing scripts will remain, but runtime may call them or share logic conceptually:
|
||||
|
||||
- `skills/gitea-issue-devops-agent/scripts/issue_audit.py`
|
||||
- `skills/gitea-issue-devops-agent/scripts/change_scope.py`
|
||||
- `skills/gitea-issue-devops-agent/scripts/preview_slot_allocator.py`
|
||||
|
||||
## Data Model
|
||||
|
||||
### Workflow Spec
|
||||
|
||||
Core fields:
|
||||
|
||||
- `name`
|
||||
- `provider`
|
||||
- `on`
|
||||
- `permissions`
|
||||
- `safe_outputs`
|
||||
- `plan`
|
||||
- `policy`
|
||||
- `evidence`
|
||||
- `body`
|
||||
|
||||
### Lock Artifact
|
||||
|
||||
Core fields:
|
||||
|
||||
- `version`
|
||||
- `compiled_at`
|
||||
- `source`
|
||||
- `provider`
|
||||
- `triggers`
|
||||
- `policy`
|
||||
- `safe_outputs`
|
||||
- `required_evidence`
|
||||
- `plan_defaults`
|
||||
- `instructions`
|
||||
|
||||
### Run Artifact
|
||||
|
||||
Core fields:
|
||||
|
||||
- `run_id`
|
||||
- `workflow_name`
|
||||
- `provider`
|
||||
- `event`
|
||||
- `plan_state`
|
||||
- `operations`
|
||||
- `evidence`
|
||||
- `result`
|
||||
|
||||
## Test Strategy
|
||||
|
||||
### Unit Tests
|
||||
|
||||
Verify:
|
||||
|
||||
- frontmatter parsing
|
||||
- default normalization
|
||||
- validator failures and successes
|
||||
- policy enforcement
|
||||
- provider request shaping
|
||||
|
||||
### Integration Tests
|
||||
|
||||
Verify:
|
||||
|
||||
- spec to lock compilation
|
||||
- lock to runtime execution
|
||||
- runtime interaction with a fake Gitea transport
|
||||
- evidence persistence
|
||||
|
||||
### Acceptance Tests
|
||||
|
||||
Verify against a real Gitea repo using env vars:
|
||||
|
||||
- workflow compile and validate pass
|
||||
- runtime can load a selected issue
|
||||
- runtime can publish an evidence comment
|
||||
- acceptance artifacts are produced locally
|
||||
|
||||
If env vars are absent, the acceptance suite must skip cleanly instead of failing misleadingly.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
This design is complete only when:
|
||||
|
||||
1. A repo-local Gitea workflow spec compiles into a lock artifact.
|
||||
2. Invalid specs fail validation with clear messages.
|
||||
3. Runtime enforces safe outputs and refuses undeclared writes.
|
||||
4. Gitea real acceptance can read an issue and publish an evidence comment.
|
||||
5. Automated unit and integration tests pass.
|
||||
6. README and skill docs describe the new execution model and CLI usage.
|
||||
|
||||
## Rollout Notes
|
||||
|
||||
- Gitea is the first real provider.
|
||||
- GitHub support is intentionally deferred, but the provider interface must be stable enough to add later.
|
||||
- `jj` remains an internal reliability layer. This subsystem must not require `jj` for external usage.
|
||||
Reference in New Issue
Block a user