Operator essentials¶
This is the day-to-day SDETKit runbook. Keep it small: prove the release decision first, investigate failures second, and only consider guarded remediation after the evidence is attached.
If a team is new to SDETKit, start here first and expand only after this lane is deterministic in local and CI.
Safety baseline¶
Investigation, reporting, recommendation, and planning paths are diagnostic-only by default. They recommend proof commands and next actions; they do not approve mutation.
Use this rule for every lane on this page:
prove first -> diagnose from artifacts -> remediate only through explicit guarded policy
Day 0 — First run and artifact handoff¶
Run the canonical release-confidence path first:
python -m sdetkit gate fast --format json --stable-json --out build/gate-fast.json
python -m sdetkit gate release --format json --out build/release-preflight.json
python -m sdetkit doctor --format json --out build/doctor.json
Expected first artifacts:
build/gate-fast.jsonbuild/release-preflight.jsonbuild/doctor.json
Read ok and failed_steps before raw logs. For artifact meanings, use Artifact reference and generated sample map.
Day 1 — Failed CI or PR check triage¶
When a log, check, or gate fails, collect evidence without mutating the repository:
python -m sdetkit review . --no-workspace --format operator-json
python -m sdetkit investigate failure --log build/quality.log --format markdown
python -m sdetkit investigate failure --log build/quality.log --format json --out build/investigation/failure.json
# One-shot handoff bundle when the operator needs diagnosis, comment, learning, safe-fix boundary, and brief artifacts together.
python -m sdetkit adaptive failure-bundle \
--log build/quality.log \
--out-dir build/sdetkit/failure-intelligence \
--proof-failed
Read these fields first in investigation output:
classificationsummarynext_actionsproof_commandsdiagnostic_onlyautomation_allowedrequires_human_review
If the owner is unclear, narrow the repository surface:
python -m sdetkit investigate repo --root . --format json --out build/investigation/repo.json
python -m sdetkit investigate surface --root . --surface <surface> --format markdown
See Investigation operator guide for the complete diagnostic-only flow.
For bounded local processing of already-created diagnostic jobs, use the Local diagnostic queue operator guide. This path is local, reporting-only, explicitly bounded by --max-jobs, and stops after the first failed job without retrying it.
Day 2 — Maintenance/autopilot artifact review¶
For maintenance-autopilot runs, start with the uploaded artifact bundle rather than individual logs:
- Open
build/maintenance/autopilot/autopilot-report.mdfor the run summary. - Open
build/maintenance/autopilot/adaptive-diagnosis.mdfor the failure explanation and proof commands. - Open
build/maintenance/autopilot/safe-fix-plan.jsononly as audit evidence. - Check
.sdetkit/maintenance/failure-memory.jsonland.sdetkit/maintenance/adaptive-safe-fix-memory.jsonlfor recurring patterns.
A safe-fix plan is not permission to apply a fix. Treat candidate, probation, policy proposal, dry-run, and guardrail outputs as evidence until a reviewed policy path explicitly authorizes the next step.
Day 3 — Guarded remediation review¶
Use remediation docs only after the diagnostic artifacts identify a specific failure class:
- Remediation cookbook for first-failure playbooks.
- Premium quality gate for guarded quality-gate remediation posture.
- PR automation for audit auto-fixes for explicit opt-in PR-fix behavior.
Before any mutation, confirm all of the following are true:
- The branch is not
main. - The policy path explicitly allows the guarded lane.
- The generated plan and diff are attached to the PR or workflow run.
- The proof command from the investigation output has been rerun on the reviewed branch.
Rollout and CI contract commands (secondary)¶
These commands are kept here for rollout contract visibility, not as the first-time operator path:
python scripts/validate_enterprise_contracts.pypython scripts/check_primary_docs_map.pymake operations-baselinemake operations-statusmake operations-next-actionmake operations-completemake release-readiness-startmake release-readiness-workflowmake release-readiness-statusmake release-readiness-start-contractmake release-readiness-seedmake release-readiness-completemake release-readiness-progressmake release-readiness-surface-claritymake quality-contract-checkmake governance-contract-checkmake ecosystem-contract-checkmake scale-readiness-startmake scale-readiness-statusmake scale-readiness-progressmake scale-readiness-completemake metrics-contract-check
Expansion trigger rules¶
Expand beyond this page only when all of the following are true:
- Day-0 commands are deterministic in local and CI.
- Release artifacts are reviewed before raw logs.
- Blockers are triaged from machine-readable fields (
ok,failed_steps,diagnostic_only,automation_allowed) first.
Next-step expansion map¶
After operator essentials is stable, expand in this order:
- Investigation and diagnosis: Investigation operator guide -> Adaptive Diagnosis Intelligence -> Remediation cookbook.
- Artifact interpretation: Artifact reference and generated sample map -> CI artifact walkthrough.
- Quality gates: Premium quality gate -> Security gate -> Determinism checklist.
- Advanced inspection lanes (
inspect,inspect-compare,inspect-project) only when needed. - Migration/legacy compatibility lanes only when required.