tech-debt-analyzer
Systematic tech debt inventory with complexity analysis, dead code detection, and remediation planning. Track debt over time. NOT for code review (honest-review) or refactoring.
Systematic tech debt inventory with complexity analysis, dead code detection, and remediation planning. Track debt over time. NOT for code review (honest-review) or refactoring.
Quick Start
Install:
npx skills add github:wyattowalsh/agents --skill tech-debt-analyzer -y -g --agent antigravity --agent claude-code --agent codex --agent crush --agent cursor --agent gemini-cli --agent github-copilot --agent grok --agent opencode Use: /tech-debt-analyzer <mode> [path]
Works with Claude Code, Gemini CLI, OpenCode, and other agentskills.io-compatible agents.
What It Does
Section titled “What It Does”Systematic technical debt inventory, prioritization, and remediation planning. Multi-pass analysis with confidence scoring and evidence-based findings.
| $ARGUMENTS | Mode |
|---|---|
scan or scan <path> | Full codebase debt inventory (or scoped to path) |
analyze <file/dir> | Targeted deep analysis of specific file or directory |
prioritize | Rank all debt items by risk x effort matrix |
roadmap | Generate phased remediation plan |
report | Render dashboard visualization |
track | Compare current scan against previous baseline |
| Empty | Show mode menu with descriptions and examples |
Critical Rules
Section titled “Critical Rules”- Run all 4 analysis scripts before presenting findings — partial scans are labeled as such
- Every finding must have a confidence score backed by evidence (script output or codebase grep)
- Confidence < 0.3 = discard; 0.3-0.7 = flag as uncertain; >= 0.7 = report
- Never execute remediation — this skill inventories and plans, not fixes
- Store baseline after every full scan — longitudinal tracking depends on it
- Do not report style preferences as debt — only structural, behavioral, or maintainability issues
- Deduplicate across script outputs — same file:line should not appear twice
- Prioritize mode requires a prior scan — prompt user to run scan first if no baseline exists
- Always present the debt score (aggregate metric) in scan output
- Track mode must show delta (new/resolved/changed) — raw numbers without comparison are useless
- Dead code detection requires high confidence (>= 0.8) — false positives erode trust
- Load ONE reference file at a time — do not preload all references
Canonical Vocabulary
Section titled “Canonical Vocabulary”| Term | Definition |
|---|---|
| debt item | A discrete tech debt finding with category, severity, confidence, and evidence |
| category | Debt classification: design, test, documentation, dependency, infrastructure |
| severity | Impact level: CRITICAL, HIGH, MEDIUM, LOW |
| confidence | Score 0.0-1.0 per item; >=0.7 report, 0.3-0.7 flag, <0.3 discard |
| complexity | Cyclomatic (decision paths) or cognitive (human comprehension difficulty) |
| dead code | Functions, classes, or imports with no references in the codebase |
| staleness | Days since a dependency’s current version was superseded |
| inconsistency | Same pattern implemented differently across files |
| remediation | Specific fix action with effort estimate and risk level |
| debt score | Aggregate metric: sum of (severity_weight x confidence) across all items |
| baseline | Previous scan stored at ~/.{gemini |
| heatmap | Visual density of debt items per file or directory |
| risk x effort | Prioritization matrix: impact vs. remediation cost |
Mode: Scan
Section titled “Mode: Scan”Full codebase debt inventory. Run all 4 analysis scripts, aggregate results, assign categories and severities.
Scan Step 1: Project Profile
Section titled “Scan Step 1: Project Profile”Run uv run python skills/tech-debt-analyzer/scripts/complexity-scanner.py <path> to get complexity metrics.
Parse JSON output. Flag functions with cyclomatic_complexity > 10 as HIGH, > 5 as MEDIUM.
Scan Step 2: Dead Code Detection
Section titled “Scan Step 2: Dead Code Detection”Run uv run python skills/tech-debt-analyzer/scripts/dead-code-detector.py <path> to find unused code.
Parse JSON output. Each unused item becomes a debt item (category: design, severity by confidence).
Scan Step 3: Dependency Staleness
Section titled “Scan Step 3: Dependency Staleness”Run uv run python skills/tech-debt-analyzer/scripts/dependency-staleness-checker.py <path> to check outdated packages.
Parse JSON output. Deprecated packages are CRITICAL. Staleness > 365 days is HIGH.
Scan Step 4: Pattern Consistency
Section titled “Scan Step 4: Pattern Consistency”Run uv run python skills/tech-debt-analyzer/scripts/pattern-consistency-checker.py <path> to detect inconsistencies.
Parse JSON output. Each inconsistency becomes a debt item (category: design).
Scan Step 5: AI-Augmented Analysis
Section titled “Scan Step 5: AI-Augmented Analysis”After script-based detection, perform additional analysis:
- Documentation gaps — scan for undocumented public APIs, missing README sections, stale comments
- Test coverage gaps — Grep for untested modules, missing edge cases, test-to-code ratio
- Infrastructure debt — outdated CI configs, missing linting, inconsistent tooling
- Design smells — God classes, feature envy, shotgun surgery patterns
Assign confidence scores (0.0-1.0) per finding. Research-validate HIGH/CRITICAL items using Grep and codebase evidence.
Scan Step 6: Aggregate and Classify
Section titled “Scan Step 6: Aggregate and Classify”Merge all findings into a unified inventory:
- Deduplicate across script outputs
- Assign categories from debt taxonomy (references/debt-taxonomy.md)
- Calculate debt score: sum of (severity_weight x confidence)
- Store baseline at
~/.{gemini\|copilot\|codex\|claude}/tech-debt/<project-slug>-<date>.json
Present findings grouped by category, sorted by severity within each group.
Mode: Analyze
Section titled “Mode: Analyze”Targeted deep analysis of a specific file or directory. Run all 4 scripts scoped to the target. Apply the same 6-step scan process but with deeper per-function analysis. Include: function-level complexity breakdown, inline dead code, local pattern violations.
Mode: Prioritize
Section titled “Mode: Prioritize”Rank debt items using risk x effort matrix. Load references/prioritization-framework.md.
| Low Effort | Medium Effort | High Effort | |
|---|---|---|---|
| High Risk | P0: Fix immediately | P1: Schedule next sprint | P2: Plan for next quarter |
| Medium Risk | P1: Schedule next sprint | P2: Plan for next quarter | P3: Backlog |
| Low Risk | P2: Quick wins batch | P3: Backlog | P4: Accept or defer |
For each debt item, estimate:
- Risk: blast radius x severity x confidence
- Effort: LOC affected x complexity x dependency count
Output a ranked list with priority labels (P0-P4).
Mode: Roadmap
Section titled “Mode: Roadmap”Generate a phased remediation plan. Requires a prior scan (reads baseline from ~/.{gemini\|copilot\|codex\|claude}/tech-debt/).
Phase structure:
- Quick Wins (P0 + low-effort P1): immediate fixes, minimal risk
- Structural (remaining P1 + high-risk P2): design improvements, refactoring
- Maintenance (P2 + P3): documentation, test coverage, dependency updates
- Strategic (P3 + P4): architecture changes, long-term improvements
Each phase includes: items, estimated effort, dependencies, success criteria.
Mode: Report
Section titled “Mode: Report”Render dashboard visualization. Requires a prior scan.
- Read the most recent baseline from
~/.{gemini\|copilot\|codex\|claude}/tech-debt/ - Copy
templates/dashboard.htmlto a temporary file - Inject findings JSON into the
<script id="data">tag - Open in browser or report the path
Dashboard sections: category pie chart, complexity heatmap, trend chart (if multiple baselines), prioritized backlog table.
Mode: Track
Section titled “Mode: Track”Compare current scan against previous baseline for longitudinal tracking.
- Run a fresh scan (Mode: Scan steps 1-6)
- Load previous baseline from
~/.{gemini\|copilot\|codex\|claude}/tech-debt/<project-slug>-*.json(most recent) - Compute delta: new items, resolved items, changed severities, score trend
- Present comparison report with trend indicators
State Management
Section titled “State Management”- State directory:
~/.{gemini\|copilot\|codex\|claude}/tech-debt/ - Create directory on first use with
mkdir -p - Filename:
<project-slug>-<YYYY-MM-DD>.json - Project slug: sanitized basename of the project root directory
- Store after every scan; track mode reads historical baselines
- Schema:
{ "project": str, "date": str, "score": float, "items": [...], "summary": {...} }
Reference Files
Section titled “Reference Files”Load ONE reference at a time. Do not preload all references into context.
| File | Content | Read When |
|---|---|---|
references/debt-taxonomy.md | 5 debt categories with subcategories and remediation templates | Scan Step 6, classifying findings |
references/complexity-metrics.md | Cyclomatic and cognitive complexity definitions, thresholds, interpretation | Interpreting complexity-scanner output |
references/prioritization-framework.md | Risk x effort matrix, scoring rubric, priority definitions | Prioritize mode |
references/remediation-templates.md | Fix patterns by issue type, effort estimates, risk ratings | Roadmap mode, generating fix plans |
| Script | When to Run |
|---|---|
scripts/complexity-scanner.py | Scan Steps 1, Analyze mode |
scripts/dead-code-detector.py | Scan Step 2, Analyze mode |
scripts/dependency-staleness-checker.py | Scan Step 3 |
scripts/pattern-consistency-checker.py | Scan Step 4, Analyze mode |
| Template | When to Render |
|---|---|
templates/dashboard.html | Report mode — inject findings JSON into data tag |
Debt Item Structure
Section titled “Debt Item Structure”Every debt item follows this format:
- Location:
[file:line]or[file:start-end]— exact source location - Category: design | test | documentation | dependency | infrastructure
- Severity: CRITICAL | HIGH | MEDIUM | LOW
- Confidence: 0.0-1.0 score with evidence basis
- Description: What the debt is (1-2 sentences)
- Impact: Why it matters (blast radius, risk)
- Remediation: Recommended fix approach with effort estimate
| Field | Value |
|---|---|
| Source Type | repo-owned |
| Display Source | github:wyattowalsh/agents |
| Source Kind | repo |
| Installability | portable command |
| Review State | reviewed |
| Target Agents | antigravity, claude-code, codex, crush, cursor, gemini-cli, github-copilot, grok, opencode |
| Field | Value |
|---|---|
| Name | tech-debt-analyzer |
| License | MIT |
| Version | 1.0.0 |
| Author | wyattowalsh |
| Field | Value |
|---|---|
| Model | opus |
| Argument Hint | [mode] [path] |
View Full SKILL.md
---name: tech-debt-analyzerdescription: >- Systematic tech debt inventory with complexity analysis, dead code detection, and remediation planning. Track debt over time. NOT for code review (honest-review) or refactoring.argument-hint: "<mode> [path]"model: opuslicense: MITmetadata: author: wyattowalsh version: "1.0.0"---
# Tech Debt Analyzer
Systematic technical debt inventory, prioritization, and remediation planning.Multi-pass analysis with confidence scoring and evidence-based findings.
**Scope:** Debt inventory and tracking only. NOT for code review (honest-review), refactoring execution, or dependency updates.
## Canonical Vocabulary
| Term | Definition ||------|------------|| **debt item** | A discrete tech debt finding with category, severity, confidence, and evidence || **category** | Debt classification: design, test, documentation, dependency, infrastructure || **severity** | Impact level: CRITICAL, HIGH, MEDIUM, LOW || **confidence** | Score 0.0-1.0 per item; >=0.7 report, 0.3-0.7 flag, <0.3 discard || **complexity** | Cyclomatic (decision paths) or cognitive (human comprehension difficulty) || **dead code** | Functions, classes, or imports with no references in the codebase || **staleness** | Days since a dependency's current version was superseded || **inconsistency** | Same pattern implemented differently across files || **remediation** | Specific fix action with effort estimate and risk level || **debt score** | Aggregate metric: sum of (severity_weight x confidence) across all items || **baseline** | Previous scan stored at ~/.{gemini|copilot|codex|claude}/tech-debt/ for longitudinal comparison || **heatmap** | Visual density of debt items per file or directory || **risk x effort** | Prioritization matrix: impact vs. remediation cost |
## Dispatch
| $ARGUMENTS | Mode ||------------|------|| `scan` or `scan <path>` | Full codebase debt inventory (or scoped to path) || `analyze <file/dir>` | Targeted deep analysis of specific file or directory || `prioritize` | Rank all debt items by risk x effort matrix || `roadmap` | Generate phased remediation plan || `report` | Render dashboard visualization || `track` | Compare current scan against previous baseline || Empty | Show mode menu with descriptions and examples |
## Mode: Scan
Full codebase debt inventory. Run all 4 analysis scripts, aggregate results, assign categories and severities.
### Scan Step 1: Project Profile
Run `uv run python skills/tech-debt-analyzer/scripts/complexity-scanner.py <path>` to get complexity metrics.Parse JSON output. Flag functions with cyclomatic_complexity > 10 as HIGH, > 5 as MEDIUM.
### Scan Step 2: Dead Code Detection
Run `uv run python skills/tech-debt-analyzer/scripts/dead-code-detector.py <path>` to find unused code.Parse JSON output. Each unused item becomes a debt item (category: design, severity by confidence).
### Scan Step 3: Dependency Staleness
Run `uv run python skills/tech-debt-analyzer/scripts/dependency-staleness-checker.py <path>` to check outdated packages.Parse JSON output. Deprecated packages are CRITICAL. Staleness > 365 days is HIGH.
### Scan Step 4: Pattern Consistency
Run `uv run python skills/tech-debt-analyzer/scripts/pattern-consistency-checker.py <path>` to detect inconsistencies.Parse JSON output. Each inconsistency becomes a debt item (category: design).
### Scan Step 5: AI-Augmented Analysis
After script-based detection, perform additional analysis:
1. **Documentation gaps** — scan for undocumented public APIs, missing README sections, stale comments2. **Test coverage gaps** — Grep for untested modules, missing edge cases, test-to-code ratio3. **Infrastructure debt** — outdated CI configs, missing linting, inconsistent tooling4. **Design smells** — God classes, feature envy, shotgun surgery patterns
Assign confidence scores (0.0-1.0) per finding. Research-validate HIGH/CRITICAL items using Grep and codebase evidence.
### Scan Step 6: Aggregate and Classify
Merge all findings into a unified inventory:- Deduplicate across script outputs- Assign categories from debt taxonomy (references/debt-taxonomy.md)- Calculate debt score: sum of (severity_weight x confidence)- Store baseline at `~/.{gemini|copilot|codex|claude}/tech-debt/<project-slug>-<date>.json`
Present findings grouped by category, sorted by severity within each group.
## Mode: Analyze
Targeted deep analysis of a specific file or directory. Run all 4 scripts scoped to the target.Apply the same 6-step scan process but with deeper per-function analysis.Include: function-level complexity breakdown, inline dead code, local pattern violations.
## Mode: Prioritize
Rank debt items using risk x effort matrix. Load `references/prioritization-framework.md`.
| | Low Effort | Medium Effort | High Effort ||---|---|---|---|| **High Risk** | P0: Fix immediately | P1: Schedule next sprint | P2: Plan for next quarter || **Medium Risk** | P1: Schedule next sprint | P2: Plan for next quarter | P3: Backlog || **Low Risk** | P2: Quick wins batch | P3: Backlog | P4: Accept or defer |
For each debt item, estimate:- **Risk**: blast radius x severity x confidence- **Effort**: LOC affected x complexity x dependency count
Output a ranked list with priority labels (P0-P4).
## Mode: Roadmap
Generate a phased remediation plan. Requires a prior scan (reads baseline from `~/.{gemini|copilot|codex|claude}/tech-debt/`).
**Phase structure:**1. **Quick Wins** (P0 + low-effort P1): immediate fixes, minimal risk2. **Structural** (remaining P1 + high-risk P2): design improvements, refactoring3. **Maintenance** (P2 + P3): documentation, test coverage, dependency updates4. **Strategic** (P3 + P4): architecture changes, long-term improvements
Each phase includes: items, estimated effort, dependencies, success criteria.
## Mode: Report
Render dashboard visualization. Requires a prior scan.
1. Read the most recent baseline from `~/.{gemini|copilot|codex|claude}/tech-debt/`2. Copy `templates/dashboard.html` to a temporary file3. Inject findings JSON into the `<script id="data">` tag4. Open in browser or report the path
Dashboard sections: category pie chart, complexity heatmap, trend chart (if multiple baselines), prioritized backlog table.
## Mode: Track
Compare current scan against previous baseline for longitudinal tracking.
1. Run a fresh scan (Mode: Scan steps 1-6)2. Load previous baseline from `~/.{gemini|copilot|codex|claude}/tech-debt/<project-slug>-*.json` (most recent)3. Compute delta: new items, resolved items, changed severities, score trend4. Present comparison report with trend indicators
## State Management
- State directory: `~/.{gemini|copilot|codex|claude}/tech-debt/`- Create directory on first use with `mkdir -p`- Filename: `<project-slug>-<YYYY-MM-DD>.json`- Project slug: sanitized basename of the project root directory- Store after every scan; track mode reads historical baselines- Schema: `{ "project": str, "date": str, "score": float, "items": [...], "summary": {...} }`
## Reference Files
Load ONE reference at a time. Do not preload all references into context.
| File | Content | Read When ||------|---------|-----------|| `references/debt-taxonomy.md` | 5 debt categories with subcategories and remediation templates | Scan Step 6, classifying findings || `references/complexity-metrics.md` | Cyclomatic and cognitive complexity definitions, thresholds, interpretation | Interpreting complexity-scanner output || `references/prioritization-framework.md` | Risk x effort matrix, scoring rubric, priority definitions | Prioritize mode || `references/remediation-templates.md` | Fix patterns by issue type, effort estimates, risk ratings | Roadmap mode, generating fix plans |
| Script | When to Run ||--------|-------------|| `scripts/complexity-scanner.py` | Scan Steps 1, Analyze mode || `scripts/dead-code-detector.py` | Scan Step 2, Analyze mode || `scripts/dependency-staleness-checker.py` | Scan Step 3 || `scripts/pattern-consistency-checker.py` | Scan Step 4, Analyze mode |
| Template | When to Render ||----------|----------------|| `templates/dashboard.html` | Report mode — inject findings JSON into data tag |
## Debt Item Structure
Every debt item follows this format:
1. **Location**: `[file:line]` or `[file:start-end]` — exact source location2. **Category**: design | test | documentation | dependency | infrastructure3. **Severity**: CRITICAL | HIGH | MEDIUM | LOW4. **Confidence**: 0.0-1.0 score with evidence basis5. **Description**: What the debt is (1-2 sentences)6. **Impact**: Why it matters (blast radius, risk)7. **Remediation**: Recommended fix approach with effort estimate
## Critical Rules
1. Run all 4 analysis scripts before presenting findings — partial scans are labeled as such2. Every finding must have a confidence score backed by evidence (script output or codebase grep)3. Confidence < 0.3 = discard; 0.3-0.7 = flag as uncertain; >= 0.7 = report4. Never execute remediation — this skill inventories and plans, not fixes5. Store baseline after every full scan — longitudinal tracking depends on it6. Do not report style preferences as debt — only structural, behavioral, or maintainability issues7. Deduplicate across script outputs — same file:line should not appear twice8. Prioritize mode requires a prior scan — prompt user to run scan first if no baseline exists9. Always present the debt score (aggregate metric) in scan output10. Track mode must show delta (new/resolved/changed) — raw numbers without comparison are useless11. Dead code detection requires high confidence (>= 0.8) — false positives erode trust12. Load ONE reference file at a time — do not preload all references