---
type: spec
feature_id: qa2
feature_name: CI/CD Foundation & Testing Infrastructure
owner: Engineering Team
version: 1.0
status: Draft
created_date: 2025-10-30
last_updated: 2025-10-30
related_prd: ../prds/qa2-prd.md
related_brd: ../brds/qa2-brd.md
---

# Technical Specification: QA2 - CI/CD Foundation & Testing Infrastructure

**Document:** Technical Specification  
**Version:** 1.0  
**Feature ID:** qa2  
**Status:** Draft  
**Created:** October 30, 2025  
**Target Release:** v1.6.7

---

## 1. Executive Summary

Implement automated testing infrastructure, CI/CD pipelines, structured logging with data redaction, and refactor critical complexity to establish engineering foundations. Organized into 4 sequential epics: (1) CI/CD Infrastructure, (2) Security Hardening, (3) Testing Foundation, (4) Code Quality Refactoring.

**References:**
- [PRD § 5 Functional Requirements](../prds/qa2-prd.md#5-functional-requirements)
- [BRD § 6 Success Metrics](../brds/qa2-brd.md#6-success-metrics)

---

## 2. Technical Requirements

### Functional Requirements (from PRD)
1. **R1: Lint + type-check on every commit** → Epic 1, Issue 1.1 ✅
2. **R2: Run tests on all PRs automatically** → Epic 1, Issue 1.1 ✅
3. **R3: Show coverage diff in PR comments** → Epic 1, Issue 1.2
4. **R4: Block merge if tests fail** → Epic 1, Issue 1.3
5. **R5: Redact sensitive data from logs** → Epic 2, Issue 2.1
6. **R6: Track code complexity trends** → Not in scope (removed refactoring)

### Non-Functional Requirements (from PRD)
- **Performance:** CI pipeline <10 min per PR
- **Reliability:** <5% false positive rate on gates
- **Security:** 100% sensitive data redaction
- **Compliance:** No tokens/emails/API keys in any logs
- **Error Handling:** Graceful failures with proper logging fallbacks

---

## 3. Architecture Overview

```
┌─────────────────────────────────────────────────────────┐
│                GitHub Repository (main/dev)             │
└────┬────────────────────────────────────────────────────┘
     │ Push/PR Event
     ↓
┌─────────────────────────────────────────────────────────┐
│            GitHub Actions CI Workflow                   │
│  (lint → type-check → test → coverage)                 │
└────┬──────────────────┬────────────────────┬───────────┘
     │                  │                    │
     ↓                  ↓                    ↓
 Lint Report     Coverage Report       Build Complete
     │                  │                    │
     └──────────────────┼────────────────────┘
                        │
                        ↓
            ┌────────────────────────┐
            │   Codecov Integration  │
            │  (coverage tracking)   │
            └────────────────────────┘
                        │
                        ↓
            ┌────────────────────────┐
            │   PR Status Checks     │
            │   + Comments           │
            └────────────────────────┘
                        │
                        ↓
            ┌────────────────────────┐
            │  Branch Protection     │
            │  (block on gate fails)  │
            └────────────────────────┘
```

**Security Layer:**
- All logs processed through `scripts/core/logger.ts`
- Sensitive data patterns redacted (tokens, emails, API keys)
- Error handling: graceful fallback if redaction fails
- 100% redaction rate target

---

## 4. Implementation Plan

### Phase 1: CI/CD Infrastructure (Epic 1, 7-11 hours)
**Timeline:** Days 1-3

1. **Complete `.github/workflows/ci.yml`** (2-3 hours)
   - Existing workflow is 70% complete and functional
   - Add verification steps for all jobs
   - Reference: Existing file at `.github/workflows/ci.yml` (lines 29-33 show lint/type-check)
   - Jobs confirmed: lint (line 30), type-check (line 32), tests (line 36), coverage upload (line 38)

2. **Configure `codecov.yml`** (2-3 hours)
   - Precision: 2 decimal places
   - Thresholds: 35% overall, 50% on patches
   - PR comments with diff

3. **Create `setup-branch-protection.sh`** (2-3 hours)
   - Protect main/dev branches
   - Require: lint, type-check, test status checks
   - Require 1 approval before merge

### Phase 2: Security Hardening (Epic 2, 8-12 hours)
**Timeline:** Days 4-6

1. **Create `scripts/core/logger.ts`** (6-8 hours)
   - Levels: debug, info, warn, error
   - Redaction patterns with regex (tokens, emails, API keys)
   - 90%+ test coverage
   - Error handling with try-catch fallback

2. **Migrate 10 critical files to structured logging** (2-4 hours)
   - Replace console.log with logger
   - Use migration pattern provided below
   - No sensitive values logged
   - Error handling for failed logging attempts

### Phase 3: Testing Foundation (Epic 3, 6-10 hours)
**Timeline:** Days 7-8

Create 6 test files achieving ≥35% overall coverage:
1. `scripts/utils/__tests__/format-helpers.test.ts` (100% target)
2. `scripts/utils/__tests__/cost-calculator.test.ts` (100%)
3. `scripts/utils/__tests__/mode-detector.test.ts` (90%+)
4. `scripts/utils/__tests__/check-submodule-name.test.ts` (100%)
5. `scripts/utils/__tests__/classification-zones.test.ts` (85%+)
6. `scripts/utils/__tests__/path-validation.test.ts` (90%+)

---

## 5. Data Model Changes

### No database schema changes (code-only release)

**Configuration files to create/update:**
- `.github/workflows/ci.yml` - VERIFY/COMPLETE (mostly exists)
- `codecov.yml` (NEW)
- `setup-branch-protection.sh` (NEW)
- `scripts/core/logger.ts` (NEW)
- `.env.local` (add CODECOV_TOKEN)

---

## 6. API Changes

**No external API changes** (internal infrastructure only)

**New internal interfaces:**
```typescript
// From logger.ts
interface Logger {
  debug(message: string): void;
  info(message: string): void;
  warn(message: string): void;
  error(message: string, error?: Error): void;
}

// From code migration
interface RedactionConfig {
  tokenPattern: RegExp;
  emailPattern: RegExp;
  apiKeyPattern: RegExp;
}
```

---

## 7. Implementation Approach

### Technology Stack
- **CI/CD:** GitHub Actions (existing)
- **Coverage:** Jest + Codecov
- **Logging:** Custom logger (no external dependencies)
- **Testing:** Jest with Arrange-Act-Assert pattern
- **Error Handling:** Try-catch with graceful fallbacks

### Quality Standards
- **Logging:** No `console.log`, only structured logger
- **Security:** 100% sensitive data redaction in logs
- **Coverage:** ≥35% overall, ≥85% per utility
- **Error Handling:** All operations wrapped with error recovery

---

## 8. Issues & Breakdown

### EPIC 1: CI/CD Infrastructure & Deployment Automation (7-11 hours)

#### Issue 1.1: Complete GitHub Actions CI Workflow

**Classification:** 5 (ai-led)  
**Overview:** Verify and finalize GitHub Actions workflow. Foundation already exists at `.github/workflows/ci.yml`.

**Existing Workflow Reference:**
- ✅ Lint job: `npm run lint` (line 30)
- ✅ Type-check job: `npm run type-check` (line 32)
- ✅ Tests: `npm test -- --coverage` (line 36)
- ✅ Codecov upload: Configured (line 38-46)

**Acceptance Criteria:**
- [ ] All jobs execute successfully on push and PR
- [ ] Lint outputs clear, actionable error messages
- [ ] Type-check catches TS errors before tests
- [ ] Tests run with coverage reporting
- [ ] Codecov upload succeeds with token from secrets
- [ ] Workflow completes in <10 minutes
- [ ] Error messages are clear and actionable

**Technical Implementation:**
- Use existing workflow as base
- Verify all npm scripts exist (npm run lint, npm run type-check, npm test)
- Test complete workflow with sample PR
- Confirm Node.js 18.x compatibility

**Error Scenarios to Handle:**
- npm script not found → descriptive error message
- Coverage upload fails → continue (don't block merge)
- Test failures → block merge with clear error details

**Dependencies:** None (foundation in place)  
**Estimated Effort:** 2-3 hours

---

#### Issue 1.2: Configure Codecov Integration & Coverage Tracking

**Classification:** 5 (ai-led)  
**Overview:** Create `codecov.yml` configuration file and authorize Codecov app.

**Acceptance Criteria:**
- [ ] `codecov.yml` created with thresholds
- [ ] Codecov app authorized on GitHub
- [ ] PR shows coverage status check
- [ ] PR comment shows coverage diff
- [ ] Historical tracking enabled

**Configuration Template (`codecov.yml`):**
```yaml
coverage:
  precision: 2
  round: down
  range: [70, 100]

codecov:
  require_ci_to_pass: true

ignore:
  - "node_modules"
  - "coverage"
  - "**/*.test.ts"
  - "**/*.mock.ts"

flags:
  unittests:
    description: Unit tests
    carryforward: true

comment:
  layout: "reach,diff,flags,tree"
  behavior: default
  require_changes: false
  require_base: no
  require_head: yes

github_checks:
  annotations: true

```

**Dependencies:** Issue 1.1 (workflow uploads coverage)  
**Estimated Effort:** 2-3 hours

---

#### Issue 1.3: Implement Branch Protection Rules

**Classification:** 5 (ai-led)  
**Overview:** Create and apply branch protection using `setup-branch-protection.sh` script.

**Script Template (`setup-branch-protection.sh`):**
```bash
#!/bin/bash
set -e

OWNER="tailwind-ai"
REPO="roadcrew-internal"

echo "🔒 Setting up branch protection for $REPO..."

for BRANCH in main dev; do
  echo "  Protecting $BRANCH branch..."
  
  gh api repos/$OWNER/$REPO/branches/$BRANCH/protection \
    --input - << 'EOF'
{
  "required_status_checks": {
    "strict": true,
    "contexts": ["lint", "type-check", "test"]
  },
  "required_pull_request_reviews": {
    "dismiss_stale_reviews": true,
    "require_code_owner_reviews": false,
    "required_approving_review_count": 1
  },
  "enforce_admins": true,
  "allow_force_pushes": false,
  "allow_deletions": false
}
EOF

  if [ $? -eq 0 ]; then
    echo "  ✅ $BRANCH branch protected"
  else
    echo "  ❌ Failed to protect $BRANCH branch"
    exit 1
  fi
done

echo "✅ All branches protected successfully"
```

**Error Handling:**
- Check GitHub CLI installed: `gh --version`
- Verify auth: `gh auth status`
- Handle API errors with meaningful messages
- Rollback on failure (show how to undo)

**Acceptance Criteria:**
- [ ] Both main and dev branches protected
- [ ] Status checks required: lint, type-check, test
- [ ] 1 approval required before merge
- [ ] Stale reviews dismissed on new push
- [ ] Force pushes prevented
- [ ] Script exits with error if protection fails

**Dependencies:** Issue 1.1 (status checks exist)  
**Estimated Effort:** 2-3 hours

---

### EPIC 2: Security Hardening & Structured Logging (8-12 hours)

#### Issue 2.1: Create Structured Logger Module with Redaction

**Classification:** 6 (ai-led)  
**Overview:** Build logger module with automatic sensitive data redaction.

**Acceptance Criteria:**
- [ ] `scripts/core/logger.ts` exports createLogger(moduleName)
- [ ] 4 log levels: debug, info, warn, error
- [ ] Redaction patterns for tokens, emails, API keys
- [ ] All redacted values use `[REDACTED_*]` format
- [ ] 90%+ test coverage
- [ ] Error handling with graceful fallback

**Implementation Details:**
- Redaction regex patterns (exact):
  * Token: `/gh[pousr]_[A-Za-z0-9_]{36,}/g` → `[REDACTED_TOKEN]`
  * Email: `/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g` → `[REDACTED_EMAIL]`
  * API key: `/api[_-]?key[_-]?[a-zA-Z0-9]{20,}/gi` → `[REDACTED_API_KEY]`

**Error Handling Code:**
```typescript
function redactSensitiveData(message: string): string {
  try {
    let redacted = message;
    redacted = redacted.replace(/gh[pousr]_[A-Za-z0-9_]{36,}/g, '[REDACTED_TOKEN]');
    redacted = redacted.replace(/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g, '[REDACTED_EMAIL]');
    redacted = redacted.replace(/api[_-]?key[_-]?[a-zA-Z0-9]{20,}/gi, '[REDACTED_API_KEY]');
    return redacted;
  } catch (error) {
    // Redaction failed - don't crash, fall back to original
    console.error('[LOGGER_ERROR] Redaction failed:', error instanceof Error ? error.message : String(error));
    return `[REDACTION_ERROR] ${message.substring(0, 50)}...`;
  }
}
```

**Dependencies:** None  
**Estimated Effort:** 6-8 hours

---

#### Issue 2.2: Migrate Critical Files to Structured Logging

**Classification:** 6 (ai-led)  
**Overview:** Replace console.log with logger in 10 critical files.

**Migration Pattern Example - Before/After:**

**BEFORE (scripts/utils/github-auth.ts):**
```typescript
console.log('Validating GitHub token...');
console.log('Token:', token); // ❌ LEAKS SENSITIVE DATA
console.log('API endpoint:', endpoint);
console.log('Response:', response);
if (!response.ok) {
  console.error('Auth failed:', response.statusText);
}
```

**AFTER:**
```typescript
import { createLogger } from '../core/logger';
const logger = createLogger('github-auth');

logger.debug('Validating GitHub token');
logger.debug(`Token length: ${token.length} chars`); // ✅ Metadata, not value
logger.debug(`Using endpoint: ${endpoint}`);
logger.info(`Auth response status: ${response.status}`);
if (!response.ok) {
  logger.error(`Auth failed with status ${response.status}`);
}
```

**Key Rules for All Files:**
- ✅ DO log: operation status, IDs, counts, endpoints, status codes
- ✅ DO log: errors (without sensitive values)
- ❌ NEVER log: token values, email addresses, API keys, request/response bodies
- ❌ NEVER log: passwords, secrets, issue content

**10 Files to Migrate:**
1. `scripts/utils/github-auth.ts`
2. `scripts/utils/github-issue-creator.ts`
3. `scripts/utils/token-tracker.ts`
4. `scripts/utils/cost-calculator.ts`
5. `scripts/utils/expert-protection.ts`
6. `scripts/utils/github-projects.ts`
7. `scripts/utils/issue-classification.ts`
8. `scripts/utils/release-parser.ts`
9. `scripts/utils/validation.ts`
10. `scripts/utils/template-engine.ts`

**Error Handling for Migration:**
```typescript
// Wrap logger calls in try-catch to handle logger failures
try {
  logger.info(`Processing: ${fileName}`);
  // ... do work ...
  logger.info(`Completed: ${fileName}`);
} catch (error) {
  // Logger failed - fall back to stderr
  console.error('[LOGGER_FALLBACK] Error:', error instanceof Error ? error.message : String(error));
  // Continue processing - don't crash
}
```

**Acceptance Criteria:**
- [ ] All 10 files have 0 console.log statements
- [ ] All use logger import and createLogger()
- [ ] No sensitive values logged anywhere
- [ ] Grep audit: `grep -r "console\.(log|debug|error|warn)" scripts/utils/*.ts` = 0 results
- [ ] Error handling in place: failures don't crash the app
- [ ] All tests passing after migration

**Dependencies:** Issue 2.1 (logger module exists)  
**Estimated Effort:** 2-4 hours

---

### EPIC 3: Testing Foundation & Coverage Baseline (6-10 hours)

#### Issue 3.1: Create Test Suite for Foundation Utilities

**Classification:** 5 (ai-led)  
**Overview:** Create Jest tests for 6 foundation utilities, achieving ≥35% coverage.

**Acceptance Criteria:**
- [ ] 6 test files created with comprehensive test cases
- [ ] Overall coverage ≥35% (verified with npm run test:coverage)
- [ ] Each target file ≥85% individual coverage
- [ ] All tests pass (npm test = exit 0)
- [ ] Coverage report in coverage/lcov-report/

**Technical Implementation:**
- Test template: Arrange-Act-Assert pattern
- Test structure: describe() → describe(function) → it(scenario)
- Happy path + edge cases + error scenarios
- Mock external dependencies
- 8-15 test cases per file

**Dependencies:** Issue 1.1 (CI runs tests)  
**Estimated Effort:** 6-10 hours

---

## 9. Testing Strategy

### Unit Tests
- Logger redaction patterns (Issue 2.1)
- Foundation utilities (Issue 3.1)

### Integration Tests
- CI/CD workflow with sample PR
- Logging migration error scenarios
- Logger fallback behavior

### Coverage Targets
- ≥35% overall (Issue 3.1)
- ≥85% per utility file
- ≥90% for logger.ts

---

## 10. Success Criteria

### All Issues Complete
- [ ] Issue 1.1: CI workflow verified and working
- [ ] Issue 1.2: Codecov integration active
- [ ] Issue 1.3: Branch protection enforced
- [ ] Issue 2.1: Logger module tested
- [ ] Issue 2.2: 10 files migrated, 0 console.log
- [ ] Issue 3.1: 35%+ coverage achieved

### Quality Gates
- ✅ All CI checks passing (lint, type-check, test)
- ✅ Coverage ≥35% overall, ≥85% per utility
- ✅ 100% sensitive data redaction in logs
- ✅ Zero regressions in existing functionality
- ✅ Error handling in place for all new code

---

## 11. Dependencies & Risks

### External Dependencies
- GitHub Actions (available)
- Jest (configured)
- Codecov API (requires token)

### Internal Dependencies
- Issues 1.1 → 1.2 → 1.3 (sequential)
- Issues 2.1 → 2.2 (logger before migration)
- Issue 3.1 independent

### Risks & Mitigations
| Risk | Mitigation |
|------|-----------|
| npm script not found | Verify scripts exist before workflow runs |
| Codecov token missing | Add CODECOV_TOKEN to GitHub secrets |
| Logger redaction fails | Wrap in try-catch, fall back gracefully |
| Migration breaks existing code | Verify no console.log references break tests |
| Coverage below 35% | Add test cases incrementally |

---

## 12. Timeline

| Week | Epic | Deliverable | Hours |
|------|------|-------------|-------|
| Week 1 | 1, 2 | CI verified, logger module, migration started | 10-15 |
| Week 1-2 | 2, 3 | Migration complete, test baseline | 7-12 |
| Week 2-3 | 3 | Coverage ≥35%, all tests passing | 4-6 |

**Total:** 21-33 hours over 2-3 weeks

---

## 13. Glossary

- **Redaction:** Replacement of sensitive data with `[REDACTED_*]` tag
- **Structured Logging:** Logs with consistent format [LEVEL] [MODULE] message
- **CI/CD Gate:** Status check blocking merge if fails
- **Coverage Diff:** Change in test coverage between PR and main branch

---

**Document Status:** Ready for implementation

**Next Steps:** Create GitHub issues from this spec using `/scope-release --current`