Skip to main content

Issues Discovered

Last Updated: 2026-03-06 Sources: All 14 repositories, .state/upstream-status.json, .state/discovered-accounts.json, .state/org-ous.json, .state/discovered-scps.json

Executive Summary

This document catalogs all issues, inconsistencies, security concerns, and improvement recommendations discovered during the comprehensive NDX:Try AWS architecture analysis. A total of 30 issues have been identified across 10 analysis phases, with 2 rated critical, 6 high, 12 medium, and 10 low severity. The most significant findings relate to the ISB fork being 12 commits behind upstream (now missing v1.2.0 and v1.2.1 feature releases), the high quarantine rate for pool accounts (46 of 240 accounts, 19%), and mixed dependency versions creating compatibility risks.


Issue Severity Ratings

SeverityDescriptionCount
CRITICALSecurity vulnerability, data loss risk, or system outage potential2
HIGHSignificant operational impact or compliance concern5
MEDIUMTechnical debt, inconsistency, or moderate risk12
LOWDocumentation gap, naming inconsistency, or minor improvement9
Total28

Summary by Category


All Issues

Phase 1: Repository Discovery

IDSeverityIssueAffectedDetails
P1-001MEDIUMEmpty placeholder repositoryndx-try-aws-isbContains only .gitignore and LICENSE. No source code or documentation. Should be archived.
P1-002MEDIUMDeprecated component still deployedinnovation-sandbox-on-aws-billing-seperatorREADME states: "This is a temporary workaround." Tracking ISB issue #70.
P1-003LOWInconsistent repository namingAll reposMix of hyphens and underscores: ndx-try-aws-* vs ndx_try_aws_scenarios.
P1-004CRITICALFork 10 commits behind upstreaminnovation-sandbox-on-awsFork at commit cf75b87, upstream at 90488e0. 10 commits behind, potentially missing security patches and features. See .state/upstream-status.json.

Phase 2: AWS Organization Structure

IDSeverityIssueAffectedDetails
P2-001HIGHHigh quarantine ratePool accounts4 of 9 initial pool accounts (44%) were in Quarantine OU. Indicates cleanup failures or extended cooldown issues.
P2-002LOWEmpty environment OUsWorkloads OUDev, Test, and Sandbox OUs exist under Workloads but are unused. All workloads are in Prod.
P2-003HIGHDual SCP managementOrganization SCPsSCPs are managed by both LZA (ndx-try-aws-lza) and Terraform (ndx-try-aws-scp). Creates risk of conflicting policies and configuration drift.
P2-004LOWRole naming inconsistencyIAM RolesMix of github-actions-* and GitHubActions-* naming patterns across repos.
P2-005LOWMissing GitHub OIDC rolesMultiple reposbilling-seperator, costs, utils, scenarios, lza, scp, terraform repos do not have visible OIDC roles in Hub account.

Phase 3: ISB Core Architecture

IDSeverityIssueAffectedDetails
P3-001HIGHISB Core has no CI/CD pipelineinnovation-sandbox-on-awsDeployment is manual (npm run deploy:all with .env file). All satellite repos have GitHub Actions workflows but core does not.
P3-002MEDIUMOutdated AWS SDK in ISB Coreinnovation-sandbox-on-awsAWS SDK ranges from v3.654.0 to v3.758.0, while satellites use v3.987.0 to v3.1000.0. A gap of 242+ minor versions.
P3-003MEDIUM21 Lambda functions without centralised monitoringinnovation-sandbox-on-awsNo evidence of unified observability dashboard. Individual CloudWatch logs but no centralised metrics aggregation.

Phase 4: ISB Satellite Components

IDSeverityIssueAffectedDetails
P4-001HIGHAWS SDK v3 version spread (346 minor versions)All satellitesRanges from v3.654.0 (ISB Core) to v3.1000.0 (Billing Separator). Creates risk of API incompatibilities.
P4-002HIGHCDK version mismatchCosts, Billing Sep vs CoreCDK v2.240.0 in Costs and Billing Sep vs v2.170.0 in Core and Approver (70 minor versions apart).
P4-003CRITICALNo EventBridge event schema versioningAll satellitesEvents lack a schemaVersion field. If ISB Core changes event schema, all consuming satellites break simultaneously with no backward compatibility.
P4-004HIGHZod major version split (v3 vs v4)Approver vs Costs/Billing SepApprover uses zod v3.24.0. Costs and Billing Sep use zod v4.3.6. These are incompatible major versions with breaking API changes.
P4-005MEDIUMISB Client version skewApprover vs Costs/DeployerApprover uses @co-cddo/isb-client v2.0.1 while Costs and Deployer use v2.0.0.
P4-006MEDIUMISB Client distributed as GitHub tarball@co-cddo/isb-clientNot published to npm or GitHub Packages. Dependencies reference tarball URLs, making version resolution opaque.

Phase 5: NDX Websites

IDSeverityIssueAffectedDetails
P5-001LOWndx package.json description is "TODO:"ndxThe description field in package.json still says "TODO:" -- never completed.
P5-002MEDIUMScenarios repo has heavy devDependenciesndx_try_aws_scenariosAWS SDK clients (CloudFormation, CloudWatch, DynamoDB, Lambda, S3, SNS, STS) are all in devDependencies. Suggests tooling/testing uses that may bloat CI.

Phase 6: Infrastructure (LZA & Terraform)

IDSeverityIssueAffectedDetails
P6-001MEDIUMTwo IaC tools manage overlapping concernsLZA + TerraformLZA manages core guardrails and security baselines. Terraform manages ISB-specific SCPs. Ownership boundary is unclear.
P6-002LOWTerraform repos lack CI/CDndx-try-aws-terraformOnly ndx-try-aws-scp has a terraform.yaml workflow. ndx-try-aws-terraform is deployed manually.

Phase 7: CI/CD Pipelines

IDSeverityIssueAffectedDetails
P7-001MEDIUMMixed test frameworksAll TypeScript reposSome repos use vitest (v4.0.10-v4.0.18), others use jest (v30.2.0). Billing separator and ISB client use jest; all others use vitest.
P7-002MEDIUMNo automated dependency updatesAll reposNo evidence of Renovate or Dependabot configured across repositories. Manual dependency management leads to version drift.

Phase 8: Security & Compliance

IDSeverityIssueAffectedDetails
P8-001MEDIUMGitHub PAT for deployerinnovation-sandbox-on-aws-deployerUses a Personal Access Token stored in Secrets Manager. PATs have broad permissions and require manual rotation. Consider GitHub App tokens instead.
P8-002LOWNo VPC endpoints for AWS servicesHub accountLambda functions access Bedrock, Cost Explorer, and other AWS APIs via NAT Gateway rather than VPC endpoints. Traffic traverses the public internet path.

Phase 9: Data Flows

IDSeverityIssueAffectedDetails
P9-001MEDIUMCost Explorer 100 req/hour limitCosts satelliteWith 110 pool accounts, batch querying is essential. A scaling event could hit the rate limit.
P9-002LOWNo data retention policy documentationDynamoDB tablesTTL values are set (30 days for LeaseTable, 90 days for QuarantineStatus) but there is no formal data retention policy document.

Phase 10: Naming & Documentation

IDSeverityIssueAffectedDetails
P10-001LOWTypo in repository name: "seperator"innovation-sandbox-on-aws-billing-seperatorShould be "separator". This typo is baked into GitHub repo name, CI/CD pipelines, and documentation references.
P10-002LOWInconsistent license typesAll reposISB Core uses Apache-2.0, Approver and Deployer use MIT, Costs uses ISC, NDX uses MIT. No unified licensing policy.

Detailed Issue Analysis

P1-004: Fork 10 Commits Behind Upstream (CRITICAL)

Repository: innovation-sandbox-on-aws Upstream: aws-solutions/innovation-sandbox-on-aws Current State (from .state/upstream-status.json):

{
"upstreamUrl": "https://github.com/aws-solutions/innovation-sandbox-on-aws",
"upstreamSha": "90488e0a554a3a76f41ceaf39a0d4127b4e47c28",
"localSha": "cf75b87e1764611d794343640136cf3fb047a801",
"divergence": {
"ahead": 0,
"behind": 10
}
}

Risk: Missing 10 upstream commits that may include security patches, bug fixes, or feature improvements. The fork has zero local commits ahead, meaning no customisations have been applied -- this should be a straightforward merge.

Recommendation: Immediately merge upstream changes. Establish a regular (monthly) upstream sync process.


P4-003: No Event Schema Versioning (CRITICAL)

Problem: All EventBridge events exchanged between ISB Core and satellites lack a version identifier:

{
"source": "leases-api",
"detail-type": "LeaseApproved",
"detail": {
"leaseId": { "userEmail": "...", "uuid": "..." },
"awsAccountId": "...",
"approvedBy": "..."
}
}

There is no schemaVersion, eventVersion, or equivalent field. If the ISB Core team modifies the event payload structure (adds fields, renames fields, changes types), all four satellite consumers (Approver, Deployer, Costs, Billing Separator) will fail simultaneously.

Recommendation: Add a schemaVersion: "1.0" field to all events. Implement consumer-side schema validation (already partially done with zod). Document event schemas in a shared repository or as EventBridge Schema Registry entries.


P2-001: High Quarantine Rate (HIGH)

Current State (at time of initial analysis):

  • Available: 5 accounts (pool-003, 004, 005, 006, 009)
  • Quarantine: 4 accounts (pool-001, 002, 007, 008)
  • Active: 0 accounts

44% of the initial 9 pool accounts were in quarantine. With the full pool of 110 accounts now provisioned, this ratio may have improved, but the underlying cleanup reliability concern remains.

Possible Causes:

  1. AWS Nuke failing to delete certain resource types
  2. Protected resources not properly configured in nuke-config.yaml
  3. SCP restrictions preventing Nuke from deleting resources
  4. CodeBuild timeout before Nuke completes

Recommendation: Review CodeBuild logs for failed cleanup jobs. Update nuke-config.yaml to handle edge cases. Consider increasing max retries from 3.


P4-001: AWS SDK Version Spread (HIGH)

ISB Core: v3.654.0 - v3.758.0 (oldest)
ISB Client: v3.992.0 (exact pin)
Approver: v3.987.0 (caret)
Deployer: v3.993.0 (caret)
Costs: v3.995.0 (caret)
Billing Separator: v3.1000.0 (caret)

The spread of 346 minor versions across the ecosystem means API changes, bug fixes, and security patches in the SDK are inconsistently applied. The ISB Core is the most significantly outdated.


Recommendations Summary

Immediate Actions (Critical)

IDActionEffort
P1-004Merge 10 upstream commits into ISB Core forkLow (straightforward merge, 0 ahead)
P4-003Add schemaVersion field to all EventBridge eventsMedium (cross-repo change)

Short-Term Actions (High)

IDActionEffort
P4-001Standardise AWS SDK to v3.995.0+ across all reposMedium
P4-002Align CDK version to v2.240.0 across all reposMedium
P4-004Migrate Approver from zod v3 to zod v4Medium
P2-003Consolidate SCP management to single IaC sourceHigh
P2-001Investigate and remediate quarantine backlogMedium

Medium-Term Actions (Medium)

IDActionEffort
P3-001Add CI/CD pipeline for ISB CoreHigh
P4-006Publish ISB Client to GitHub Packages (npm)Low
P7-001Standardise on vitest across all TypeScript reposMedium
P7-002Configure Renovate/Dependabot across all reposLow
P8-001Replace GitHub PAT with GitHub App tokenMedium
P1-001Archive ndx-try-aws-isb (empty repo)Low
P1-002Track ISB issue #70 for billing separator replacementLow
P6-001Document IaC ownership boundaries (LZA vs Terraform)Low
P9-001Implement batch cost collection strategyMedium
P3-002Update AWS SDK in ISB Core workspacesMedium
P3-003Create unified CloudWatch dashboardMedium
P5-002Move AWS SDK clients from devDeps if not neededLow

Long-Term Actions (Low)

IDActionEffort
P1-003Standardise repository naming conventionsHigh (rename = breaking)
P10-001Fix "seperator" typo (requires repo rename)High (GitHub URL changes)
P10-002Adopt unified license policy (recommend MIT or Apache-2.0)Low
P2-002Document or remove empty environment OUsLow
P2-004Standardise IAM role naming conventionsMedium
P2-005Create OIDC roles for all repos that need themMedium
P5-001Fix "TODO:" in ndx package.json descriptionLow
P8-002Deploy VPC endpoints for AWS servicesMedium
P9-002Create formal data retention policy documentLow

Risk Matrix


References


Generated from source analysis. See 00-repo-inventory.md for full inventory.