Skip to main content

Data Flows

Last Updated: 2026-03-06 Sources: repos/innovation-sandbox-on-aws, repos/innovation-sandbox-on-aws-costs, repos/innovation-sandbox-on-aws-approver, repos/innovation-sandbox-on-aws-billing-seperator, repos/innovation-sandbox-on-aws-deployer, .state/discovered-accounts.json

Executive Summary

The NDX:Try AWS ecosystem relies on an event-driven data architecture where DynamoDB tables serve as the primary data stores, EventBridge acts as the central nervous system for cross-component communication, and Step Functions orchestrate multi-step workflows. Data flows span the Hub account (568672915267), the Organization Management account (955063685555), and up to 110 pool accounts within the ndx_InnovationSandboxAccountPool OU.


Data Flow 1: User Signup to ISB Lease

Flow Diagram

Data Objects

LeaseRequest (Input)

{
leaseTemplateUuid: string, // Selected template
userEmail?: string, // Optional (defaults to requestor)
comments?: string, // Justification text
tags?: Record<string, string> // Custom tags
}

Lease (Created in DynamoDB)

{
// Partition Key
userEmail: string,
// Sort Key
uuid: string,

// Status
status: 'PendingApproval' | 'Active' | 'Frozen' | 'Expired' | 'BudgetExceeded',

// Template Info
originalLeaseTemplateUuid: string,
originalLeaseTemplateName: string,

// Resource Allocation
awsAccountId?: string, // Assigned after approval

// Temporal
createdDate: string, // ISO8601
startDate?: string, // After approval
expirationDate?: string, // startDate + duration
endDate?: string, // After termination

// Budget
maxSpend: number, // GBP
totalCostAccrued: number, // Updated hourly

// Lifecycle
leaseDurationInHours: number,
approvedBy?: string,
ttl?: number // Epoch for auto-deletion
}

DynamoDB Write Patterns

1. Check User Quota

Operation: Query
TableName: LeaseTable
KeyConditionExpression: userEmail = :email
FilterExpression: status IN ('Active', 'PendingApproval', 'Frozen')
Purpose: Enforce maxLeasesPerUser limit (typically 3)

2. Find Available Account

Operation: Query
TableName: SandboxAccountTable
IndexName: AccountsByStatus
KeyConditionExpression: accountStatus = :available
Limit: 1
Purpose: Allocate account for new lease

3. Create Lease Record

Operation: PutItem
TableName: LeaseTable
Item: {lease object}
Purpose: Persist lease for tracking

4. Update Account Status

Operation: UpdateItem
TableName: SandboxAccountTable
Key: {accountId}
UpdateExpression: SET accountStatus = :active, leaseUuid = :leaseId
Purpose: Mark account as in-use

EventBridge Events Published

LeaseRequested (if manual approval)

{
"source": "leases-api",
"detail-type": "LeaseRequested",
"detail": {
"leaseId": {
"userEmail": "user@example.gov.uk",
"uuid": "lease-abc-123"
},
"templateId": "template-xyz",
"userEmail": "user@example.gov.uk",
"templateName": "Production-Like",
"justification": "Need to test new microservice..."
}
}

LeaseApproved (if auto-approve or after manual approval)

{
"source": "leases-api",
"detail-type": "LeaseApproved",
"detail": {
"leaseId": {
"userEmail": "user@example.gov.uk",
"uuid": "lease-abc-123"
},
"awsAccountId": "340601547583",
"approvedBy": "AUTO_APPROVED"
}
}

Data Flow 2: Lease Approval to Deployment

Flow Diagram

Data Transformations

1. Approver Scoring

Input: Raw lease request

{
"leaseId": "lease-abc-123",
"userEmail": "user@example.gov.uk",
"budget": 1000,
"duration": 48,
"justification": "Testing new AI service integration..."
}

Intermediate: Rule scores (weighted across 5 categories, 19 rules total)

{
"R01_PreviousLeaseCompliance": {"score": 92, "weight": 10, "passed": true},
"R02_CostOverrunHistory": {"score": 100, "weight": 8, "passed": true},
"R09_JustificationQuality": {"score": 78, "weight": 8, "passed": true},
"R13_CurrentSpendVsQuota": {"score": 100, "weight": 6, "passed": true}
}

Output: Approval decision

{
"compositeScore": 87,
"decision": "APPROVED",
"decisionBy": "AUTO",
"ruleBreakdown": {}
}

2. Deployer Parameter Enrichment

The deployer uses the @co-cddo/isb-client library to fetch lease details from the ISB API, then enriches CloudFormation parameters:

{
"StackName": "NDXTry-CouncilChatbot",
"TemplateURL": "https://...",
"Parameters": [
{"ParameterKey": "Budget", "ParameterValue": "1000"},
{"ParameterKey": "Environment", "ParameterValue": "sandbox"}
],
"Tags": [
{"Key": "LeaseId", "Value": "lease-abc-123"},
{"Key": "CostCentre", "Value": "Innovation"}
]
}

Cross-Account Data Access

Deployer Assumes Role in Target Account

Source: Hub Account (568672915267)
Target: Pool Account (e.g., pool-003 / 340601547583)
Role: OrganizationAccountAccessRole
Permissions:
- cloudformation:CreateStack
- cloudformation:DescribeStacks
- cloudformation:UpdateStack
- iam:CreateRole (for CFN execution)
- s3:GetObject (for templates)

Data Flow 3: Cost Data Collection

Flow Diagram

Cost Data Structure

Cost Explorer Query (from Hub to Org Management Account)

# Cost collector assumes CostExplorerReadRole in 955063685555
ce.get_cost_and_usage(
TimePeriod={
'Start': start_date,
'End': end_date
},
Granularity='DAILY',
Metrics=['UnblendedCost', 'UsageQuantity'],
GroupBy=[
{'Type': 'DIMENSION', 'Key': 'SERVICE'},
{'Type': 'DIMENSION', 'Key': 'REGION'}
],
Filter={
'Dimensions': {
'Key': 'LINKED_ACCOUNT',
'Values': [account_id]
}
}
)

DynamoDB CostReports Record

{
"leaseId": "lease-abc-123",
"accountId": "340601547583",
"collectedAt": 1704067200,
"totalCost": 856.34,
"budget": 1000,
"overBudget": false,
"variance": -143.66,
"variancePercent": -14.4,
"duration": 30,
"costPerDay": 28.54,
"dailyCosts": "[{\"date\":\"2024-01-01\",\"cost\":25.67}]",
"topServices": "[{\"service\":\"EC2\",\"cost\":456.78}]",
"topRegions": "[{\"region\":\"us-east-1\",\"cost\":850.00}]",
"dataSource": "AWS_COST_EXPLORER",
"ownerId": "user@example.gov.uk",
"orgUnit": "Innovation"
}

Data Retention

TableRetentionPurpose
LeaseTable30 days (TTL)Active tracking
CostReportsIndefiniteBusiness intelligence
ApprovalHistory2 yearsCompliance audit
QuarantineStatus90 days (TTL)Operational monitoring

Data Flow 4: Billing Separation & Cleanup

72-Hour Quarantine Flow

SQS Message Structure

Initial Message (at termination)

{
"leaseId": "lease-abc-123",
"accountId": "340601547583",
"terminatedAt": 1704067200,
"ownerId": "user@example.gov.uk",
"finalBudget": 1000,
"duration": 30,
"quarantineReason": "AWAITING_COST_DATA"
}

Decision Matrix

Hours Elapsed | Cost Data | Action
--------------|-----------|------------------
24h | No | Extend delay 24h
48h | No | Extend delay 24h
72h | Yes | Release account
72h | No | Extend delay 24h
96h | No | Force release + alert

Event-Driven Architecture Summary

EventBridge Event Catalog

EventSourceConsumersPayload
LeaseRequestedleases-apiApprover, EmailleaseId, templateId, userEmail
LeaseApprovedleases-api, approverLifecycle Mgr, Deployer, EmailleaseId, accountId, approvedBy
LeaseDeniedapproverEmailleaseId, reason, deniedBy
LeaseTerminatedleases-api, monitoringCosts, Billing Sep, CleanupleaseId, accountId, terminatedAt
LeaseExpiredmonitoringLifecycle Mgr, Cleanup, EmailleaseId, accountId, expirationDate
LeaseBudgetExceededmonitoringLifecycle Mgr, Email, CleanupleaseId, budget, totalSpend
CostDataCollectedcostsBilling Sep, ReportingleaseId, totalCost, overBudget
BudgetOveragecostsEmail, ReportingleaseId, variance, variancePercent
DeploymentCompletedeployerLeases API, EmailleaseId, stackId, status
AccountCleanedcleanupLifecycle MgraccountId, cleanupDuration
AccountQuarantinedcleanupOps alertsaccountId, reason, failureCount

Cross-Service Data Dependencies


Data Consistency & Error Handling

Transaction Patterns

Lease Creation (DynamoDB)

# Transactional write ensures atomicity
dynamodb.transact_write_items(
TransactItems=[
{
'Put': {
'TableName': 'LeaseTable',
'Item': lease_item,
'ConditionExpression': 'attribute_not_exists(leaseId)'
}
},
{
'Update': {
'TableName': 'SandboxAccountTable',
'Key': {'accountId': account_id},
'UpdateExpression': 'SET accountStatus = :active, leaseUuid = :leaseId',
'ConditionExpression': 'accountStatus = :available'
}
}
]
)

Rollback on Failure: Transaction fails atomically if account is not available, preventing partial states.

Idempotency

The Approver uses @aws-lambda-powertools/idempotency to ensure duplicate EventBridge deliveries do not result in duplicate processing. The deployer checks for existing CloudFormation stacks before creating new ones.

Eventual Consistency Handling

  1. 24-hour delay before first cost collection attempt (Cost Explorer data lag)
  2. 72-hour quarantine before cleanup (billing propagation)
  3. Retry logic with exponential backoff (via exponential-backoff library in billing separator)
  4. Dead Letter Queues for permanent failures

Performance Characteristics

Throughput Limits

OperationLimitMitigation
Cost Explorer API100 req/hourBatch queries, 24h delay
DynamoDB writesOn-demand (unlimited)Auto-scaling
EventBridge events10,000/secWell within capacity
Lambda concurrency1000 (default)Reserved concurrency per function

Latency Targets

FlowTargetTypical (P95)
Lease creation< 2s~1.3s
Approval scoring< 10s~8.7s (includes Bedrock)
Deployment trigger< 5s~3.2s
Cost collection< 30s~18.4s

References


Generated from source analysis. See 00-repo-inventory.md for full inventory.