Lease Lifecycle
Last Updated: 2026-03-02 Source: co-cddo/innovation-sandbox-on-aws Captured SHA:
cf75b87
Executive Summary
A lease represents a user's temporary access to a sandboxed AWS account within the Innovation Sandbox ecosystem. Each lease passes through a well-defined state machine, from request through approval, active monitoring, and eventual termination with automated account cleanup. The lifecycle is orchestrated by the Leases Lambda (API-driven state changes), the Lease Monitoring Lambda (scheduled budget/duration checks), and the Account Lifecycle Manager Lambda (event-driven OU transitions and IDC assignments). Account cleanup is handled by a Step Functions state machine that invokes CodeBuild running AWS Nuke in a container.
Complete Lease State Machine
Lease States
| State | Schema | Account Allocated | OU | Monitoring | Terminal |
|---|---|---|---|---|---|
PendingApproval | PendingLeaseSchema | No | -- | No | No |
ApprovalDenied | ApprovalDeniedLeaseSchema | No | -- | No | Yes |
Active | MonitoredLeaseSchema | Yes | Active | Yes | No |
Frozen | MonitoredLeaseSchema | Yes | Frozen | Yes | No |
Expired | ExpiredLeaseSchema | Yes (cleanup queued) | CleanUp | No | Yes |
BudgetExceeded | ExpiredLeaseSchema | Yes (cleanup queued) | CleanUp | No | Yes |
ManuallyTerminated | ExpiredLeaseSchema | Yes (cleanup queued) | CleanUp | No | Yes |
AccountQuarantined | ExpiredLeaseSchema | Yes (quarantined) | Quarantine | No | Yes |
Ejected | ExpiredLeaseSchema | Yes (ejected) | Exit | No | Yes |
Source: source/common/data/lease/lease.ts
Phase 1: Lease Request
Sequence Diagram
DynamoDB Writes
Auto-approved path:
LeaseTableINSERT: NewMonitoredLeasewith statusActive,awsAccountId,startDate,expirationDate,approvedBy: "AUTO_APPROVED"SandboxAccountTableUPDATE: Account status fromAvailabletoActive, set lease association
Pending approval path:
LeaseTableINSERT: NewPendingLeasewith statusPendingApproval- No account table changes
Validation rules (from global AppConfig):
- Template must exist and be active
- User's concurrent active lease count <
maxLeasesPerUser(default 3) - If auto-approved, at least one account must be in
Availablestatus - If
userEmaildiffers from requester, requester must be Manager or Admin
Source: source/lambdas/api/leases/src/leases-handler.ts
Phase 2: Lease Approval
Sequence Diagram
Account Lifecycle Manager Actions on LeaseApproved
The Account Lifecycle Manager Lambda handles the physical account provisioning:
- Move account to Active OU:
organizations:MoveAccountfrom Available OU to Active OU - Grant IDC access:
sso:CreateAccountAssignmentwith the user's permission set (User, Manager, or Admin PS) - Update DynamoDB: Record the IDC assignment state on the account
The Write Protection SCP is removed when leaving the Available OU (it only applies to Available, CleanUp, Quarantine, Entry, and Exit OUs), enabling the user to create resources.
Source: source/lambdas/account-management/account-lifecycle-management/src/account-lifecycle-manager.ts
Phase 3: Active Lease Monitoring
Monitoring Schedule
The LeaseMonitoringLambda runs on a scheduled EventBridge rule and evaluates all Active and Frozen leases.
Alert-to-Action Mapping
| Alert Event | Current State | Action | New State |
|---|---|---|---|
LeaseBudgetExceeded | Active/Frozen | Terminate lease, queue cleanup | BudgetExceeded |
LeaseExpired | Active/Frozen | Terminate lease, queue cleanup | Expired |
LeaseBudgetThresholdAlert | Active | Send notification only | Active (unchanged) |
LeaseDurationThresholdAlert | Active | Send notification only | Active (unchanged) |
LeaseFreezingThresholdAlert | Active | Freeze account | Frozen |
Threshold Configuration
Thresholds are defined per lease template:
Budget thresholds: [{ dollarsSpent: number, action: "ALERT" | "FREEZE_ACCOUNT" }]
ALERT: PublishesLeaseBudgetThresholdAlert(notification only)FREEZE_ACCOUNT: PublishesLeaseFreezingThresholdAlert(triggers freeze)
Duration thresholds: [{ hoursRemaining: number, action: "ALERT" | "FREEZE_ACCOUNT" }]
- Same action types as budget thresholds
Source: source/lambdas/account-management/lease-monitoring/src/lease-monitoring-handler.ts
Phase 4: Lease Freeze and Unfreeze
Freeze Flow
Unfreeze Flow
Freezing preserves existing resources but the Frozen OU may have additional restrictions. Unfreezing restores full access. Both operations require Manager or Admin role.
Source: source/common/events/lease-frozen-event.ts, lease-unfrozen-event.ts
Phase 5: Lease Termination and Cleanup
Termination Triggers
A lease enters a terminal state through three paths:
- Manual termination:
POST /leases/{id}/terminate(Manager/Admin) - Budget exceeded: Lease Monitoring detects
totalCostAccrued > maxSpend - Duration expired: Lease Monitoring detects
now > expirationDate
Account Lifecycle Manager on Terminal Events
The Account Lifecycle Manager handles the tracked events LeaseBudgetExceeded, LeaseExpired, and processes the transition:
- Update lease record to terminal status (
Expired/BudgetExceeded/ManuallyTerminated) - Set
endDateandttlon the lease - Revoke IDC access:
sso:DeleteAccountAssignment - Move account to CleanUp OU:
organizations:MoveAccount - Publish
CleanAccountRequestevent to trigger the cleanup Step Function
Account Cleaner Step Function
Key parameters (from Global AppConfig cleanup section):
numberOfSuccessfulAttemptsToFinishCleanup: Number of consecutive AWS Nuke successes required (default: 2)waitBeforeRerunSuccessfulAttemptSeconds: Delay between successful runs (default: 30s)numberOfFailedAttemptsToCancelCleanup: Max failures before quarantine (default: 3)waitBeforeRetryFailedAttemptSeconds: Delay between failed retries (default: 5s)- Step Function total timeout: 12 hours
- CodeBuild timeout: 60 minutes per run
AWS Nuke Execution
CodeBuild runs an AWS Nuke container that:
- Assumes the
IntermediateRolein the hub account - Then assumes the
{namespace}_IsbCleanupRolein the target sandbox account - Loads nuke config from AppConfig (with placeholder substitution)
- Deletes all resources except those in the blocklist/filters
- Returns exit code to Step Functions
Protected resources (from nuke-config.yaml):
- CloudFormation StackSet instances (
StackSet-Isb-*) - AWS Control Tower resources (trails, rules, roles, functions, logs)
- SSO-related roles (
AWSReservedSSO_*) OrganizationAccountAccessRole- StackSet execution roles (
stacksets-exec-*) - SAML providers (
AWSSSO) - Config Service recorders/channels
Source: source/infrastructure/lib/components/account-cleaner/step-function.ts, cleanup-buildspec.yaml, source/infrastructure/lib/components/config/nuke-config.yaml
Phase 6: Post-Cleanup
On AccountCleanupSucceeded
The Account Lifecycle Manager:
- Moves account from CleanUp OU to Available OU
- Resets the account record in SandboxAccountTable (clears lease association, sets status to
Available) - Account is now ready for the next lease
On AccountCleanupFailed
The Account Lifecycle Manager:
- Moves account from CleanUp OU to Quarantine OU
- Updates account status to
Quarantine - Updates lease status to
AccountQuarantined - Publishes
AccountQuarantinedevent - Sends admin notification for manual review
Admin Recovery Options
- Retry cleanup:
POST /accounts/{id}/retryCleanup-- moves account back to CleanUp OU and re-triggers cleanup - Eject account:
POST /accounts/{id}/eject-- moves account to Exit OU, removes from pool permanently
Account OU Transition Diagram
Each OU has specific SCPs applied:
- Available, CleanUp, Quarantine, Entry, Exit: Write Protection SCP (blocks create/modify)
- Active: Full access within allowed services and regions
- Frozen: Full access but practically limited (no active user sessions)
- All OUs: AWS Nuke Supported Services SCP, Restrictions SCP, Protect ISB SCP, Limit Regions SCP
EventBridge Event Routing
Event-to-Lambda Routing
| Event | Rule Target | Delivery | Concurrency |
|---|---|---|---|
LeaseApproved, LeaseBudgetExceeded, LeaseExpired, AccountCleanupSucceeded, AccountCleanupFailed, AccountDriftDetected, LeaseFreezingThresholdAlert | Account Lifecycle Manager | SQS -> Lambda | Reserved: 1 |
CleanAccountRequest | Account Cleaner Step Function | Direct | -- |
LeaseRequested, LeaseApproved, LeaseDenied, LeaseTerminated, LeaseFrozen, LeaseUnfrozen, alerts | Email Notification Lambda | SQS -> Lambda | -- |
| All events | CloudWatch Logs | Direct | -- |
The Account Lifecycle Manager uses reserved concurrency of 1 to ensure serialized processing of events, preventing race conditions in account OU transitions and DynamoDB updates.
Source: source/infrastructure/lib/components/events/isb-internal-core.ts, source/infrastructure/lib/components/account-management/account-lifecycle-management-lambda.ts
Error Handling and Recovery
SQS-based Retry Pattern
Events routed through SQS queues benefit from:
- Visibility timeout: Prevents re-processing during Lambda execution
- Max receive count: 3 retries before DLQ
- Max event age: 4 hours for lifecycle events
- DLQ: Dead letter queue for manual investigation
Step Function Error Handling
- The
InitializeCleanupLambdainvoke has a catch-all that publishesAccountCleanupFailed - The CodeBuild step has a catch-all that increments the failure counter and retries
- The entire state machine has a 12-hour timeout
Idempotency
- The
InitializeCleanupLambdachecks if cleanup is already in progress (by querying thecleanupExecutionContexton the account record) and skips if so - Lease state transitions use DynamoDB conditional writes to prevent conflicting updates
DynamoDB Query Patterns
| Query | Method | Key/Index | Filter |
|---|---|---|---|
| Get lease by ID | Query | PK: userEmail, SK: uuid | -- |
| User's leases | Query | PK: userEmail | Optional status filter |
| Leases by status | Query | GSI StatusIndex PK: status | -- |
| Available accounts | Scan | -- | status = "Available" |
| Account by ID | GetItem | PK: awsAccountId | -- |
| Template by ID | GetItem | PK: uuid | -- |
Note: The StatusIndex GSI on LeaseTable uses status as partition key and originalLeaseTemplateUuid as sort key, enabling efficient queries for all leases in a given state.
Related Documentation
- 10-isb-core-architecture.md -- CDK stacks, Lambda catalog, API endpoints
- 12-isb-frontend.md -- Frontend UI for lease management
- 13-isb-customizations.md -- CDDO extensions (Costs, Deployer, Approver)
- 05-service-control-policies.md -- SCP analysis per OU
Generated from source analysis. See 00-repo-inventory.md for full inventory.