6R Disposition Distribution
Composite Score Distribution
Application Portfolio
AWS-Powered Rationalization Pipeline
End-to-end data pipeline. All processing runs in GovCloud (us-gov-west-1) with cross-partition Bedrock access. Click any service below for details.
Data Ingestion
5 FAA-provided Excel data extracts uploaded to S3 bucket (atlas-rationalization-092359260389/raw/):
CMDB Extract (101 CIs) | Cloud Export (17 resources) | Business Context (19 apps) | Security/Compliance (20 apps) | Interface Inventory (42 interfaces)
Why S3?
FedRAMP-authorized object storage in GovCloud. Provides durable, encrypted-at-rest storage for source data. Serves as the single source of truth — all downstream processing reads from these immutable source files.
Data Processing & Scoring Engine
atlas-data-loader Lambda (Python 3.12) parses all 5 Excel files, merges records across sources by Application ID, and computes 4-dimension scores using deterministic rule-based logic:
- Mission Value (30%): Business criticality, user count, system-of-record status, regulatory driver, workaround availability
- Technical Fit (25%): Language/framework currency, hosting model, support status, DB modernity
- Cost Efficiency (20%): Cloud spend, DB complexity, server count, interface complexity
- Risk Posture (25%): ATO status/expiration, POA&M items, continuous monitoring, critical interfaces
6R dispositions (Retain/Rehost/Replatform/Refactor/Replace/Retire) assigned via decision tree logic. Results stored in 4 DynamoDB tables with GSIs for fast querying.
Why Lambda + DynamoDB?
Lambda provides serverless compute — no infrastructure to manage, scales to zero when idle. DynamoDB offers single-digit-ms reads with pay-per-request billing. GSIs on sourceAppId, targetAppId, appKey, and ciType enable fast lookups for dependency analysis. Scoring is deterministic (not AI-generated) — satisfying the disclosure requirement.
Knowledge Base & Vector Indexing
All 5 data extracts + scoring methodology converted to structured markdown documents and indexed into a Bedrock Knowledge Base. OpenSearch Serverless vector collection stores embeddings generated by Amazon Titan Embed Text v2. Enables semantic retrieval — the AI can find relevant data points even when questions don't use exact field names.
Why RAG (Retrieval-Augmented Generation)?
Instead of sending the entire dataset in every prompt, RAG retrieves only the most relevant records for each query. This enables: (1) More focused answers citing specific data points, (2) Scalability to the full 200+ app portfolio, (3) Traceability — every AI response shows which source documents were retrieved.
AI-Powered Analysis (Cross-Partition)
atlas-bedrock-proxy Lambda performs cross-partition Bedrock calls from GovCloud to commercial AWS using service-specific credentials with bearer token authentication. Credentials stored in Secrets Manager and rotated automatically.
Agentic RAG workflow: Query → Retrieve relevant documents from KB → Augment prompt with retrieved context + full portfolio scores → Generate analysis with Claude Opus → Cache response in DynamoDB (TTL: 24h)
Why Cross-Partition?
Amazon Bedrock with Anthropic Claude is not available in GovCloud partition. The cross-partition pattern uses service-specific credentials (bearer token auth) to securely call Bedrock in commercial us-east-1 from GovCloud Lambda. All credentials encrypted via Secrets Manager. Only bedrock:InvokeModel and bedrock:Converse actions are permitted — minimal blast radius.
API Layer
atlas-api-handler Lambda serves RESTful endpoints for the frontend:
GET /api/applications | GET /api/interfaces | GET /api/scores/summary | POST /api/analyze | GET /api/deliverables
Data endpoints serve directly from DynamoDB (sub-100ms). Analysis endpoint routes to Bedrock proxy with RAG context injection.
Why API Gateway + Lambda?
HTTP API (v2) provides low-latency request routing with built-in CORS. Lambda functions scale automatically and cost nothing when idle. Separation of data queries (fast, DynamoDB-backed) from AI analysis (Bedrock-backed) ensures the demo UI is responsive.
Deliverable Generation
7 deliverables generated programmatically from the scored data:
- Inventory & Scoring Workbook — Consolidated inventory with 4-dimension scoring and methodology
- Dependency Diagrams — 20×20 interface matrix, data flow by domain, modernization sequencing
- 6R Disposition List — Disposition with data-traced rationale + 6 consolidation opportunities
- Outcome Matrix — Per-app cost/performance/resilience/security outcomes + portfolio KPIs
- Risk Register — 12 scored risks with mitigation strategies and heatmap
- Cutover & Rollback Approach — 4-wave cutover with rollback triggers and T&E requirements
- Schedule with Assumptions — 36-activity master schedule (18 months) + 14 stated assumptions
Data Traceability
Every recommendation in every deliverable cites specific data points: App IDs, CI IDs, Interface IDs, ATO dates, FISMA levels, and POA&M counts. During the 45-minute Q&A, evaluators can trace any recommendation back to the source data.
Presentation & Authentication
Static web application hosted on S3 with Cognito user pool authentication. Dashboard, scoring matrix, interactive dependency graph, AI chat interface, and deliverable downloads — all served from GovCloud.
Security
Cognito provides user authentication with email-based login. All API calls authenticated. S3 website hosting in GovCloud with HTTPS. No data leaves the GovCloud boundary except cross-partition Bedrock calls (encrypted, bearer-token-authenticated).
Architecture Summary
Scoring Matrix - All Applications Ranked by Composite Score
Weights: Mission Value 30% | Technical Fit 25% | Cost Efficiency 20% | Risk Posture 25%. All scores tool-calculated (deterministic rules).
| Rank | App | Name | Mission Value | Technical Fit | Cost Efficiency | Risk Posture | Composite | Disposition |
|---|
Application Dependency Graph with CSP Boundaries
What this shows: Every application and every data connection between them, grouped by where they run (AWS, Azure, On-Prem, SaaS). Use this to understand which apps depend on each other before planning any migration.
How to read it: Each box is an app. Lines are data flows (APIs, shared databases, batch files). Thicker/redder lines = more critical connections. Dashed lines cross cloud boundaries. Click any app to see its details.
AI-Powered Portfolio Analysis (RAG)
1. Query sent to Bedrock Knowledge Base which retrieves relevant records from the indexed portfolio data
2. Retrieved context + scored portfolio data augmented into prompt
3. Claude Opus generates analysis with mandatory data citations
4. Scoring and 6R dispositions shown are tool-calculated (deterministic, not AI)
Generated Deliverables
All 7 deliverables generated from FAA-provided data using the AWS pipeline above.
Upload Source Data
Upload updated Excel files to replace source data. After uploading, run the Glue ETL pipeline to reprocess scores and dispositions.
ETL Pipeline (AWS Glue)
Run the Glue ETL job to reprocess uploaded data, recompute scores and 6R dispositions, and reload DynamoDB tables.
1. Read 5 Excel files from S3 (raw/ prefix)
2. Parse and merge records by Application ID
3. Compute 4-dimension scores (Mission Value, Technical Fit, Cost, Risk)
4. Assign 6R dispositions via decision tree
5. Load results to DynamoDB (4 tables)
Engine: AWS Glue 4.0 (Spark), 2x G.1X workers
Recent Runs
Import AWS Migration Evaluator Results
Upload the ME Data Export XLSX to enrich ATLAS with EC2 right-sizing, pricing models, Graviton eligibility, and RDS recommendations. Each app detail will show ME cost analysis alongside ATLAS scores.