AWS Service |
Role |
Amazon EMR (Hadoop + Hive) |
Baseline batch-oriented data warehouse |
Amazon MSK (Kafka) |
Real-time streaming ingestion backbone |
Amazon EMR (Spark Streaming) |
Real-time stream processing engine |
Amazon Kinesis Data Analytics (Flink) |
Alternative for stream analytics |
Amazon S3 |
Centralized data lake (raw + transformed) |
Amazon Redshift |
Data warehouse for analytical queries |
Amazon Athena |
Interactive queries across S3 datasets |
AWS Lambda |
Event-driven orchestration & stream triggers |
Amazon CloudWatch |
Monitoring performance, cost, and system health |
Amazon QuickSight |
Visualization of migration impact & performance metrics |