Category |
AWS Service / Technology |
Purpose |
Big Data Processing |
AWS EMR |
Distributed preprocessing of large image datasets using Spark/Hadoop. |
Machine Learning |
Amazon SageMaker |
Train deep learning models using distributed GPU/CPU resources. |
Data Storage |
Amazon S3 |
Store raw images, processed data, and trained model artifacts. |
Monitoring |
Amazon CloudWatch |
Track EMR cluster, SageMaker jobs, and model performance metrics. |
Compute Scaling |
SageMaker Managed Spot / EMR Auto Scaling |
Dynamically scale resources for cost efficiency. |
Security & Access |
AWS IAM |
Secure access to S3, EMR, and SageMaker resources. |
Workflow Orchestration |
AWS Step Functions (Optional) |
Orchestrate preprocessing, training, and deployment tasks. |
Notification |
Amazon SNS |
Notify stakeholders when training or inference completes. |