
| Setting | Select / Enter |
|---|---|
| Name | My-EMR-Spark-Hive-Cluster (or any name) |
| Application Bundle | Custom |
| Select Applications | ✅ Hadoop ✅ Hive ✅ Spark (Others can remain, but ensure these 3 are selected) |
| AWS Glue Data Catalog (optional) | You can leave unchecked for now |
| Operating System | ✅ Amazon Linux (default) |
| Option | Select |
|---|---|
| Cluster configuration method | Uniform instance groups |
| Node Type | Instance Type | Count |
|---|---|---|
| Primary (Master) | m5.xlarge (or r8g.xlarge if available in your region) | 1 |
| Core | m5.xlarge | 2 |
| Task | Leave empty (not required for now) | — |
| Group | Instance Type | Instance Count |
|---|---|---|
| Core | m5.xlarge | 2 |
| Task | Not needed | 0 |
| Setting | Select |
|---|---|
| VPC | Default VPC (or your VPC) |
| Subnet | Any public subnet |
| Security Group | Default (EMR will auto-create security groups if needed) |
| Setting | Choose |
|---|---|
| Security configuration | Leave default |
| EC2 Key Pair | ✅ Select your SSH key (or create new) |
















