Google Cloud Service / Technology |
Role |
Cloud Storage |
Acts as the data lake for storing raw datasets and Iceberg tables. |
Dataproc (Spark) |
Provides distributed processing and querying of Iceberg tables using Spark clusters. |
Apache Iceberg |
Open table format ensuring ACID compliance, schema evolution, and time travel for lakehouse datasets. |
BigQuery + BigLake |
Enables serverless querying and federated access to Iceberg tables stored in Cloud Storage. |
Looker Studio / BigQuery BI Engine |
Provides data visualization dashboards and performance insights for queries. |
Cloud Monitoring & Logging |
Captures benchmarking metrics, query latency, job execution times, and system performance logs. |
IAM & Cloud KMS |
Handles data access control, permissions, and encryption of data at rest and in transit. |