Service/Technology |
Role |
Cloud Storage |
Raw data ingestion + staging + processed data storage. |
Dataproc (Spark/Hadoop) |
Distributed batch processing (ETL transformations, aggregations). |
BigQuery |
Data warehouse for analytics and visualization. |
Cloud Composer (Apache Airflow) |
Workflow orchestration, scheduling, dependency management. |
Pub/Sub (optional) |
Event-driven triggers for batch pipeline runs. |
Cloud Monitoring & Logging |
Tracking pipeline performance, error alerts, and logs. |
IAM & Cloud KMS |
Secure access control + data encryption. |
Looker Studio / BI Engine |
Business intelligence dashboards and query acceleration. |