Azure Service |
Purpose in Project |
Azure Data Lake Storage Gen2 (ADLS Gen2) |
Stores raw and transformed datasets (input + output of ETL). |
Azure HDInsight (Spark Cluster) |
Provides managed Spark environment for ETL workloads. |
Azure Databricks |
Optimized Spark platform with ML/AI integration, tested against HDInsight. |
Azure Synapse Analytics |
Data warehouse for storing transformed data and running analytical queries. |
Azure Monitor + Log Analytics |
Collects performance metrics, logs, and cost analysis. |
Azure Cost Management + Pricing Calculator |
Provides cost benchmarking between HDInsight and Databricks. |
Azure Key Vault (IAM + Security) |
Manages access, credentials, and secure keys for services. |
Power BI / Synapse Studio |
Data visualization and reporting for benchmarking results. |