Apache Hadoop Open Source Tools for Big Data Projects
S. No.ToolsTypeUsed ForPage Link.
1Apache AccumuloNoSQL wide column storeData accessApache Accumulo – Hadoop Open Source Code
2Apache AmbariDistributed ComputingOperationsApache Ambari – Hadoop Open Source Code
3Apache AtlasMetadata ManagementData Management, Data Governance and IntegrationApache atlas – Hadoop Open Source Code
4Apache FalconThe framework of Managing dataData Governance and IntegrationApache Falcon – Hadoop Open Source Code
5Apache FlumeData transfer into HDFSData Governance and IntegrationApache Flume – Hadoop Open Source Code
6Apache HadoopBatch ProcessingData Management, Data access, Data Governance and Integration, Security, OperationsApache Hadoop – Hadoop Open Source Code
7Apache Hadoop HDFSBig Data StorageData Management, Data Governance and Integration, SecurityApache Hadoop HDFS – Hadoop Open Source Code
8Apache Hadoop MapReduceBatch ProcessingData accessApache Hadoop MapReduce – Hadoop Open Source Code
9Apache Hadoop YARNResource Management and Job Scheduling layerData ManagementApache Hadoop YARN – Hadoop Open Source Code
10Apache HBaseNoSQL DatabaseData accessApache HBase – Hadoop Open Source Code
11Apache HiveRelational DatabaseData accessApache Hive – Hadoop Open Source Code
12Apache KafkaStream ProcessingData Governance and IntegrationApache Kafka – Hadoop Open Source Code
13Apache Knox GatewaySecurity Entry pointSecurityApache Knox Gateway – Hadoop Open Source Code
14Apache OOZIEWork flow SchedulerOperationsApache OOZIE – Hadoop Open Source Code
15Apache PhoenixSQL DatabaseData accessApache Phoenix – Hadoop Open Source Code
16Apache Pig (High level Scripting Language used with Hadoop)High level Scripting LanguageData accessApache Pig – Hadoop Open Source Code
17Apache RangerData Security FrameworkSecurityApache Ranger – Hadoop Open Source Code
18Apache SliderFramework of YARN- basedData accessApache Slider – Hadoop Open Source Code
19Apache SolrSearch PlatformData accessApache Solr – Hadoop Open Source Code
20Apache SparkHybrid Framework(Batch and Stream)Data accessApache Spark – Hadoop Open Source Codee
21Apache SqoopData Transfer toolData Governance and IntegrationApache Sqoop – Hadoop Open Source Code
22Apache StormDistributed Stream ProcessingData accessApache Storm – Hadoop Open Source Code
23Apache TezFramework of YARN- basedData accessApache Tez – Hadoop Open Source Code
24Apache ZeppelinWeb Based NotebookData AnalyticsApache Zeppelin – Hadoop Open Source Code
25Apache ZookeeperDistributed ComputingOperationsApache ZooKeeper – Hadoop Open Source Code
26DruidData StoreBI queriesDruid – Hadoop Open Source Code
27Apache SamzaStream ProcessingData AnalyticsApache Samza – Hadoop Open Source Code
28Apache FlinkStream ProcessingData Analytics, ML AlgorithmsApache Flink – Hadoop Open Source Code
Leave Comment

Your email address will not be published. Required fields are marked *

clear formSubmit