About

About

This Pathway introduces Hadoop Ecosystem and how to plan capacity for a Hadoop Cluster to meet functional and non functional requirements.

This Pathway on Data Engineering - Processing, Storage, and Capacity Planning gives an overview of NoSQL Db Architectures (hBase, Hive) and how to size a Hadoop Cluster.

By the end of this Pathway, you will be able to:

  1. Describe the architecture of Hive, HBase to store data efficiently
  2. Describe Spark architecture to process real-time data
  3. Describe how to create Hadoop Clusters, size them for non-functional requirements