Hbase series: Components
Hbase Components
As a distributed data store, you can see in the diagram above how tables are spread among the different Region Servers (RS). That’s because data can be partitioned and these partitions, called regions, are automatically distributed among Region Servers (auto-sharding).
Data is ultimately written and read to and from regions, as I mentioned earlier think of a region as a database partition. Regions are managed by Region Servers (RS) and Region Servers are handled by the HMaster (HM) and monitored by Zookeeper (ZK). When a client wants to read/write data to Hbase it first asks ZooKeeper which Region Server I should send the data to.
From version 3.X the master is required by the client in some cases when reading/writing data |
The HMaster main responsibilities are administration tasks and how regions are handled among region servers. Region Servers responsibility is to serve and manage regions. There are tasks such as compactation (which among other things phisically deletes records previously marked to be deleted) that are executed by the RS on each of its regions
Resources
-
Hbase Architecture: Official Hbase documentation on architecture
-
Regions guidelines: documentation on how to determine region count and size