Yushan Lu's Blog: Enterprise Hadoop Cluster Architecture

Saturday, October 22, 2016

Enterprise Hadoop Cluster Architecture

Enterprise Hadoop Cluster Architecture

Master: SPOF to one single node

NameNode, JobTracker / ResourceManager
Hive Metastore, HiveServer2
Impala StateStore, Catalog Server
Spark Master

Node kernel environment setup

Ulimit, /etc/security/limits.conf to configure nofile
Since default Linux system only allow 1024 sessions for file system.
For Hadoop Cluster, Habase, this is not enough, so we need to configure this.
THP(Transparent Huge Page), ACPI, Memory overcommit issue
THP: A new function from Linux kernel, its cache may not compatible with Hadoop, this will cause high memory, so it's better to turn it off.
ACPI: Power management, this may also cause high memory, better to turn off.
Memory Overcommit: System don't give more memory to app when app has a lot commit(50% total), so we need to either configure it higher or turn it off.
Customize configuration on different functional node
If High Memery? How to configure swap?
High disk IO? Then we may not need high OverCommit.
High CPU, high system load?

Enterprise Hadoop Cluster Data Management

HDFS config

HDFS Block Size: dfs.block.size
Replication Factor: dfs.replication, default is 3
Turn on dfs.permissions? fs.permissions.umask-mode
User permission
DataNode's dfs disk partition, better to be separate from Linux system partition, otherwise may cause system fail to start because of disk space.

Resource Allocation

CPU, Memory, Disk IO, Network IO
HDFS, MapReduce(Yarn), Hive jobs
HBase, Hive, Impala, Spark resource allocation, limitations

Enterprise Hadoop Cluster Task Scheduling

Oozie

dispatch HDFS, MapReduce jobs, Pig, Hive, Sqoop, Java Apps, shell, email
take advantage of cluster resources
an alternative of cron job
Distribution start job
Parallel start multiple jobs
Can fail tolerance, re-try, alarm in working flow

ZooKeeper

Synchronize node situation
The communication between HBase's master server and region server
Synchronize region's situation for HBase's tables
online or split?

Hadoop Cluster Monitors

Cloudera Manager's Monitor Tool

Strong, Full of Monitor values, Monitors well to Impala
Waste of system resources

Ganglia

Monitor by collecting Hadoop Metrics
Use self's gmond to collect CPU, Memory, NetworkIO
Collect JMX, IPC, RPC data
Weak point: No disk IO by default

Graphite

Good third party plugins, can push KPI in java applications(StatsD, Yammer Metrics)
Can collect server data using plugins(collectd, logster, jmxtrans)

Splunk: expensive

Nagios, Icinga

Hadoop Cluster Issue and Limitation

Issues

HA: Too many too many single nodes
Dangerous to upgrade
Waste too much memory

Limitations

Missing solutions for cross data center
High concurrent bottle neck
Impala, Spark no ability to handle too many query at the same time, and no solution yet
No matter which type of model, database. Join is a big issue
Spark is still not good for this, it's more like MapReduce + Pig

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)