Saturday, October 22, 2016

Enterprise Hadoop Cluster Architecture

Enterprise Hadoop Cluster Architecture

Master: SPOF to one single node

  • NameNode, JobTracker / ResourceManager
  • Hive Metastore, HiveServer2
  • Impala StateStore, Catalog Server
  • Spark Master
Node kernel environment setup
  • Ulimit, /etc/security/limits.conf to configure nofile
    Since default Linux system only allow 1024 sessions for file system.
    For Hadoop Cluster, Habase, this is not enough, so we need to configure this.
  • THP(Transparent Huge Page), ACPI, Memory overcommit issue
    THP: A new function from Linux kernel, its cache may not compatible with Hadoop, this will cause high memory, so it's better to turn it off.
    ACPI: Power management, this may also cause high memory, better to turn off.
    Memory Overcommit: System don't give more memory to app when app has a lot commit(50% total), so we need to either configure it higher or turn it off.
  • Customize configuration on different functional node
        If High Memery? How to configure swap?
        High disk IO? Then we may not need high OverCommit.
        High CPU, high system load?
Enterprise Hadoop Cluster Data Management

HDFS config
  • HDFS Block Size: dfs.block.size
  • Replication Factor: dfs.replication, default is 3
  • Turn on dfs.permissions? fs.permissions.umask-mode
  • User permission
  • DataNode's dfs disk partition, better to be separate from Linux system partition, otherwise may cause system fail to start because of disk space.
Resource Allocation
  • CPU, Memory, Disk IO, Network IO
  • HDFS, MapReduce(Yarn), Hive jobs
  • HBase, Hive, Impala, Spark resource allocation, limitations
Enterprise Hadoop Cluster Task Scheduling

Oozie
  • dispatch HDFS, MapReduce jobs, Pig, Hive, Sqoop, Java Apps, shell, email
  • take advantage of cluster resources
  • an alternative of cron job
    Distribution start job
    Parallel start multiple jobs
    Can fail tolerance, re-try, alarm in working flow

ZooKeeper
  • Synchronize node situation
  • The communication between HBase's master server and region server
  • Synchronize region's situation for HBase's tables
    online or split?
Hadoop Cluster Monitors

Cloudera Manager's Monitor Tool
  • Strong, Full of Monitor values, Monitors well to Impala
  • Waste of system resources
Ganglia
  • Monitor by collecting Hadoop Metrics
  • Use self's gmond to collect CPU, Memory, NetworkIO
  • Collect JMX, IPC, RPC data
  • Weak point: No disk IO by default
Graphite
  • Good third party plugins, can push KPI in java applications(StatsD, Yammer Metrics)
  • Can collect server data using plugins(collectd, logster, jmxtrans)
Splunk: expensive
Nagios, Icinga


Hadoop Cluster Issue and Limitation

Issues
  • HA: Too many too many single nodes
  • Dangerous to upgrade
  • Waste too much memory
Limitations
  • Missing solutions for cross data center
  • High concurrent bottle neck
  • Impala, Spark no ability to handle too many query at the same time, and no solution yet
  • No matter which type of model, database. Join is a big issue
    Spark is still not good for this, it's more like MapReduce + Pig



































No comments:

Post a Comment