KunTeng's Stories about Big Data and Cloud

HAWQ DBA Tutorial 1

##HAWQ DBA知识结构
分五层、八项、四核心:
Hardware -> OS & Shell -> postgreSQL(core) -> Greenplum (core) -> HAWQ (core)
-> HDFS (core)
-> YARN (optional)
-> Ambari(recommend for production)
1.因HAWQ可以stand alone模式运行,故而YARN是可选的。只在机群需同时负载HAWQ及其它分析栈(如Flink)时,需用YARN。
2.Ambari,涵盖了部署、监视、控制、报警、健康检查操作、故障恢复操作,在实际生产运维中,是HAWQ DBA不可或缺的工具。但此工具会用即可,Ambari本身实现细节知识基本用不到。
3.postgreSQL, Greenplum, HDFS, HAWQ,是HAWQ DBA知识核心构成,同时也是认知和动手水平差距的核心构成,必须深研。
4.学习过程:先看文档中的模型讲解,边读边动手;遇到问题,google+动手+求教;逐步读代码。

##HAWQ DBA资料

##postgreSQL
1.postgreSQL中国社区 何伟平 翻译 《PostgreSQL 8.2.3中文文档》 (Recommend)
2.媛媛、韩悦悦为代表的山东瀚高的开发人员以及社区其他志愿者 翻译《PostgreSQL 9.3.1 中文手册》
3.Pivotal GC 陈淼 翻译《Greenplum管理员指南》(当前版本V4.2.2_2015_Revised)
4.Pivotal [Greenplum Documents] (http://gpdb.docs.pivotal.io/) (重点三文档《Installation Guide》《Administrator Guide》《Greenplum Database Best Practices》)
5.Apache Hadoop Documents
6.Pivotal [HAWQ Documents] (http://hdb.docs.pivotal.io/)
7.武汉大学 彭智勇 彭煜玮 《PostgreSQL数据库内核分析》
8.[《Python 入门教程》] (http://docspy3zh.readthedocs.io/en/latest/tutorial/)
9.[基本的Bash Shell指令] (http://cn.linux.vbird.org/linux_basic/0320bash.php)

##HAWQ开发环境

###1.推荐使用Mac+VM with 4G RAM+Deploy HDFS with Pseudo-Distributed Mode + Installing HAWQ from the command line
VM-可在有风险实验操作前,制作快照,遇灾回滚即可。
Mac-Windows is OK, but Mac recommanded:基本兼容Shell、软件丰富、备份方便、电池耐用。确实省时。值得投入6K+。

###2.操作过程如下。

####2.1 Create a VM with CentOS 6.* with a internal IP(细节不表)

####2.2 Installing Oracle JDK 1.8 (Apache Hadoop官方网站声明OpenJDK也可)

####2.3 yum install ssh && yum install rsync

####2.4 Download [Hadoop](http://www.apache.org/dyn/closer.cgi/hadoop/common/) hadoop-*.tar.gz包至CentOS VM的/usr/local

####2.5 tar -xzf /usr/local/hadoop-*.tar.gz && ln -s ./hadoop-* hadoop

####2.6 vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh, 修改JAVA_HOME为“export JAVA_HOME=/usr/java/latest”

####2.7 vi etc/hadoop/core-site.xml


fs.defaultFS
hdfs://localhost:8020

####2.8 vi etc/hadoop/hdfs-site.xml


dfs.replication
1

####2.9 Check that you can ssh to the localhost without a passphrase:
$ssh localhost

####
$useradd -U -m hdfs && echo -e “hdfs\hdfs” |passwd hdfs
$su - hdfs
$sed ‘s/PATH=$PATH:$HOME/PATH=$PATH:$HOME\/bin\/usr\/local/‘ .bash_profile

####2.10 Format HDFS
$ bin/hdfs namenode -format

####2.11 Start NameNode daemon and DataNode daemon:
$ sbin/start-dfs.sh
(如希望关闭则 $ sbin/stop-dfs.sh)

####2.12 Browse the web interface for the NameNode:
http://localhost:50070/

####2.13 echo “vm.overcommit_ratio=50” >>/etc/sysctl.conf

####2.14 vi /etc/sysctl.conf

HAWQ

kernel.shmmax = 4000000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048

#Why HAWQ setted “0”?
net.ipv4.tcp_syncookies = 0
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 200000
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1281 65535
net.core.netdev_max_backlog = 200000
vm.overcommit_memory = 2
fs.nr_open = 3000000
kernel.threads-max = 798720
kernel.pid_max = 798720

increase network

net.core.rmem_max=2097152
net.core.wmem_max=2097152

Extra TCP connection quotas for gpfdist

net.core.somaxconn = 1024

####2.15
$useradd -U -m hawq
$echo -e “hawq\hawq” |passwd hawq

####2.16
$mkdir /staging
$chown hawq
$chown -R hawq:hawq /usr/local/hawq
$chown -R hawq:hawq /usr/local/hawq_2_1_0_0

####2.17
$mkdir -p /var/lib/hadoop-hdfs/dn_socket
$chown -R hdfs:hdfs /var/lib/hadoop-hdfs
$chmod -R 755 /var/lib/hadoop-hdfs

####2.18 vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml

dfs.allow.truncate
true


dfs.block.access.token.enable
false
false for an unsecured HDFS cluster, or true for a secure cluster



dfs.block.local-path-access.user
hawq


dfs.domain.socket.path
/var/lib/hadoop-hdfs/dn_socket


dfs.client.read.shortcircuit
true


dfs.client.socket-timeout
300000000


dfs.client.use.legacy.blockreader.local
false


dfs.datanode.data.dir.perm
750


dfs.datanode.handler.count
60


dfs.datanode.max.transfer.threads
40960


dfs.datanode.socket.write.timeout
7200000


dfs.namenode.accesstime.precision
0


dfs.namenode.handler.count
600


dfs.support.append
true

####2.19 vi /usr/local/hadoop/etc/hadoop/core-site.xml

ipc.client.connection.maxidletime
3600000


ipc.client.connect.timeout
300000


ipc.server.listen.queue.size
3300

####2.20 $yum install httpd

####2.21
$sed ‘s/SELINUX=enforcing/SELINUX=disabled’ /etc/sysconfig/selinux
$reboot

####2.22
$yum install -y epel-release

####2.23
$su - hawq
$source /usr/local/hawq/greenplum_path.sh
$hawq ssh-exkeys -h localhost

####2.24 /usr/local/hawq/etc/hawq-site.xml

<property>
        <name>hawq_master_address_port</name>
        <value>15432</value>
        <description>The port of hawq master.</description>
</property>

<property>
        <name>hawq_segment_address_port</name>
        <value>50000</value>
        <description>The port of hawq segment.</description>
</property>

        <property>
        <name>hawq_rm_memory_limit_perseg</name>
        <value>1GB</value>
        <description>The limit of memory usage in a hawq segment when
                                 hawq_global_rm_type is set 'none'.
        </description>
</property>

<property>
        <name>hawq_rm_nvcore_limit_perseg</name>
        <value>2</value>
        <description>The limit of virtual core usage in a hawq segment when
                                 hawq_global_rm_type is set 'none'.
        </description>
</property>

####2.25
$hdfs dfs -chown hawq hdfs://localhost:8020/

####2.26
$su - hawq
$hawq init cluster

####2.27 TODO: reduce the heap size of HDFS service JVM to shrink the VM memory consume.