Greenplum安装|Greenplum数据库

Posted by Tang Blog on September 19, 2019

这篇文章主要介绍Greenplum分布式MPP数据库V5.21版本在CentOS7操作系统环境下的安装过程。 本文分为以下几个部分:

  1. Greenplum基础软件包的安装准备
  2. Greenplum数据库的详细安装步骤

Greenplum5-centos7安装指南

一、基础软件准备

(1)下载RPM安装包 这里安装的GP版本为:5.21.2 greenplum-db-5.5.21.2-rhel6-x86_64.rpm 下载地址:https://network.pivotal.io/products/pivotal-gpdb

(2)安装基础软件包 yum install -y net-tools which openssh-clients openssh-server less zip unzip iproute.x86_64

二、机器规划与准备工作

  • (1)机器规划(一个主节点,一个从节点,两个数据节点: 三台机器的root帐号的密码均为:whistle):

    节点类型 节点IP 节点hostname 节点描述 master 172.16.90.151 mdw master 主节点 standby 172.16.90.152 smdw standby 从节点 segment 172.16.90.153 sdw1 数据节点1 segment 172.16.90.154 sdw2 数据节点2

  • (2)分别设置三台主机的hostname

    在master机器上: hostnamectl –static set-hostname mdw 在standby机器上: hostnamectl –static set-hostname smdw 在segment1机器上: hostnamectl –static set-hostname sdw1 在segment2机器上: hostnamectl –static set-hostname sdw2

    需要在/etc/hosts中设置节点的hostname(如上),并能ping通。

  • (3)关闭防火墙

    systemctl stop firewalld systemctl mask firewalld systemctl stop iptables systemctl disable iptables

  • (4)每台主机上创建gpadmin用户与组

    groupdel gpadmin userdel gpadmin groupadd -g 530 gpadmin useradd -g 530 -u 530 -m -d /home/gpadmin -s /bin/bash gpadmin passwd gpadmin 注:后面均按照密码为gpadmin编写

-(5)创建节点配置信息 在master上创建文件all_hosts(所有主机名)和 all_segs(2台数据节点主机名)路径可以自选,这里用/home/gpadmin/nodes 文件内容: 文件/home/gpadmin/nodes/all_hosts: mdw smdw sdw1 sdw2

文件/home/gpadmin/nodes/all_segs:
	sdw1
	sdw2

三、配置每台机器的内核参数

vim /etc/sysctl.conf


kernel.shmmax = 500000000

kernel.shmmni = 4096

kernel.shmall = 4000000000

kernel.sem = 500 1024000 200 4096

kernel.sysrq = 1

kernel.core_uses_pid = 1

kernel.msgmnb = 65536

kernel.msgmax = 65536

kernel.msgmni = 2048

net.ipv4.tcp_syncookies = 1

net.ipv4.ip_forward = 0

net.ipv4.conf.default.accept_source_route = 0

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_max_syn_backlog = 4096

net.ipv4.conf.all.arp_filter = 1

net.ipv4.ip_local_port_range = 10000 65535

net.core.netdev_max_backlog = 10000

net.core.rmem_max = 2097152

net.core.wmem_max = 2097152

vm.overcommit_memory = 2

执行如下使内核参数生效:

sysctl -p

四、配置每台机器的资源限制参数

vim /etc/security/limits.conf

设置文件/etc/security/limits.conf的内容如下:


* soft nofile 65536

* hard nofile 65536

* soft nproc 131072

* hard nproc 131072

五、安装Greenplum

1,准备greenplum数据库安装文件(master节点)

路径:/home/greenplum/installer: greenplum-db-5.X.X-rhel7-x86_64.rpm

2,安装软件(master节点)

(1)安装软件包 cd /home/greenplum/installer/ rpm -Uvh ./greenplum-db-5.X.X-rhel7-x86_64.rpm 默认安装路径 /usr/local/,即安装的目录为:/usr/local/greenplum-db/

(3)设置安装目录的用户及组权限 chown -R gpadmin /usr/local/greenplum(在创建gpadmin后执行) chgrp -R gpadmin /usr/local/greenplum(在创建gpadmin后执行)

(3)查看获取环境参数 source /usr/local/greenplum-db/greenplum_path.sh echo $GPHOME

(4)拷贝master节点公钥至其他各个节点
cd /usr/local/greenplum-db/bin source /usr/local/greenplum-db/greenplum_path.sh gpssh-exkeys -f /home/gpadmin/nodes/all_hosts

[root@mdw bin]# gpssh-exkeys -f /home/gpadmin/nodes/all_hosts
[STEP 1 of 5] create local ID and authorize on local host

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] authorize current user on remote hosts
  ... send to smdw
  ***
  *** Enter password for smdw: (机器的root帐号的密码:whistle)
  ... send to sdw1
  ... send to sdw2

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts
  ... finished key exchange with smdw
  ... finished key exchange with sdw1
  ... finished key exchange with sdw2

[INFO] completed successfully

(5)运行gpseginstall工具 cd /usr/local/greenplum-db/bin source /usr/local/greenplum-db/greenplum_path.sh gpseginstall -f /home/gpadmin/nodes/all_hosts -u gpadmin

[root@mdw bin]# gpseginstall -f /home/gpadmin/nodes/all_hosts -u gpadmin
20190902:16:51:00:019873 gpseginstall:mdw:root-[INFO]:-Installation Info:
link_name greenplum-db
binary_path /usr/local/greenplum-db-5.21.2
binary_dir_location /usr/local
binary_dir_name greenplum-db-5.21.2
20190902:16:51:00:019873 gpseginstall:mdw:root-[INFO]:-check cluster password access
20190902:16:51:01:019873 gpseginstall:mdw:root-[INFO]:-de-duplicate hostnames
20190902:16:51:01:019873 gpseginstall:mdw:root-[INFO]:-master hostname: mdw
20190902:16:51:01:019873 gpseginstall:mdw:root-[INFO]:-check for user gpadmin on cluster
20190902:16:51:01:019873 gpseginstall:mdw:root-[INFO]:-add user gpadmin on master
20190902:16:51:02:019873 gpseginstall:mdw:root-[INFO]:-add user gpadmin on cluster
20190902:16:51:02:019873 gpseginstall:mdw:root-[INFO]:-chown -R gpadmin:gpadmin /usr/local/greenplum-db
20190902:16:51:02:019873 gpseginstall:mdw:root-[INFO]:-chown -R gpadmin:gpadmin /usr/local/greenplum-db-5.21.2
20190902:16:51:02:019873 gpseginstall:mdw:root-[INFO]:-rm -f /usr/local/greenplum-db-5.21.2.tar; rm -f /usr/local/greenplum-db-5.21.2.tar.gz
20190902:16:51:02:019873 gpseginstall:mdw:root-[INFO]:-cd /usr/local; tar cf greenplum-db-5.21.2.tar greenplum-db-5.21.2
20190902:16:51:05:019873 gpseginstall:mdw:root-[INFO]:-gzip /usr/local/greenplum-db-5.21.2.tar
20190902:16:51:41:019873 gpseginstall:mdw:root-[INFO]:-remote command: mkdir -p /usr/local
20190902:16:51:42:019873 gpseginstall:mdw:root-[INFO]:-remote command: rm -rf /usr/local/greenplum-db-5.21.2
20190902:16:51:42:019873 gpseginstall:mdw:root-[INFO]:-scp software to remote location
20190902:16:51:44:019873 gpseginstall:mdw:root-[INFO]:-remote command: gzip -f -d /usr/local/greenplum-db-5.21.2.tar.gz
20190902:16:51:51:019873 gpseginstall:mdw:root-[INFO]:-md5 check on remote location
20190902:16:51:53:019873 gpseginstall:mdw:root-[INFO]:-remote command: cd /usr/local; tar xf greenplum-db-5.21.2.tar
20190902:16:51:54:019873 gpseginstall:mdw:root-[INFO]:-remote command: rm -f /usr/local/greenplum-db-5.21.2.tar
20190902:16:51:55:019873 gpseginstall:mdw:root-[INFO]:-remote command: cd /usr/local; rm -f greenplum-db; ln -fs greenplum-db-5.21.2 greenplum-db
20190902:16:51:55:019873 gpseginstall:mdw:root-[INFO]:-remote command: chown -R gpadmin:gpadmin /usr/local/greenplum-db
20190902:16:51:56:019873 gpseginstall:mdw:root-[INFO]:-remote command: chown -R gpadmin:gpadmin /usr/local/greenplum-db-5.21.2
20190902:16:51:56:019873 gpseginstall:mdw:root-[INFO]:-rm -f /usr/local/greenplum-db-5.21.2.tar.gz
Please enter a password: (这里输入:gpadmin)
Confirm password: (这里输入:gpadmin)
20190902:16:52:05:019873 gpseginstall:mdw:root-[INFO]:-Changing system passwords ...
20190902:16:52:06:019873 gpseginstall:mdw:root-[INFO]:-exchange ssh keys for user root
20190902:16:52:09:019873 gpseginstall:mdw:root-[INFO]:-exchange ssh keys for user gpadmin
20190902:16:52:12:019873 gpseginstall:mdw:root-[INFO]:-/usr/local/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin
20190902:16:52:13:019873 gpseginstall:mdw:root-[INFO]:-remote command: . /usr/local/greenplum-db/./greenplum_path.sh; /usr/local/greenplum-db/./sbin/gpfixuserlimts -f /etc/security/limits.conf -u gpadmin
20190902:16:52:13:019873 gpseginstall:mdw:root-[INFO]:-version string on master: gpssh version 5.21.2 build commit:610b6d777436fe4a281a371cae85ac40f01f4f5e
20190902:16:52:13:019873 gpseginstall:mdw:root-[INFO]:-remote command: . /usr/local/greenplum-db/./greenplum_path.sh; /usr/local/greenplum-db/./bin/gpssh --version
20190902:16:52:14:019873 gpseginstall:mdw:root-[INFO]:-remote command: . /usr/local/greenplum-db-5.21.2/greenplum_path.sh; /usr/local/greenplum-db-5.21.2/bin/gpssh --version
20190902:16:52:14:019873 gpseginstall:mdw:root-[INFO]:-SUCCESS -- Requested commands completed

至此完成其他2台主机的安装

(6)切换到gpadmin用户验证无密码登录 su - gpadmin source /usr/local/greenplum-db/greenplum_path.sh gpssh -f /home/gpadmin/nodes/all_hosts -e ls -l $GPHOME

(7)配置环境变量 在上述(6)切换到gpadmin帐号前提下: vim ~/.bashrc


source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/home/gpadmin/data/master/gpseg-1/  (改目录事先建好)
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=postgres                   (默认数据库) ---------------------------------------------------------------

使环境变量生效:source ~/.bashrc 

(8)使用gpssh工具在所有segment主机上创建主数据和镜像数据目录,如果没有设置镜像可以不创建mirror目录(执行下面命令):
 gpssh -f /home/gpadmin/nodes/all_segs -e 'mkdir -p /home/gpadmin/data/gpseg'
 
(9)在Master主机上,通过NTP守护进程同步系统时钟(略)

(10)检测系统环境(root帐号下)
   source /usr/local/greenplum-db/greenplum_path.sh
   cd /home/gpadmin/
   
  a)Master上进行主机OS参数检测
   gpcheck -f /home/gpadmin/nodes/all_hosts -m mdw
   ----------------------------------------------------------------------
	gpcheck时遇到的一些报错解决:
	$ gpssh -f /home/gpadmin/nodes/all_hosts -e 'echo deadline > /sys/block/sr0/queue/scheduler'
	$ gpssh -f /home/gpadmin/nodes/all_hosts -e 'echo deadline > /sys/block/sr1/queue/scheduler'
	$ gpssh -f /home/gpadmin/nodes/all_hosts -e 'echo deadline > /sys/block/sda/queue/scheduler'
	$ /sbin/blockdev --setra 16384 /dev/sda* 
	$ /sbin/blockdev --getra /dev/sda*
   ----------------------------------------------------------------------
   
   b)Master上检查网络性能
   $ gpcheckperf -f /home/gpadmin/nodes/all_segs -r N -d /tmp > subnet1.out
   $ cat subnet1.out
   
   c)Master上检查磁盘I/O和内存带宽
   $ gpcheckperf -f /home/gpadmin/nodes/all_hosts -d /home/data/gpseg -r ds

六、初始化Greenplum数据库

1. 初始化数据库

(1)在master节点上切换至gpadmin帐号下:
su - gpadmin

(2)从模板中拷贝一份gpinitsystem_config文件
#cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpinitsystem_config

(3)根据文件中的配置目录在三台机器上创建相应的目录:
mkdir -p /home/gpadmin/data/master    (在mdw主机上)
mkdir -p /home/gpadmin/data/master    (在standby主机上)
mkdir -p /home/gpadmin/data/primary   (在sdw1主机上)
mkdir -p /home/gpadmin/data/primary   (在sdw2主机上)

(4)执行初始化
$ gpinitsystem -c /home/gpadmin/gpinitsystem_config -h /home/gpadmin/nodes/all_segs
待命令初始化成功后,三台机器上即已经成功启动了greenplum数据库了。

(5)增加master的从节点:
$ gpinitstandby -s smdw

2, 访问数据库

psql -d postgres

psql (8.2.15)
Type "help" for help.
postgres=#
出现以上界面,恭喜你已经安装成功了。

输入查询语句,查看是否可以执行。

postgres=# select datname,datdba,encoding,datacl from pg_database;

  datname  | datdba | encoding |              datacl             
-----------+--------+----------+----------------------------------
 postgres  |     10 |        6 |
 template1 |     10 |        6 | {=c/gpadmin,gpadmin=CTc/gpadmin}
 template0 |     10 |        6 | {=c/gpadmin,gpadmin=CTc/gpadmin}
(3 rows)

postgres=# \q(退出)

3 用户创建密码

postgres =# alter role gpadmin with password 'gpadmin';

4,启动和停止数据库测试

(1)启动

gpstart

(2)停止

gpstop 或者:强制停止 gpstop -M fast

(3)删除原有的standby

gpinitstandby -r

(4)启动和同步standby数据

gpinitstandby -n

5 配置远程登录

(1)pg_hba.conf配置文件 greenplum数据库底层封装的是 postgresql 数据库,与 pg 数据库一样,要想登录数据库,需先配置数据库白名单,即允许登录的数据库相关信息。配置文件为位于 MASTER 节点的数据目录之下的 pg_hba.conf 文件。

vim $MASTER_DATA_DIRECTORY/pg_hba.conf
该文件的记录有5个字段,代表的意义为:
1、连接方式
分别是:“local”使用本地unix套接字,“host”使用TCP/IP连接(包括SSL和非SSL),“hostssl”只能使用SSL TCP/IP连接,“hostnossl”不能使用SSL TCP/IP连接。
2、数据库名
可通过使用 all 关键字表示所有的数据库
3、用户名
多个用户通过逗号隔开
4、被允许连接的服务器地址
5、认证方式
ident,md5,password(明文密码),trust(无需密码),reject(拒绝认证)   (2)postgresql.conf配置文件 
 将配置文件postgresql.conf的listen_addresses修改为监听所有,也就是listen_addresses = '*'

6 建议

(1)在master节点上通过crontab设置定期清理pg的日志 /10 * * * * (rm -f /home/gpadmin/data/master/gpseg-1/pg_log/gpdb-.csv)

7 命令记录

根据进程名批量kill进程:

ps -ef | grep postgres | grep -v grep | awk '{print $2}' | xargs kill -9