No results found

CentOS上zookeeper集群模式安装配置

本篇介绍在四个节点的集群中搭建zookeeper环境,zookeeper可配置三种模式运行:单机模式,伪集群模式,集群模式,本文使用集群模式搭建。

安装环境

  • 虚拟机:VMware Workstation 12 Player
  • Linux版本:CentOS release 6.4 (Final)
  • zookeeper版本:zookeeper-3.4.5-cdh5.7.6.tar.gz
  • 集群节点:
    • master:192.168.137.11 内存1G
    • slave1:192.168.137.12 内存512M
    • slave2:192.168.137.13 内存512M
    • slave3:192.168.137.14 内存512M
  • 前提:java已安装,已配置ssh免密登录,停掉防火墙等。

上传安装包

将下载的zookeeper-3.4.5-cdh5.7.6.tar.gz安装包上传到CentOS指定目录,例如/opt
上传方法很多,这里在SecureCRT用rz命令。

解压缩安装包:

tar -zxf zookeeper-3.4.5-cdh5.7.6.tar.gz

重命名文件夹:

mv zookeeper-3.4.5-cdh5.7.6 zookeeper

修改配置文件

配置文件在安装目录conf文件夹下的zoo_sample.cfg,需要先复制一个并且改文件名:

1
2
3
4
5
6
7
8
9
[root@master conf]# pwd
/opt/zookeeper/conf
[root@master conf]# cp zoo_sample.cfg zoo.cfg
[root@master conf]# ll
total 16
-rw-rw-r--. 1 root root 535 Feb 22 2017 configuration.xsl
-rw-rw-r--. 1 root root 2693 Feb 22 2017 log4j.properties
-rw-r--r--. 1 root root 808 Jan 23 10:06 zoo.cfg
-rw-rw-r--. 1 root root 808 Feb 22 2017 zoo_sample.cfg

修改zoo.cfg配置文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/zookeeper/tmp
# the port at which the clients will connect
clientPort=2181
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
dataLogDir=/opt/zookeeper/logs
server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888
server.4=slave3:2888:3888

参数说明:

  • tickTime: zookeeper中使用的基本时间单位, 毫秒值.
  • dataDir: 数据目录. 可以是任意目录.
  • dataLogDir: log目录, 同样可以是任意目录. 如果没有设置该参数, 将使用和dataDir相同的设置.
  • clientPort: 监听client连接的端口号.
  • initLimit: zookeeper集群中的包含多台server, 其中一台为leader, 集群中其余的server为follower. initLimit参数配置初始化连接时, follower和leader之间的最长心跳时间. 此时该参数设置为5, 说明时间限制为5倍tickTime, 即5*2000=10000ms=10s.
  • syncLimit: 该参数配置leader和follower之间发送消息, 请求和应答的最大时间长度. 此时该参数设置为2, 说明时间限制为2倍tickTime, 即4000ms.
  • server.X=A:B:C 其中X是一个数字, 表示这是第几号server. A是该server所在的IP地址. B配置该server和集群中的leader交换消息所使用的端口. C配置选举leader时所使用的端口.

由于我们修改了dataDir目录,在zookeeper目录中创建一个文件夹用于后面创建myid文件:

mkdir /opt/zookeeper/tmp

mkdir /opt/zookeeper/logs

复制安装包到其他节点

zookeeper文件夹复制到其他三个服务器上:

1
2
3
scp -r /opt/zookeeper/ root@slave1:/opt
scp -r /opt/zookeeper/ root@slave2:/opt
scp -r /opt/zookeeper/ root@slave3:/opt

在master节点上用一下命令给每个节点上创建myid文件,文件中的id号与zoo.cfg配置文件中的对应:

1
2
3
4
[root@master zookeeper]# echo 1 > /opt/zookeeper/tmp/myid
[root@master zookeeper]# ssh slave1 "echo 2 > /opt/zookeeper/tmp/myid"
[root@master zookeeper]# ssh slave2 "echo 3 > /opt/zookeeper/tmp/myid"
[root@master zookeeper]# ssh slave3 "echo 4 > /opt/zookeeper/tmp/myid"

运行启动

由于没有配置环境变量,需要用全路径执行:

1
2
3
4
[root@master zookeeper]# /opt/zookeeper/bin/zkServer.sh start
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

其实配置文件中修改dataLogDir的本意是想让启动日志输出到配置的文件夹里,但是好像并没有,日志文件zookeeper.out还是在zookeeper的安装目录下生成。

查看zookeeper.out文件发现有错误:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
2018-01-23 10:48:35,470 [myid:] - INFO [main:QuorumPeerConfig@101] - Reading configuration from: /opt/zookeeper/bin/../conf/zoo.cfg
2018-01-23 10:48:35,484 [myid:] - WARN [main:QuorumPeerConfig@290] - Non-optimial configuration, consider an odd number of servers.
2018-01-23 10:48:35,484 [myid:] - INFO [main:QuorumPeerConfig@334] - Defaulting to majority quorums
2018-01-23 10:48:35,512 [myid:4] - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2018-01-23 10:48:35,513 [myid:4] - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2018-01-23 10:48:35,513 [myid:4] - INFO [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2018-01-23 10:48:35,536 [myid:4] - INFO [main:QuorumPeerMain@132] - Starting quorum peer
2018-01-23 10:48:35,587 [myid:4] - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181
2018-01-23 10:48:35,611 [myid:4] - INFO [main:QuorumPeer@913] - tickTime set to 2000
2018-01-23 10:48:35,612 [myid:4] - INFO [main:QuorumPeer@933] - minSessionTimeout set to -1
2018-01-23 10:48:35,612 [myid:4] - INFO [main:QuorumPeer@944] - maxSessionTimeout set to -1
2018-01-23 10:48:35,612 [myid:4] - INFO [main:QuorumPeer@959] - initLimit set to 10
2018-01-23 10:48:35,639 [myid:4] - INFO [main:QuorumPeer@429] - currentEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2018-01-23 10:48:35,643 [myid:4] - INFO [main:QuorumPeer@444] - acceptedEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2018-01-23 10:48:35,652 [myid:4] - INFO [Thread-1:QuorumCnxManager$Listener@486] - My election bind port: 0.0.0.0/0.0.0.0:3888
2018-01-23 10:48:35,674 [myid:4] - INFO [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2018-01-23 10:48:35,679 [myid:4] - INFO [QuorumPeer[myid=4]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@740] - New election. My id = 4, proposed zxid=0x0
2018-01-23 10:48:35,692 [myid:4] - INFO [slave3/192.168.137.14:3888:QuorumCnxManager$Listener@493] - Received connection request /192.168.137.11:34491
2018-01-23 10:48:35,704 [myid:4] - INFO [WorkerReceiver[myid=4]:FastLeaderElection@542] - Notification: 4 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 4 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)
2018-01-23 10:48:35,706 [myid:4] - WARN [WorkerSender[myid=4]:QuorumCnxManager@368] - Cannot open channel to 2 at election address slave1/192.168.137.12:3888
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:354)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:327)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:393)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:365)
at java.lang.Thread.run(Thread.java:748)

提示Connection refused的异常,其实一开始先不急着百度这个问题,其实要所有节点上都启动zookeeper后再看看运行状态,现在查看运行状态都是没运行的,也找不到相应的进程:

1
2
3
4
5
6
7
8
[root@master zookeeper]# /opt/zookeeper/bin/zkServer.sh start
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@master zookeeper]# /opt/zookeeper/bin/zkServer.sh status
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.

到其他节点服务器上都启动zookeeper,过一会儿后每个服务器查看状态:

1
2
3
4
5
6
7
[root@master zookeeper]# /opt/zookeeper/bin/zkServer.sh status
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: follower
[root@master zookeeper]# jps
5488 QuorumPeerMain
5539 Jps

如果有Mode和QuorumPeerMain,就说明已经启动成功了。

如果要关闭zookeeper,需要在每个节点上执行:

/opt/zookeeper/bin/zkServer.sh stop

另外如果使用如下命令启动,就会在启动时输出日志信息:

/opt/zookeeper/bin/zkServer.sh start-foreground

批量启动和关闭

一台一台服务器去执行命令有点麻烦,写一个脚本批量执行:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/bin/bash
#下面变量修改zookeeper安装目录
zooHome=/opt/zookeeper
if [ $1 != "" ]
then
confFile=$zooHome/conf/zoo.cfg
slaves=$(cat "$confFile" | sed '/^server/!d;s/^.*=//;s/:.*$//g;/^$/d')
for salve in $slaves ; do
ssh $salve "$zooHome/bin/zkServer.sh $1"
done
else
echo "parameter empty! parameter:start|stop"
fi

将上面脚本保存为zooManager文件,调用执行:

sh zooManager start

sh zooManager stop

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@master opt]# sh zooManager start
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

由于所有服务器节点都是使用root用户,所以没有考虑权限问题,实际情况要考虑的。

参考:http://coolxing.iteye.com/blog/1871009

文章目录
  1. 1. 安装环境
  2. 2. 上传安装包
  3. 3. 修改配置文件
  4. 4. 复制安装包到其他节点
  5. 5. 运行启动
  6. 6. 批量启动和关闭
|