技术博文 · 2021-03-26 0

keepalive-1.3.5两台服务器都出现vip配置问题

今天安装keepalived-1.3.5+nginx做高可用的时候发现keepalived死活启动不了。

问题已经解决,记录一下心酸的解决过程

1、安装过程(略)

可以参考各种百度google文档。

2、配置keepalived开机自启

[root@master1 keepalived-1.3.5] # cp /usr/local/src/keepalived-1.3.5/keepalived/etc/init.d/keepalived /etc/rc.d/init.d/
[root@master1 keepalived-1.3.5] # cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
[root@master1 keepalived-1.3.5] # mkdir /etc/keepalived/
[root@master1 keepalived-1.3.5] # cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
[root@master1 keepalived-1.3.5] # cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
[root@master1 keepalived-1.3.5] # echo "/etc/init.d/keepalived start" >> /etc/rc.local
[root@master1 keepalived-1.3.5] # systemctl enable keepalived
[root@master1 keepalived-1.3.5] # systemctl start keepalived

3、启动报错了

提示:

[root@cqdsrmyy-app-01 keepalived]# /etc/init.d/keepalived start
Starting keepalived (via systemctl):  Job for keepalived.service failed because a timeout was exceeded. See "systemctl status keepalived.service" and "journalctl -xe" for details.
                                                           [FAILED]

image-20210326121148442

解决:

很多问题其实在日志里面已经说的很清楚了。只需要根据日志的提示区进行排查就可以了

这里提示的是一个PID找不到的问题。我们可以根据启动文件来查找

image-20210326155917186

[root@cqdsrmyy-app-01 keepalived]# cat /usr/lib/systemd/system/keepalived.service 
[Unit]
Description=LVS and VRRP High Availability Monitor
After=syslog.target network-online.target

[Service]
Type=forking
PIDFile=/usr/local/keepalived/var/run/keepalived.pid
KillMode=process
EnvironmentFile=-/usr/local/keepalived/etc/sysconfig/keepalived
ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target
[root@cqdsrmyy-app-01 keepalived]# 

这里制定了PID文件其实在服务器上面不存在,所以需要修改PIDFile=/var/run/keepalived.pid

保存然后重新启动keepalived就可以了

[root@cqdsrmyy-app-01 run]# systemctl start keepalived
[root@cqdsrmyy-app-01 run]# systemctl status keepalived
● keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2021-03-26 03:31:17 EDT; 8s ago
  Process: 25043 ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 25044 (keepalived)
   CGroup: /system.slice/keepalived.service
           ├─ 8983 nginx: master process /opt/nginx/sbin/nginx
           ├─ 8984 nginx: worker process
           ├─ 8985 nginx: worker process
           ├─ 8986 nginx: worker process
           ├─ 8987 nginx: worker process
           ├─ 8988 nginx: worker process
           ├─ 8989 nginx: worker process
           ├─ 8991 nginx: worker process
           ├─ 8992 nginx: worker process
           ├─25044 /usr/local/keepalived/sbin/keepalived -f /etc/keepalived/keepalived.conf -D -S 0
           ├─25045 /usr/local/keepalived/sbin/keepalived -f /etc/keepalived/keepalived.conf -D -S 0
           └─25046 /usr/local/keepalived/sbin/keepalived -f /etc/keepalived/keepalived.conf -D -S 0

Mar 26 03:31:17 cqdsrmyy-app-01 Keepalived[25043]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 26 03:31:17 cqdsrmyy-app-01 Keepalived[25044]: Starting Healthcheck child process, pid=25045
Mar 26 03:31:17 cqdsrmyy-app-01 Keepalived_healthcheckers[25045]: Initializing ipvs
Mar 26 03:31:17 cqdsrmyy-app-01 Keepalived[25044]: Starting VRRP child process, pid=25046
Mar 26 03:31:17 cqdsrmyy-app-01 Keepalived_healthcheckers[25045]: Opening file '/etc/keepalived/kee....
Mar 26 03:31:17 cqdsrmyy-app-01 Keepalived_vrrp[25046]: Registering Kernel netlink reflector
Mar 26 03:31:17 cqdsrmyy-app-01 Keepalived_vrrp[25046]: Registering Kernel netlink command channel
Mar 26 03:31:17 cqdsrmyy-app-01 Keepalived_vrrp[25046]: Registering gratuitous ARP shared channel
Mar 26 03:31:17 cqdsrmyy-app-01 Keepalived_vrrp[25046]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 26 03:31:17 cqdsrmyy-app-01 systemd[1]: Started LVS and VRRP High Availability Monitor.
Hint: Some lines were ellipsized, use -l to show in full.

虽然启动问题是已经解决了,但是在测试的时候发现vip会在master和backup上面都存在。这个问题经过排查发现是因为keepalived.conf的配置问题导致的。

记录一下:

! Configuration File for keepalived
global_defs {
  script_user root
  enable_script_security
}
vrrp_script check_nginx {
    script "/etc/keepalived/nginx_check.sh"
    interval 10
}
vrrp_instance VI_1 {  # 定义一个实例
    state BACKUP     # 指定Keepalived的角色,MASTER表示此主机是主服务器,BACKUP表示此主机是备用服务器,所以设置priority时要注意MASTER比BACKUP高。如果设置了nopreempt,那么state的这个值不起作用,主备靠priority决定。
    nopreempt    # 设置为不抢占 
    interface eth0   #指定监测网络的接口,当LVS接管时,将会把IP地址添加到该网卡上。
    virtual_router_id 101      #虚拟路由标识,同一个vrrp实例使用唯一的标识,同一个vrrp_instance下,MASTER和BACKUP必须一致。
    priority 100       #指定这个实例优先级
    unicast_src_ip 192.168.1.14  # 配置单播的源地址
    unicast_peer { 
        192.168.1.15       #配置单播的目标地址
    }    #keepalived在组播模式下所有的信息都会向224.0.0.18的组播地址发送,产生众多的无用信息,并且会产生干扰和冲突,可以将组播的模式改为单拨。这是一种安全的方法,避免局域网内有大量的keepalived造成虚拟路由id的冲突。
    advert_int 1      #心跳报文发送间隔
    authentication {
        auth_type PASS    #设置验证类型,主要有PASS和AH两种
        auth_pass test123   #设置验证密码,同一个vrrp_instance下,MASTER和BACKUP的密码必须一致才能正常通信
    }
    virtual_ipaddress {    #设置虚拟IP地址,可以设置多个虚拟IP地址,每行一个
        118.24.101.16/24 dev eth1 
    }
    track_interface {  # 设置额外的监控,里面那个网卡出现问题都会切换
        eth0
    }
    track_script {
        check_nginx
    }
}

重点在 virtual_router_id 虚拟路由标识,同一个vrrp实例使用唯一的标识,同一个vrrp_instance下,MASTER和BACKUP必须一致。因为把他理解成了mysql的主从了。配置的不一致的id导致他们在做vrrp通信是找不到对方,将其修改成一致之后问题解决。

重新测试实验,杀掉其中一台的nginx发现keepalive可以迅速拉活,因为我们在check_nginx.sh里面写了

同时杀掉keep alived和nginx killall nginx keepalived

发现vip迅速漂移到了另外一台服务器上面。至此keepalived就配置和测试完成了。