HA
High Availability
Linux Cluster: LB, HA, HP
LB: lvs
- Directory: HA
- RS: 健康状态监测方式
前段调度器高可用
RS:健康状态检测方式
a.网络层:
icmp ping
b.传输层:
tcp ping
c.应用层:对相应服务下的关键点发起请求
检测超时限定:timeout (限定范围内超时时间)
检测次数限定:times (限定范围内次数限制)
- soft failure(第一次失败) -> soft failure(第二次失败) -> hard failure(真失败)
- OK -> problem: 3+检测
- Problem -> OK: 1+检测
各检测之间的时间间隔:duration (限定范围内时间间隔)
Avalability=MTBF/(MTBF+MTTR)
MTBF:mean time between failure 平均无故障时间
MTTR:mean time to repair 平均修复时间
- 0<A<1: 90%,95%,99%,99.9%, 99.99%
- 99.99%比99.9提高了10倍
- 减低MTTR: 冗余(redundant)技术
- 0<A<1: 90%,95%,99%,99.9%, 99.99%
拉锯战:A正常,发送B, C正常,B,C收不到A发送的信息
- 切断电源(brain split, network partition)
- B,C的总票数大于总票数的一半,B,C为主
- width quorun = total/2
- 两个节点的不可用
- A自己放弃
仅有A,B独特的集群
建议:奇数节点集群(3,5,7)
HA Service: ip, httpd, store
HA Cluster的实现方案
- 1.vrrp协议的实现
- keepalived
- 2.AIS 家族(补刀、应用程序心跳):完备HA集群
- hearbear
- corosync
- cman(早期)
- 1.vrrp协议的实现
VRRP协议
Virtual Redundant Routing Protocol,虚拟冗余路由协议
VRRP 协议术语
虚拟路由器:由一个Master路由器和多个Backup路由器组成。主机将虚拟路由器当作默认网关。
VRID:0-255,虚拟路由器的标识。有相同VRID一组路由器构成一个虚拟路由器。
Master(Active):虚拟路由器中承担报文转发任务的路由器
Backup(Passive):Master路由器出现故障时,能够代替Master路由器的路由器
虚拟IP地址,虚拟路由器的IP地址,一个虚拟路由器可以拥有一个或多个IP地址
IP地址拥有者:接口IP地址与虚拟IP地址相同的路由器被称为IP地址拥有者
虚拟MAC地址:一个虚拟罗尤其拥有一个虚拟MAC地址。格式是00-00-5E-00-01-{VRID}。虚拟路由器回应ARP请求使用的虚拟MAC地址,只有虚拟路由器做特殊配置的时候,才回应接口的真实MAC地址。
优先级:VRRP根据优先级来确定虚拟路由器中路由器的地位
工作方式
- 抢占式
- 非抢占式
工作模式
- 主备
- 主/主:配置多个路由器(主备)
认证方式:
- 无认证
- 简单字符串认证:与共享密钥
- md5认证
gracious arp: 免费的arp
keepalived架构
是vrrp协议的实现,原生设计目的为高可用ipvs服务;keepalived能够配置文件中定义生成ipvs规则,并能够对各RS的将健康状态进行检测
vrrp_script, vrrp_track;
- 组件:
- 控制组件:配置文件分析器
- 内存管理组件
- IO复用组件
- 核心组件
- vrrp stack
- checker
- ipvs wrapper
- watch dog
HA Cluster配置的前提
- 各节点时间同步;
ntp协议
,chrony服务
(CentOS 7) - 确保
iptables及selinux不会阻碍
- 各节点之间可通过
主机名互相通信
(对ka并非必须);名称解析服务的解析结果必须与uname -n
命令的结果一致 - 各节点之间的root用户可以
基于密钥认证的ssh通信
(对ka并非必须)
示例
- HA1: 172.18.100.6
- HA2: 172.18.100.5
- RS1: 172.18.100.11
- RS2: 172.18.100.12
- 配置IP地址
- HA1, HA2分配时间同步设置
# ntpdate 172.16.0.1
# crontab -e
*/5 * * * * /sbin/ntpdate 172.18.0.1 &> /dev/null
# crontab -l
# ntpdate 172.18.0.1
# which ntpdate
# iptables -nL
# getenforce
- HA1, HA2分别安装keepalived
- 安装keepalived
- CentOS 6.4+,程序包已经在base源
- CentOS 6.4- 程序包在epel源
# yum -y install keepalived
# rpm -ql keepalived
主配置文件:
/etc/keepalived/keepalived.conf
Unit file:
/usr/lib/systemd/system/keepalived.service
配置文件:
/etc/sysconfig/keepalived
配置文件内容块
# cd /etc/keepalived
# cp keepalived.conf{,.bak}
# vim keepalived.conf
# man keepalivcd.conf
GLOBAL CONFIGURATIONS 全局配置
global_defs {
...
}
VRRP CONFIGURATION
vrrp_sync_group GRP_NAME {
...
}
vrrp_instance INST_NAME{
...
}
LVS CONFIGURATIONS
virtual_server_group GRP_NAME {
...
}
virtual_server IP port | virtual_server fwmark int {
protocol TCP
...
real_server <IPADDR> <PORT> {
...
}
real_server <IPADDR> <PORT> {
...
}
}
}
- HA2配置
# cd /etc/keepalived
# cp keepalived.conf{,.bak}
# man keepalived
# vim keepalived.conf
global_defs {
notification_email { 接受通告邮件的地址
root@localhost
}
notification_email_from [email protected] 发件人邮箱地址
smtp_server 127.0.0.1 邮件服务器(163,126,qq smtp)
smtp_connect_timeout 30 超时时长
router_id node1 路由器设备的ID号(必须是主机名)
vrrp_mcast_group4 224.0.100.18 ipv4多播地址
vrrp_mcast_group6 ff02::12 ipv6多播地址
}
vrrp_instance VI_1 { 虚拟路由配置
state MASTER 当前节点在此虚拟路由器中初始状态 MASTER|BACKUP
interface eno16777736 哪个接口上配置,vrrp实例工作的网络接口
virtual_route_id 171 当前虚拟路由器ID号,0-255之间
priority 100 当前物理节点在此虚拟路由器中的优先级
advert_int 1 广播通告的时间间隔(心跳时间间隔),1秒
authentication { 认证机制
auth_type PASS|AH 默认PASS简单字符串认证,AH:IPSEC(验证头协议)
auth_pass 12345678 默认8位,# openssl rand -base64 8
}
virtual_ipaddress { 定义虚拟IP,默认使用eno16777736
172.18.100.66 别名
}
复制以上文件到:scp keepalived.conf 172.18.100.6:/etc/keepavlied/
track_interface { 定义要监控的接口
eth0
eth1
...
}
nopreempt: 非抢占模式
Dreempt_delay 300 抢占延迟(等待多长时间抢占)
通告脚本定义,脚本文件路径
notify_master <STRONG>|<QUOTED-STRING>
notify_backup <STRONG>|<QUOTED-STRING>
notify_fault <STRONG>|<QUOTED-STRING>
notify <STRONG>|<QUOTED-STRING>
}
- HA1
state BACKUP
priority 98
- HA2
# systemctl start keepalived.service
# systemctl status keepalived.service
# ip addr show
172.18.100.66/32
# tail /var/log/messages
- HA1
# systemctl start keepalived.service
# systemctl -l status keepalived.service
- 下线HA2
在另一个ssh查看
# tail -f /var/log/messages
# systemctl stop keepalived.service
- 查看HA1
# tail -f /var/log/messages
# ip addr show
- 启用HA2
# systemctl start keepalived.service
在另一个ssh查看
# tail -f /var/log/messages
- 查看HA1
在另一个ssh查看
# tail -f /var/log/messages
Received higher prio advert
Enter BACKUP STATE
...
- HA2
vitual_ipaddress {
172.18.100.66 dev eno16777736 label eno16777736:0
}
- HA1
vitual_ipaddress {
172.18.100.66 dev eno16777736 label eno16777736:0
}
- HA1停止并重启使用
# systemctl stop keepalived.service
# systemctl start keepalived.service
在另一个ssh查看
# tail -f /var/log/messages
# ip addr show
- 通告脚本定义-发送邮件
postfix - 本地发送邮件
# vim /etc/vimrc
set nohlsearch
# vim /etc/keepavlied/notify.sh
#!/bin/bash
contact='root@localhost'
notify() {
mailsubject="$(hostname) to be $1: vip floating"
mailbody="$(date +'%F %T'): vrrp transition, $(hostname) change to be $1"
echo $mailbody | mail -s "$mailsubject" $contact
}
case $1
master)
notify master
;;
backup)
notify backup
;;
fault)
notify fault
;;
*)
echo "Usage: $(basename $) {master|backup|fault}"
;;
esac
# chmod +x notify.sh
# ./notify.sh master
& 1
& exit
# scp notify.sh [email protected]:/etc/keepavlied/
# vim /etc/keepalived.conf
virtual_ipaddress {}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
# systemctl stop keepalived.service
- 172.18.100.6 配置
# systemctl stop keepalived.service
# vim /etc/keepalived.conf
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
# systemctl start keepalived.service
# tail -f /var/log/messages
# mail
& 1
& d 1
& exit
- .100.5停止服务,是否发送邮件到100.6
- 配置虚拟服务器
- 100.5
# vim /etc/keepalived.conf
virtual_server IP PORT |
virtual_server PWM # {
lb_algo rr|wrr|lc|wlc|lblc|sh|dh
调度方法
deplay_loop <INT>
服务轮询时间间隔
lb_kind NAT|DR|TUN
集群类型
persistence_timeout <INT>
持久连接时长
protocol TCP
服务协议
sorry_server <IPADDR><PORT>
所有RS均故障时,提胸say sorry的服务器
real_server <IPADDR> <PORT> {
weight <INT>
权重
notify_up <STRING>|<QUOTED-STRING>
节点上线时调用通知脚本
notify_down <STRING>|<QUOTED-STRING>
节点离线时调用通知脚本
HTTP_GET|SSL_GET|TCP_CHECK|SMTP_CHECK|MISC_CHECK
支持的所有健康状态检测方式
HTTP_GET|SSL_GET {
url {
path<STRING> 健康状态检测时请求的资源的URL
digest <STRING> 基于获取的内容摘要码进行健康状态码判定
status_code <STRING>
}
nb_get_retry <INT> 尝试的次数
delay_before_retry <INT> 再次尝试之间的时间间隔
connect_timeout <INTEGER> 连接的超时时长
connect_ip <IP ADDRESS> 向此处指定的地址发测试请求
connect_port <PORT> 向此处指定的PORT发测试请求
bindto <IP ADDRESS> 指定测试请求报文的源IP
bind_port <PORT> 指定测试请求报文的PORT
}
TCP_CHECK {
}
}
}
RS: 100.6,100.5
- 100.5
# yum -y install httpd
# vim /var/www/html/index.html
sorry server 1
# systemct start httpd.service
- 100.6
# vim /var/www/html/index.html
sorry server 2
# systemct start httpd.service
- 100.5
# vim /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
...
virtual_ipaddress {
172.18.100.7 dev eno16777736 label eno16777736:0
}
}
virtual_server 172.18.100.7 80 {
deplay_loop 6
lb_algo rr
lb_kind DR
protocol TCP
real_server 172.18.100.11 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
real_server 172.18.100.12 80 {
weight 2
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
# scp keepalived.conf 172.18.100.6:/etc/keepalived/
# systemctl stop keepalived.service
- 100.6
# vim /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
state BACKUP
priority 98
...
}
# systemctl stop keepalived.service
- 100.5
# systemctl start keepalived.service
# ifconfig
# yum -y install ipvsadm
# ipvsadm -Ln
- 100.6
# systemctl start keepalived.service
# yum -y install ipvsadm
# ipvsadm -Ln
# ifconfig
- 打开终端
# curl http://172.18.100.7
- RS1下线httpd
# systemctl stop httpd.service
- 打开终端
# curl http://172.18.100.7
# curl http://172.18.100.7
# curl http://172.18.100.7
- 100.5
# ipvsadm -Ln
仅有一个
- 100.6
# ipvsadm -Ln
仅有一个
- RS1上线httpd
# systemctl stop httpd.service
- 打开终端
# curl http://172.18.100.7
# curl http://172.18.100.7
# curl http://172.18.100.7
- 100.5
# ipvsadm -Ln
两个
- 100.6
# ipvsadm -Ln
两个
- RS1和RS2全部下线httpd
# systemctl stop httpd.service
- RS1,RS2
# vim /etc/keepalived/keepalived.conf
protocol TCP
sorry_server 127.0.0.1 80
- 100.5
# systemctl stop keepalived.service
- 100.6
# systemctl start keepalived.service
# ipvsadm -Ln
# ipvsadm -Ln
- 打开终端
# curl http://172.18.100.7
# curl http://172.18.100.7
# curl http://172.18.100.7
- 100.5
# systemctl start keepalived.service
# ipvsadm -Ln
# ipvsadm -Ln
- 打开终端
# curl http://172.18.100.7
# curl http://172.18.100.7
# curl http://172.18.100.7
- RS1启动httpd服务
# systemctl start httpd.service
- 100.5
# ipvsadm -Ln
# ipvsadm -Ln
- 打开终端
# curl http://172.18.100.7
# curl http://172.18.100.7
# curl http://172.18.100.7
- 博客作业:
- keepalived高可用ipvs集群
- ip集群提供PHP应用,例如phpwind