当前位置: 首页 > news >正文

网站前端设计是什么意思苏州百度快照优化排名

网站前端设计是什么意思,苏州百度快照优化排名,建设银行网上银行网站打不开,地图网站制作一切的开始 打开Redis5.0.5的源码中server.c#xff0c;找到如下代码#xff0c;这里运行了一个定时任务#xff0c;每隔100毫秒执行一次。 /* Run the Redis Cluster cron. *//** 每隔100毫秒执行一次* 要求开启集群模式*/run_with_period(100) {if (server.cluster_enabl…一切的开始 打开Redis5.0.5的源码中server.c找到如下代码这里运行了一个定时任务每隔100毫秒执行一次。 /* Run the Redis Cluster cron. *//** 每隔100毫秒执行一次* 要求开启集群模式*/run_with_period(100) {if (server.cluster_enabled) clusterCron();}clusterCron定时任务中做了些什么 我们可以找到cluster.c源码文件找到其中的clusterCron()定时任务其详细注释的源码已经贴在下文了。在这个定时文件中需要特别关注以下几点 定时任务大约每100毫秒执行一次每秒执行10次。遍历集群所有节点如果其他节点没有与本节点建立有效的连接link则尝试重新建立连接连接建立后立即发送PING命令。每执行 10 次至少间隔一秒钟仅此该随机发送PING消息间隔1秒就向一个随机节点发送 PING 信息。随机选择指的是选出 5 个随机节点并从中选择最近一次接收 PONG 回复距离现在时间最久的一个节点。每隔100毫秒进行主观下线检测如果等待某个节点回复PONG的时间超过cluster-node-timeout时间的一半则会重新连接链接。如果重启连接成功则立即向该节点发送PING消息。如果某个节点的 PONG 延迟超过了超时时间则将节点标记为主观下线PFAIL。如果当前节点是从节点则检查是否满足手动故障转移条件执行方法为clusterHandleManualFailover()这个方法将会mf_can_start标记设置为1表示接下来可以执行手动故障转移逻辑。如果手动故障转移超时则中止手动故障转移。手动故障转移超时时间默认5秒。如果当前节点是主节点且从节点数量最多则将一个从节点送给一个孤儿主节点。每隔100毫秒执行处理自动故障转移自动主从切换的逻辑包括选举新的主节点和更新集群状态执行方法为clusterHandleSlaveFailover()。这个我们后面会重点讲解先知道即可。 /* -----------------------------------------------------------------------------* CLUSTER cron job* -------------------------------------------------------------------------- *//* This is executed 10 times every second */ // 集群定时任务每0.1秒执行一次用于处理集群的心跳、故障检测、状态同步、故障转移等任务。 void clusterCron(void) {dictIterator *di; // 字典迭代器用于迭代集群中的所有节点dictEntry *de; // 用于保存迭代器返回的节点int update_state 0; // 是否更新集群状态// 孤儿主节点的数量孤儿节点即没有工作的从节点的主节点int orphaned_masters; /* How many masters there are without ok slaves. */// 单个主节点的最大从节点数量int max_slaves; /* Max number of ok slaves for a single master. */// 当前主节点的从节点数量int this_slaves; /* Number of ok slaves for our master (if we are slave). */mstime_t min_pong 0, now mstime(); // 最小的 PONG 时间当前时间clusterNode *min_pong_node NULL; // 最小 PONG 时间的节点即最晚发送 PONG 的节点static unsigned long long iteration 0; // 函数被调用的次数大概100毫秒调用一次mstime_t handshake_timeout; // 握手超时时间// 函数被调用的次数大概100毫秒调用一次iteration; /* Number of times this function was called so far. *//* We want to take myself-ip in sync with the cluster-announce-ip option.* The option can be set at runtime via CONFIG SET, so we periodically check* if the option changed to reflect this into myself-ip. *//** cluster-announce-ip 是一个重要的配置参数用于指定 Redis 节点对外宣告的 IP 地址。* 它的作用是解决 Redis 集群在复杂网络环境如 Docker、云服务器、NAT 等中可能遇到的通信问题。* 容器内的 Redis 节点可能会获取到 Docker 内部的 IP 地址如 172.17.0.2而外部节点无法通过这个 IP 地址访问它。* 此时可以通过 cluster-announce-ip 指定宿主机的 IP 地址。可以通过config set命令动态修改因此需要定时检查。*/// 检查集群的 announce ip 是否发生了变化如果发生了变化则更新 myself-ip{static char *prev_ip NULL; // 之前的 announce ipchar *curr_ip server.cluster_announce_ip;int changed 0;// 比较集群的 announce ip 和 之前的 announce ip 是否相同if (prev_ip NULL curr_ip ! NULL) changed 1;else if (prev_ip ! NULL curr_ip NULL) changed 1;else if (prev_ip curr_ip strcmp(prev_ip,curr_ip)) changed 1;// 如果发生了变化则更新 myself-ipif (changed) {if (prev_ip) zfree(prev_ip);prev_ip curr_ip;if (curr_ip) {/* We always take a copy of the previous IP address, by* duplicating the string. This way later we can check if* the address really changed. */prev_ip zstrdup(prev_ip); // 复制 announce ipstrncpy(myself-ip,server.cluster_announce_ip,NET_IP_STR_LEN);myself-ip[NET_IP_STR_LEN-1] \0;} else {myself-ip[0] \0; /* Force autodetection. 强制自动检测 */ }}}/* The handshake timeout is the time after which a handshake node that was* not turned into a normal node is removed from the nodes. Usually it is* just the NODE_TIMEOUT value, but when NODE_TIMEOUT is too small we use* the value of 1 second. */// 握手超时时间如果节点超时时间太小则使用 1 秒handshake_timeout server.cluster_node_timeout;if (handshake_timeout 1000) handshake_timeout 1000;/* Update myself flags. */// 更新 myself 的标志位flags 是一个 64 位的无符号整数用于记录节点是否允许故障转移// flags 还记录了节点的角色主节点或从节点、节点的状态在线或下线是否有地址等信息clusterUpdateMyselfFlags();/* Check if we have disconnected nodes and re-establish the connection.* Also update a few stats while we are here, that can be used to make* better decisions in other part of the code.* 检查集群中的节点是否断开连接并尝试重新建立连接。* 同时它还更新了一些统计信息这些信息可以在代码的其他部分用于做出更好的决策* *//* 获取一个安全的字典迭代器用于遍历集群中的所有节点 */ di dictGetSafeIterator(server.cluster-nodes);/* 初始化统计信息标记为 PFAIL(主观下线) 的节点数量 */server.cluster-stats_pfail_nodes 0;/* 遍历集群中的所有节点主要是与这些节点建立网络连接 */while((de dictNext(di)) ! NULL) {// 获取当前节点clusterNode *node dictGetVal(de);/* Not interested in reconnecting the link with myself or nodes* for which we have no address. *//* 如果节点是自身节点或者没有地址信息则跳过 */if (node-flags (CLUSTER_NODE_MYSELF|CLUSTER_NODE_NOADDR)) continue;/* 如果节点被标记为 PFAIL主观下线则增加 pfail 统计计数 */if (node-flags CLUSTER_NODE_PFAIL)server.cluster-stats_pfail_nodes;/* A Node in HANDSHAKE state has a limited lifespan equal to the* configured node timeout. *//* 如果节点处于 HANDSHAKE 状态并且超过了握手超时时间则集群删除该节点 */if (nodeInHandshake(node) now - node-ctime handshake_timeout) {clusterDelNode(node); continue;}/* 如果node节点没有与本节点建立有效的连接link则尝试重新建立连接 */if (node-link NULL) {int fd; // 用于保存连接的文件描述符mstime_t old_ping_sent; // 记录节点最近一次发送 PING 消息的时间clusterLink *link; // 用于保存连接的 clusterLink 结构/* 尝试以非阻塞方式连接到节点的 IP 和端口 */fd anetTcpNonBlockBindConnect(server.neterr, node-ip,node-cport, NET_FIRST_BIND_ADDR);/* 如果连接失败记录错误日志并跳过该节点 */if (fd -1) {/* We got a synchronous error from connect before* clusterSendPing() had a chance to be called.* If node-ping_sent is zero, failure detection cant work,* so we claim we actually sent a ping now (that will* be really sent as soon as the link is obtained). */// 记录本次发送 PING 消息的时间声明现在实际上已经“发送”了一个 ping 消息if (node-ping_sent 0) node-ping_sent mstime(); serverLog(LL_DEBUG, Unable to connect to Cluster Node [%s]:%d - %s, node-ip,node-cport, server.neterr);continue;}/* 创建新的 clusterLink 结构并将其与节点关联 */link createClusterLink(node);link-fd fd;node-link link;// 将连接的文件描述符添加到事件循环中监听读事件aeCreateFileEvent(server.el,link-fd,AE_READABLE,clusterReadHandler,link);/* Queue a PING in the new connection ASAP: this is crucial* to avoid false positives in failure detection.** If the node is flagged as MEET, we send a MEET message instead* of a PING one, to force the receiver to add us in its node* table. *//* 在新的连接上尽快发送一个 PING 消息这对于避免故障检测中的误报至关重要。如果节点被标记为 MEET则发送 MEET 消息而不是 PING以强制接收方将我们添加到其节点表中ping_sent 0 表示没有发送过 ping 消息,如果我们已经接收到PONG则node-ping_sent为零ping_sent ! 0 表示最近一次发送 ping 消息的时间*/old_ping_sent node-ping_sent;// clusterSendPing() 函数会发送一个 PING 消息,重新设置 ping_sent 时间clusterSendPing(link, node-flags CLUSTER_NODE_MEET ?CLUSTERMSG_TYPE_MEET : CLUSTERMSG_TYPE_PING);if (old_ping_sent) {/* If there was an active ping before the link was* disconnected, we want to restore the ping time, otherwise* replaced by the clusterSendPing() call. */// 如果连接断开之前有一个活动的 ping则我们要恢复 ping 时间// 否则由 clusterSendPing() 调用替换。node-ping_sent old_ping_sent;}/* We can clear the flag after the first packet is sent.* If well never receive a PONG, well never send new packets* to this node. Instead after the PONG is received and we* are no longer in meet/handshake status, we want to send* normal PING packets. */// 在发送第一个数据包后我们可以清除标志。node-flags ~CLUSTER_NODE_MEET;serverLog(LL_DEBUG,Connecting with Node %.40s at %s:%d,node-name, node-ip, node-cport);}}// 释放字典迭代器dictReleaseIterator(di);/* Ping some random node 1 time every 10 iterations, so that we usually ping* one random node every second. */// 每执行 10 次至少间隔一秒钟就向一个随机节点发送 gossip 信息if (!(iteration % 10)) {int j;/* Check a few random nodes and ping the one with the oldest* pong_received time. */// 随机 5 个节点选出其中一个最久与本节点通信的节点向其发送 PING 消息for (j 0; j 5; j) {// 随机在集群中挑选节点de dictGetRandomKey(server.cluster-nodes);clusterNode *this dictGetVal(de);/* Dont ping nodes disconnected or with a ping currently active. */// 不要 PING 连接断开的节点也不要 PING 最近已经 PING 过的节点if (this-link NULL || this-ping_sent ! 0) continue;// 跳过自身节点、无地址节点和握手节点if (this-flags (CLUSTER_NODE_MYSELF|CLUSTER_NODE_HANDSHAKE))continue;// 选出 5 个随机节点中最近一次接收 PONG 回复距离现在最旧的节点if (min_pong_node NULL || min_pong this-pong_received) {min_pong_node this; // 记录最久未接收 PONG 回复的节点min_pong this-pong_received; // 记录最久未接收 PONG 回复的时间}}// min_pong_node 就是五个节点中相对很久没通信的节点if (min_pong_node) {serverLog(LL_DEBUG,Pinging node %.40s, min_pong_node-name);// 发送 ping 消息clusterSendPing(min_pong_node-link, CLUSTERMSG_TYPE_PING);}}/* Iterate nodes to check if we need to flag something as failing.* This loop is also responsible to:* 1) Check if there are orphaned masters (masters without non failing* slaves).* 2) Count the max number of non failing slaves for a single master.* 3) Count the number of slaves for our master, if we are a slave. * /* 迭代节点以检查是否需要将某些节点标记为失败。* 此循环还负责* 1) 检查是否存在孤儿主节点即没有非失败从节点的主节点。* 2) 统计单个主节点拥有的最大非失败从节点数。* 3) 如果当前节点是从节点统计其主节点的从节点数。 */orphaned_masters 0; // 初始化孤儿主节点数max_slaves 0; // 初始化最大从节点数this_slaves 0; // 初始化当前主节点的从节点数di dictGetSafeIterator(server.cluster-nodes); // 获取集群节点的安全迭代器while((de dictNext(di)) ! NULL) {// 遍历所有节点clusterNode *node dictGetVal(de);now mstime(); /* 每次迭代都使用更新的时间 Use an updated time at every iteration. */mstime_t delay;// 如果节点是自身节点、无地址节点或握手节点则跳过if (node-flags (CLUSTER_NODE_MYSELF|CLUSTER_NODE_NOADDR|CLUSTER_NODE_HANDSHAKE))continue;/* Orphaned master check, useful only if the current instance* is a slave that may migrate to another master. *//* 孤儿主节点检查仅在当前实例是从节点且可能迁移到其他主节点时有用。 */if (nodeIsSlave(myself) nodeIsMaster(node) !nodeFailed(node)) {// 统计主节点的从节点数int okslaves clusterCountNonFailingSlaves(node);/* A master is orphaned if it is serving a non-zero number of* slots, have no working slaves, but used to have at least one* slave, or failed over a master that used to have slaves. */if (okslaves 0 node-numslots 0 node-flags CLUSTER_NODE_MIGRATE_TO){orphaned_masters;}if (okslaves max_slaves) max_slaves okslaves;if (nodeIsSlave(myself) myself-slaveof node)this_slaves okslaves;}/* If we are waiting for the PONG more than half the cluster* timeout, reconnect the link: maybe there is a connection* issue even if the node is alive. *//* 如果我们等待PONG的时间超过集群超时时间的一半则重新连接链接* 即使节点仍然存活也可能存在连接问题。 */if (node-link /* is connected */now - node-link-ctime // 链接已经存在的时间超过了超时时间server.cluster_node_timeout /* was not already reconnected */node-ping_sent /* 已经发送了PING we already sent a ping */node-pong_received node-ping_sent /*仍在等待PONG still waiting pong *//* 并且等待PONG的时间超过超时时间的一半 and we are waiting for the pong more than timeout/2 */now - node-ping_sent server.cluster_node_timeout/2){/* Disconnect the link, it will be reconnected automatically. *//* 断开链接它将自动重新连接。 */freeClusterLink(node-link);}/* If we have currently no active ping in this instance, and the* received PONG is older than half the cluster timeout, send* a new ping now, to ensure all the nodes are pinged without* a too big delay. *//* 如果当前实例没有活动的PING并且接收到的PONG时间超过集群超时时间的一半* 则立即发送一个新的PING以确保所有节点都能在不太大的延迟内被PING到。 */if (node-link node-ping_sent 0 (now - node-pong_received) server.cluster_node_timeout/2){clusterSendPing(node-link, CLUSTERMSG_TYPE_PING); // 发送PINGcontinue;}/* 如果我们是主节点并且某个从节点请求了手动故障转移则持续PING它。 *//* If we are a master and one of the slaves requested a manual* failover, ping it continuously. */if (server.cluster-mf_end // 有节点发起了手动故障转移nodeIsMaster(myself) // 我是主server.cluster-mf_slave node // 从节点请求了手动故障转移node-link) // 连接存活{clusterSendPing(node-link, CLUSTERMSG_TYPE_PING); // 发送PINGcontinue;}/* Check only if we have an active ping for this instance. *//* 仅在我们对该实例有活动的PING时进行检查。 */if (node-ping_sent 0) continue;/* Compute the delay of the PONG. Note that if we already received* the PONG, then node-ping_sent is zero, so cant reach this* code at all. *//* 计算PONG的延迟。注意如果我们已经接收到PONG则node-ping_sent为零* 因此无法执行到此代码。 */delay now - node-ping_sent;// pong的延迟超过cluster_node_timeout标记为主观下线 pfail// 如果节点的 PONG 延迟超过了超时时间则将节点标记为主观下线if (delay server.cluster_node_timeout) {/* Timeout reached. Set the node as possibly failing if it is* not already in this state. */// 如果节点不是已经标记为主观下线则将其标记为主观下线if (!(node-flags (CLUSTER_NODE_PFAIL|CLUSTER_NODE_FAIL))) {serverLog(LL_DEBUG,*** NODE %.40s possibly failing,node-name);node-flags | CLUSTER_NODE_PFAIL; // 将节点标记为主观下线update_state 1; // 需要更新集群状态}}}dictReleaseIterator(di); // 释放迭代器/* If we are a slave node but the replication is still turned off,* enable it if we know the address of our master and it appears to* be up. *//* 如果我是从节点但复制功能仍然关闭且我们知道主节点的地址并且主节点似乎在线* 则启用复制功能。 */if (nodeIsSlave(myself) server.masterhost NULL myself-slaveof nodeHasAddr(myself-slaveof)){replicationSetMaster(myself-slaveof-ip, myself-slaveof-port); // 设置主节点}/* Abourt a manual failover if the timeout is reached. *//* 如果手动故障转移超时到达则中止手动故障转移。超时时间默认5秒 */manualFailoverCheckTimeout();// 如果当前节点是从节点if (nodeIsSlave(myself)) {// 如果myself是从节点则检查是否满足手动故障转移条件// 如果满足条件会设置 mf_can_start 标志允许手动故障转移继续进行clusterHandleManualFailover();// 检查是否设置了 CLUSTER_MODULE_FLAG_NO_FAILOVER 标志。该标志用于禁用自动故障转移。if (!(server.cluster_module_flags CLUSTER_MODULE_FLAG_NO_FAILOVER))// 处理自动故障转移的逻辑包括选举新的主节点和更新集群状态clusterHandleSlaveFailover();/* If there are orphaned slaves, and we are a slave among the masters* with the max number of non-failing slaves, consider migrating to* the orphaned masters. Note that it does not make sense to try* a migration if there is no master with at least *two* working* slaves. */if (orphaned_masters max_slaves 2 this_slaves max_slaves)// 处理从节点迁移的逻辑将当前从节点迁移到孤儿主节点clusterHandleSlaveMigration(max_slaves);}// 更新集群if (update_state || server.cluster-state CLUSTER_FAIL)clusterUpdateState(); }手动故障转移设置 设置手动故障转移的代码如下假设在某个从节点上执行了cluster failover则会设置server.cluster-mf_end mstime() CLUSTER_MF_TIMEOUT;因此会执行下面的方法将 server.cluster-mf_can_start 1;设置为1表示接下来的代码可以通过该标记直接执行主从切换逻辑。注意server.cluster-mf_end 0表示没有接收到手动故障转移命令。 /* This function is called from the cluster cron function in order to go* forward with a manual failover state machine. 手动故障转移*/ void clusterHandleManualFailover(void) {/* Return ASAP if no manual failover is in progress. */// 检查是否正在进行手动故障转移为0则不是手动切换if (server.cluster-mf_end 0) return;/* If mf_can_start is non-zero, the failover was already triggered so the* next steps are performed by clusterHandleSlaveFailover(). */// 检查手动故障转移是否已经触发开始表示主从同步了下一步执行clusterHandleSlaveFailoverif (server.cluster-mf_can_start) return;// 等待主节点的复制偏移量if (server.cluster-mf_master_offset 0) return; /* Wait for offset... */// 检查从节点的复制偏移量是否与主节点一致如果主从同步了if (server.cluster-mf_master_offset replicationGetSlaveOffset()) {/* Our replication offset matches the master replication offset* announced after clients were paused. We can start the failover. */// 表示可以开始故障转移并记录日志等于1表示同步了server.cluster-mf_can_start 1;serverLog(LL_WARNING,All master replication stream processed, manual failover can start.);} }主从切换故障转移 在clusterHandleSlaveFailover()主从切换方法中既包括自动主从切换也执行了手动主存切换的逻辑。手动主从切换是立即要求其他主节点投票本节点的无需要延迟和等待客观下线。但是投票依旧需要。这里面有很多细节阅读下面代码的重点如下 该方法每隔100毫秒会被调用一次。选举超时时间等于两倍的cluster_node_timeout即如果在两倍的cluster_node_timeout时间内还未选举成功则选举失败如果在此时间内收到了半数主节点投票则选举成功。选举失败重试时间等于4倍的cluster_node_timeout即如果选举失败了则需要距离第一次选举开始4倍的cluster_node_timeout时间才可以开始下一轮选举。只有从节点且它的主节点下线了或者收到了手动故障转移标记才会进行下面的故障转移逻辑。下一次选举的时间是当前时间500ms 随机 0~500ms rank * 1000组成。当执行到发起选举的逻辑代码中会将当前纪元增加1将选举状态设置为1表示正在选举然后广播投票请求直接返回。100ms后再次被调用执行判断是否有收到主节点的投票数超过主节点半数。如果是则执行clusterFailoverReplaceYourMaster();方法替换主节点。否则打印日志等待投票。其中向其他节点发送投票广播可以看clusterRequestFailoverAuth方法。 /* This function is called if we are a slave node and our master serving* a non-zero amount of hash slots is in FAIL state.** 如果当前节点是一个从节点并且它正在复制的一个负责非零个槽的主节点处于 FAIL 状态那么执行这个函数。* * 这个函数有三个目标** 1) To check if we are able to perform a failover, is our data updated?* 检查是否可以对主节点执行一次故障转移节点的关于主节点的信息是否准确和最新updated* 2) Try to get elected by masters.* 选举一个新的主节点* 3) Perform the failover informing all the other nodes.* 执行故障转移并通知其他节点*/// 这个方法每100毫秒会被检查是否执行 // 作用当从节点的主节点被标记为 FAIL 状态时从节点会尝试发起故障转移选举自己为新的主节点。 void clusterHandleSlaveFailover(void) {mstime_t data_age; // 从节点与主节点的断开秒数// 距离上一次故障转移的秒数 现在时间 - 上一次故障转移的时间mstime_t auth_age mstime() - server.cluster-failover_auth_time; int needed_quorum (server.cluster-size / 2) 1; //选举所需的法定人数// 是否手动故障转移如果设置了mf_end和mf_can_start就是手动切换int manual_failover server.cluster-mf_end ! 0 server.cluster-mf_can_start;// auth_timeout 选举超时时间等于两倍的cluster_node_timeout // auth_retry_time 选举失败重试时间等于4倍的cluster_node_timeout mstime_t auth_timeout, auth_retry_time; // 清除 server.cluster-todo_before_sleep 中的 CLUSTER_TODO_HANDLE_FAILOVER 标志位server.cluster-todo_before_sleep ~CLUSTER_TODO_HANDLE_FAILOVER;/* Compute the failover timeout (the max time we have to send votes* and wait for replies), and the failover retry time (the time to wait* before trying to get voted again).** Timeout is MAX(NODE_TIMEOUT*2,2000) milliseconds.* Retry is two times the Timeout.*/auth_timeout server.cluster_node_timeout*2; // 默认选举超时时间为 15*2 30 秒if (auth_timeout 2000) auth_timeout 2000; // 最小为 2 秒// 选举失败后的重试时间是 auth_timeout 的两倍。 30*2 60 秒// 即距离第一次选举开始时间需要间隔60s才能再次发起选举auth_retry_time auth_timeout*2; /* Pre conditions to run the function, that must be met both in case* of an automatic or manual failover:* 1) We are a slave. 我是从节点* 2) Our master is flagged as FAIL, or this is a manual failover. 我的主客观下线了fail或者手动切换* 3) We dont have the no failover configuration set, and this is * not a manual failover. 自动切换需要主节点是fail状态而且没有设置cluster-replica-no-failover no* 4) It is serving slots. 我的主节点被分配了slots */// 检查从节点是否有资格发起故障转移,同时需要满足以上四个条件if (nodeIsMaster(myself) ||myself-slaveof NULL ||(!nodeFailed(myself-slaveof) !manual_failover) ||(server.cluster_slave_no_failover !manual_failover) ||myself-slaveof-numslots 0){/* There are no reasons to failover, so we set the reason why we* are returning without failing over to NONE. */// 设置不能发起故障转移的原因为 CLUSTER_CANT_FAILOVER_NONEserver.cluster-cant_failover_reason CLUSTER_CANT_FAILOVER_NONE;return;}/* Set data_age to the number of seconds we are disconnected from* the master. */// 计算从节点与主节点断开连接的时间 data_age 秒// 如果从节点当前与主节点连接则 data_age 为当前时间减去最后一次与主节点交互的时间。if (server.repl_state REPL_STATE_CONNECTED) {data_age (mstime_t)(server.unixtime - server.master-lastinteraction)* 1000;} else {data_age (mstime_t)(server.unixtime - server.repl_down_since) * 1000;}/* Remove the node timeout from the data age as it is fine that we are* disconnected from our master at least for the time it was down to be* flagged as FAIL, thats the baseline. */// 如果从节点与主节点断开连接的时间超过了节点超时时间那么减去节点超时时间// cluster-node-timeout 的时间不计入断线时间之内if (data_age server.cluster_node_timeout)data_age - server.cluster_node_timeout;/* Check if our data is recent enough according to the slave validity* factor configured by the user.** Check bypassed for manual failovers. */// 检查从节点的数据是否足够新// 目前的检测办法是断线时间不能超过 cluster-node-timeout 的十倍if (server.cluster_slave_validity_factor data_age (((mstime_t)server.repl_ping_slave_period * 1000) (server.cluster_node_timeout * server.cluster_slave_validity_factor))){if (!manual_failover) {// 如果不是手动切换clusterLogCantFailover(CLUSTER_CANT_FAILOVER_DATA_AGE);return;}}/* If the previous failover attempt timedout and the retry time has* elapsed, we can setup a new one. */// 如果距离上一次发起选举的时间的秒数auth_age超过了选举的重试时间auth_retry_time// 则重新调度下一次选举if (auth_age auth_retry_time) {// 设置下一次选举的时间500ms 随机 0~500msserver.cluster-failover_auth_time mstime() 500 /* Fixed delay of 500 milliseconds, let FAIL msg propagate. */random() % 500; /* Random delay between 0 and 500 milliseconds. */server.cluster-failover_auth_count 0; // 投票数清零server.cluster-failover_auth_sent 0; // 投票发送状态清零server.cluster-failover_auth_rank clusterGetSlaveRank(); // 获取当前节点的排名/* We add another delay that is proportional to the slave rank.* Specifically 1 second * rank. This way slaves that have a probably* less updated replication offset, are penalized. */// 偏移量最小的从节点等待时间最短偏移量最大的从节点等待时间最长server.cluster-failover_auth_time server.cluster-failover_auth_rank * 1000;/* However if this is a manual failover, no delay is needed. */// 如果是手动切换则不需要等待立即发起选举优先级为0if (server.cluster-mf_end) {server.cluster-failover_auth_time mstime();server.cluster-failover_auth_rank 0;// 设置可以进行故障转移的状态clusterDoBeforeSleep(CLUSTER_TODO_HANDLE_FAILOVER); }// 打印下一次选举的日志serverLog(LL_WARNING,Start of election delayed for %lld milliseconds (rank #%d, offset %lld).,server.cluster-failover_auth_time - mstime(),server.cluster-failover_auth_rank,replicationGetSlaveOffset());/* Now that we have a scheduled election, broadcast our offset* to all the other slaves so that theyll updated their offsets* if our offset is better. */// 广播消息通知其他从节点更新自己的复制偏移量clusterBroadcastPong(CLUSTER_BROADCAST_LOCAL_SLAVES);return;}/* It is possible that we received more updated offsets from other* slaves for the same master since we computed our election delay.* Update the delay if our rank changed.** Not performed if this is a manual failover. */// 如果是自动切换则执行下面的根据rank设置延迟时间if (server.cluster-failover_auth_sent 0 // 没有向其他节点发起选举投票server.cluster-mf_end 0) // 不是手动切换{int newrank clusterGetSlaveRank(); // 获取当前节点的优先级排名if (newrank server.cluster-failover_auth_rank) { // 如果当前节点的优先级排名大于上一次的优先级排名long long added_delay // 计算增加的延迟时间(newrank - server.cluster-failover_auth_rank) * 1000;server.cluster-failover_auth_time added_delay;server.cluster-failover_auth_rank newrank;// 打印延迟多少毫秒后发起选举serverLog(LL_WARNING,Replica rank updated to #%d, added %lld milliseconds of delay.,newrank, added_delay);}}/* Return ASAP if we cant still start the election. */// 如果当前时间mstime()还未达到下一次选举的时间failover_auth_time则不能发起选举。if (mstime() server.cluster-failover_auth_time) {// 打印还需要等待多少毫秒才能发起选举等最少100msclusterLogCantFailover(CLUSTER_CANT_FAILOVER_WAITING_DELAY);return;}// 距离上一次选举开始的时间超过了选举超时时间则本轮选举失败/* Return ASAP if the election is too old to be valid. */if (auth_age auth_timeout) {// 即选举时间超过了 2*cluster_node_timeout选举失败clusterLogCantFailover(CLUSTER_CANT_FAILOVER_EXPIRED);return;}/* Ask for votes if needed. */// 执行到这里表示前面都不符合说明可以发起选举if (server.cluster-failover_auth_sent 0) { // 没有向其他节点发起选举投票server.cluster-currentEpoch; // 当前纪元1server.cluster-failover_auth_epoch server.cluster-currentEpoch; // 设置选举纪元serverLog(LL_WARNING,Starting a failover election for epoch %llu.,(unsigned long long) server.cluster-currentEpoch);// 向所有集群节点发送投票请求clusterRequestFailoverAuth();server.cluster-failover_auth_sent 1; // 选举状态置为1表示正在选举// 设置三个信号量// 在事件循环的 beforeSleep 阶段执行一些与集群相关的任务。beforeSleep 是 Redis 事件循环的一部分// 每次事件循环进入休眠之前都会调用这个阶段用于处理一些需要在事件循环休眠前完成的任务。clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG| // 保存配置CLUSTER_TODO_UPDATE_STATE| // 更新状态CLUSTER_TODO_FSYNC_CONFIG); // 同步配置文件到磁盘return; /* Wait for replies. 等待投票等待时间 cluster-node-timeout*2除非收到投票大于半数 */ }/* Check if we reached the quorum. */// 如果当前节点获得的投票数大于等于法定人数则故障转移成功if (server.cluster-failover_auth_count needed_quorum) {/* We have the quorum, we can finally failover the master. */serverLog(LL_WARNING,Failover election won: Im the new master.);/* Update my configEpoch to the epoch of the election. */if (myself-configEpoch server.cluster-failover_auth_epoch) {myself-configEpoch server.cluster-failover_auth_epoch;serverLog(LL_WARNING,configEpoch set to %llu after successful failover,(unsigned long long) myself-configEpoch);}/* Take responsibility for the cluster slots. */clusterFailoverReplaceYourMaster();} else {// 如果没有获得过半的选票clusterLogCantFailover(CLUSTER_CANT_FAILOVER_WAITING_VOTES);} }广播投票请求代码 void clusterRequestFailoverAuth(void) {unsigned char buf[sizeof(clusterMsg)];clusterMsg *hdr (clusterMsg*) buf;uint32_t totlen;// 构建消息头设置消息类型为 CLUSTERMSG_TYPE_FAILOVER_AUTH_REQUEST// 即故障转移授权请求clusterBuildMessageHdr(hdr,CLUSTERMSG_TYPE_FAILOVER_AUTH_REQUEST);/* If this is a manual failover, set the CLUSTERMSG_FLAG0_FORCEACK bit* in the header to communicate the nodes receiving the message that* they should authorized the failover even if the master is working. */// 如果是手动故障转移设置 CLUSTERMSG_FLAG0_FORCEACK 标志// 通知接收到消息的节点即使主节点正常工作也应该授权故障转移(强制授权)if (server.cluster-mf_end) hdr-mflags[0] | CLUSTERMSG_FLAG0_FORCEACK;totlen sizeof(clusterMsg)-sizeof(union clusterMsgData);hdr-totlen htonl(totlen);// 将消息广播发送给集群中的所有节点clusterBroadcastMessage(buf,totlen); }投票请求处理与收到投票处理 找到clusterProcessPacket()方法这个方法主要是用于处理收到的请求我们找到处理投票请求和手动切换请求的消息处理逻辑。值得注意以下几点 投票必须由主节点进行投票。请求的配置纪元必须大于等于当前节点的配置纪元。如果在cluster_node_timeout * 2默认30秒内已经投过票了则不在进行投票第二次投票的时间间隔必须大于两倍的集群节点超时时间。对于投票请求处理很简单只是检查请求者是否满足条件如果满足代码中的三个条件则进行投票并设置故障转移检查标记尽快执行检查是否大于半数投票的逻辑。另外如果是手动触发的故障转移同样需要投票这里手动故障转移超时时间默认5秒。并且暂停客户端请求防止在故障转移期间写入数据,暂停的时间是 CLUSTER_MF_TIMEOUT * 2默认10秒直到故障转移成功。 // 收到投票请求 else if (type CLUSTERMSG_TYPE_FAILOVER_AUTH_REQUEST) {// 如果不知道发送者直接返回if (!sender) return 1; /* We dont know that node. */// 发送投票消息这段代码详细解释如下clusterSendFailoverAuthIfNeeded(sender,hdr);// 故障转移投票确认其他主节点会通过发送 CLUSTERMSG_TYPE_FAILOVER_AUTH_ACK 消息来表示支持} else if (type CLUSTERMSG_TYPE_FAILOVER_AUTH_ACK) {if (!sender) return 1; /* We dont know that node. *//* We consider this vote only if the sender is a master serving* a non zero number of slots, and its currentEpoch is greater or* equal to epoch where this node started the election. */// 发送者必须是一个主节点只有主节点有投票权// 发送者必须负责至少一个槽即它是一个有效的主节点// 发送者的当前 epoch 必须大于或等于当前节点发起故障转移选举时的 epoch。这是为了防止过期的投票if (nodeIsMaster(sender) sender-numslots 0 senderCurrentEpoch server.cluster-failover_auth_epoch){// 投票计数server.cluster-failover_auth_count;/* Maybe we reached a quorum here, set a flag to make sure* we check ASAP. */// 检查是否达到法定票数clusterDoBeforeSleep(CLUSTER_TODO_HANDLE_FAILOVER);}// 向主节点发送手动切换的消息 } else if (type CLUSTERMSG_TYPE_MFSTART) {/* This message is acceptable only if Im a master and the sender* is one of my slaves. */// 我是master发送者是我的slave,消息才会被接受if (!sender || sender-slaveof ! myself) return 1;/* Manual failover requested from slaves. Initialize the state* accordingly. */// 重设手动故障转移状态resetManualFailover();// 设置手动故障转移超时时间戳server.cluster-mf_end mstime() CLUSTER_MF_TIMEOUT;// 记录发起手动故障转移的从节点server.cluster-mf_slave sender;// 暂停客户端请求防止在故障转移期间写入数据,暂停的时间是 CLUSTER_MF_TIMEOUT * 2默认 10 秒pauseClients(mstime()(CLUSTER_MF_TIMEOUT*2));// 记录日志通知管理员手动故障转移已由某个从节点发起serverLog(LL_WARNING,Manual failover requested by replica %.40s.,sender-name);} clusterSendFailoverAuthIfNeeded()请求投票的逻辑如下 /* Vote for the node asking for our vote if there are the conditions.* 手动故障转移的投票逻辑主要体现在对 force_ack 标志的处理上绕过主节点状态的检查即使主节点没有处于 FAIL 状态投票节点仍然可以投票。其他条件与自动故障转移一致包括配置纪元检查、重复投票检查、槽布局配置纪元检查等。投票行为与自动故障转移一致满足条件后投票节点会记录投票轮次和投票时间并向请求节点发送投票。*/ void clusterSendFailoverAuthIfNeeded(clusterNode *node, clusterMsg *request) {// 请求节点的主节点clusterNode *master node-slaveof;// 请求节点的当前配置纪元uint64_t requestCurrentEpoch ntohu64(request-currentEpoch);// 请求节点想要获得投票的纪元uint64_t requestConfigEpoch ntohu64(request-configEpoch);// 请求节点的槽布局unsigned char *claimed_slots request-myslots;// 如果设置了 CLUSTERMSG_FLAG0_FORCEACK 标志则 force_ack 为 1表示这是一个手动故障转移请求// 手动故障转移的唯一区别在于force_ack 为 1 时可以绕过主节点状态的检查。int force_ack request-mflags[0] CLUSTERMSG_FLAG0_FORCEACK;int j;/* IF we are not a master serving at least 1 slot, we dont have the* right to vote, as the cluster size in Redis Cluster is the number* of masters serving at least one slot, and quorum is the cluster* size 1 */// 如果节点为从节点或者是一个没有处理任何槽的主节点// 那么它没有投票权if (nodeIsSlave(myself) || myself-numslots 0) return;/* Request epoch must be our currentEpoch.* Note that it is impossible for it to actually be greater since* our currentEpoch was updated as a side effect of receiving this* request, if the request epoch was greater. */// 请求的配置纪元必须大于等于当前节点的配置纪元if (requestCurrentEpoch server.cluster-currentEpoch) {serverLog(LL_WARNING,Failover auth denied to %.40s: reqEpoch (%llu) curEpoch(%llu),node-name,(unsigned long long) requestCurrentEpoch,(unsigned long long) server.cluster-currentEpoch);return;}/* I already voted for this epoch? Return ASAP. */// 已经投过票了if (server.cluster-lastVoteEpoch server.cluster-currentEpoch) {serverLog(LL_WARNING,Failover auth denied to %.40s: already voted for epoch %llu,node-name,(unsigned long long) server.cluster-currentEpoch);return;}/* Node must be a slave and its master down.* The master can be non failing if the request is flagged* with CLUSTERMSG_FLAG0_FORCEACK (manual failover). */// 节点必须是 slave 并且它的 master 不能为null并且master节点是挂了if (nodeIsMaster(node) || master NULL ||// 如果master在线且是手动故障转移也执行投票下面逻辑绕过了master必须存活的投票逻辑(!nodeFailed(master) !force_ack)) // force_ack 为 1 时可以绕过主节点 FAIL 状态的检查{if (nodeIsMaster(node)) {serverLog(LL_WARNING,Failover auth denied to %.40s: it is a master node,node-name);} else if (master NULL) {serverLog(LL_WARNING,Failover auth denied to %.40s: I dont know its master,node-name);} else if (!nodeFailed(master)) {// ((n)-flags CLUSTER_NODE_PFAIL)serverLog(LL_WARNING,Failover auth denied to %.40s: its master is up,node-name);}return;}/* We did not voted for a slave about this master for two* times the node timeout. This is not strictly needed for correctness* of the algorithm but makes the base case more linear. */// 如果之前一段时间已经对请求节点进行过投票那么不进行投票// 第二次投票的时间间隔必顋大于两倍的集群节点超时时间if (mstime() - node-slaveof-voted_time server.cluster_node_timeout * 2){serverLog(LL_WARNING,Failover auth denied to %.40s: cant vote about this master before %lld milliseconds,node-name,(long long) ((server.cluster_node_timeout*2)-(mstime() - node-slaveof-voted_time)));return;}/* The slave requesting the vote must have a configEpoch for the claimed* slots that is the one of the masters currently serving the same* slots in the current configuration. */for (j 0; j CLUSTER_SLOTS; j) {// 跳过未指派节点if (bitmapTestBit(claimed_slots, j) 0) continue;// 查找是否有某个槽的配置纪元大于节点请求的纪元if (server.cluster-slots[j] NULL ||server.cluster-slots[j]-configEpoch requestConfigEpoch){continue;}/* If we reached this point we found a slot that in our current slots* is served by a master with a greater configEpoch than the one claimed* by the slave requesting our vote. Refuse to vote for this slave. */// 如果有的话说明节点请求的纪元已经过期没有必要进行投票serverLog(LL_WARNING,Failover auth denied to %.40s: slot %d epoch (%llu) reqEpoch (%llu),node-name, j,(unsigned long long) server.cluster-slots[j]-configEpoch,(unsigned long long) requestConfigEpoch);return;}/* We can vote for this slave. */// 记录投票轮次和投票时间server.cluster-lastVoteEpoch server.cluster-currentEpoch;node-slaveof-voted_time mstime();clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG|CLUSTER_TODO_FSYNC_CONFIG);// 为节点投票clusterSendFailoverAuth(node);serverLog(LL_WARNING, Failover auth granted to %.40s for epoch %llu,node-name, (unsigned long long) server.cluster-currentEpoch); } 我们再上述代码中总是可以看见clusterDoBeforeSleep(CLUSTER_TODO_HANDLE_FAILOVER);类似代码这里节点解释就是设置集群的标记参数就是标记然后在每次进入事件驱动库的主循环之前理解为另外一个定时任务都会调用clusterBeforeSleep()方法然后会判断设置的标记从而执行对应的操作代码如下 /* This function is called before the event handler returns to sleep for* events. It is useful to perform operations that must be done ASAP in* reaction to events fired but that are not safe to perform inside event* handlers, or to perform potentially expansive tasks that we need to do* a single time before replying to clients. */ /* 这个函数在事件处理程序返回并进入事件等待之前被调用。* 它用于执行那些必须尽快响应事件但又不适合在事件处理程序中执行的操作* 或者用于执行那些在回复客户端之前需要一次性完成的可能耗时的任务。 */ void clusterBeforeSleep(void) {/* Handle failover, this is needed when it is likely that there is already* the quorum from masters in order to react fast. *//* 处理故障转移当已经有足够的主节点达成共识时需要快速响应。 */if (server.cluster-todo_before_sleep CLUSTER_TODO_HANDLE_FAILOVER)clusterHandleSlaveFailover(); // 调用处理从节点故障转移的函数/* Update the cluster state. *//* 更新集群状态。 */if (server.cluster-todo_before_sleep CLUSTER_TODO_UPDATE_STATE)clusterUpdateState();/* 保存配置可能会使用 fsync 进行同步。 *//* Save the config, possibly using fsync. */if (server.cluster-todo_before_sleep CLUSTER_TODO_SAVE_CONFIG) {int fsync server.cluster-todo_before_sleep CLUSTER_TODO_FSYNC_CONFIG;clusterSaveConfigOrDie(fsync);}/* Reset our flags (not strictly needed since every single function* called for flags set should be able to clear its flag). *//* 重置标志位严格来说不是必须的因为每个被调用的函数都应该能够清除自己的标志位。 */server.cluster-todo_before_sleep 0; }以上就是故障转移的核心源码另外重要主观下线和客观下线源码和PING消息发送源码解读将在下一篇文章中继续分析
http://www.dnsts.com.cn/news/150395.html

相关文章:

  • 郑州服装 网站建设公司名字变了网站备案
  • 常平网站开发加强网站内容建设创新
  • 亿度网络网站建设建设银行江西分行官方网站
  • 高校邦营销型网站建设测验答案wordpress模板站如何安装
  • 网站开发入股合作分配比例seo上首页
  • 网站域名收费标准线上交易商城平台开发
  • 保险咨询网站建设网站空间服务多少钱
  • 三水营销网站开发咋建网站
  • wordpress建站成本中山百度网站建设
  • 公司网站优化推广方案中国建设教育协会网站培训中心
  • 网站软件大全免费下外贸销售渠道
  • 网站空间wordpress文件在哪
  • 中国建设造价协会网站产品设计网张
  • 网站建设方案设计心得百度百科分类方法
  • 网页设计个人网站心得体会西安做网站电话
  • 国外一个专门做配乐的网站招聘网站哪个好用
  • 招聘网站竞品分析怎么做去哪找网站建设公司好
  • 公司网站开发费用济南兴田德润o评价小企业做网站怎么做
  • 东营网站建设哪家更好广州免费孕检
  • 做五金外贸哪个网站比较好网站建设公司新闻
  • 驻马店做网站优化网络推广的方法
  • 服装公司网站东莞建设企业网站公司
  • 广州网站建设 领航科技wordpress怎样建立二级菜单
  • 国家精品资源共享课程建设网站网站下载app连接怎么做
  • 宣讲家网站美丽乡村建设2022全国封城名单
  • 郑州餐饮网站建设哪家好网站建设的提成
  • 南京学校网站制作英文网站模版
  • 做配资网站多少钱网站完成上线时间
  • 瑞金网站建设推广响应式网站开发图标
  • php 网站配置电子商务网站平台有哪些