何らかの理由でリソースを移動できなくなりました。pcs
pacemaker-1.1.16-12.el7_4.8.x86_64
corosync-2.4.0-9.el7_4.2.x86_64
pcs-0.9.158-6.el7.centos.1.x86_64
Linux server_a.test.local 3.10.0-693.el7.x86_64
リソースグループの一部として4つのリソースが構成されています。ClusterIP
リソースを使用から使用に移動しようとしたserver_d
ときの作業ログです。server_a
pcs resource move ClusterIP servr_a.test.local
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Forwarding cib_delete operation for section constraints to all (origin=local/crm_resource/3)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: --- 0.24.0 2
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: +++ 0.25.0 (null)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: -- /cib/configuration/constraints/rsc_location[@id='cli-prefer-ClusterIP']
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: + /cib: @epoch=25
Apr 06 12:16:26 [17292] server_d.test.local crmd: info: abort_transition_graph: Transition aborted by deletion of rsc_location[@id='cli-prefer-ClusterIP']: Configuration change | cib=0.25.0 source=te_update_diff:456 path=/cib/configuration/constraints/rsc_location[@id='cli-prefer-ClusterIP'] complete=true
Apr 06 12:16:26 [17292] server_d.test.local crmd: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Completed cib_delete operation for section constraints: OK (rc=0, origin=server_d.test.local/crm_resource/3, version=0.25.0)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: determine_online_status: Node server_a.test.local is online
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: determine_online_status: Node server_d.test.local is online
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: group_print: Resource Group: my_app
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: ClusterIP (ocf::heartbeat:IPaddr2): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: Apache (systemd:httpd): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: stunnel (systemd:stunnel-my_app): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: common_print: my_app-daemon (systemd:my_app): Started server_d.test.local
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave ClusterIP (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave Apache (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave stunnel (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: info: LogActions: Leave my_app-daemon (Started server_d.test.local)
Apr 06 12:16:26 [17291] server_d.test.local pengine: notice: process_pe_message: Calculated transition 8, saving inputs in /var/lib/pacemaker/pengine/pe-input-18.bz2
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Forwarding cib_modify operation for section constraints to all (origin=local/crm_resource/4)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: --- 0.25.0 2
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: Diff: +++ 0.26.0 (null)
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: + /cib: @epoch=26
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_perform_op: ++ /cib/configuration/constraints: <rsc_location id="cli-prefer-ClusterIP" rsc="ClusterIP" role="Started" node="server_a.test.local" score="INFINITY"/>
Apr 06 12:16:26 [17287] server_d.test.local cib: info: cib_process_request: Completed cib_modify operation for section constraints: OK (rc=0, origin=server_d.test.local/crm_resource/4, version=0.26.0)
Apr 06 12:16:26 [17292] server_d.test.local crmd: info: abort_transition_graph: Transition aborted by rsc_location.cli-prefer-ClusterIP 'create': Configuration change | cib=0.26.0 source=te_update_diff:456 path=/cib/configuration/constraints complete=true
Apr 06 12:16:26 [17292] server_d.test.local crmd: info: handle_response: pe_calc calculation pe_calc-dc-1523016986-67 is obsolete
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: determine_online_status: Node server_a.test.local is online
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: determine_online_status: Node server_d.test.local is online
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 1 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: unpack_node_loop: Node 2 is already processed
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: group_print: Resource Group: my_app
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: ClusterIP (ocf::heartbeat:IPaddr2): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: Apache (systemd:httpd): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: stunnel (systemd:stunnel-my_app): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: common_print: my_app-daemon (systemd:my_app): Started server_d.test.local
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave ClusterIP (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave Apache (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave stunnel (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: info: LogActions: Leave my_app-daemon (Started server_d.test.local)
Apr 06 12:16:27 [17291] server_d.test.local pengine: notice: process_pe_message: Calculated transition 9, saving inputs in /var/lib/pacemaker/pengine/pe-input-19.bz2
Apr 06 12:16:27 [17292] server_d.test.local crmd: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
Apr 06 12:16:27 [17292] server_d.test.local crmd: info: do_te_invoke: Processing graph 9 (ref=pe_calc-dc-1523016987-68) derived from /var/lib/pacemaker/pengine/pe-input-19.bz2
Apr 06 12:16:27 [17292] server_d.test.local crmd: notice: run_graph: Transition 9 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-19.bz2): Complete
Apr 06 12:16:27 [17292] server_d.test.local crmd: info: do_log: Input I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd
Apr 06 12:16:27 [17292] server_d.test.local crmd: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_backup: Archived previous version as /var/lib/pacemaker/cib/cib-34.raw
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Wrote version 0.25.0 of the CIB to disk (digest: 7511cba55b6c2f2f481a51d5585b8d36)
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Reading cluster configuration file /var/lib/pacemaker/cib/cib.tPIv7m (digest: /var/lib/pacemaker/cib/cib.OwHiKz)
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_backup: Archived previous version as /var/lib/pacemaker/cib/cib-35.raw
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Wrote version 0.26.0 of the CIB to disk (digest: 7f962ed676a49e84410eee2ee04bae8c)
Apr 06 12:16:27 [17287] server_d.test.local cib: info: cib_file_write_with_digest: Reading cluster configuration file /var/lib/pacemaker/cib/cib.MnRP4u (digest: /var/lib/pacemaker/cib/cib.B5sWNH)
Apr 06 12:16:31 [17287] server_d.test.local cib: info: cib_process_ping: Reporting our current digest to server_d.test.local: 8182592cb4922cbf007158ab0a277190 for 0.26.0 (0x5575234afde0 0)
1つの注意点は、pcs cluster stop server_b.test.local
構成を実行すると、グループ内のすべてのリソースが別のノードに移動されることです。
どうなりますか?私が言ったように、それはうまくいき、それ以来何の変化もありませんでした。
よろしくお願いします!
編集する:
pcs config
[root@server_a ~]# pcs config
Cluster Name: my_app_cluster
Corosync Nodes:
server_a.test.local server_d.test.local
Pacemaker Nodes:
server_a.test.local server_d.test.local
Resources:
Group: my_app
Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=24 ip=10.116.63.49
Operations: monitor interval=10s timeout=20s (ClusterIP-monitor-interval-10s)
start interval=0s timeout=20s (ClusterIP-start-interval-0s)
stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)
Resource: Apache (class=systemd type=httpd)
Operations: monitor interval=60 timeout=100 (Apache-monitor-interval-60)
start interval=0s timeout=100 (Apache-start-interval-0s)
stop interval=0s timeout=100 (Apache-stop-interval-0s)
Resource: stunnel (class=systemd type=stunnel-my_app)
Operations: monitor interval=60 timeout=100 (stunnel-monitor-interval-60)
start interval=0s timeout=100 (stunnel-start-interval-0s)
stop interval=0s timeout=100 (stunnel-stop-interval-0s)
Resource: my_app-daemon (class=systemd type=my_app)
Operations: monitor interval=60 timeout=100 (my_app-daemon-monitor-interval-60)
start interval=0s timeout=100 (my_app-daemon-start-interval-0s)
stop interval=0s timeout=100 (my_app-daemon-stop-interval-0s)
Stonith Devices:
Fencing Levels:
Location Constraints:
Resource: Apache
Enabled on: server_d.test.local (score:INFINITY) (role: Started) (id:cli-prefer-Apache)
Resource: ClusterIP
Enabled on: server_a.test.local (score:INFINITY) (role: Started) (id:cli-prefer-ClusterIP)
Resource: my_app-daemon
Enabled on: server_a.test.local (score:INFINITY) (role: Started) (id:cli-prefer-my_app-daemon)
Resource: stunnel
Enabled on: server_a.test.local (score:INFINITY) (role: Started) (id:cli-prefer-stunnel)
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:
Alerts:
No alerts defined
Resources Defaults:
No defaults set
Operations Defaults:
No defaults set
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: my_app_cluster
dc-version: 1.1.16-12.el7_4.8-94ff4df
have-watchdog: false
stonith-enabled: false
Quorum:
Options:
編集2
実行すると、crm_simulate -sL
次の結果が表示されます。
[root@server_a ~]# crm_simulate -sL
Current cluster status:
Online: [ server_a.test.local server_d.test.local ]
Resource Group: my_app
ClusterIP (ocf::heartbeat:IPaddr2): Started server_a.test.local
Apache (systemd:httpd): Started server_a.test.local
stunnel (systemd:stunnel-my_app): Started server_a.test.local
my_app-daemon (systemd:my_app): Started server_a.test.local
Allocation scores:
group_color: my_app allocation score on server_a.test.local: 0
group_color: my_app allocation score on server_d.test.local: 0
group_color: ClusterIP allocation score on server_a.test.local: 0
group_color: ClusterIP allocation score on server_d.test.local: INFINITY
group_color: Apache allocation score on server_a.test.local: 0
group_color: Apache allocation score on server_d.test.local: INFINITY
group_color: stunnel allocation score on server_a.test.local: INFINITY
group_color: stunnel allocation score on server_d.test.local: 0
group_color: my_app-daemon allocation score on server_a.test.local: INFINITY
group_color: my_app-daemon allocation score on server_d.test.local: 0
native_color: ClusterIP allocation score on server_a.test.local: INFINITY
native_color: ClusterIP allocation score on server_d.test.local: INFINITY
native_color: Apache allocation score on server_a.test.local: INFINITY
native_color: Apache allocation score on server_d.test.local: -INFINITY
native_color: stunnel allocation score on server_a.test.local: INFINITY
native_color: stunnel allocation score on server_d.test.local: -INFINITY
native_color: my_app-daemon allocation score on server_a.test.local: INFINITY
native_color: my_app-daemon allocation score on server_d.test.local: -INFINITY
Transition Summary:
次に、すべてのリソースを削除して再追加し(以前と同じように記録しました)、crm_simulate -sL
コマンドを実行すると他の結果が得られます。
[root@server_a ~]# crm_simulate -sL
Current cluster status:
Online: [ server_a.test.local server_d.test.local ]
Resource Group: my_app
ClusterIP (ocf::heartbeat:IPaddr2): Started server_a.test.local
Apache (systemd:httpd): Started server_a.test.local
stunnel (systemd:stunnel-my_app.service): Started server_a.test.local
my_app-daemon (systemd:my_app.service): Started server_a.test.local
Allocation scores:
group_color: my_app allocation score on server_a.test.local: 0
group_color: my_app allocation score on server_d.test.local: 0
group_color: ClusterIP allocation score on server_a.test.local: 0
group_color: ClusterIP allocation score on server_d.test.local: 0
group_color: Apache allocation score on server_a.test.local: 0
group_color: Apache allocation score on server_d.test.local: 0
group_color: stunnel allocation score on server_a.test.local: 0
group_color: stunnel allocation score on server_d.test.local: 0
group_color: my_app-daemon allocation score on server_a.test.local: 0
group_color: my_app-daemon allocation score on server_d.test.local: 0
native_color: ClusterIP allocation score on server_a.test.local: 0
native_color: ClusterIP allocation score on server_d.test.local: 0
native_color: Apache allocation score on server_a.test.local: 0
native_color: Apache allocation score on server_d.test.local: -INFINITY
native_color: stunnel allocation score on server_a.test.local: 0
native_color: stunnel allocation score on server_d.test.local: -INFINITY
native_color: my_app-daemon allocation score on server_a.test.local: 0
native_color: my_app-daemon allocation score on server_d.test.local: -INFINITY
リソースを移動できますが、そうしてcrm_simulate -sL
コマンドを再実行すると、以前とは異なる出力が表示されます。
[root@server_a ~]# crm_simulate -sL
Current cluster status:
Online: [ server_a.test.local server_d.test.local ]
Resource Group: my_app
ClusterIP (ocf::heartbeat:IPaddr2): Started server_d.test.local
Apache (systemd:httpd): Started server_d.test.local
stunnel (systemd:stunnel-my_app.service): Started server_d.test.local
my_app-daemon (systemd:my_app.service): Started server_d.test.local
Allocation scores:
group_color: my_app allocation score on server_a.test.local: 0
group_color: my_app allocation score on server_d.test.local: 0
group_color: ClusterIP allocation score on server_a.test.local: 0
group_color: ClusterIP allocation score on server_d.test.local: INFINITY
group_color: Apache allocation score on server_a.test.local: 0
group_color: Apache allocation score on server_d.test.local: 0
group_color: stunnel allocation score on server_a.test.local: 0
group_color: stunnel allocation score on server_d.test.local: 0
group_color: my_app-daemon allocation score on server_a.test.local: 0
group_color: my_app-daemon allocation score on server_d.test.local: 0
native_color: ClusterIP allocation score on server_a.test.local: 0
native_color: ClusterIP allocation score on server_d.test.local: INFINITY
native_color: Apache allocation score on server_a.test.local: -INFINITY
native_color: Apache allocation score on server_d.test.local: 0
native_color: stunnel allocation score on server_a.test.local: -INFINITY
native_color: stunnel allocation score on server_d.test.local: 0
native_color: my_app-daemon allocation score on server_a.test.local: -INFINITY
native_color: my_app-daemon allocation score on server_d.test.local: 0
Transition Summary:
少し混乱しています。/これは予想される動作ですか?
答え1
私の最後の答えが正しいかどうかはわかりませんが、詳しく調べてから次のことをman pcs
見つけました。
move [ターゲットノード] [--master] [lifetime=] [--wait[=n]] ノードを無効にする -INFINITY 位置制約を生成し、現在実行中のノードからリソースを移動します。宛先ノードが指定されている場合は、宛先ノードを優先的に選択するために INFINITY 位置制約を作成し、そのノードにリソースを移動します。 --masterを使用する場合、コマンドの範囲はマスターロールに制限され、マスターID(リソースIDではない)を使用する必要があります。寿命が指定されると、制約はその時間後に期限切れになります。それ以外の場合、デフォルトは無限に設定され、「pcsリソースのクリーンアップ」または「pcs制約の削除」を使用して制約を手動で消去できます。 --waitが指定されている場合、PCはリソースが移動するまで最大n秒を待ち、成功した場合は0を返し、エラーは1を返します。 'n'を指定しない場合、デフォルトは60分です。特定のノードで実行されないようにするのが最善ですが、そのノードにフェールオーバーできるリソースが必要な場合は、「pcs locationvoids」を使用してください。
を使用すると、制限 pcs resource clear
が解除され、リソースを移動できます。
答え2
グループ化されたすべてのリソースのデフォルト設定の制限が問題であるscore:INFINITY
可能性があります。 PacemakerのINFINITY
場合とほぼ同じで、1,000,000
スコアに割り当てることができる最も高い値です。
使用時には以下が適用されますINFINITY
(ClusterLabsのマニュアルを参照)。
6.1.1. Infinity Math Pacemaker implements INFINITY (or equivalently, +INFINITY) internally as a score of 1,000,000. Addition and subtraction with it follow these three basic rules: Any value + INFINITY = INFINITY Any value - INFINITY = -INFINITY INFINITY - INFINITY = -INFINITY
1,000
好みのスコアを、または10,000
などに変更してINFINITY
からテストをやり直してください。