[Users] oVirt Node keeps rebooting

*One of my oVirt Node keeps rebooting since I join the node to oVirt. Here what I see if I issue TOP or IOTOP There is a lot of : * qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0* I don't have that much activity on the other oVirt node. What do you suggest to verify ? $ top* PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24369 qemu 20 0 7108m 2.1g 10m S 86.5 13.4 16:21.26 qemu-kvm 23852 qemu 20 0 5621m 805m 10m S 1.3 5.0 0:40.05 qemu-kvm 23617 qemu 20 0 5114m 320m 10m S 1.0 2.0 0:25.69 qemu-kvm 1516 vdsm 15 -5 1911m 39m 7736 S 0.7 0.2 0:21.12 vdsm 1127 root 20 0 1014m 14m 7716 S 0.3 0.1 0:03.30 libvirtd 16141 root 20 0 15380 1404 936 R 0.3 0.0 0:00.10 top 1 root 20 0 65680 27m 2052 S 0.0 0.2 0:01.34 systemd * $ iotop* Total DISK READ: 14.87 M/s | Total DISK WRITE: 8.79 K/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 29276 be/4 qemu 156.34 K/s 0.00 B/s 0.00 % 3.70 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29118 be/4 qemu 375.21 K/s 0.00 B/s 0.00 % 2.63 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29322 be/4 qemu 203.24 K/s 0.00 B/s 0.00 % 2.62 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29263 be/4 qemu 250.14 K/s 0.00 B/s 0.00 % 2.58 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29273 be/4 qemu 250.14 K/s 0.00 B/s 0.00 % 2.40 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29262 be/4 qemu 312.68 K/s 0.00 B/s 0.00 % 2.15 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29274 be/4 qemu 70.35 K/s 0.00 B/s 0.00 % 1.91 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 28022 be/4 qemu 297.04 K/s 0.00 B/s 0.00 % 1.82 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29298 be/4 qemu 171.97 K/s 0.00 B/s 0.00 % 1.79 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29324 be/4 qemu 281.41 K/s 0.00 B/s 0.00 % 1.78 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29117 be/4 qemu 187.61 K/s 0.00 B/s 0.00 % 1.62 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 28024 be/4 qemu 379.12 K/s 0.00 B/s 0.00 % 1.49 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29129 be/4 qemu 175.88 K/s 0.00 B/s 0.00 % 1.31 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 25711 be/4 qemu 328.31 K/s 0.00 B/s 0.00 % 1.22 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29127 be/4 qemu 297.04 K/s 0.00 B/s 0.00 % 1.19 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29261 be/4 qemu 351.76 K/s 0.00 B/s 0.00 % 1.16 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29116 be/4 qemu 328.31 K/s 0.00 B/s 0.00 % 1.16 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29302 be/4 qemu 390.85 K/s 0.00 B/s 0.00 % 1.16 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29277 be/4 qemu 281.41 K/s 0.00 B/s 0.00 % 1.12 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29122 be/4 qemu 93.80 K/s 0.00 B/s 0.00 % 1.10 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29260 be/4 qemu 265.78 K/s 0.00 B/s 0.00 % 1.10 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29279 be/4 qemu 285.32 K/s 0.00 B/s 0.00 % 1.09 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29112 be/4 qemu 343.95 K/s 0.00 B/s 0.00 % 1.08 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29272 be/4 qemu 187.61 K/s 0.00 B/s 0.00 % 1.04 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29264 be/4 qemu 179.79 K/s 0.00 B/s 0.00 % 0.94 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29269 be/4 qemu 171.97 K/s 0.00 B/s 0.00 % 0.93 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29303 be/4 qemu 171.97 K/s 0.00 B/s 0.00 % 0.92 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29121 be/4 qemu 254.05 K/s 0.00 B/s 0.00 % 0.87 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29271 be/4 qemu 226.69 K/s 0.00 B/s 0.00 % 0.85 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29280 be/4 qemu 218.87 K/s 0.00 B/s 0.00 % 0.84 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 26280 be/4 qemu 250.14 K/s 0.00 B/s 0.00 % 0.74 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29321 be/4 qemu 156.34 K/s 7.82 K/s 0.00 % 0.70 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 26281 be/4 qemu 234.51 K/s 0.00 B/s 0.00 % 0.69 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29265 be/4 qemu 297.04 K/s 0.00 B/s 0.00 % 0.65 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29268 be/4 qemu 125.07 K/s 0.00 B/s 0.00 % 0.60 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29323 be/4 qemu 332.22 K/s 0.00 B/s 0.00 % 0.58 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29111 be/4 qemu 316.59 K/s 0.00 B/s 0.00 % 0.53 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 29126 be/4 qemu 187.61 K/s 0.00 B/s 0.00 % 0.43 % qemu-kvm -S -M pc-0.14 -cpu kvm64,+lahf_lm,+ssse~irtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

[2012-11-18 15:20:08] Protecting spm lock for vdsm pid 1343 [2012-11-18 15:20:08] Trying to acquire lease - spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 lease_file=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases id=1000 lease_time_ms =60000 io_op_to_ms=10000 [2012-11-18 15:20:28] Lease acquired spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1000 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases, TS=1353270008160373 [2012-11-18 15:20:28] Protecting spm lock for vdsm pid 1343 [2012-11-18 15:20:28] Started renewal process (pid=1912) for spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1000 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases [2012-11-18 15:20:30] Stopping lease for pool: f0071c9b-cbe2-4555-9ae0-279031764a99 pgrps: -1912 User defined signal 1 [2012-11-18 15:20:30] releasing lease spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1000 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases [2012-11-18 15:20:33] Protecting spm lock for vdsm pid 1343 [2012-11-18 15:20:33] Trying to acquire lease - spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 lease_file=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases id=1 lease_time_ms=60 000 io_op_to_ms=10000 [2012-11-18 15:20:53] Lease acquired spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases, TS=1353270033749998 [2012-11-18 15:20:53] Protecting spm lock for vdsm pid 1343 [2012-11-18 15:20:53] Started renewal process (pid=2072) for spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases [2012-11-18 16:19:09] Protecting spm lock for vdsm pid 1343 [2012-11-18 16:19:09] Trying to acquire lease - spUUID=8862496a-f326-46cf-8085-7ff982f985da lease_file=/rhev/data-center/mnt/10.10.0.200:_iso/8862496a-f326-46cf-8085-7ff982f985da/dom_md/leases id=1 lease_ time_ms=5000 io_op_to_ms=1000 [2012-11-18 16:19:11] Lease acquired spUUID=8862496a-f326-46cf-8085-7ff982f985da id=1 lease_path=/rhev/data-center/mnt/10.10.0.200:_iso/8862496a-f326-46cf-8085-7ff982f985da/dom_md/leases, TS=1353273549413 059 [2012-11-18 16:19:11] Protecting spm lock for vdsm pid 1343 [2012-11-18 16:19:11] Started renewal process (pid=25101) for spUUID=8862496a-f326-46cf-8085-7ff982f985da id=1 lease_path=/rhev/data-center/mnt/10.10.0.200: _iso/8862496a-f326-46cf-8085-7ff982f985da/dom_md /leases [2012-11-18 16:19:13] Stopping lease for pool: 8862496a-f326-46cf-8085-7ff982f985da pgrps: -25101 User defined signal 1 [2012-11-18 16:19:13] releasing lease spUUID=8862496a-f326-46cf-8085-7ff982f985da id=1 lease_path=/rhev/data-center/mnt/10.10.0.200: _iso/8862496a-f326-46cf-8085-7ff982f985da/dom_md/leases [2012-11-18 18:51:27] Protecting spm lock for vdsm pid 1495 [2012-11-18 18:51:27] Trying to acquire lease - spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 lease_file=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases id=1 lease_time_ms=60 000 io_op_to_ms=10000 [2012-11-18 18:53:47] Lease acquired spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases, TS=1353282807712736 [2012-11-18 18:53:47] Protecting spm lock for vdsm pid 1495 [2012-11-18 18:53:47] Started renewal process (pid=2338) for spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases [2012-11-18 20:17:10] Protecting spm lock for vdsm pid 1492 [2012-11-18 20:17:10] Trying to acquire lease - spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 lease_file=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases id=1 lease_time_ms=60 000 io_op_to_ms=10000 [2012-11-18 20:19:30] Lease acquired spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases, TS=1353287950366393 [2012-11-18 20:19:30] Protecting spm lock for vdsm pid 1492 [2012-11-18 20:19:30] Started renewal process (pid=2182) for spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases [2012-11-19 11:34:48] Protecting spm lock for vdsm pid 1516 [2012-11-19 11:34:48] Trying to acquire lease - spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 lease_file=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases id=1 lease_time_ms=60 000 io_op_to_ms=10000 [2012-11-19 11:37:08] Lease acquired spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases, TS=1353343008455126 [2012-11-19 11:37:08] Protecting spm lock for vdsm pid 1516 [2012-11-19 11:37:08] Started renewal process (pid=2184) for spUUID=f0071c9b-cbe2-4555-9ae0-279031764a99 id=1 lease_path=/rhev/data-center/mnt/_ovirt/f0071c9b-cbe2-4555-9ae0-279031764a99/dom_md/leases 2012/11/19 Itamar Heim <iheim@redhat.com>

*vdsm.log is too big (14Mo)* *spm-lock.log is attached to the email $ lspci | grep -i ether *00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network Connection (rev 05) 03:00.0 Ethernet controller: D-Link System Inc RTL8139 Ethernet (rev 10)* * 2012/11/22 Ayal Baron <abaron@redhat.com>
participants (3)
-
Ayal Baron
-
EricD
-
Itamar Heim