ovirt engine VM (xfs on sda3) broken, how to fix image?

I have a 4.2.7 setup hyperconverged, two deployed VM Engine images and i have 20-30 second network outage. After some pinging to start engine on host 1, then 2, then again 1 Engine image stuck at "Probing EDD (edd=off to disable)... _" as here: https://bugzilla.redhat.com/show_bug.cgi?id=1569827 I stop ha-agent, ha-broker on two hosts (to not trying to start Engine VM), Stop VM via vdsm-client VM destroy vmID="4f169ca9-1854-4e3f-ad57-24445ec08c79" on both hosts, but i have a lock (lease file) anyway .... Oh, lease are disappeared while i wrote he message.... now xfs_repair output: ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. guestmount : ommandrvf: udevadm --debug settle -E /dev/sda3 calling: settle ...... command: mount '-o' 'ro' '/dev/sda3' '/sysroot//' [ 1.478858] SGI XFS with ACLs, security attributes, no debug enabled [ 1.481701] XFS (sda3): Mounting V5 Filesystem [ 1.514183] XFS (sda3): Starting recovery (logdev: internal) [ 1.537299] XFS (sda3): Internal error XFS_WANT_CORRUPTED_GOTO at line 1664 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_extent+0xaa/0x140 [xfs] and "Structure needs cleaning" ..... before it: [root@ovirtnode6 aa6f3e9b-2eba-4fab-a8ee-a4a1aceddf5e]# ls -l итого 7480047 -rw-rw----. 1 vdsm kvm 83751862272 дек 21 13:05 38ef3aac-6ecc-4940-9d2c-ffe4e2557482 -rw-rw----. 1 vdsm kvm 1048576 дек 21 13:27 38ef3aac-6ecc-4940-9d2c-ffe4e2557482.lease -rw-r--r--. 1 vdsm kvm 338 ноя 2 14:01 38ef3aac-6ecc-4940-9d2c-ffe4e2557482.meta If i try to use guestfs LIBGUESTFS_BACKEND=direct guestfish --rw -a 38ef3aac-6ecc-4940-9d2c-ffe4e2557482 and 'run' It result to <fs> run ..... qemu-kvm: -device scsi-hd,drive=hd0: Failed to get "write" lock Is another process using the image? in vdsm-client Host getVMList I do not see engine VM (get id from vdsm-client Host getAllVmStats), because it stopeed? And i want to remove lease by vdsm-client, i need an json file with UUIDs like usage: vdsm-client Lease info [-h] [arg=value [arg=value ...]] positional arguments: arg=value lease: The lease to query JSON representation: { "lease": { "sd_id": "UUID", "lease_id": "UUID" } } in all docs I not find any explains about sd_id and lease_id - where i can get it? see, for example: https://www.ovirt.org/develop/developer-guide/vdsm/vdsm-client.html without it I get: [root@ovirtnode6 aa6f3e9b-2eba-4fab-a8ee-a4a1aceddf5e]# vdsm-client Lease info lease=38ef3aac-6ecc-4940-9d2c-ffe4e2557482 vdsm-client: Command Lease.info with args {'lease': '38ef3aac-6ecc-4940-9d2c-ffe4e2557482'} failed: (code=100, message='unicode' object has no attribute 'get') [root@ovirtnode6 ~]# vdsm-client Lease status lease=38ef3aac-6ecc-4940-9d2c-ffe4e2557482 vdsm-client: Command Lease.status with args {'lease': '38ef3aac-6ecc-4940-9d2c-ffe4e2557482'} failed: (code=100, message=) Please, help me to fix that Engine VM image. -- Mike

21.12.2018 14:24, Mike Lykov пишет:
I have a 4.2.7 setup hyperconverged, two deployed VM Engine images and i have 20-30 second network outage. After some pinging to start engine on host 1, then 2, then again 1 Engine image stuck at "Probing EDD (edd=off to disable)... _" as here: https://bugzilla.redhat.com/show_bug.cgi?id=1569827
Now I looking to the logs. Full /var/log archives are here: https://yadi.sk/d/XZ5jJfQLN6QMlA (HE engine logs) - 36 Mb https://yadi.sk/d/bZ0TYGxFoHGgIQ (ovirtnode6 logs) - 144 Mb I do some CCs in this email to personal addresses, if i's not relevant - please ignore. Host nodes (centos 7.5) named ovirtnode1,5,6. Timeouts (in ha agent) are default. Sanlock are configured (as i think) HE running on ovirtnode6, and spare HE deployed on ovirtnode1. There is two network links: ovirtmgmt over "ovirtmgmt: port 1(enp59s0f0)" and glusterfs storage network over ib0 interface (different subnet) messages log on ovirtnode6: That outage: --------------------------- Dec 21 12:32:56 ovirtnode6 kernel: bnx2x 0000:3b:00.0 enp59s0f0: NIC Link is Down Dec 21 12:32:56 ovirtnode6 kernel: ovirtmgmt: port 1(enp59s0f0) entered disabled state Dec 21 12:33:13 ovirtnode6 kernel: bnx2x 0000:3b:00.0 enp59s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit Dec 21 12:33:13 ovirtnode6 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): enp59s0f0: link becomes ready Dec 21 12:33:13 ovirtnode6 kernel: ovirtmgmt: port 1(enp59s0f0) entered forwarding state Dec 21 12:33:13 ovirtnode6 NetworkManager[1715]: <info> [1545381193.2204] device (enp59s0f0): carrier: link connected ----------------------- There is 17 second. at 33:13 link are back. BUT all events lead to crash follow later: HA agent log: ------------------------------ MainThread::INFO::2018-12-21 12:32:59,540::states::444::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine vm running on localhost MainThread::INFO::2018-12-21 12:32:59,662::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineUp (score: 3400) MainThread::INFO::2018-12-21 12:33:09,797::states::136::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Penalizing score by 1280 due to gateway status MainThread::INFO::2018-12-21 12:33:09,798::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineUp (score: 2120) MainThread::ERROR::2018-12-21 12:33:19,815::states::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Host ovirtnode1.miac (id 1) score is significantly better than local score, shutting down VM on this host ---------------------------------------------- syslog messages: Dec 21 12:33:19 ovirtnode6 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Host ovirtnode1.miac (id 1) score is significantly better than local score, shutting down VM on this host Dec 21 12:33:29 ovirtnode6 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Dec 21 12:33:37 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered disabled state Dec 21 12:33:37 ovirtnode6 kernel: device vnet1 left promiscuous mode Dec 21 12:33:37 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered disabled state Dec 21 12:33:37 ovirtnode6 NetworkManager[1715]: <info> [1545381217.1796] device (vnet1): state change: disconnected -> unmanaged (reason 'unmanaged', sys-iface-state: 'removed') Dec 21 12:33:37 ovirtnode6 NetworkManager[1715]: <info> [1545381217.1798] device (vnet1): released from master device ovirtmgmt Dec 21 12:33:37 ovirtnode6 libvirtd: 2018-12-21 08:33:37.192+0000: 2783: **************error : qemuMonitorIO:719 : internal error: End of file from qemu monitor************* - WHAT IS THIS? Dec 21 12:33:37 ovirtnode6 kvm: 2 guests now active Dec 21 12:33:37 ovirtnode6 systemd-machined: Machine qemu-2-HostedEngine terminated. Dec 21 12:33:37 ovirtnode6 firewalld[1693]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w2 -w -D libvirt-out -m physdev --physdev-is-bridged --physdev-out vnet1 -g FP-vnet1' failed: iptables v1.4.21: goto 'FP-vnet1' is not a chain#012#0 12Try `iptables -h' or 'iptables --help' for more information. Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered blocking state Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered disabled state Dec 21 12:33:55 ovirtnode6 kernel: device vnet1 entered promiscuous mode Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered blocking state Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered forwarding state Dec 21 12:33:55 ovirtnode6 lldpad: recvfrom(Event interface): No buffer space available Dec 21 12:33:55 ovirtnode6 NetworkManager[1715]: <info> [1545381235.8086] manager: (vnet1): new Tun device (/org/freedesktop/NetworkManager/Devices/37) Dec 21 12:33:55 ovirtnode6 NetworkManager[1715]: <info> [1545381235.8121] device (vnet1): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external') Dec 21 12:33:55 ovirtnode6 NetworkManager[1715]: <info> [1545381235.8127] device (vnet1): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'external') --------------------------- *** WHAT IS THIS *** ? Link are ready some time ago, why this bridge status transitions and iptables errors are happen? and this machine try to start again: ----------------- Dec 21 12:33:56 ovirtnode6 systemd-machined: New machine qemu-15-HostedEngine. Dec 21 12:33:56 ovirtnode6 systemd: Started Virtual Machine qemu-15-HostedEngine. ------------------- HA agent log on this host (ovirtnode6): ----------------------------------------- MainThread::INFO::2018-12-21 12:33:49,880::states::510::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down and local host has best score (3400), attempting to start engine VM MainThread::INFO::2018-12-21 12:33:57,884::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-12-21 12:34:04,898::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. ..... MainThread::INFO::2018-12-21 12:36:24,800::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. MainThread::INFO::2018-12-21 12:36:24,921::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) --------------------------- HA Agent log on ovirtnode1 (spare HE host where VM trying to start): ---------------------- MainThread::INFO::2018-12-21 12:33:52,984::states::510::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down and local host has best score (3400), attempting to start engine VM MainThread::INFO::2018-12-21 12:33:56,787::hosted_engine::947::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) Engine VM started on localhost MainThread::INFO::2018-12-21 12:33:59,923::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineStart-EngineStarting) sent? sent MainThread::INFO::2018-12-21 12:33:59,936::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-12-21 12:34:06,950::states::783::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Another host already took over.. ----------------- WHAT IS THIS? what in means about "took over", what process it do in this case? ------------------------------------ MainThread::INFO::2018-12-21 12:34:10,240::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineForceStop (score: 3400) MainThread::INFO::2018-12-21 12:34:10,246::hosted_engine::1006::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) Shutting down vm using `/usr/sbin/hosted-engine --vm-poweroff` MainThread::INFO::2018-12-21 12:34:10,797::hosted_engine::1011::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) stdout: MainThread::INFO::2018-12-21 12:34:10,797::hosted_engine::1012::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) stderr: MainThread::ERROR::2018-12-21 12:34:10,797::hosted_engine::1020::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) Engine VM stopped on localhost ------------------------------------- I mount HE VM partitions with log, here is a syslog messages: WHAT IS THIS guest-shutdown? why? no network problems at this time Dec 21 12:28:24 ovirtengine ovsdb-server: ovs|36386|reconnect|WARN|ssl:[::ffff:127.0.0.1]:58834: connection dropped (Protocol error) Dec 21 12:28:24 ovirtengine python: ::ffff:172.16.10.101 - - [21/Dec/2018 12:28:24] "GET /v2.0/networks HTTP/1.1" 200 - Dec 21 12:30:44 ovirtengine ovsdb-server: ovs|00111|reconnect|ERR|ssl:[::ffff:172.16.10.5]:42032: no response to inactivity probe after 5 seconds, disconnecting Dec 21 12:30:44 ovirtengine ovsdb-server: ovs|00112|reconnect|ERR|ssl:[::ffff:172.16.10.1]:45624: no response to inactivity probe after 5 seconds, disconnecting Dec 21 12:31:07 ovirtengine qemu-ga: info: guest-shutdown called, mode: powerdown Dec 21 12:33:58 ovirtengine kernel: Linux version 3.10.0-862.14.4.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC) ) #1 SMP Wed Sep 26 15:12:11 UTC 2018 Dec 21 12:33:58 ovirtengine kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-862.14.4.el7.x86_64 root=UUID=091e7022-295b-4b3f-96ad-4a7d90a2a9b0 ro crashkernel=auto rd.lvm.lv=ovirt/swap console=ttyS0 LANG=en_US.UTF-8 ..... Dec 21 12:33:59 ovirtengine kernel: XFS (vda3): Ending clean mount Dec 21 12:34:01 ovirtengine systemd: Mounted /sysroot. .... Dec 21 12:34:06 ovirtengine lvm: 6 logical volume(s) in volume group "ovirt" now active Dec 21 12:34:06 ovirtengine systemd: Started LVM2 PV scan on device 252:2. Dec 21 12:34:06 ovirtengine systemd: Found device /dev/mapper/ovirt-audit. Dec 21 12:34:06 ovirtengine kernel: XFS (dm-5): Ending clean mount Dec 21 12:34:06 ovirtengine systemd: Mounted /home. Dec 21 12:34:07 ovirtengine kernel: XFS (dm-3): Ending clean mount Dec 21 12:34:07 ovirtengine systemd: Mounted /var. Dec 21 12:34:07 ovirtengine systemd: Starting Load/Save Random Seed... Dec 21 12:34:07 ovirtengine systemd: Mounting /var/log... Dec 21 12:34:07 ovirtengine kernel: XFS (dm-2): Mounting V5 Filesystem Dec 21 12:34:07 ovirtengine systemd: Started Load/Save Random Seed. Dec 21 12:34:08 ovirtengine kernel: XFS (dm-2): Ending clean mount Dec 21 12:34:08 ovirtengine systemd: Mounted /var/log. Dec 21 12:34:08 ovirtengine systemd: Starting Flush Journal to Persistent Storage... Dec 21 12:34:08 ovirtengine systemd: Mounting /var/log/audit... Dec 21 12:34:08 ovirtengine kernel: XFS (dm-1): Mounting V5 Filesystem Dec 21 12:34:08 ovirtengine systemd: Started Flush Journal to Persistent Storage. Dec 21 12:34:08 ovirtengine kernel: XFS (dm-1): Ending clean mount Dec 21 12:34:08 ovirtengine systemd: Mounted /var/log/audit. Dec 21 12:34:13 ovirtengine kernel: XFS (dm-4): Ending clean mount Dec 21 12:34:13 ovirtengine systemd: Mounted /tmp. Dec 21 12:34:13 ovirtengine systemd: Reached target Local File Systems. ..... Dec 21 12:34:24 ovirtengine sshd[1324]: Server listening on 0.0.0.0 port 2222. Dec 21 12:34:24 ovirtengine sshd[1324]: Server listening on :: port 2222. Dec 21 12:34:25 ovirtengine rsyslogd: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1334" x-info="http://www.rsyslog.com"] start Dec 21 12:34:25 ovirtengine systemd: Started System Logging Service. ...... Dec 21 12:34:25 ovirtengine aliasesdb: BDB0196 Encrypted checksum: no encryption key specified Dec 21 12:34:25 ovirtengine aliasesdb: BDB0196 Encrypted checksum: no encryption key specified Dec 21 12:34:25 ovirtengine kernel: XFS (vda3): Internal error XFS_WANT_CORRUPTED_GOTO at line 1664 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_extent+0xaa/0x140 [xfs] Dec 21 12:34:25 ovirtengine kernel: CPU: 1 PID: 1379 Comm: postalias Not tainted 3.10.0-862.14.4.el7.x86_64 #1 Dec 21 12:34:25 ovirtengine kernel: Hardware name: oVirt oVirt Node, BIOS 1.11.0-2.el7 04/01/2014 Dec 21 12:34:25 ovirtengine kernel: XFS (vda3): xfs_do_force_shutdown(0x8) called from line 236 of file fs/xfs/libxfs/xfs_defer.c. Return address = 0xffffffffc02709cb Dec 21 12:34:27 ovirtengine kernel: XFS (vda3): Corruption of in-memory data detected. Shutting down filesystem Dec 21 12:34:27 ovirtengine kernel: XFS (vda3): Please umount the filesystem and rectify the problem(s) ...... Dec 21 12:34:27 ovirtengine ovirt-websocket-proxy.py: ImportError: No module named websocketproxy Dec 21 12:34:27 ovirtengine journal: 2018-12-21 12:34:27,189+0400 ovirt-engine-dwhd: ERROR run:554 Error: list index out of range Dec 21 12:34:27 ovirtengine systemd: ovirt-websocket-proxy.service: main process exited, code=killed, status=7/BUS Dec 21 12:34:27 ovirtengine systemd: Failed to start oVirt Engine websockets proxy. Dec 21 12:34:27 ovirtengine systemd: Unit ovirt-websocket-proxy.service entered failed state. Dec 21 12:34:27 ovirtengine systemd: ovirt-websocket-proxy.service failed. Dec 21 12:34:27 ovirtengine systemd: ovirt-engine-dwhd.service: main process exited, code=exited, status=1/FAILURE ...... ----------------------------

24.12.2018 11:30, Mike Lykov пишет:
Host nodes (centos 7.5) named ovirtnode1,5,6. Timeouts (in ha agent) are default. Sanlock are configured (as i think) HE running on ovirtnode6, and spare HE deployed on ovirtnode1.
Fixed (as seems) by guestfish/xfs_repair method. It requires to zero xfs metadata logs, and this heavily relies on luck. 1. Why (when it cannot boot due to corruption) it NOT show anything at all in console? I can get to grub menu (if moving fast enough), but if I continue boot I see a blinking cursor for many minutes and not more. Grub options not contain any splash/quiet parameters. (exclusion for EDD message - it is meaningless, if I use edd=off - I get only black console). Where is a kernel boot logs/console output? Are it try to load initrd at least? 2. How to set some timeouts for ha-agent NOT to try restart HE after 1-2 unsuccessful pings and 10 seconds outage? For HE VM stability (not crash/broke fs) are more important instead availability (I can live with unavailable it for 10-15 sec, but cannot with broken VM).
--------------------------- Dec 21 12:32:56 ovirtnode6 kernel: bnx2x 0000:3b:00.0 enp59s0f0: NIC Link is Down Dec 21 12:32:56 ovirtnode6 kernel: ovirtmgmt: port 1(enp59s0f0) entered disabled state Dec 21 12:33:13 ovirtnode6 kernel: bnx2x 0000:3b:00.0 enp59s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit Dec 21 12:33:13 ovirtnode6 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): enp59s0f0: link becomes ready Dec 21 12:33:13 ovirtnode6 kernel: ovirtmgmt: port 1(enp59s0f0) entered forwarding state Dec 21 12:33:13 ovirtnode6 NetworkManager[1715]: <info> [1545381193.2204] device (enp59s0f0): carrier: link connected -----------------------
There is 17 second. at 33:13 link are back. BUT all events lead to crash follow later:
HA agent log: ------------------------------ MainThread::INFO::2018-12-21 12:32:59,540::states::444::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine vm running on localhost MainThread::INFO::2018-12-21 12:32:59,662::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineUp (score: 3400) MainThread::INFO::2018-12-21 12:33:09,797::states::136::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Penalizing score by 1280 due to gateway status MainThread::INFO::2018-12-21 12:33:09,798::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineUp (score: 2120) MainThread::ERROR::2018-12-21 12:33:19,815::states::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Host ovirtnode1.miac (id 1) score is significantly better than local score, shutting down VM on this host ----------------------------------------------
syslog messages:
Dec 21 12:33:19 ovirtnode6 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Host ovirtnode1.miac (id 1) score is significantly better than local score, shutting down VM on this host Dec 21 12:33:29 ovirtnode6 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Dec 21 12:33:37 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered disabled state Dec 21 12:33:37 ovirtnode6 kernel: device vnet1 left promiscuous mode Dec 21 12:33:37 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered disabled state Dec 21 12:33:37 ovirtnode6 NetworkManager[1715]: <info> [1545381217.1796] device (vnet1): state change: disconnected -> unmanaged (reason 'unmanaged', sys-iface-state: 'removed') Dec 21 12:33:37 ovirtnode6 NetworkManager[1715]: <info> [1545381217.1798] device (vnet1): released from master device ovirtmgmt Dec 21 12:33:37 ovirtnode6 libvirtd: 2018-12-21 08:33:37.192+0000: 2783: **************error : qemuMonitorIO:719 : internal error: End of file from qemu monitor************* - WHAT IS THIS? Dec 21 12:33:37 ovirtnode6 kvm: 2 guests now active Dec 21 12:33:37 ovirtnode6 systemd-machined: Machine qemu-2-HostedEngine terminated. Dec 21 12:33:37 ovirtnode6 firewalld[1693]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w2 -w -D libvirt-out -m physdev --physdev-is-bridged --physdev-out vnet1 -g FP-vnet1' failed: iptables v1.4.21: goto 'FP-vnet1' is not a chain#012#0 12Try `iptables -h' or 'iptables --help' for more information.
Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered blocking state Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered disabled state Dec 21 12:33:55 ovirtnode6 kernel: device vnet1 entered promiscuous mode Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered blocking state Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered forwarding state Dec 21 12:33:55 ovirtnode6 lldpad: recvfrom(Event interface): No buffer space available Dec 21 12:33:55 ovirtnode6 NetworkManager[1715]: <info> [1545381235.8086] manager: (vnet1): new Tun device (/org/freedesktop/NetworkManager/Devices/37) Dec 21 12:33:55 ovirtnode6 NetworkManager[1715]: <info> [1545381235.8121] device (vnet1): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external') Dec 21 12:33:55 ovirtnode6 NetworkManager[1715]: <info> [1545381235.8127] device (vnet1): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'external') ---------------------------
*** WHAT IS THIS *** ? Link are ready some time ago, why this bridge status transitions and iptables errors are happen?
and this machine try to start again: ----------------- Dec 21 12:33:56 ovirtnode6 systemd-machined: New machine qemu-15-HostedEngine. Dec 21 12:33:56 ovirtnode6 systemd: Started Virtual Machine qemu-15-HostedEngine. -------------------
HA agent log on this host (ovirtnode6):
----------------------------------------- MainThread::INFO::2018-12-21 12:33:49,880::states::510::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down and local host has best score (3400), attempting to start engine VM MainThread::INFO::2018-12-21 12:33:57,884::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-12-21 12:34:04,898::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. ..... MainThread::INFO::2018-12-21 12:36:24,800::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. MainThread::INFO::2018-12-21 12:36:24,921::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) ---------------------------
HA Agent log on ovirtnode1 (spare HE host where VM trying to start):
---------------------- MainThread::INFO::2018-12-21 12:33:52,984::states::510::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down and local host has best score (3400), attempting to start engine VM
MainThread::INFO::2018-12-21 12:33:56,787::hosted_engine::947::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) Engine VM started on localhost MainThread::INFO::2018-12-21 12:33:59,923::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineStart-EngineStarting) sent? sent MainThread::INFO::2018-12-21 12:33:59,936::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-12-21 12:34:06,950::states::783::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Another host already took over.. -----------------
WHAT IS THIS? what in means about "took over", what process it do in this case? ------------------------------------ MainThread::INFO::2018-12-21 12:34:10,240::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineForceStop (score: 3400) MainThread::INFO::2018-12-21 12:34:10,246::hosted_engine::1006::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) Shutting down vm using `/usr/sbin/hosted-engine --vm-poweroff` MainThread::INFO::2018-12-21 12:34:10,797::hosted_engine::1011::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) stdout: MainThread::INFO::2018-12-21 12:34:10,797::hosted_engine::1012::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) stderr: MainThread::ERROR::2018-12-21 12:34:10,797::hosted_engine::1020::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) Engine VM stopped on localhost -------------------------------------
I mount HE VM partitions with log, here is a syslog messages:
WHAT IS THIS guest-shutdown? why? no network problems at this time
Dec 21 12:28:24 ovirtengine ovsdb-server: ovs|36386|reconnect|WARN|ssl:[::ffff:127.0.0.1]:58834: connection dropped (Protocol error) Dec 21 12:28:24 ovirtengine python: ::ffff:172.16.10.101 - - [21/Dec/2018 12:28:24] "GET /v2.0/networks HTTP/1.1" 200 - Dec 21 12:30:44 ovirtengine ovsdb-server: ovs|00111|reconnect|ERR|ssl:[::ffff:172.16.10.5]:42032: no response to inactivity probe after 5 seconds, disconnecting Dec 21 12:30:44 ovirtengine ovsdb-server: ovs|00112|reconnect|ERR|ssl:[::ffff:172.16.10.1]:45624: no response to inactivity probe after 5 seconds, disconnecting Dec 21 12:31:07 ovirtengine qemu-ga: info: guest-shutdown called, mode: powerdown
Dec 21 12:33:58 ovirtengine kernel: Linux version 3.10.0-862.14.4.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC) ) #1 SMP Wed Sep 26 15:12:11 UTC 2018 Dec 21 12:33:58 ovirtengine kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-862.14.4.el7.x86_64 root=UUID=091e7022-295b-4b3f-96ad-4a7d90a2a9b0 ro crashkernel=auto rd.lvm.lv=ovirt/swap console=ttyS0 LANG=en_US.UTF-8 ..... Dec 21 12:33:59 ovirtengine kernel: XFS (vda3): Ending clean mount Dec 21 12:34:01 ovirtengine systemd: Mounted /sysroot. .... Dec 21 12:34:06 ovirtengine lvm: 6 logical volume(s) in volume group "ovirt" now active Dec 21 12:34:06 ovirtengine systemd: Started LVM2 PV scan on device 252:2. Dec 21 12:34:06 ovirtengine systemd: Found device /dev/mapper/ovirt-audit. Dec 21 12:34:06 ovirtengine kernel: XFS (dm-5): Ending clean mount Dec 21 12:34:06 ovirtengine systemd: Mounted /home. Dec 21 12:34:07 ovirtengine kernel: XFS (dm-3): Ending clean mount Dec 21 12:34:07 ovirtengine systemd: Mounted /var. Dec 21 12:34:07 ovirtengine systemd: Starting Load/Save Random Seed... Dec 21 12:34:07 ovirtengine systemd: Mounting /var/log... Dec 21 12:34:07 ovirtengine kernel: XFS (dm-2): Mounting V5 Filesystem Dec 21 12:34:07 ovirtengine systemd: Started Load/Save Random Seed. Dec 21 12:34:08 ovirtengine kernel: XFS (dm-2): Ending clean mount Dec 21 12:34:08 ovirtengine systemd: Mounted /var/log. Dec 21 12:34:08 ovirtengine systemd: Starting Flush Journal to Persistent Storage... Dec 21 12:34:08 ovirtengine systemd: Mounting /var/log/audit... Dec 21 12:34:08 ovirtengine kernel: XFS (dm-1): Mounting V5 Filesystem Dec 21 12:34:08 ovirtengine systemd: Started Flush Journal to Persistent Storage. Dec 21 12:34:08 ovirtengine kernel: XFS (dm-1): Ending clean mount Dec 21 12:34:08 ovirtengine systemd: Mounted /var/log/audit. Dec 21 12:34:13 ovirtengine kernel: XFS (dm-4): Ending clean mount Dec 21 12:34:13 ovirtengine systemd: Mounted /tmp. Dec 21 12:34:13 ovirtengine systemd: Reached target Local File Systems. ..... Dec 21 12:34:24 ovirtengine sshd[1324]: Server listening on 0.0.0.0 port 2222. Dec 21 12:34:24 ovirtengine sshd[1324]: Server listening on :: port 2222. Dec 21 12:34:25 ovirtengine rsyslogd: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1334" x-info="http://www.rsyslog.com"] start Dec 21 12:34:25 ovirtengine systemd: Started System Logging Service. ...... Dec 21 12:34:25 ovirtengine aliasesdb: BDB0196 Encrypted checksum: no encryption key specified Dec 21 12:34:25 ovirtengine aliasesdb: BDB0196 Encrypted checksum: no encryption key specified Dec 21 12:34:25 ovirtengine kernel: XFS (vda3): Internal error XFS_WANT_CORRUPTED_GOTO at line 1664 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_extent+0xaa/0x140 [xfs] Dec 21 12:34:25 ovirtengine kernel: CPU: 1 PID: 1379 Comm: postalias Not tainted 3.10.0-862.14.4.el7.x86_64 #1 Dec 21 12:34:25 ovirtengine kernel: Hardware name: oVirt oVirt Node, BIOS 1.11.0-2.el7 04/01/2014 Dec 21 12:34:25 ovirtengine kernel: XFS (vda3): xfs_do_force_shutdown(0x8) called from line 236 of file fs/xfs/libxfs/xfs_defer.c. Return address = 0xffffffffc02709cb
Dec 21 12:34:27 ovirtengine kernel: XFS (vda3): Corruption of in-memory data detected. Shutting down filesystem Dec 21 12:34:27 ovirtengine kernel: XFS (vda3): Please umount the filesystem and rectify the problem(s) ......
Dec 21 12:34:27 ovirtengine ovirt-websocket-proxy.py: ImportError: No module named websocketproxy Dec 21 12:34:27 ovirtengine journal: 2018-12-21 12:34:27,189+0400 ovirt-engine-dwhd: ERROR run:554 Error: list index out of range Dec 21 12:34:27 ovirtengine systemd: ovirt-websocket-proxy.service: main process exited, code=killed, status=7/BUS Dec 21 12:34:27 ovirtengine systemd: Failed to start oVirt Engine websockets proxy. Dec 21 12:34:27 ovirtengine systemd: Unit ovirt-websocket-proxy.service entered failed state. Dec 21 12:34:27 ovirtengine systemd: ovirt-websocket-proxy.service failed. Dec 21 12:34:27 ovirtengine systemd: ovirt-engine-dwhd.service: main process exited, code=exited, status=1/FAILURE ...... ----------------------------
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/BBHBZ6ZWU5KLCW...

25.12.2018 10:14, Mike Lykov пишет:
1. Why (when it cannot boot due to corruption) it NOT show anything at all in console? I can get to grub menu (if moving fast enough), but if I continue boot I see a blinking cursor for many minutes and not more. Grub options not contain any splash/quiet parameters. (exclusion for EDD message - it is meaningless, if I use edd=off - I get only black console).
Where is a kernel boot logs/console output? Are it try to load initrd at least?
2. How to set some timeouts for ha-agent NOT to try restart HE after 1-2 unsuccessful pings and 10 seconds outage? For HE VM stability (not crash/broke fs) are more important instead availability (I can live with unavailable it for 10-15 sec, but cannot with broken VM).
3. I stop ha-agent, broker and HE VM on all (two) nodes. Fix a partition in VM. Then I start ha-agent on nodes, and it BROKE VM fs AGAIN! (trying to decide which VM are starting). I fix VM fs again, put a cluster in maintenance mode, start a VM on one node by hand, check it for status/health ok, and only then put ha-agent in work (none) mode. Easy way to broke the cluster by crash HE VM fs (by not put it to global maintenance mode).
--------------------------- Dec 21 12:32:56 ovirtnode6 kernel: bnx2x 0000:3b:00.0 enp59s0f0: NIC Link is Down Dec 21 12:32:56 ovirtnode6 kernel: ovirtmgmt: port 1(enp59s0f0) entered disabled state Dec 21 12:33:13 ovirtnode6 kernel: bnx2x 0000:3b:00.0 enp59s0f0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit Dec 21 12:33:13 ovirtnode6 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): enp59s0f0: link becomes ready Dec 21 12:33:13 ovirtnode6 kernel: ovirtmgmt: port 1(enp59s0f0) entered forwarding state Dec 21 12:33:13 ovirtnode6 NetworkManager[1715]: <info> [1545381193.2204] device (enp59s0f0): carrier: link connected -----------------------
There is 17 second. at 33:13 link are back. BUT all events lead to crash follow later:
HA agent log: ------------------------------ MainThread::INFO::2018-12-21 12:32:59,540::states::444::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine vm running on localhost MainThread::INFO::2018-12-21 12:32:59,662::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineUp (score: 3400) MainThread::INFO::2018-12-21 12:33:09,797::states::136::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Penalizing score by 1280 due to gateway status MainThread::INFO::2018-12-21 12:33:09,798::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineUp (score: 2120) MainThread::ERROR::2018-12-21 12:33:19,815::states::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Host ovirtnode1.miac (id 1) score is significantly better than local score, shutting down VM on this host ----------------------------------------------
syslog messages:
Dec 21 12:33:19 ovirtnode6 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Host ovirtnode1.miac (id 1) score is significantly better than local score, shutting down VM on this host Dec 21 12:33:29 ovirtnode6 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Dec 21 12:33:37 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered disabled state Dec 21 12:33:37 ovirtnode6 kernel: device vnet1 left promiscuous mode Dec 21 12:33:37 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered disabled state Dec 21 12:33:37 ovirtnode6 NetworkManager[1715]: <info> [1545381217.1796] device (vnet1): state change: disconnected -> unmanaged (reason 'unmanaged', sys-iface-state: 'removed') Dec 21 12:33:37 ovirtnode6 NetworkManager[1715]: <info> [1545381217.1798] device (vnet1): released from master device ovirtmgmt Dec 21 12:33:37 ovirtnode6 libvirtd: 2018-12-21 08:33:37.192+0000: 2783: **************error : qemuMonitorIO:719 : internal error: End of file from qemu monitor************* - WHAT IS THIS? Dec 21 12:33:37 ovirtnode6 kvm: 2 guests now active Dec 21 12:33:37 ovirtnode6 systemd-machined: Machine qemu-2-HostedEngine terminated. Dec 21 12:33:37 ovirtnode6 firewalld[1693]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w2 -w -D libvirt-out -m physdev --physdev-is-bridged --physdev-out vnet1 -g FP-vnet1' failed: iptables v1.4.21: goto 'FP-vnet1' is not a chain#012#0 12Try `iptables -h' or 'iptables --help' for more information.
Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered blocking state Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered disabled state Dec 21 12:33:55 ovirtnode6 kernel: device vnet1 entered promiscuous mode Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered blocking state Dec 21 12:33:55 ovirtnode6 kernel: ovirtmgmt: port 3(vnet1) entered forwarding state Dec 21 12:33:55 ovirtnode6 lldpad: recvfrom(Event interface): No buffer space available Dec 21 12:33:55 ovirtnode6 NetworkManager[1715]: <info> [1545381235.8086] manager: (vnet1): new Tun device (/org/freedesktop/NetworkManager/Devices/37) Dec 21 12:33:55 ovirtnode6 NetworkManager[1715]: <info> [1545381235.8121] device (vnet1): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external') Dec 21 12:33:55 ovirtnode6 NetworkManager[1715]: <info> [1545381235.8127] device (vnet1): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'external') ---------------------------
*** WHAT IS THIS *** ? Link are ready some time ago, why this bridge status transitions and iptables errors are happen?
and this machine try to start again: ----------------- Dec 21 12:33:56 ovirtnode6 systemd-machined: New machine qemu-15-HostedEngine. Dec 21 12:33:56 ovirtnode6 systemd: Started Virtual Machine qemu-15-HostedEngine. -------------------
HA agent log on this host (ovirtnode6):
----------------------------------------- MainThread::INFO::2018-12-21 12:33:49,880::states::510::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down and local host has best score (3400), attempting to start engine VM MainThread::INFO::2018-12-21 12:33:57,884::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-12-21 12:34:04,898::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. ..... MainThread::INFO::2018-12-21 12:36:24,800::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. MainThread::INFO::2018-12-21 12:36:24,921::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) ---------------------------
HA Agent log on ovirtnode1 (spare HE host where VM trying to start):
---------------------- MainThread::INFO::2018-12-21 12:33:52,984::states::510::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down and local host has best score (3400), attempting to start engine VM
MainThread::INFO::2018-12-21 12:33:56,787::hosted_engine::947::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) Engine VM started on localhost MainThread::INFO::2018-12-21 12:33:59,923::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineStart-EngineStarting) sent? sent MainThread::INFO::2018-12-21 12:33:59,936::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-12-21 12:34:06,950::states::783::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Another host already took over.. -----------------
WHAT IS THIS? what in means about "took over", what process it do in this case? ------------------------------------ MainThread::INFO::2018-12-21 12:34:10,240::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineForceStop (score: 3400) MainThread::INFO::2018-12-21 12:34:10,246::hosted_engine::1006::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) Shutting down vm using `/usr/sbin/hosted-engine --vm-poweroff` MainThread::INFO::2018-12-21 12:34:10,797::hosted_engine::1011::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) stdout: MainThread::INFO::2018-12-21 12:34:10,797::hosted_engine::1012::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) stderr: MainThread::ERROR::2018-12-21 12:34:10,797::hosted_engine::1020::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm) Engine VM stopped on localhost -------------------------------------
I mount HE VM partitions with log, here is a syslog messages:
WHAT IS THIS guest-shutdown? why? no network problems at this time
Dec 21 12:28:24 ovirtengine ovsdb-server: ovs|36386|reconnect|WARN|ssl:[::ffff:127.0.0.1]:58834: connection dropped (Protocol error) Dec 21 12:28:24 ovirtengine python: ::ffff:172.16.10.101 - - [21/Dec/2018 12:28:24] "GET /v2.0/networks HTTP/1.1" 200 - Dec 21 12:30:44 ovirtengine ovsdb-server: ovs|00111|reconnect|ERR|ssl:[::ffff:172.16.10.5]:42032: no response to inactivity probe after 5 seconds, disconnecting Dec 21 12:30:44 ovirtengine ovsdb-server: ovs|00112|reconnect|ERR|ssl:[::ffff:172.16.10.1]:45624: no response to inactivity probe after 5 seconds, disconnecting Dec 21 12:31:07 ovirtengine qemu-ga: info: guest-shutdown called, mode: powerdown
Dec 21 12:33:58 ovirtengine kernel: Linux version 3.10.0-862.14.4.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC) ) #1 SMP Wed Sep 26 15:12:11 UTC 2018 Dec 21 12:33:58 ovirtengine kernel: Command line: BOOT_IMAGE=/vmlinuz-3.10.0-862.14.4.el7.x86_64 root=UUID=091e7022-295b-4b3f-96ad-4a7d90a2a9b0 ro crashkernel=auto rd.lvm.lv=ovirt/swap console=ttyS0 LANG=en_US.UTF-8 ..... Dec 21 12:33:59 ovirtengine kernel: XFS (vda3): Ending clean mount Dec 21 12:34:01 ovirtengine systemd: Mounted /sysroot. .... Dec 21 12:34:06 ovirtengine lvm: 6 logical volume(s) in volume group "ovirt" now active Dec 21 12:34:06 ovirtengine systemd: Started LVM2 PV scan on device 252:2. Dec 21 12:34:06 ovirtengine systemd: Found device /dev/mapper/ovirt-audit. Dec 21 12:34:06 ovirtengine kernel: XFS (dm-5): Ending clean mount Dec 21 12:34:06 ovirtengine systemd: Mounted /home. Dec 21 12:34:07 ovirtengine kernel: XFS (dm-3): Ending clean mount Dec 21 12:34:07 ovirtengine systemd: Mounted /var. Dec 21 12:34:07 ovirtengine systemd: Starting Load/Save Random Seed... Dec 21 12:34:07 ovirtengine systemd: Mounting /var/log... Dec 21 12:34:07 ovirtengine kernel: XFS (dm-2): Mounting V5 Filesystem Dec 21 12:34:07 ovirtengine systemd: Started Load/Save Random Seed. Dec 21 12:34:08 ovirtengine kernel: XFS (dm-2): Ending clean mount Dec 21 12:34:08 ovirtengine systemd: Mounted /var/log. Dec 21 12:34:08 ovirtengine systemd: Starting Flush Journal to Persistent Storage... Dec 21 12:34:08 ovirtengine systemd: Mounting /var/log/audit... Dec 21 12:34:08 ovirtengine kernel: XFS (dm-1): Mounting V5 Filesystem Dec 21 12:34:08 ovirtengine systemd: Started Flush Journal to Persistent Storage. Dec 21 12:34:08 ovirtengine kernel: XFS (dm-1): Ending clean mount Dec 21 12:34:08 ovirtengine systemd: Mounted /var/log/audit. Dec 21 12:34:13 ovirtengine kernel: XFS (dm-4): Ending clean mount Dec 21 12:34:13 ovirtengine systemd: Mounted /tmp. Dec 21 12:34:13 ovirtengine systemd: Reached target Local File Systems. ..... Dec 21 12:34:24 ovirtengine sshd[1324]: Server listening on 0.0.0.0 port 2222. Dec 21 12:34:24 ovirtengine sshd[1324]: Server listening on :: port 2222. Dec 21 12:34:25 ovirtengine rsyslogd: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1334" x-info="http://www.rsyslog.com"] start Dec 21 12:34:25 ovirtengine systemd: Started System Logging Service. ...... Dec 21 12:34:25 ovirtengine aliasesdb: BDB0196 Encrypted checksum: no encryption key specified Dec 21 12:34:25 ovirtengine aliasesdb: BDB0196 Encrypted checksum: no encryption key specified Dec 21 12:34:25 ovirtengine kernel: XFS (vda3): Internal error XFS_WANT_CORRUPTED_GOTO at line 1664 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_extent+0xaa/0x140 [xfs] Dec 21 12:34:25 ovirtengine kernel: CPU: 1 PID: 1379 Comm: postalias Not tainted 3.10.0-862.14.4.el7.x86_64 #1 Dec 21 12:34:25 ovirtengine kernel: Hardware name: oVirt oVirt Node, BIOS 1.11.0-2.el7 04/01/2014 Dec 21 12:34:25 ovirtengine kernel: XFS (vda3): xfs_do_force_shutdown(0x8) called from line 236 of file fs/xfs/libxfs/xfs_defer.c. Return address = 0xffffffffc02709cb
Dec 21 12:34:27 ovirtengine kernel: XFS (vda3): Corruption of in-memory data detected. Shutting down filesystem Dec 21 12:34:27 ovirtengine kernel: XFS (vda3): Please umount the filesystem and rectify the problem(s) ......
Dec 21 12:34:27 ovirtengine ovirt-websocket-proxy.py: ImportError: No module named websocketproxy Dec 21 12:34:27 ovirtengine journal: 2018-12-21 12:34:27,189+0400 ovirt-engine-dwhd: ERROR run:554 Error: list index out of range Dec 21 12:34:27 ovirtengine systemd: ovirt-websocket-proxy.service: main process exited, code=killed, status=7/BUS Dec 21 12:34:27 ovirtengine systemd: Failed to start oVirt Engine websockets proxy. Dec 21 12:34:27 ovirtengine systemd: Unit ovirt-websocket-proxy.service entered failed state. Dec 21 12:34:27 ovirtengine systemd: ovirt-websocket-proxy.service failed. Dec 21 12:34:27 ovirtengine systemd: ovirt-engine-dwhd.service: main process exited, code=exited, status=1/FAILURE ...... ----------------------------
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/BBHBZ6ZWU5KLCW...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/CIW5NDHMUXTPJL...
participants (1)
-
Mike Lykov