Eventually failed.
I am running CentOS 7.5 on the host. After re-reading documentation it
seems that my /var partition might not be large enough, as it's only 30GB,
but no warning message indicating that's an issue.
Thanks,
Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Weill Cornell Medicine
E: doug(a)med.cornell.edu
O: 212-746-6305
F: 212-746-8690
On Wed, Aug 15, 2018 at 2:10 PM, Douglas Duckworth <dod2014(a)med.cornell.edu>
wrote:
Ok the ansible engine-deploy now seems to be stuck and same step:
[ INFO ] TASK [Force host-deploy in offline mode]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Add host]
[ INFO ] changed: [localhost]
[ INFO ] TASK [Wait for the host to be up]
On the hypervisor in syslog I see:
Aug 15 14:09:26 ovirt-hv1 python: ansible-ovirt_hosts_facts Invoked with
pattern=name=ovirt-hv1.pbtech fetch_nested=False nested_attributes=[]
auth={'timeout': 0, 'url':
'https://ovirt-engine.pbtech/ovirt-engine/api',
Within the VM, which I can access over virtual machine network, I see:
Aug 15 18:08:06 ovirt-engine python: 192.168.122.69 - - [15/Aug/2018
14:08:06] "GET /v2.0/networks HTTP/1.1" 200 -
Aug 15 18:08:11 ovirt-engine ovsdb-server: ovs|00008|stream_ssl|WARN|SSL_read:
system error (Connection reset by peer)
Aug 15 18:08:11 ovirt-engine ovsdb-server: ovs|00009|jsonrpc|WARN|ssl:127
.0.0.1:50356: receive error: Connection reset by peer
Aug 15 18:08:11 ovirt-engine ovsdb-server: ovs|00010|reconnect|WARN|ssl:1
27.0.0.1:50356: connection dropped (Connection reset by peer)
Thanks,
Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Weill Cornell Medicine
E: doug(a)med.cornell.edu
O: 212-746-6305
F: 212-746-8690
On Wed, Aug 15, 2018 at 1:21 PM, Douglas Duckworth <
dod2014(a)med.cornell.edu> wrote:
> Same VDSM error
>
> This is the state shown by service after the failed state messages:
>
> ● vdsmd.service - Virtual Desktop Server Manager
> Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled;
> vendor preset: enabled)
> Active: active (running) since Wed 2018-08-15 13:07:48 EDT; 4min 10s
> ago
> Main PID: 18378 (vdsmd)
> Tasks: 56
> CGroup: /system.slice/vdsmd.service
> ├─18378 /usr/bin/python2 /usr/share/vdsm/vdsmd
> ├─18495 /usr/libexec/ioprocess --read-pipe-fd 45
> --write-pipe-fd 44 --max-threads 10 --max-queued-requests 10
> ├─18504 /usr/libexec/ioprocess --read-pipe-fd 53
> --write-pipe-fd 51 --max-threads 10 --max-queued-requests 10
> └─20825 /usr/libexec/ioprocess --read-pipe-fd 60
> --write-pipe-fd 59 --max-threads 10 --max-queued-requests 10
>
> Aug 15 13:07:49 ovirt-hv1.pbtech vdsm[18378]: WARN Not ready yet,
> ignoring event '|virt|VM_status|c5463d87-c964-4430-9fdb-0e97d56cf812'
> args={'c5463d87-c964-4430-9fdb-0e97d56cf812': {'status':
'Up',
> 'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port':
> '5900'}], 'hash': '6802750603520244794', 'cpuUser':
'0.00',
> 'monitorResponse': '0', 'cpuUsage': '0.00',
'elapsedTime': '124', 'cpuSys':
> '0.00', 'vcpuPeriod': 100000L, 'timeOffset': '0',
'clientIp': '',
> 'pauseCode': 'NOERR', 'vcpuQuota': '-1'}}
> Aug 15 13:07:49 ovirt-hv1.pbtech vdsm[18378]: WARN MOM not available.
> Aug 15 13:07:49 ovirt-hv1.pbtech vdsm[18378]: WARN MOM not available, KSM
> stats will be missing.
> Aug 15 13:07:49 ovirt-hv1.pbtech vdsm[18378]: ERROR failed to retrieve
> Hosted Engine HA score '[Errno 2] No such file or directory'Is the Hosted
> Engine setup finished?
> Aug 15 13:07:50 ovirt-hv1.pbtech vdsm[18378]: WARN Not ready yet,
> ignoring event '|virt|VM_status|c5463d87-c964-4430-9fdb-0e97d56cf812'
> args={'c5463d87-c964-4430-9fdb-0e97d56cf812': {'status':
'Up',
> 'username': 'Unknown', 'memUsage': '40',
'guestFQDN': '', 'memoryStats':
> {'swap_out': '0', 'majflt': '0',
'mem_cached': '772684', 'mem_free':
> '1696572', 'mem_buffers': '9348', 'swap_in':
'0', 'pageflt': '3339',
> 'mem_total': '3880652', 'mem_unused': '1696572'},
'session': 'Unknown',
> 'netIfaces': [], 'guestCPUCount': -1, 'appsList': (),
'guestIPs': '',
> 'disksUsage': []}}
> Aug 15 13:08:04 ovirt-hv1.pbtech vdsm[18378]: ERROR failed to retrieve
> Hosted Engine HA score '[Errno 2] No such file or directory'Is the Hosted
> Engine setup finished?
> Aug 15 13:08:16 ovirt-hv1.pbtech vdsm[18378]: WARN File:
> /var/lib/libvirt/qemu/channels/c5463d87-c964-4430-9fdb-
> 0e97d56cf812.com.redhat.rhevm.vdsm already removed
> Aug 15 13:08:16 ovirt-hv1.pbtech vdsm[18378]: WARN File:
> /var/lib/libvirt/qemu/channels/c5463d87-c964-4430-9fdb-
> 0e97d56cf812.org.qemu.guest_agent.0 already removed
> Aug 15 13:08:16 ovirt-hv1.pbtech vdsm[18378]: WARN File:
> /var/run/ovirt-vmconsole-console/c5463d87-c964-4430-9fdb-0e97d56cf812.sock
> already removed
> Aug 15 13:08:19 ovirt-hv1.pbtech vdsm[18378]: ERROR failed to retrieve
> Hosted Engine HA score '[Errno 2] No such file or directory'Is the Hosted
> Engine setup finished?
>
> Note 'ipAddress': '0' though I see IP was leased out via DHCP
server:
>
> Aug 15 13:05:55 server dhcpd: DHCPACK on 10.0.0.178 to 00:16:3e:54:fb:7f
> via em1
>
> While I can ping it from my NFS server which provides storage domain:
>
> 64 bytes from ovirt-hv1.pbtech (10.0.0.176): icmp_seq=1 ttl=64 time=0.253
> ms
>
>
>
>
> Thanks,
>
> Douglas Duckworth, MSc, LFCS
> HPC System Administrator
> Scientific Computing Unit
> Weill Cornell Medicine
> E: doug(a)med.cornell.edu
> O: 212-746-6305
> F: 212-746-8690
>
> On Wed, Aug 15, 2018 at 12:50 PM, Douglas Duckworth <
> dod2014(a)med.cornell.edu> wrote:
>
>> Ok
>>
>> I was now able to get to the step:
>>
>> Engine replied: DB Up!Welcome to Health Status!
>>
>> By removing a bad entry from /etc/hosts for ovirt-engine.pbech which
>> pointed to an IP on the local virtualization network.
>>
>> Though now when trying to connect to engine during deploy:
>>
>> [ ERROR ] The VDSM host was found in a failed state. Please check engine
>> and bootstrap installation logs.
>>
>> [ ERROR ] Unable to add ovirt-hv1.pbtech to the manager
>>
>> Then repeating
>>
>> [ INFO ] Still waiting for engine to start...
>>
>> Thanks,
>>
>> Douglas Duckworth, MSc, LFCS
>> HPC System Administrator
>> Scientific Computing Unit
>> Weill Cornell Medicine
>> E: doug(a)med.cornell.edu
>> O: 212-746-6305
>> F: 212-746-8690
>>
>> On Wed, Aug 15, 2018 at 10:34 AM, Douglas Duckworth <
>> dod2014(a)med.cornell.edu> wrote:
>>
>>> Hi
>>>
>>> I keep getting this error after running
>>>
>>> sudo hosted-engine --deploy --noansible
>>>
>>> [ INFO ] Engine is still not reachable, waiting...
>>> [ ERROR ] Failed to execute stage 'Closing up': Engine is still not
>>> reachable
>>>
>>> I do see a VM running
>>>
>>> 10:20 2:51 /usr/libexec/qemu-kvm -name guest=HostedEngine,debug-threa
>>> ds=on
>>>
>>> Though
>>>
>>> sudo hosted-engine --vm-status
>>> [Errno 2] No such file or directory
>>> Cannot connect to the HA daemon, please check the logs
>>> An error occured while retrieving vm status, please make sure the HA
>>> daemon is ready and reachable.
>>> Unable to connect the HA Broker
>>>
>>> Can someone please help?
>>>
>>> Each time this failed I ran
"/usr/sbin/ovirt-hosted-engine-cleanup"
>>> then tried deployment again.
>>>
>>> Thanks,
>>>
>>> Douglas Duckworth, MSc, LFCS
>>> HPC System Administrator
>>> Scientific Computing Unit
>>> Weill Cornell Medicine
>>> E: doug(a)med.cornell.edu
>>> O: 212-746-6305
>>> F: 212-746-8690
>>>
>>
>>
>