[ovirt-users] Hosted engine
Joel Diaz
mrjoeldiaz at gmail.com
Thu Jun 15 15:23:25 UTC 2017
Sorry. I forgot to attached the requested logs in the previous email.
Thanks,
On Jun 15, 2017 9:38 AM, "Joel Diaz" <mrjoeldiaz at gmail.com> wrote:
Good morning,
Requested info below. Along with some additional info.
You'll notice the data volume is not mounted.
Any help in getting HE back running would be greatly appreciated.
Thank you,
Joel
[root at ovirt-hyp-01 ~]# hosted-engine --vm-status
--== Host 1 status ==--
conf_on_shared_storage : True
Status up-to-date : False
Hostname : ovirt-hyp-01.example.lan
Host ID : 1
Engine status : unknown stale-data
Score : 3400
stopped : False
Local maintenance : False
crc32 : 5558a7d3
local_conf_timestamp : 20356
Host timestamp : 20341
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=20341 (Fri Jun 9 14:38:57 2017)
host-id=1
score=3400
vm_conf_refresh_time=20356 (Fri Jun 9 14:39:11 2017)
conf_on_shared_storage=True
maintenance=False
state=EngineDown
stopped=False
--== Host 2 status ==--
conf_on_shared_storage : True
Status up-to-date : False
Hostname : ovirt-hyp-02.example.lan
Host ID : 2
Engine status : unknown stale-data
Score : 3400
stopped : False
Local maintenance : False
crc32 : 936d4cf3
local_conf_timestamp : 20351
Host timestamp : 20337
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=20337 (Fri Jun 9 14:39:03 2017)
host-id=2
score=3400
vm_conf_refresh_time=20351 (Fri Jun 9 14:39:17 2017)
conf_on_shared_storage=True
maintenance=False
state=EngineDown
stopped=False
--== Host 3 status ==--
conf_on_shared_storage : True
Status up-to-date : False
Hostname : ovirt-hyp-03.example.lan
Host ID : 3
Engine status : unknown stale-data
Score : 3400
stopped : False
Local maintenance : False
crc32 : f646334e
local_conf_timestamp : 20391
Host timestamp : 20377
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=20377 (Fri Jun 9 14:39:37 2017)
host-id=3
score=3400
vm_conf_refresh_time=20391 (Fri Jun 9 14:39:51 2017)
conf_on_shared_storage=True
maintenance=False
state=EngineStop
stopped=False
timeout=Thu Jan 1 00:43:08 1970
[root at ovirt-hyp-01 ~]# gluster peer status
Number of Peers: 2
Hostname: 192.168.170.143
Uuid: b2b30d05-cf91-4567-92fd-022575e082f5
State: Peer in Cluster (Connected)
Other names:
10.0.0.2
Hostname: 192.168.170.147
Uuid: 4e50acc4-f3cb-422d-b499-fb5796a53529
State: Peer in Cluster (Connected)
Other names:
10.0.0.3
[root at ovirt-hyp-01 ~]# gluster volume info all
Volume Name: data
Type: Replicate
Volume ID: 1d6bb110-9be4-4630-ae91-36ec1cf6cc02
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.170.141:/gluster_bricks/data/data
Brick2: 192.168.170.143:/gluster_bricks/data/data
Brick3: 192.168.170.147:/gluster_bricks/data/data (arbiter)
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
performance.low-prio-threads: 32
network.remote-dio: off
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
storage.owner-uid: 36
storage.owner-gid: 36
network.ping-timeout: 30
performance.strict-o-direct: on
cluster.granular-entry-heal: enable
Volume Name: engine
Type: Replicate
Volume ID: b160f0b2-8bd3-4ff2-a07c-134cab1519dd
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.170.141:/gluster_bricks/engine/engine
Brick2: 192.168.170.143:/gluster_bricks/engine/engine
Brick3: 192.168.170.147:/gluster_bricks/engine/engine (arbiter)
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
performance.low-prio-threads: 32
network.remote-dio: off
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
storage.owner-uid: 36
storage.owner-gid: 36
network.ping-timeout: 30
performance.strict-o-direct: on
cluster.granular-entry-heal: enable
[root at ovirt-hyp-01 ~]# df -h
Filesystem Size Used Avail Use% Mounted
on
/dev/mapper/centos_ovirt--hyp--01-root 50G 4.1G 46G 9% /
devtmpfs 7.7G 0 7.7G 0% /dev
tmpfs 7.8G 0 7.8G 0% /dev/shm
tmpfs 7.8G 8.7M 7.7G 1% /run
tmpfs 7.8G 0 7.8G 0%
/sys/fs/cgroup
/dev/mapper/centos_ovirt--hyp--01-home 61G 33M 61G 1% /home
/dev/mapper/gluster_vg_sdb-gluster_lv_engine 50G 7.6G 43G 16%
/gluster_bricks/engine
/dev/mapper/gluster_vg_sdb-gluster_lv_data 730G 157G 574G 22%
/gluster_bricks/data
/dev/sda1 497M 173M 325M 35% /boot
ovirt-hyp-01.example.lan:engine 50G 7.6G 43G 16%
/rhev/data-center/mnt/glusterSD/ovirt-hyp-01.example.lan:engine
tmpfs 1.6G 0 1.6G 0%
/run/user/0
[root at ovirt-hyp-01 ~]# systemctl list-unit-files|grep ovirt
ovirt-ha-agent.service enabled
ovirt-ha-broker.service enabled
ovirt-imageio-daemon.service disabled
ovirt-vmconsole-host-sshd.service enabled
[root at ovirt-hyp-01 ~]# systemctl status ovirt-ha-agent.service
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring
Agent
Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled;
vendor preset: disabled)
Active: active (running) since Thu 2017-06-15 08:56:15 EDT; 21min ago
Main PID: 3150 (ovirt-ha-agent)
CGroup: /system.slice/ovirt-ha-agent.service
└─3150 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
--no-daemon
Jun 15 08:56:15 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted
Engine High Availability Monitoring Agent.
Jun 15 08:56:15 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted
Engine High Availability Monitoring Agent...
Jun 15 09:17:18 ovirt-hyp-01.example.lan ovirt-ha-agent[3150]:
ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
ERROR Engine VM stopped on localhost
[root at ovirt-hyp-01 ‾]# systemctl status ovirt-ha-broker.service
● ovirt-ha-broker.service - oVirt Hosted Engine High Availability
Communications Broker
Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
enabled; vendor preset: disabled)
Active: active (running) since Thu 2017-06-15 08:54:06 EDT; 24min ago
Main PID: 968 (ovirt-ha-broker)
CGroup: /system.slice/ovirt-ha-broker.service
└─968 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker
--no-daemon
Jun 15 08:54:06 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted
Engine High Availability Communications Broker.
Jun 15 08:54:06 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted
Engine High Availability Communications Broker...
Jun 15 08:56:16 ovirt-hyp-01.example.lan ovirt-ha-broker[968]:
ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler
ERROR Error handling request, data: '...1b55bcf76'
Traceback (most
recent call last):
File
"/usr/lib/python2.7/site-packages/ovirt...
Hint: Some lines were ellipsized, use -l to show in full.
[root at ovirt-hyp-01 ‾]# systemctl restart ovirt-ha-agent.service
[root at ovirt-hyp-01 ‾]# systemctl status ovirt-ha-agent.service
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring
Agent
Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled;
vendor preset: disabled)
Active: active (running) since Thu 2017-06-15 09:19:21 EDT; 26s ago
Main PID: 8563 (ovirt-ha-agent)
CGroup: /system.slice/ovirt-ha-agent.service
└─8563 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent
--no-daemon
Jun 15 09:19:21 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted
Engine High Availability Monitoring Agent.
Jun 15 09:19:21 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted
Engine High Availability Monitoring Agent...
[root at ovirt-hyp-01 ‾]# systemctl restart ovirt-ha-broker.service
[root at ovirt-hyp-01 ‾]# systemctl status ovirt-ha-broker.service
● ovirt-ha-broker.service - oVirt Hosted Engine High Availability
Communications Broker
Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
enabled; vendor preset: disabled)
Active: active (running) since Thu 2017-06-15 09:20:59 EDT; 28s ago
Main PID: 8844 (ovirt-ha-broker)
CGroup: /system.slice/ovirt-ha-broker.service
└─8844 /usr/bin/python
/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker
--no-daemon
Jun 15 09:20:59 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted
Engine High Availability Communications Broker.
Jun 15 09:20:59 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted
Engine High Availability Communications Broker...
On Jun 14, 2017 4:45 AM, "Sahina Bose" <sabose at redhat.com> wrote:
> What's the output of "hosted-engine --vm-status" and "gluster volume
> status engine" tell you? Are all the bricks running as per gluster vol
> status?
>
> Can you try to restart the ovirt-ha-agent and ovirt-ha-broker services?
>
> If HE still has issues powering up, please provide agent.log and
> broker.log from /var/log/ovirt-hosted-engine-ha and gluster mount logs
> from /var/log/glusterfs/rhev-data-center-mnt <engine>.log
>
> On Thu, Jun 8, 2017 at 6:57 PM, Joel Diaz <mrjoeldiaz at gmail.com> wrote:
>
>> Good morning oVirt community,
>>
>> I'm running a three host gluster environment with hosted engine.
>>
>> Yesterday the engine went down and has not been able to come up properly.
>> It tries to start on all three host.
>>
>> I have two gluster volumes, data and engne. The data storage domian
>> volume is no longer mounted but the engine volume is up. I've restarted the
>> gluster service and make sure both volumes were running. The data volume
>> will not mount.
>>
>> How can I get the engine running properly again?
>>
>> Thanks,
>>
>> Joel
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170615/e6123f17/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: agent.log
Type: application/octet-stream
Size: 1618026 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170615/e6123f17/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: broker.log
Type: application/octet-stream
Size: 1824640 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170615/e6123f17/attachment-0003.obj>
More information about the Users
mailing list