Users
Threads by month
- ----- 2025 -----
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
November 2014
- 168 participants
- 256 discussions
Hi,
Release criteria discussion started on 2014-10-22 and should end today as per current release process [1].
Current options are:
1) keeping the same release criteria we had for 3.5 [2]
2) review the proposed changes [3] and prepare new release criteria for 3.6
Release management for 3.6.0 has been created [4]
The key milestones for this release must be scheduled:
Key Milestones
Release criteria discussion start: 2014-10-22
Release criteria ready: 2014-11-12
Feature freeze: 60 Days before release
First Test Day: 45 days before release
Release Candidate: 30 days before release
Release: 6 months after oVirt 3.5.0 release
Two different proposals have been made about above scheduling [5]:
1) extend the cycle to 10 months for allowing to include a large feature set
2) reduce the cycle to less than 6 months and split features over 3.6 and 3.7
Feature proposed for 3.6.0 must now be collected in the 3.6 google doc [9]
and reviewed by maintainers.
A tracker bug for 3.6.0 has been created [6] and currently shows no blockers.
There are 399 bugs [7] targeted to 3.6.0.
Excluding node and documentation bugs we have 379 bugs [8] targeted to 3.6.0.
[1] http://www.ovirt.org/Release_process
[2] http://www.ovirt.org/OVirt_3.5_release-management#Release_Criteria
[3] http://lists.ovirt.org/pipermail/devel/2014-September/008695.html
[4] http://www.ovirt.org/OVirt_3.6_Release_Management
[5] http://lists.ovirt.org/pipermail/users/2014-November/028875.html
[6] https://bugzilla.redhat.com/show_bug.cgi?id=1155425
[7] http://goo.gl/zwkF3r
[8] http://goo.gl/ZbUiMc
[9] http://goo.gl/9X3G49
--
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
1
0

Cannot find suitable CPU model for given data
by Thompson, John H. (GSFC-606.2)[Computer Sciences Corporation] 11 Nov '14
by Thompson, John H. (GSFC-606.2)[Computer Sciences Corporation] 11 Nov '14
11 Nov '14
> This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
--B_3498554996_21797312
Content-type: text/plain;
charset="US-ASCII"
Content-transfer-encoding: 7bit
When trying to launch a Centos6.5 VM (pulled from
ovirt-image-repository) onto a Westmere (Dell 6100)
host through oVirt, the following error occurs from
libvirt:
3312: warning : x86Decode:1517 : Preferred CPU model Westmere not allowed by
hypervisor; closest supported model will be used
3312: error : x86Decode:1573 : internal error: Cannot find suitable CPU
model for given data
Googling the issue yeilds:
https://bugzilla.redhat.com/show_bug.cgi?id=804224
System info as Virsh sees it.
$virsh capabilities
<capabilities>
<host>
<uuid>6622b24b-5019-4644-af20-289533f0a7bc</uuid>
<cpu>
<arch>x86_64</arch>
<model>Westmere</model>
<vendor>Intel</vendor>
CPUs qemu-kvm supports.
$/usr/libexec/qemu-kvm -cpu "?" -nodefconfig
x86 Opteron_G5 AMD Opteron 63xx class CPU
x86 Opteron_G4 AMD Opteron 62xx class CPU
x86 Opteron_G3 AMD Opteron 23xx (Gen 3 Class Opteron)
x86 Opteron_G2 AMD Opteron 22xx (Gen 2 Class Opteron)
x86 Opteron_G1 AMD Opteron 240 (Gen 1 Class Opteron)
x86 Haswell Intel Core Processor (Haswell)
x86 SandyBridge Intel Xeon E312xx (Sandy Bridge)
x86 Westmere Westmere E56xx/L56xx/X56xx (Nehalem-C)
x86 Nehalem Intel Core i7 9xx (Nehalem Class Core i7)
x86 Penryn Intel Core 2 Duo P9xxx (Penryn Class Core 2)
x86 Conroe Intel Celeron_4x0 (Conroe/Merom Class Core 2)
x86 cpu64-rhel5 QEMU Virtual CPU version (cpu64-rhel5)
x86 cpu64-rhel6 QEMU Virtual CPU version (cpu64-rhel6)
x86 n270 Intel(R) Atom(TM) CPU N270 @ 1.60GHz
x86 athlon QEMU Virtual CPU version 0.12.1
x86 pentium3
x86 pentium2
x86 pentium
x86 486
x86 coreduo Genuine Intel(R) CPU T2600 @ 2.16GHz
x86 qemu32 QEMU Virtual CPU version 0.12.1
x86 kvm64 Common KVM processor
x86 core2duo Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
x86 phenom AMD Phenom(tm) 9550 Quad-Core Processor
x86 qemu64 QEMU Virtual CPU version 0.12.1
CPU model name from /proc/cpu
Intel(R) Xeon(R) CPU X5660 @ 2.80GHz
--B_3498554996_21797312
Content-type: text/html;
charset="US-ASCII"
Content-transfer-encoding: quoted-printable
<html><head></head><body style=3D"word-wrap: break-word; -webkit-nbsp-mode: s=
pace; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size:=
14px; font-family: Calibri, sans-serif; "><div><span class=3D"Apple-style-spa=
n" style=3D"font-family: Calibri; font-size: medium; ">When trying to launch a=
Centos6.5 VM (pulled from</span><span class=3D"Apple-style-span" style=3D"font-=
family: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-sp=
an" style=3D"font-family: Calibri; font-size: medium; ">ovirt-image-repository=
) onto a Westmere (Dell 6100)</span><span class=3D"Apple-style-span" style=3D"fo=
nt-family: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style=
-span" style=3D"font-family: Calibri; font-size: medium; ">host through oVirt,=
the following error occurs from</span><span class=3D"Apple-style-span" style=3D=
"font-family: Calibri; font-size: medium; "><br></span><span class=3D"Apple-st=
yle-span" style=3D"font-family: Calibri; font-size: medium; ">libvirt:</span><=
span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium=
; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; f=
ont-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-fa=
mily: Calibri; font-size: medium; ">3312: warning : x86Decode:1517 : Preferr=
ed CPU model Westmere not allowed by hypervisor; closest supported model wil=
l be used</span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; =
font-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-f=
amily: Calibri; font-size: medium; ">3312: error : x86Decode:1573 : internal=
error: Cannot find suitable CPU model for given data</span><span class=3D"App=
le-style-span" style=3D"font-family: Calibri; font-size: medium; "><br></span>=
<span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: mediu=
m; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; =
font-size: medium; ">Googling the issue yeilds:</span><span class=3D"Apple-sty=
le-span" style=3D"font-family: Calibri; font-size: medium; "><br></span><span =
class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium; "><=
a class=3D"moz-txt-link-freetext" href=3D"https://bugzilla.redhat.com/show_bug.c=
gi?id=3D804224">https://bugzilla.redhat.com/show_bug.cgi?id=3D804224</a></span><=
span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium=
; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; f=
ont-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-fa=
mily: Calibri; font-size: medium; ">System info as Virsh sees it.</span><spa=
n class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium; "=
><br></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font=
-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-famil=
y: Calibri; font-size: medium; ">$virsh capabilities</span><span class=3D"Appl=
e-style-span" style=3D"font-family: Calibri; font-size: medium; "><br></span><=
span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium=
; "><capabilities></span><span class=3D"Apple-style-span" style=3D"font-fa=
mily: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-span=
" style=3D"font-family: Calibri; font-size: medium; "> <host></spa=
n><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: med=
ium; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri=
; font-size: medium; "> <uuid>6622b24b-5019-4644-af2=
0-289533f0a7bc</uuid></span><span class=3D"Apple-style-span" style=3D"font=
-family: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-s=
pan" style=3D"font-family: Calibri; font-size: medium; "> &l=
t;cpu></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; =
font-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-f=
amily: Calibri; font-size: medium; "> <arch=
>x86_64</arch></span><span class=3D"Apple-style-span" style=3D"font-fam=
ily: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-span"=
style=3D"font-family: Calibri; font-size: medium; "> &=
nbsp; <</span><span class=3D"Apple-style-span" style=3D"font-family: Calibri;=
font-size: medium; "><b>model>Westmere</model></b></span><span cla=
ss=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium; "><b><=
br></b></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; fo=
nt-size: medium; "><b> <vendor>Intel<=
/vendor></b></span><span class=3D"Apple-style-span" style=3D"font-family: Cal=
ibri; font-size: medium; "><b><br></b></span><span class=3D"Apple-style-span" =
style=3D"font-family: Calibri; font-size: medium; "><br></span><span class=3D"Ap=
ple-style-span" style=3D"font-family: Calibri; font-size: medium; ">CPUs qemu-=
kvm supports.</span><span class=3D"Apple-style-span" style=3D"font-family: Calib=
ri; font-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"fo=
nt-family: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style=
-span" style=3D"font-family: Calibri; font-size: medium; ">$/usr/libexec/qemu-=
kvm -cpu "?" -nodefconfig</span><span class=3D"Apple-style-span" style=3D"font-f=
amily: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-spa=
n" style=3D"font-family: Calibri; font-size: medium; ">x86 &n=
bsp; Opteron_G5 AMD Opteron 63xx class CPU</span><span cla=
ss=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium; "><br>=
</span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size=
: medium; ">x86 Opteron_G4 AMD Opt=
eron 62xx class CPU</span><span class=3D"Apple-style-span" style=3D"font-family:=
Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-span" sty=
le=3D"font-family: Calibri; font-size: medium; ">x86 &n=
bsp; Opteron_G3 AMD Opteron 23xx (Gen 3 Class Opteron)</span><sp=
an class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium; =
"><br></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; fon=
t-size: medium; ">x86 Opteron_G2 A=
MD Opteron 22xx (Gen 2 Class Opteron)</span><span class=3D"Apple-style-span" s=
tyle=3D"font-family: Calibri; font-size: medium; "><br></span><span class=3D"App=
le-style-span" style=3D"font-family: Calibri; font-size: medium; ">x86 &n=
bsp; Opteron_G1 AMD Opteron 240 (Gen 1 Class O=
pteron)</span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; fo=
nt-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-fam=
ily: Calibri; font-size: medium; ">x86 &n=
bsp; Haswell Intel Core Processor (Haswell)</span><span cl=
ass=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium; "><br=
></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-siz=
e: medium; ">x86 SandyBridge Intel Xeon =
E312xx (Sandy Bridge)</span><span class=3D"Apple-style-span" style=3D"font-famil=
y: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-span" s=
tyle=3D"font-family: Calibri; font-size: medium; "><b>x86 &nb=
sp; Westmere Westmere E56xx/L56xx/X56xx (Nehal=
em-C)</b></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; =
font-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-f=
amily: Calibri; font-size: medium; ">x86 =
Nehalem Intel Core i7 9xx (Nehalem Class Core i7)</=
span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: =
medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Cali=
bri; font-size: medium; ">x86  =
; Penryn Intel Core 2 Duo P9xxx (Penryn Class Core 2)</spa=
n><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: med=
ium; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri=
; font-size: medium; ">x86 &n=
bsp; Conroe Intel Celeron_4x0 (Conroe/Merom Class Core 2)</span>=
<span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: mediu=
m; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; =
font-size: medium; ">x86 cpu64-rhel5 QEM=
U Virtual CPU version (cpu64-rhel5)</span><span class=3D"Apple-style-span" sty=
le=3D"font-family: Calibri; font-size: medium; "><br></span><span class=3D"Apple=
-style-span" style=3D"font-family: Calibri; font-size: medium; ">x86 &nbs=
p; cpu64-rhel6 QEMU Virtual CPU version (cpu64-rhel6=
)</span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-siz=
e: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: C=
alibri; font-size: medium; ">x86 &n=
bsp; n270 Intel(R) Atom(TM) CPU N270 &nbs=
p; @ 1.60GHz</span><span class=3D"Apple-style-span" style=3D"font-family: Calibr=
i; font-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"fon=
t-family: Calibri; font-size: medium; ">x86 &nb=
sp; athlon QEMU Virtual CPU version 0.12.1</sp=
an><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: me=
dium; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Calibr=
i; font-size: medium; ">x86 =
pentium3</span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; f=
ont-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-fa=
mily: Calibri; font-size: medium; ">x86 &=
nbsp; pentium2</span><span class=3D"Apple-style-span" style=3D"font-family=
: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-span" st=
yle=3D"font-family: Calibri; font-size: medium; ">x86 &=
nbsp; pentium</span><span class=3D"Apple-style-span" s=
tyle=3D"font-family: Calibri; font-size: medium; "><br></span><span class=3D"App=
le-style-span" style=3D"font-family: Calibri; font-size: medium; ">x86 &n=
bsp; 486</=
span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: =
medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Cali=
bri; font-size: medium; ">x86  =
; coreduo Genuine Intel(R) CPU &nbs=
p; T2600 @ 2.16GHz</span><span class=3D"Apple-st=
yle-span" style=3D"font-family: Calibri; font-size: medium; "><br></span><span=
class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: medium; ">=
x86 qemu32 =
QEMU Virtual CPU version 0.12.1</span><span class=3D"Apple-style-span" style=3D=
"font-family: Calibri; font-size: medium; "><br></span><span class=3D"Apple-st=
yle-span" style=3D"font-family: Calibri; font-size: medium; ">x86 &=
nbsp; kvm64 Common KVM=
processor</span><span class=3D"Apple-style-span" style=3D"font-family: Calibri;=
font-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-=
family: Calibri; font-size: medium; ">x86  =
; core2duo Intel(R) Core(TM)2 Duo CPU &nb=
sp; T7700 @ 2.40GHz</span><span class=3D"Apple-style-span" style=3D"font-f=
amily: Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-spa=
n" style=3D"font-family: Calibri; font-size: medium; ">x86 &n=
bsp; phenom AMD Phenom(tm) 9550 Qu=
ad-Core Processor</span><span class=3D"Apple-style-span" style=3D"font-family: C=
alibri; font-size: medium; "><br></span><span class=3D"Apple-style-span" style=
=3D"font-family: Calibri; font-size: medium; ">x86 &nbs=
p; qemu64 QEMU Virtual CPU version 0.12.=
1</span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-siz=
e: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: C=
alibri; font-size: medium; "><br></span><span class=3D"Apple-style-span" style=
=3D"font-family: Calibri; font-size: medium; ">CPU model name from /proc/cpu</=
span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-size: =
medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: Cali=
bri; font-size: medium; "><br></span><span class=3D"Apple-style-span" style=3D"f=
ont-family: Calibri; font-size: medium; ">Intel(R) Xeon(R) CPU X5660 @ 2.80G=
Hz</span><span class=3D"Apple-style-span" style=3D"font-family: Calibri; font-si=
ze: medium; "><br></span><span class=3D"Apple-style-span" style=3D"font-family: =
Calibri; font-size: medium; "><br></span><span class=3D"Apple-style-span" styl=
e=3D"font-family: Calibri; font-size: medium; "><br></span></div></body></html=
>
--B_3498554996_21797312--
1
0
When trying to launch a Centos6.5 VM (pulled from
ovirt-image-repository) onto a Westmere (Dell 6100)
host through oVirt, the following error occurs from
libvirt:
3312: warning : x86Decode:1517 : Preferred CPU model Westmere not
allowed by hypervisor; closest supported model will be used
3312: error : x86Decode:1573 : internal error: Cannot find suitable CPU
model for given data
Googling the issue yeilds:
https://bugzilla.redhat.com/show_bug.cgi?id=804224
System info as Virsh sees it.
$virsh capabilities
<capabilities>
<host>
<uuid>6622b24b-5019-4644-af20-289533f0a7bc</uuid>
<cpu>
<arch>x86_64</arch>
<model>Westmere</model>
<vendor>Intel</vendor>
CPUs qemu-kvm supports.
$/usr/libexec/qemu-kvm -cpu "?" -nodefconfig
x86 Opteron_G5 AMD Opteron 63xx class CPU
x86 Opteron_G4 AMD Opteron 62xx class CPU
x86 Opteron_G3 AMD Opteron 23xx (Gen 3 Class Opteron)
x86 Opteron_G2 AMD Opteron 22xx (Gen 2 Class Opteron)
x86 Opteron_G1 AMD Opteron 240 (Gen 1 Class Opteron)
x86 Haswell Intel Core Processor (Haswell)
x86 SandyBridge Intel Xeon E312xx (Sandy Bridge)
x86 Westmere Westmere E56xx/L56xx/X56xx (Nehalem-C)
x86 Nehalem Intel Core i7 9xx (Nehalem Class Core i7)
x86 Penryn Intel Core 2 Duo P9xxx (Penryn Class Core 2)
x86 Conroe Intel Celeron_4x0 (Conroe/Merom Class Core 2)
x86 cpu64-rhel5 QEMU Virtual CPU version (cpu64-rhel5)
x86 cpu64-rhel6 QEMU Virtual CPU version (cpu64-rhel6)
x86 n270 Intel(R) Atom(TM) CPU N270 @ 1.60GHz
x86 athlon QEMU Virtual CPU version 0.12.1
x86 pentium3
x86 pentium2
x86 pentium
x86 486
x86 coreduo Genuine Intel(R) CPU T2600 @ 2.16GHz
x86 qemu32 QEMU Virtual CPU version 0.12.1
x86 kvm64 Common KVM processor
x86 core2duo Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz
x86 phenom AMD Phenom(tm) 9550 Quad-Core Processor
x86 qemu64 QEMU Virtual CPU version 0.12.1
CPU model name from /proc/cpu
Intel(R) Xeon(R) CPU X5660 @ 2.80GHz
1
0
Hi,
have you got some free time and do you want to get involved in oVirt integration?
Here are a couple of bugs you can hopefully fix in less that one day or you can just try to reproduce providing info:
Bug 1080823 - [RFE] make override of iptables configurable when using hosted-engine
Bug 1065350 - hosted-engine should prompt a question at the user when the host was already a host in the engine
Bug 1059952 - hosted-engine --deploy (additional host) will fail if the engine is not using the default self-signed CA
Bug 1073421 - [RFE] allow additional parameter for engine-backup to omit audit_log data
You don't have programming skills but you want to contribute?
Here are some bugs you can take care of, also without writing a line of code:
Bug ID Status Whiteboard Summary
1118354 NEW integration [RFE] Automated testing should prevent leaking sensitive data
1120585 NEW integration update image uploader documentation
1120586 NEW integration update iso uploader documentation
1120588 NEW integration update log collector documentation
1083104 NEW integration engine-setup --offline does not update versionlock
1074301 NEW infra [RFE] ovirt-shell has no man page
Do you prefer to write on the wiki?
Bug ID Status Summary
1099998 NEW Hosted Engine documentation has several errors
1099995 NEW Migrate to Hosted Engine How-To does not state all pre-reqs
1054303 NEW Dead links in "Quick Start Guide"
1125933 NEW Provide a way to change /ca.crt for non-self-signed certs
1111918 NEW Suggest a repository for installing hosts with an external network provider
1127123 ASSIGNED cannot figure out how to use show statistic
1142623 NEW Feature page: AAA 3.5 needs to be updated
1142649 POST Feature page: AdvancedForemanIntegration needs to be updated
1142671 POST Feature page: CommandCoordinator needs to be updated
1142616 NEW Feature page: Gluster Volume Capacity needs to be updated
1142803 NEW Feature page: Generic Node Registration needs to be updated
1142639 NEW Feature page: Features/Design/JsonRpc needs to be updated
1142822 NEW Feature page: Support blkio SLA features needs to be updated
1142846 NEW Feature page: oVirt Scheduler API needs to be updated
1142806 NEW Feature page: Node Hosted Engine needs to be updated
1142814 NEW Feature page: oVirt Appliance needs to be updated
1142662 NEW Feature page: PMHealthCheck needs to be updated
1142665 NEW Feature page: Custom Fencing needs to be updated
1142652 NEW Feature page: DetailedHostPMProxyPreferences needs to be updated
1142783 NEW Feature page: Separate DWH Host needs to be updated
1111066 NEW oVirt Hardening Guide
1074545 NEW Error in API documentation: Create API object in python sdk
Is this the first time you try to contribute to oVirt project?
You can start from here [1][2]!
Don't know gerrit very well? You can find some more docs here [3].
Any other question about development? Feel free to ask on devel(a)ovirt.org or on irc channel[4].
[1] http://www.ovirt.org/Develop
[2] http://www.ovirt.org/Working_with_oVirt_Gerrit
[3] https://gerrit-review.googlesource.com/Documentation
[4] http://www.ovirt.org/Community
--
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
1
0
On 29.10.2014 11:48, Xavier Naveira wrote:
> On 10/29/2014 11:47 AM, Xavier Naveira wrote:
>> On 10/29/2014 11:40 AM, Daniel Helgenberger wrote:
>>>
>>>
>>> On 29.10.2014 10:21, Xavier Naveira wrote:
>>>> Hi,
>>>>
>>>> We are migrating our ifrastructure from kvm+libvirt hypervisors to
>>>> ovirt.
>>>>
>>>> Everything is working fine but we're noticing that all the qemu-kvm
>>>> processes in the hypervisors take a lot of CPU.
>>> Without further details of the workload this is hard tell. One Reason I
>>> can think of might be KSM [1]. Is it enabled on your cluster(s)? What is
>>> your mem over-commitment setting?
>>>
>>> Note, IIRC the KSM policy is currently hard coded; it will start at 80%
>>> host mem usage.
>>>
>>> [1] http://www.ovirt.org/Sla/host-mom-policy
>>>>
>>>> The typical example is an idle machine, running top from the machine
>>>> itself it reports cpu use percentages below 10% and loads (with 2
>>>> processors) of 0.0x. The process running that machine in the hypervisor
>>>> rports cpu uses in the order of the 80-100%.
>>>>
>>>> Should the values look like this? Why are the idle machines eating up so
>>>> much CPU time?
>>>>
>>>> Thank you.
>>>> Xavier
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users(a)ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>
>>
> Hi, thank you for the answer.
>
> I've been trying to work out some pattern and realized that the VMs
> using that much cpu all are Redhat 5.x, the Readhat 6.x doesn't exhibit
> this kind of high cpu use. (we run only redhat/centos 5.x/6.x on the
> cluster)
What OS are the hosts running? In case of EL6, make sure you have
tuned-0.2.19-13.el6.noarch installed [1].
To further investigate please post Engine, VDSM, libvirt and kernel
versions from the hosts.
[1] https://access.redhat.com/solutions/358033
>
> I'll take a look to the KSM config.
>
> Cheers,
>
> Xavier
>
--
Daniel Helgenberger
m box bewegtbild GmbH
P: +49/30/2408781-22
F: +49/30/2408781-10
ACKERSTR. 19
D-10115 BERLIN
www.m-box.de www.monkeymen.tv
Geschäftsführer: Martin Retschitzegger / Michaela Göllner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767
4
18
Hi,
today I've decided to upgrade my 3.4 Ovirt cluster to 3.5/
The engine migration was OK. Everything was good .. after I want upgraded
the node .. I've installed the rpm on the node and follow :
http://www.ovirt.org/OVirt_3.5_Release_Notes#Install_.2F_Upgrade_from_Previ…
.
Now my vms cannot be started. I've got this error message :
2014-11-11 10:49:37,305 WARN [org.ovirt.engine.core.bll.RunVmCommand]
(org.ovirt.thread.pool-8-thread-13) CanDoAction of action RunVm failed.
Reasons:VAR__ACTION__RUN,VAR__TYPE__VM,VAR__ACTION__RUN,VAR__TYPE__VM,VAR__ACTION__RUN,VAR__TYPE__VM,VAR__ACTION__RUN,VAR__TYPE__VM,SCHEDULING_NO_HOSTS
2014-11-11 10:49:37,307 INFO [org.ovirt.engine.core.bll.RunVmCommand]
(org.ovirt.thread.pool-8-thread-13) Lock freed to object EngineLock
[exclusiveLocks= key: 0fdea856-9aec-4300-8d88-4cfd330cf4ff value: VM
, sharedLocks= ]
There is a full log in attachment ..
Any help will be appreciated ...
James
3
3
------=_Part_891997_988106587.1415681763753
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Hi Jirka,
the patch works. it stabilized the status of my two hosts. the engine migration during failover also works fine. thanks guys!
Jaicel
From: "Jiri Moskovcak" <jmoskovc(a)redhat.com>
To: "Jaicel" <jaicel(a)asti.dost.gov.ph>
Cc: "Niels de Vos" <ndevos(a)redhat.com>, "Vijay Bellur" <vbellur(a)redhat.com>, users(a)ovirt.org, "Gluster Devel" <gluster-devel(a)gluster.org>
Sent: Monday, November 3, 2014 3:33:16 PM
Subject: Re: [ovirt-users] Hosted-Engine HA problem
On 11/01/2014 07:43 AM, Jaicel wrote:
> Hi,
>
> my engine runs on Host1. current status and agent logs below.
>
> Host 1
Hi,
it seems like you ran into [1], you can either zero-out the metadata
file or apply the patch from [1] manually.
--Jirka
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1158925
>
> MainThread::INFO::2014-10-31 16:55:39,918::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engi
> ne-ha agent 1.1.6 started
> MainThread::INFO::2014-10-31 16:55:39,985::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(_get_hostname) Found certificate common name: 192.168.12.11
> MainThread::INFO::2014-10-31 16:55:40,228::hosted_engine::367::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(_initialize_broker) Initializing ha-broker connection
> MainThread::INFO::2014-10-31 16:55:40,228::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor ping, options {'addr': '192.168.12.254'}
> MainThread::INFO::2014-10-31 16:55:40,231::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 140634215107920
> MainThread::INFO::2014-10-31 16:55:40,231::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name': 'ovirtmgmt', 'address': '0'}
> MainThread::INFO::2014-10-31 16:55:40,237::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 140634215108432
> MainThread::INFO::2014-10-31 16:55:40,237::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}
> MainThread::INFO::2014-10-31 16:55:40,240::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 39956688
> MainThread::INFO::2014-10-31 16:55:40,240::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid': '41d4aff1-54e1-4946-a812-2e656bb7d3f
> 9', 'address': '0'}
> MainThread::INFO::2014-10-31 16:55:40,243::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 140634215107664
> MainThread::INFO::2014-10-31 16:55:40,244::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid': '41d4aff1-54e1-4946-a812-2e656bb7d3f9', '
> address': '0'}
> MainThread::INFO::2014-10-31 16:55:40,249::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 140634006879632
> MainThread::INFO::2014-10-31 16:55:40,249::hosted_engine::391::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(_initialize_broker) Broker initialized, all submonitors started
> MainThread::INFO::2014-10-31 16:55:40,298::hosted_engine::476::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(_initialize_sanlock) Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file: /rhev/data-center/mnt/g
> luster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.lockspace)
> MainThread::INFO::2014-10-31 16:55:40,322::state_machine::153::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(refresh) Global metadata: {'maintenance': False}
> MainThread::INFO::2014-10-31 16:55:40,322::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(refresh) Host 192.168.12.12 (id 2): {'live-data': False, 'extra': 'metadata_parse_version=1\nmetadata_feature_version
> =1\ntimestamp=1413882675 (Tue Oct 21 17:11:15 2014)\nhost-id=2\nscore=2400\nmaintenance=False\nstate=EngineDown\n', 'hostname': '192.168.12.12', 'host-id': 2, 'engine-status': {'reason': 'vm not running on this host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False, 'host-ts': 1413882675}
> MainThread::INFO::2014-10-31 16:55:40,322::state_machine::161::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh) Local (id 1): {'engine-health': None, 'bridge': True, 'mem-free': None, 'maintenance': False, 'cpu-load': None, 'gateway': True}
> MainThread::INFO::2014-10-31 16:55:40,323::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1414745740.32 type=state_transition detail=StartState-ReinitializeFSM hostname='ovirt1'
> MainThread::INFO::2014-10-31 16:55:40,392::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (StartState-ReinitializeFSM) sent? ignored
> MainThread::INFO::2014-10-31 16:55:40,675::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state ReinitializeFSM (score: 0)
> MainThread::INFO::2014-10-31 16:55:50,710::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1414745750.71 type=state_transition detail=ReinitializeFSM-EngineUp hostname='ovirt1'
> MainThread::INFO::2014-10-31 16:55:50,710::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (ReinitializeFSM-EngineUp) sent? ignored
> MainThread::INFO::2014-10-31 16:55:51,001::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineUp (score: 2400)
> MainThread::CRITICAL::2014-10-31 16:56:01,033::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not start ha-agent
> Traceback (most recent call last):
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 97, in run
> self._run_agent()
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 154, in _run_agent
> hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring()
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 307, in start_monitoring
> for old_state, state, delay in self.fsm:
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 125, in next
> new_data = self.refresh(self._state.data)
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 77, in refresh
> stats.update(self.hosted_engine.collect_stats())
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 700, in collect_stats
> stats = self.process_remote_metadata(host_id, remote_data)
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 747, in process_remote_metadata
> md['engine-status'] = engine_status(md["engine-status"])
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 79, in engine_status
> in json.loads(status).iteritems()])
> AttributeError: 'NoneType' object has no attribute 'iteritems'
> [root@ovirt1 ~]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> Status up-to-date : False
> Hostname : 192.168.12.11
> Host ID : 1
> Engine status : unknown stale-data
> Score : 2400
> Local maintenance : False
> Host timestamp : 1414745750
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=1414745750 (Fri Oct 31 16:55:50 2014)
> host-id=1
> score=2400
> maintenance=False
> state=EngineUp
>
>
> --== Host 2 status ==--
>
> Status up-to-date : False
> Hostname : 192.168.12.12
> Host ID : 2
> Engine status : unknown stale-data
> Score : 2400
> Local maintenance : False
> Host timestamp : 1414745821
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=1414745821 (Fri Oct 31 16:57:01 2014)
> host-id=2
> score=2400
> maintenance=False
> state=EngineStart
> [root@ovirt1 ~]# service ovirt-ha-agent status
> ovirt-ha-agent dead but subsys locked
>
> Host2
>
> MainThread::INFO::2014-10-31 16:55:59,642::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engi
> ne-ha agent 1.1.6 started
> MainThread::INFO::2014-10-31 16:55:59,678::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(_get_hostname) Found certificate common name: 192.168.12.12
> MainThread::INFO::2014-10-31 16:55:59,918::hosted_engine::367::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(_initialize_broker) Initializing ha-broker connection
> MainThread::INFO::2014-10-31 16:55:59,919::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor ping, options {'addr': '192.168.12.254'}
> MainThread::INFO::2014-10-31 16:55:59,922::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 25353488
> MainThread::INFO::2014-10-31 16:55:59,922::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name': 'ovirtmgmt', 'address': '0'}
> MainThread::INFO::2014-10-31 16:55:59,928::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 25354128
> MainThread::INFO::2014-10-31 16:55:59,928::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}
> MainThread::INFO::2014-10-31 16:55:59,931::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 25353552
> MainThread::INFO::2014-10-31 16:55:59,931::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid': '41d4aff1-54e1-4946-a812-2e656bb7d3f
> 9', 'address': '0'}
> MainThread::INFO::2014-10-31 16:55:59,934::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 139976608389584
> MainThread::INFO::2014-10-31 16:55:59,934::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid': '41d4aff1-54e1-4946-a812-2e656bb7d3f9', '
> address': '0'}
> MainThread::INFO::2014-10-31 16:55:59,939::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo
> nitor) Success, id 139976608447760
> MainThread::INFO::2014-10-31 16:55:59,939::hosted_engine::391::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(_initialize_broker) Broker initialized, all submonitors started
> MainThread::INFO::2014-10-31 16:55:59,983::hosted_engine::476::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(_initialize_sanlock) Ensuring lease for lockspace hosted-engine, host id 2 is acquired (file: /rhev/data-center/mnt/g
> luster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.lockspace)
> MainThread::INFO::2014-10-31 16:56:00,001::state_machine::153::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(refresh) Global metadata: {'maintenance': False}
> MainThread::INFO::2014-10-31 16:56:00,001::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(refresh) Host 192.168.12.11 (id 1): {'live-data': True, 'extra': 'metadata_parse_version=1\nmetadata_feature_version=
> 1\ntimestamp=1414745750 (Fri Oct 31 16:55:50 2014)\nhost-id=1\nscore=2400\nmaintenance=False\nstate=EngineUp\n', 'hostn
> ame': '192.168.12.11', 'host-id': 1, 'engine-status': {'health': 'good', 'vm': 'up', 'detail': 'up'}, 'score': 2400, 'm
> aintenance': False, 'host-ts': 1414745750}
> MainThread::INFO::2014-10-31 16:56:00,001::state_machine::161::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(refresh) Local (id 2): {'engine-health': None, 'bridge': True, 'mem-free': None, 'maintenance': False, 'cpu-load': No
> ne, 'gateway': True}
> MainThread::INFO::2014-10-31 16:56:00,002::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1414745760.0 type=state_transition detail=StartState-ReinitializeFSM hostname='ovirt2'
> MainThread::INFO::2014-10-31 16:56:00,045::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition (StartState-ReinitializeFSM) sent? ignored
> MainThread::INFO::2014-10-31 16:56:00,325::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:
> :(start_monitoring) Current state ReinitializeFSM (score: 0)
> MainThread::INFO::2014-10-31 16:56:10,352::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1414745770.35 type=state_transition detail=ReinitializeFSM-EngineDown hostname='ovirt2'
> MainThread::INFO::2014-10-31 16:56:10,353::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (ReinitializeFSM-EngineDown) sent? ignored
> MainThread::INFO::2014-10-31 16:56:10,638::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineDown (score: 2400)
> MainThread::INFO::2014-10-31 16:56:20,663::states::441::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) The engine is not running, but we do not have enough data to decide which hosts are alive
> MainThread::INFO::2014-10-31 16:56:20,663::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1414745780.66 type=state_transition detail=EngineDown-EngineDown hostname='ovirt2'
> MainThread::INFO::2014-10-31 16:56:20,664::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineDown) sent? ignored
> MainThread::INFO::2014-10-31 16:56:20,943::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineDown (score: 2400)
> MainThread::INFO::2014-10-31 16:56:30,968::states::441::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) The engine is not running, but we do not have enough data to decide which hosts are alive
> MainThread::INFO::2014-10-31 16:56:30,969::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1414745790.97 type=state_transition detail=EngineDown-EngineDown hostname='ovirt2'
> MainThread::INFO::2014-10-31 16:56:30,969::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineDown) sent? ignored
> MainThread::INFO::2014-10-31 16:56:31,248::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineDown (score: 2400)
> MainThread::INFO::2014-10-31 16:56:41,274::states::441::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) The engine is not running, but we do not have enough data to decide which hosts are alive
> MainThread::INFO::2014-10-31 16:56:41,275::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1414745801.28 type=state_transition detail=EngineDown-EngineDown hostname='ovirt2'
> MainThread::INFO::2014-10-31 16:56:41,276::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineDown) sent? ignored
> MainThread::INFO::2014-10-31 16:56:41,555::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineDown (score: 2400)
> MainThread::INFO::2014-10-31 16:56:51,583::states::441::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) The engine is not running, but we do not have enough data to decide which hosts are alive
> MainThread::INFO::2014-10-31 16:56:51,584::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1414745811.58 type=state_transition detail=EngineDown-EngineDown hostname='ovirt2'
> MainThread::INFO::2014-10-31 16:56:51,584::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineDown) sent? ignored
> MainThread::INFO::2014-10-31 16:56:51,864::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineDown (score: 2400)
> MainThread::INFO::2014-10-31 16:57:01,897::states::454::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down and local host has best score (2400), attempting to start engine VM
> MainThread::INFO::2014-10-31 16:57:01,898::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1414745821.9 type=state_transition detail=EngineDown-EngineStart hostname='ovirt2'
> MainThread::INFO::2014-10-31 16:57:01,906::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineDown-EngineStart) sent? ignored
> MainThread::INFO::2014-10-31 16:57:02,189::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineStart (score: 2400)
> MainThread::CRITICAL::2014-10-31 16:57:02,207::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not start ha-agent
> Traceback (most recent call last):
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 97, in run
> self._run_agent()
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 154, in _run_agent
> hosted_engine.HostedEngine(self.shutdown_requested).start_monitoring()
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 307, in start_monitoring
> for old_state, state, delay in self.fsm:
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 125, in next
> new_data = self.refresh(self._state.data)
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 77, in refresh
> stats.update(self.hosted_engine.collect_stats())
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 662, in collect_stats
> constants.SERVICE_TYPE)
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 171, in get_stats_from_storage
> result = self._checked_communicate(request)
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 199, in _checked_communicate
> .format(message or response))
> RequestError: Request failed: <type 'exceptions.OSError'>
>
> [root@ovirt2 ~]# hosted-engine --vm-status
> Traceback (most recent call last):
> File "/usr/lib64/python2.6/runpy.py", line 122, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
> File "/usr/lib64/python2.6/runpy.py", line 34, in _run_code
> exec code in run_globals
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 111, in <module>
> if not status_checker.print_status():
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 58, in print_status
> all_host_stats = ha_cli.get_all_host_stats()
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 137, in get_all_host_stats
> return self.get_all_stats(self.StatModes.HOST)
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 86, in get_all_stats
> constants.SERVICE_TYPE)
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 171, in get_stats_from_storage
> result = self._checked_communicate(request)
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 199, in _checked_communicate
> .format(message or response))
> ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed: <type 'exceptions.OSError'>
> [root@ovirt2 ~]# service ovirt-ha-agent status
> ovirt-ha-agent dead but subsys locked
>
>
> Thanks,
> Jaicel
>
> ----- Original Message -----
> From: "Jiri Moskovcak" <jmoskovc(a)redhat.com>
> To: "Jaicel" <jaicel(a)asti.dost.gov.ph>
> Cc: "Niels de Vos" <ndevos(a)redhat.com>, "Vijay Bellur" <vbellur(a)redhat.com>, users(a)ovirt.org, "Gluster Devel" <gluster-devel(a)gluster.org>
> Sent: Friday, October 31, 2014 11:05:32 PM
> Subject: Re: [ovirt-users] Hosted-Engine HA problem
>
> On 10/31/2014 10:26 AM, Jaicel wrote:
>> i've increased the limit and then restarted agent and broker. status normalize, but then right now it went to "False" state again but still both having 2400 score. agent logs remains the same, with "ovirt-ha-agent dead but subsys locked" status. ha-broker logs below
>>
>> Thread-138::INFO::2014-10-31 17:24:22,981::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection established
>> Thread-138::INFO::2014-10-31 17:24:22,991::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed
>> Thread-139::INFO::2014-10-31 17:24:38,385::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection established
>> Thread-139::INFO::2014-10-31 17:24:38,395::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed
>> Thread-140::INFO::2014-10-31 17:24:53,816::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection established
>> Thread-140::INFO::2014-10-31 17:24:53,827::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed
>> Thread-141::INFO::2014-10-31 17:25:09,172::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection established
>> Thread-141::INFO::2014-10-31 17:25:09,182::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed
>> Thread-142::INFO::2014-10-31 17:25:24,551::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection established
>> Thread-142::INFO::2014-10-31 17:25:24,562::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed
>>
>> Thanks,
>> Jaicel
>
> ok, now it seems that broker runs fine, so I need the recent agent.log
> to debug it more.
>
> --Jirka
>
>>
>> ----- Original Message -----
>> From: "Jiri Moskovcak" <jmoskovc(a)redhat.com>
>> To: "Jaicel R. Sabonsolin" <jaicel(a)asti.dost.gov.ph>, "Niels de Vos" <ndevos(a)redhat.com>
>> Cc: "Vijay Bellur" <vbellur(a)redhat.com>, users(a)ovirt.org, "Gluster Devel" <gluster-devel(a)gluster.org>
>> Sent: Friday, October 31, 2014 4:32:02 PM
>> Subject: Re: [ovirt-users] Hosted-Engine HA problem
>>
>> On 10/31/2014 03:53 AM, Jaicel R. Sabonsolin wrote:
>>> Hi guys,
>>>
>>> these logs appear on both hosts just like the result of --vm-status. tried to tcpdump on ovirt hosts and gluster nodes but only packets exchange with my monitoring VM(zabbix) appeared.
>>>
>>> agent.log
>>> new_data = self.refresh(self._state.data)
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 77, in refresh
>>> stats.update(self.hosted_engine.collect_stats())
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 662, in collect_stats
>>> constants.SERVICE_TYPE)
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 171, in get_stats_from_storage
>>> result = self._checked_communicate(request)
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 199, in _checked_communicate
>>> .format(message or response))
>>> RequestError: Request failed: <type 'exceptions.OSError'>
>>>
>>> broker.log
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 165, in handle
>>> response = "success " + self._dispatch(data)
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 261, in _dispatch
>>> .get_all_stats_for_service_type(**options)
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 41, in get_all_stats_for_service_type
>>> d = self.get_raw_stats_for_service_type(storage_dir, service_type)
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 74, in get_raw_stats_for_service_type
>>> f = os.open(path, direct_flag | os.O_RDONLY)
>>> OSError: [Errno 24] Too many open files: '/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.metadata'
>>
>> - ah, there we go ^^^^^^ you might need to tweak the limit of allowed
>> open files as described here [1] or find the app keeps so many files open
>>
>>
>> --Jirka
>>
>> [1]
>> http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-file…
>>
>>> Thread-38160::INFO::2014-10-31 10:28:37,989::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed
>>> Thread-38161::INFO::2014-10-31 10:28:53,656::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection established
>>> Thread-38161::ERROR::2014-10-31 10:28:53,657::listener::190::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Error handling request, data: 'get-stats storage_dir=/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent service_type=hosted-engine'
>>> Traceback (most recent call last):
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 165, in handle
>>> response = "success " + self._dispatch(data)
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 261, in _dispatch
>>> .get_all_stats_for_service_type(**options)
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 41, in get_all_stats_for_service_type
>>> d = self.get_raw_stats_for_service_type(storage_dir, service_type)
>>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 74, in get_raw_stats_for_service_type
>>> f = os.open(path, direct_flag | os.O_RDONLY)
>>> OSError: [Errno 24] Too many open files: '/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.metadata'
>>> Thread-38161::INFO::2014-10-31 10:28:53,658::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed
>>>
>>> Thanks,
>>> Jaicel
>>>
>>> ----- Original Message -----
>>> From: "Niels de Vos" <ndevos(a)redhat.com>
>>> To: "Vijay Bellur" <vbellur(a)redhat.com>
>>> Cc: "Jiri Moskovcak" <jmoskovc(a)redhat.com>, "Jaicel R. Sabonsolin" <jaicel(a)asti.dost.gov.ph>, users(a)ovirt.org, "Gluster Devel" <gluster-devel(a)gluster.org>
>>> Sent: Friday, October 31, 2014 4:11:25 AM
>>> Subject: Re: [ovirt-users] Hosted-Engine HA problem
>>>
>>> On Thu, Oct 30, 2014 at 09:07:24PM +0530, Vijay Bellur wrote:
>>>> On 10/30/2014 06:45 PM, Jiri Moskovcak wrote:
>>>>> On 10/30/2014 09:22 AM, Jaicel R. Sabonsolin wrote:
>>>>>> Hi Guys,
>>>>>>
>>>>>> I need help with my ovirt Hosted-Engine HA setup. I am running on 2
>>>>>> ovirt hosts and 2 gluster nodes with replicated volumes. i already have
>>>>>> VMs running on my hosts and they can migrate normally once i for example
>>>>>> power off the host that they are running on. the problem is that the
>>>>>> engine can't migrate once i switch off the host that hosts the engine.
>>>>>>
>>>>>> oVirt 3.4.3-1.el6
>>>>>> KVM 0.12.1.2 - 2.415.el6_5.10
>>>>>> LIBVIRT libvirt-0.10.2-29.el6_5.9
>>>>>> VDSM vdsm-4.14.17-0.el6
>>>>>>
>>>>>>
>>>>>> right now, i have this result from hosted-engine --vm-status.
>>>>>>
>>>>>> File "/usr/lib64/python2.6/runpy.py", line 122, in
>>>>>> _run_module_as_main
>>>>>> "__main__", fname, loader, pkg_name)
>>>>>> File "/usr/lib64/python2.6/runpy.py", line 34, in _run_code
>>>>>> exec code in run_globals
>>>>>> File
>>>>>>
>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.py",
>>>>>>
>>>>>> line 111, in <module>
>>>>>> if not status_checker.print_status():
>>>>>> File
>>>>>>
>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.py",
>>>>>>
>>>>>> line 58, in print_status
>>>>>> all_host_stats = ha_cli.get_all_host_stats()
>>>>>> File
>>>>>>
>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py",
>>>>>>
>>>>>> line 137, in get_all_host_stats
>>>>>> return self.get_all_stats(self.StatModes.HOST)
>>>>>> File
>>>>>>
>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py",
>>>>>>
>>>>>> line 86, in get_all_stats
>>>>>> constants.SERVICE_TYPE)
>>>>>> File
>>>>>>
>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>>
>>>>>> line 171, in get_stats_from_storage
>>>>>> result = self._checked_communicate(request)
>>>>>> File
>>>>>>
>>>>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>>
>>>>>> line 199, in _checked_communicate
>>>>>> .format(message or response))
>>>>>> ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed:
>>>>>> <type 'exceptions.OSError'>
>>>>>>
>>>>>>
>>>>>> restarting ha-broker and ha-agent normalizes the status but eventually
>>>>>> it would become "false" and then return to the result above. hope you
>>>>>> guys could help me with this.
>>>>>>
>>>>>
>>>>> Hi Jaicel,
>>>>> please attach agent.log and broker.log from the host where you trying to
>>>>> run hosted-engine --vm-status. I have a feeling that you ran into a
>>>>> known problem on gluster - stalled file descriptor, in that case the
>>>>> only known solution at this time is to restart the broker & agent as you
>>>>> have already found out.
>>>>>
>>>>
>>>> Adding Niels and gluster-devel to troubleshoot from Gluster NFS perspective.
>>>
>>> I'd welcome any details on this "stalled file descriptor" problem. Is
>>> there a bug filed with some details like logs, sysrq-t and maybe even
>>> tcpdumps? If there is an easy way to reproduce this behaviour, I can
>>> surely look into it and hopefully come up with some advise or fix.
>>>
>>> Thanks,
>>> Niels
>>>
------=_Part_891997_988106587.1415681763753
Content-Type: multipart/related;
boundary="----=_Part_891998_1183135702.1415681763754"
------=_Part_891998_1183135702.1415681763754
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable
<html><body><div style=3D"font-family: arial, helvetica, sans-serif; font-s=
ize: 10pt; color: #000000"><div data-marker=3D"__QUOTED_TEXT__"><div style=
=3D"font-family: arial, helvetica, sans-serif; font-size: 10pt; color: #000=
000"><div>Hi Jirka,<br><br>the patch works. it stabilized the status of my =
two hosts. the engine migration during failover also works fine. thanks guy=
s! <img src=3D"cid:8b096be5d873a9597907183bb13f9baf5a0669a2@zimbra"><br></d=
iv><div><br data-mce-bogus=3D"1"></div><div>Jaicel</div><br><hr id=3D"zwchr=
"><div><b>From: </b>"Jiri Moskovcak" <jmoskovc(a)redhat.com><br><b>To: =
</b>"Jaicel" <jaicel(a)asti.dost.gov.ph><br><b>Cc: </b>"Niels de Vos" &=
lt;ndevos(a)redhat.com>, "Vijay Bellur" <vbellur(a)redhat.com>, users@=
ovirt.org, "Gluster Devel" <gluster-devel(a)gluster.org><br><b>Sent: </=
b>Monday, November 3, 2014 3:33:16 PM<br><b>Subject: </b>Re: [ovirt-users] =
Hosted-Engine HA problem<br></div><br><div>On 11/01/2014 07:43 AM, Jaicel w=
rote:<br>> Hi,<br>><br>> my engine runs on Host1. current status a=
nd agent logs below.<br>><br>> Host 1<br><br>Hi,<br>it seems like you=
ran into [1], you can either zero-out the metadata <br>file or apply the p=
atch from [1] manually.<br><br>--Jirka<br><br>[1] https://bugzilla.redhat.c=
om/show_bug.cgi?id=3D1158925<br><br>><br>> MainThread::INFO::2014-10-=
31 16:55:39,918::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run)=
ovirt-hosted-engi<br>> ne-ha agent 1.1.6 started<br>> MainThread::IN=
FO::2014-10-31 16:55:39,985::hosted_engine::223::ovirt_hosted_engine_ha.age=
nt.hosted_engine.HostedEngine:<br>> :(_get_hostname) Found certificate c=
ommon name: 192.168.12.11<br>> MainThread::INFO::2014-10-31 16:55:40,228=
::hosted_engine::367::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngi=
ne:<br>> :(_initialize_broker) Initializing ha-broker connection<br>>=
MainThread::INFO::2014-10-31 16:55:40,228::brokerlink::126::ovirt_hosted_e=
ngine_ha.lib.brokerlink.BrokerLink::(start_mo<br>> nitor) Starting monit=
or ping, options {'addr': '192.168.12.254'}<br>> MainThread::INFO::2014-=
10-31 16:55:40,231::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.=
BrokerLink::(start_mo<br>> nitor) Success, id 140634215107920<br>> Ma=
inThread::INFO::2014-10-31 16:55:40,231::brokerlink::126::ovirt_hosted_engi=
ne_ha.lib.brokerlink.BrokerLink::(start_mo<br>> nitor) Starting monitor =
mgmt-bridge, options {'use_ssl': 'true', 'bridge_name': 'ovirtmgmt', 'addre=
ss': '0'}<br>> MainThread::INFO::2014-10-31 16:55:40,237::brokerlink::13=
7::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo<br>> nito=
r) Success, id 140634215108432<br>> MainThread::INFO::2014-10-31 16:55:4=
0,237::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(=
start_mo<br>> nitor) Starting monitor mem-free, options {'use_ssl': 'tru=
e', 'address': '0'}<br>> MainThread::INFO::2014-10-31 16:55:40,240::brok=
erlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo<br=
>> nitor) Success, id 39956688<br>> MainThread::INFO::2014-10-31 16:5=
5:40,240::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink=
::(start_mo<br>> nitor) Starting monitor cpu-load-no-engine, options {'u=
se_ssl': 'true', 'vm_uuid': '41d4aff1-54e1-4946-a812-2e656bb7d3f<br>> 9'=
, 'address': '0'}<br>> MainThread::INFO::2014-10-31 16:55:40,243::broker=
link::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo<br>&=
gt; nitor) Success, id 140634215107664<br>> MainThread::INFO::2014-10-31=
16:55:40,244::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.Broke=
rLink::(start_mo<br>> nitor) Starting monitor engine-health, options {'u=
se_ssl': 'true', 'vm_uuid': '41d4aff1-54e1-4946-a812-2e656bb7d3f9', '<br>&g=
t; address': '0'}<br>> MainThread::INFO::2014-10-31 16:55:40,249::broker=
link::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo<br>&=
gt; nitor) Success, id 140634006879632<br>> MainThread::INFO::2014-10-31=
16:55:40,249::hosted_engine::391::ovirt_hosted_engine_ha.agent.hosted_engi=
ne.HostedEngine:<br>> :(_initialize_broker) Broker initialized, all subm=
onitors started<br>> MainThread::INFO::2014-10-31 16:55:40,298::hosted_e=
ngine::476::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:<br>>=
; :(_initialize_sanlock) Ensuring lease for lockspace hosted-engine, host i=
d 1 is acquired (file: /rhev/data-center/mnt/g<br>> luster1:_engine/6eb2=
20be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.lockspace)<br>> =
MainThread::INFO::2014-10-31 16:55:40,322::state_machine::153::ovirt_hosted=
_engine_ha.agent.hosted_engine.HostedEngine:<br>> :(refresh) Global meta=
data: {'maintenance': False}<br>> MainThread::INFO::2014-10-31 16:55:40,=
322::state_machine::158::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE=
ngine:<br>> :(refresh) Host 192.168.12.12 (id 2): {'live-data': False, '=
extra': 'metadata_parse_version=3D1\nmetadata_feature_version<br>> =3D1\=
ntimestamp=3D1413882675 (Tue Oct 21 17:11:15 2014)\nhost-id=3D2\nscore=3D24=
00\nmaintenance=3DFalse\nstate=3DEngineDown\n', 'hostname': '192.168.12.12'=
, 'host-id': 2, 'engine-status': {'reason': 'vm not running on this host', =
'health': 'bad', 'vm': 'down', 'detail': 'unknown'}, 'score': 2400, 'mainte=
nance': False, 'host-ts': 1413882675}<br>> MainThread::INFO::2014-10-31 =
16:55:40,322::state_machine::161::ovirt_hosted_engine_ha.agent.hosted_engin=
e.HostedEngine::(refresh) Local (id 1): {'engine-health': None, 'bridge': T=
rue, 'mem-free': None, 'maintenance': False, 'cpu-load': None, 'gateway': T=
rue}<br>> MainThread::INFO::2014-10-31 16:55:40,323::brokerlink::108::ov=
irt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify tim=
e=3D1414745740.32 type=3Dstate_transition detail=3DStartState-ReinitializeF=
SM hostname=3D'ovirt1'<br>> MainThread::INFO::2014-10-31 16:55:40,392::b=
rokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) =
Success, was notification of state_transition (StartState-ReinitializeFSM) =
sent? ignored<br>> MainThread::INFO::2014-10-31 16:55:40,675::hosted_eng=
ine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_m=
onitoring) Current state ReinitializeFSM (score: 0)<br>> MainThread::INF=
O::2014-10-31 16:55:50,710::brokerlink::108::ovirt_hosted_engine_ha.lib.bro=
kerlink.BrokerLink::(notify)<br>> Trying: notify time=3D1414745750.71 ty=
pe=3Dstate_transition detail=3DReinitializeFSM-EngineUp hostname=3D'ovirt1'=
<br>> MainThread::INFO::2014-10-31 16:55:50,710::brokerlink::117::ovirt_=
hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notificat=
ion of state_transition (ReinitializeFSM-EngineUp) sent? ignored<br>> Ma=
inThread::INFO::2014-10-31 16:55:51,001::hosted_engine::327::ovirt_hosted_e=
ngine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state=
EngineUp (score: 2400)<br>> MainThread::CRITICAL::2014-10-31 16:56:01,0=
33::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could not s=
tart ha-agent<br>> Traceback (most recent call last):<br>> &nb=
sp;File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agen=
t.py", line 97, in run<br>> self._run_agent()<br>>=
; File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_h=
a/agent/agent.py", line 154, in _run_agent<br>> host=
ed_engine.HostedEngine(self.shutdown_requested).start_monitoring()<br>> =
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/=
agent/hosted_engine.py", line 307, in start_monitoring<br>>  =
; for old_state, state, delay in self.fsm:<br>> File =
"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py=
", line 125, in next<br>> new_data =3D self.refresh(=
self._state.data)<br>> File "/usr/lib/python2.6/site-packag=
es/ovirt_hosted_engine_ha/agent/state_machine.py", line 77, in refresh<br>&=
gt; stats.update(self.hosted_engine.collect_stats())<br=
>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engi=
ne_ha/agent/hosted_engine.py", line 700, in collect_stats<br>> &n=
bsp; stats =3D self.process_remote_metadata(host_id, remote_data)<br>=
> File "/usr/lib/python2.6/site-packages/ovirt_hosted_engin=
e_ha/agent/hosted_engine.py", line 747, in process_remote_metadata<br>> =
md['engine-status'] =3D engine_status(md["engine-status=
"])<br>> File "/usr/lib/python2.6/site-packages/ovirt_hoste=
d_engine_ha/agent/hosted_engine.py", line 79, in engine_status<br>> &nbs=
p; in json.loads(status).iteritems()])<br>> AttributeError:=
'NoneType' object has no attribute 'iteritems'<br>> [root@ovirt1 ~]# ho=
sted-engine --vm-status<br>><br>><br>> --=3D=3D Host 1 status =3D=
=3D--<br>><br>> Status up-to-date =
: False<br>> Hostname &n=
bsp; : 192.1=
68.12.11<br>> Host ID &=
nbsp; : 1<br>> Engine status &n=
bsp; :=
unknown stale-data<br>> Score =
: 2400<br>&g=
t; Local maintenance  =
; : False<br>> Host timestamp &=
nbsp; : 1414745750<br>> Extra metadata (vali=
d at timestamp):<br>> metadata_parse_v=
ersion=3D1<br>> metadata_feature_versi=
on=3D1<br>> timestamp=3D1414745750 (Fr=
i Oct 31 16:55:50 2014)<br>> host-id=
=3D1<br>> score=3D2400<br>> =
maintenance=3DFalse<br>> =
state=3DEngineUp<br>><br>><br>> --=3D=3D Host 2 stat=
us =3D=3D--<br>><br>> Status up-to-date &=
nbsp; : False<br>> Hostname &nb=
sp; :=
192.168.12.12<br>> Host ID &n=
bsp; : 2<br>> Engine sta=
tus &=
nbsp;: unknown stale-data<br>> Score =
: 2400=
<br>> Local maintenance =
: False<br>> Host timestamp &n=
bsp; : 1414745821<br>> Extra metadata=
(valid at timestamp):<br>> metadata_p=
arse_version=3D1<br>> metadata_feature=
_version=3D1<br>> timestamp=3D14147458=
21 (Fri Oct 31 16:57:01 2014)<br>> hos=
t-id=3D2<br>> score=3D2400<br>> &nb=
sp; maintenance=3DFalse<br>> &n=
bsp; state=3DEngineStart<br>> [root@ovirt1 ~]# service ovir=
t-ha-agent status<br>> ovirt-ha-agent dead but subsys locked<br>><br>=
> Host2<br>><br>> MainThread::INFO::2014-10-31 16:55:59,642::agent=
::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engi<br>=
> ne-ha agent 1.1.6 started<br>> MainThread::INFO::2014-10-31 16:55:5=
9,678::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.Hoste=
dEngine:<br>> :(_get_hostname) Found certificate common name: 192.168.12=
.12<br>> MainThread::INFO::2014-10-31 16:55:59,918::hosted_engine::367::=
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:<br>> :(_initial=
ize_broker) Initializing ha-broker connection<br>> MainThread::INFO::201=
4-10-31 16:55:59,919::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlin=
k.BrokerLink::(start_mo<br>> nitor) Starting monitor ping, options {'add=
r': '192.168.12.254'}<br>> MainThread::INFO::2014-10-31 16:55:59,922::br=
okerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo<=
br>> nitor) Success, id 25353488<br>> MainThread::INFO::2014-10-31 16=
:55:59,922::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLi=
nk::(start_mo<br>> nitor) Starting monitor mgmt-bridge, options {'use_ss=
l': 'true', 'bridge_name': 'ovirtmgmt', 'address': '0'}<br>> MainThread:=
:INFO::2014-10-31 16:55:59,928::brokerlink::137::ovirt_hosted_engine_ha.lib=
.brokerlink.BrokerLink::(start_mo<br>> nitor) Success, id 25354128<br>&g=
t; MainThread::INFO::2014-10-31 16:55:59,928::brokerlink::126::ovirt_hosted=
_engine_ha.lib.brokerlink.BrokerLink::(start_mo<br>> nitor) Starting mon=
itor mem-free, options {'use_ssl': 'true', 'address': '0'}<br>> MainThre=
ad::INFO::2014-10-31 16:55:59,931::brokerlink::137::ovirt_hosted_engine_ha.=
lib.brokerlink.BrokerLink::(start_mo<br>> nitor) Success, id 25353552<br=
>> MainThread::INFO::2014-10-31 16:55:59,931::brokerlink::126::ovirt_hos=
ted_engine_ha.lib.brokerlink.BrokerLink::(start_mo<br>> nitor) Starting =
monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid': '41d4aff=
1-54e1-4946-a812-2e656bb7d3f<br>> 9', 'address': '0'}<br>> MainThread=
::INFO::2014-10-31 16:55:59,934::brokerlink::137::ovirt_hosted_engine_ha.li=
b.brokerlink.BrokerLink::(start_mo<br>> nitor) Success, id 1399766083895=
84<br>> MainThread::INFO::2014-10-31 16:55:59,934::brokerlink::126::ovir=
t_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_mo<br>> nitor) Star=
ting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid': '41d4aff=
1-54e1-4946-a812-2e656bb7d3f9', '<br>> address': '0'}<br>> MainThread=
::INFO::2014-10-31 16:55:59,939::brokerlink::137::ovirt_hosted_engine_ha.li=
b.brokerlink.BrokerLink::(start_mo<br>> nitor) Success, id 1399766084477=
60<br>> MainThread::INFO::2014-10-31 16:55:59,939::hosted_engine::391::o=
virt_hosted_engine_ha.agent.hosted_engine.HostedEngine:<br>> :(_initiali=
ze_broker) Broker initialized, all submonitors started<br>> MainThread::=
INFO::2014-10-31 16:55:59,983::hosted_engine::476::ovirt_hosted_engine_ha.a=
gent.hosted_engine.HostedEngine:<br>> :(_initialize_sanlock) Ensuring le=
ase for lockspace hosted-engine, host id 2 is acquired (file: /rhev/data-ce=
nter/mnt/g<br>> luster1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_=
agent/hosted-engine.lockspace)<br>> MainThread::INFO::2014-10-31 16:56:0=
0,001::state_machine::153::ovirt_hosted_engine_ha.agent.hosted_engine.Hoste=
dEngine:<br>> :(refresh) Global metadata: {'maintenance': False}<br>>=
MainThread::INFO::2014-10-31 16:56:00,001::state_machine::158::ovirt_hoste=
d_engine_ha.agent.hosted_engine.HostedEngine:<br>> :(refresh) Host 192.1=
68.12.11 (id 1): {'live-data': True, 'extra': 'metadata_parse_version=3D1\n=
metadata_feature_version=3D<br>> 1\ntimestamp=3D1414745750 (Fri Oct 31 1=
6:55:50 2014)\nhost-id=3D1\nscore=3D2400\nmaintenance=3DFalse\nstate=3DEngi=
neUp\n', 'hostn<br>> ame': '192.168.12.11', 'host-id': 1, 'engine-status=
': {'health': 'good', 'vm': 'up', 'detail': 'up'}, 'score': 2400, 'm<br>>=
; aintenance': False, 'host-ts': 1414745750}<br>> MainThread::INFO::2014=
-10-31 16:56:00,001::state_machine::161::ovirt_hosted_engine_ha.agent.hoste=
d_engine.HostedEngine:<br>> :(refresh) Local (id 2): {'engine-health': N=
one, 'bridge': True, 'mem-free': None, 'maintenance': False, 'cpu-load': No=
<br>> ne, 'gateway': True}<br>> MainThread::INFO::2014-10-31 16:56:00=
,002::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(n=
otify)<br>> Trying: notify time=3D1414745760.0 type=3Dstate_transition d=
etail=3DStartState-ReinitializeFSM hostname=3D'ovirt2'<br>> MainThread::=
INFO::2014-10-31 16:56:00,045::brokerlink::117::ovirt_hosted_engine_ha.lib.=
brokerlink.BrokerLink::(notify)<br>> Success, was notification of state_=
transition (StartState-ReinitializeFSM) sent? ignored<br>> MainThread::I=
NFO::2014-10-31 16:56:00,325::hosted_engine::327::ovirt_hosted_engine_ha.ag=
ent.hosted_engine.HostedEngine:<br>> :(start_monitoring) Current state R=
einitializeFSM (score: 0)<br>> MainThread::INFO::2014-10-31 16:56:10,352=
::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notif=
y) Trying: notify time=3D1414745770.35 type=3Dstate_transition detail=3DRei=
nitializeFSM-EngineDown hostname=3D'ovirt2'<br>> MainThread::INFO::2014-=
10-31 16:56:10,353::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.=
BrokerLink::(notify) Success, was notification of state_transition (Reiniti=
alizeFSM-EngineDown) sent? ignored<br>> MainThread::INFO::2014-10-31 16:=
56:10,638::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.H=
ostedEngine::(start_monitoring) Current state EngineDown (score: 2400)<br>&=
gt; MainThread::INFO::2014-10-31 16:56:20,663::states::441::ovirt_hosted_en=
gine_ha.agent.hosted_engine.HostedEngine::(consume) The engine is not runni=
ng, but we do not have enough data to decide which hosts are alive<br>> =
MainThread::INFO::2014-10-31 16:56:20,663::brokerlink::108::ovirt_hosted_en=
gine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=3D141474578=
0.66 type=3Dstate_transition detail=3DEngineDown-EngineDown hostname=3D'ovi=
rt2'<br>> MainThread::INFO::2014-10-31 16:56:20,664::brokerlink::117::ov=
irt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notif=
ication of state_transition (EngineDown-EngineDown) sent? ignored<br>> M=
ainThread::INFO::2014-10-31 16:56:20,943::hosted_engine::327::ovirt_hosted_=
engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current stat=
e EngineDown (score: 2400)<br>> MainThread::INFO::2014-10-31 16:56:30,96=
8::states::441::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(c=
onsume) The engine is not running, but we do not have enough data to decide=
which hosts are alive<br>> MainThread::INFO::2014-10-31 16:56:30,969::b=
rokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) =
Trying: notify time=3D1414745790.97 type=3Dstate_transition detail=3DEngine=
Down-EngineDown hostname=3D'ovirt2'<br>> MainThread::INFO::2014-10-31 16=
:56:30,969::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLi=
nk::(notify) Success, was notification of state_transition (EngineDown-Engi=
neDown) sent? ignored<br>> MainThread::INFO::2014-10-31 16:56:31,248::ho=
sted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::=
(start_monitoring) Current state EngineDown (score: 2400)<br>> MainThrea=
d::INFO::2014-10-31 16:56:41,274::states::441::ovirt_hosted_engine_ha.agent=
.hosted_engine.HostedEngine::(consume) The engine is not running, but we do=
not have enough data to decide which hosts are alive<br>> MainThread::I=
NFO::2014-10-31 16:56:41,275::brokerlink::108::ovirt_hosted_engine_ha.lib.b=
rokerlink.BrokerLink::(notify) Trying: notify time=3D1414745801.28 type=3Ds=
tate_transition detail=3DEngineDown-EngineDown hostname=3D'ovirt2'<br>> =
MainThread::INFO::2014-10-31 16:56:41,276::brokerlink::117::ovirt_hosted_en=
gine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of st=
ate_transition (EngineDown-EngineDown) sent? ignored<br>> MainThread::IN=
FO::2014-10-31 16:56:41,555::hosted_engine::327::ovirt_hosted_engine_ha.age=
nt.hosted_engine.HostedEngine::(start_monitoring) Current state EngineDown =
(score: 2400)<br>> MainThread::INFO::2014-10-31 16:56:51,583::states::44=
1::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) The e=
ngine is not running, but we do not have enough data to decide which hosts =
are alive<br>> MainThread::INFO::2014-10-31 16:56:51,584::brokerlink::10=
8::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notif=
y time=3D1414745811.58 type=3Dstate_transition detail=3DEngineDown-EngineDo=
wn hostname=3D'ovirt2'<br>> MainThread::INFO::2014-10-31 16:56:51,584::b=
rokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) =
Success, was notification of state_transition (EngineDown-EngineDown) sent?=
ignored<br>> MainThread::INFO::2014-10-31 16:56:51,864::hosted_engine::=
327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monito=
ring) Current state EngineDown (score: 2400)<br>> MainThread::INFO::2014=
-10-31 16:57:01,897::states::454::ovirt_hosted_engine_ha.agent.hosted_engin=
e.HostedEngine::(consume) Engine down and local host has best score (2400),=
attempting to start engine VM<br>> MainThread::INFO::2014-10-31 16:57:0=
1,898::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(=
notify) Trying: notify time=3D1414745821.9 type=3Dstate_transition detail=
=3DEngineDown-EngineStart hostname=3D'ovirt2'<br>> MainThread::INFO::201=
4-10-31 16:57:01,906::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlin=
k.BrokerLink::(notify) Success, was notification of state_transition (Engin=
eDown-EngineStart) sent? ignored<br>> MainThread::INFO::2014-10-31 16:57=
:02,189::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.Hos=
tedEngine::(start_monitoring) Current state EngineStart (score: 2400)<br>&g=
t; MainThread::CRITICAL::2014-10-31 16:57:02,207::agent::103::ovirt_hosted_=
engine_ha.agent.agent.Agent::(run) Could not start ha-agent<br>> Traceba=
ck (most recent call last):<br>> File "/usr/lib/python2.6/s=
ite-packages/ovirt_hosted_engine_ha/agent/agent.py", line 97, in run<br>>=
; self._run_agent()<br>> File "/usr/lib=
/python2.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 154, =
in _run_agent<br>> hosted_engine.HostedEngine(self.s=
hutdown_requested).start_monitoring()<br>> File "/usr/lib/p=
ython2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line=
307, in start_monitoring<br>> for old_state, state,=
delay in self.fsm:<br>> File "/usr/lib/python2.6/site-pack=
ages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 125, in next<br>> =
new_data =3D self.refresh(self._state.data)<br>> &nb=
sp; File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/age=
nt/state_machine.py", line 77, in refresh<br>> stats=
.update(self.hosted_engine.collect_stats())<br>> File "/usr=
/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py"=
, line 662, in collect_stats<br>> constants.SERVICE_=
TYPE)<br>> File "/usr/lib/python2.6/site-packages/ovirt_hos=
ted_engine_ha/lib/brokerlink.py", line 171, in get_stats_from_storage<br>&g=
t; result =3D self._checked_communicate(request)<br>>=
; File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_h=
a/lib/brokerlink.py", line 199, in _checked_communicate<br>> &nbs=
p; .format(message or response))<br>> RequestError: Request failed=
: <type 'exceptions.OSError'><br>><br>> [root@ovirt2 ~]# hosted=
-engine --vm-status<br>> Traceback (most recent call last):<br>> &nbs=
p; File "/usr/lib64/python2.6/runpy.py", line 122, in _run_module_as_=
main<br>> "__main__", fname, loader, pkg_name)<br>&g=
t; File "/usr/lib64/python2.6/runpy.py", line 34, in _run_code=
<br>> exec code in run_globals<br>> =
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.=
py", line 111, in <module><br>> if not status_=
checker.print_status():<br>> File "/usr/lib/python2.6/site-=
packages/ovirt_hosted_engine_setup/vm_status.py", line 58, in print_status<=
br>> all_host_stats =3D ha_cli.get_all_host_stats()<=
br>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_en=
gine_ha/client/client.py", line 137, in get_all_host_stats<br>> &=
nbsp; return self.get_all_stats(self.StatModes.HOST)<br>> &=
nbsp;File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/c=
lient.py", line 86, in get_all_stats<br>> constants.=
SERVICE_TYPE)<br>> File "/usr/lib/python2.6/site-packages/o=
virt_hosted_engine_ha/lib/brokerlink.py", line 171, in get_stats_from_stora=
ge<br>> result =3D self._checked_communicate(request=
)<br>> File "/usr/lib/python2.6/site-packages/ovirt_hosted_=
engine_ha/lib/brokerlink.py", line 199, in _checked_communicate<br>> &nb=
sp; .format(message or response))<br>> ovirt_hosted_engine_=
ha.lib.exceptions.RequestError: Request failed: <type 'exceptions.OSErro=
r'><br>> [root@ovirt2 ~]# service ovirt-ha-agent status<br>> ovirt=
-ha-agent dead but subsys locked<br>><br>><br>> Thanks,<br>> Ja=
icel<br>><br>> ----- Original Message -----<br>> From: "Jiri Mosko=
vcak" <jmoskovc(a)redhat.com><br>> To: "Jaicel" <jaicel(a)asti.dost=
.gov.ph><br>> Cc: "Niels de Vos" <ndevos(a)redhat.com>, "Vijay Be=
llur" <vbellur(a)redhat.com>, users(a)ovirt.org, "Gluster Devel" <glus=
ter-devel(a)gluster.org><br>> Sent: Friday, October 31, 2014 11:05:32 P=
M<br>> Subject: Re: [ovirt-users] Hosted-Engine HA problem<br>><br>&g=
t; On 10/31/2014 10:26 AM, Jaicel wrote:<br>>> i've increased the lim=
it and then restarted agent and broker. status normalize, but then right no=
w it went to "False" state again but still both having 2400 score. agent lo=
gs remains the same, with "ovirt-ha-agent dead but subsys locked" status. h=
a-broker logs below<br>>><br>>> Thread-138::INFO::2014-10-31 17=
:24:22,981::listener::134::ovirt_hosted_engine_ha.broker.listener.Connectio=
nHandler::(setup) Connection established<br>>> Thread-138::INFO::2014=
-10-31 17:24:22,991::listener::184::ovirt_hosted_engine_ha.broker.listener.=
ConnectionHandler::(handle) Connection closed<br>>> Thread-139::INFO:=
:2014-10-31 17:24:38,385::listener::134::ovirt_hosted_engine_ha.broker.list=
ener.ConnectionHandler::(setup) Connection established<br>>> Thread-1=
39::INFO::2014-10-31 17:24:38,395::listener::184::ovirt_hosted_engine_ha.br=
oker.listener.ConnectionHandler::(handle) Connection closed<br>>> Thr=
ead-140::INFO::2014-10-31 17:24:53,816::listener::134::ovirt_hosted_engine_=
ha.broker.listener.ConnectionHandler::(setup) Connection established<br>>=
;> Thread-140::INFO::2014-10-31 17:24:53,827::listener::184::ovirt_hoste=
d_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed<b=
r>>> Thread-141::INFO::2014-10-31 17:25:09,172::listener::134::ovirt_=
hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection esta=
blished<br>>> Thread-141::INFO::2014-10-31 17:25:09,182::listener::18=
4::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Conne=
ction closed<br>>> Thread-142::INFO::2014-10-31 17:25:24,551::listene=
r::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) C=
onnection established<br>>> Thread-142::INFO::2014-10-31 17:25:24,562=
::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::=
(handle) Connection closed<br>>><br>>> Thanks,<br>>> Jaic=
el<br>><br>> ok, now it seems that broker runs fine, so I need the re=
cent agent.log<br>> to debug it more.<br>><br>> --Jirka<br>><br=
>>><br>>> ----- Original Message -----<br>>> From: "Jiri =
Moskovcak" <jmoskovc(a)redhat.com><br>>> To: "Jaicel R. Sabonsoli=
n" <jaicel(a)asti.dost.gov.ph>, "Niels de Vos" <ndevos(a)redhat.com>=
;<br>>> Cc: "Vijay Bellur" <vbellur(a)redhat.com>, users(a)ovirt.or=
g, "Gluster Devel" <gluster-devel(a)gluster.org><br>>> Sent: Frid=
ay, October 31, 2014 4:32:02 PM<br>>> Subject: Re: [ovirt-users] Host=
ed-Engine HA problem<br>>><br>>> On 10/31/2014 03:53 AM, Jaicel=
R. Sabonsolin wrote:<br>>>> Hi guys,<br>>>><br>>>&=
gt; these logs appear on both hosts just like the result of --vm-status. tr=
ied to tcpdump on ovirt hosts and gluster nodes but only packets exchange w=
ith my monitoring VM(zabbix) appeared.<br>>>><br>>>> agen=
t.log<br>>>> new_data =3D self.refresh(=
self._state.data)<br>>>> File "/usr/lib/python=
2.6/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 77, =
in refresh<br>>>> stats.update(self.hos=
ted_engine.collect_stats())<br>>>> File "/usr/=
lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",=
line 662, in collect_stats<br>>>> cons=
tants.SERVICE_TYPE)<br>>>> File "/usr/lib/pyth=
on2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 171, in=
get_stats_from_storage<br>>>> result =
=3D self._checked_communicate(request)<br>>>> =
File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlin=
k.py", line 199, in _checked_communicate<br>>>> &nbs=
p; .format(message or response))<br>>>> RequestError: Reques=
t failed: <type 'exceptions.OSError'><br>>>><br>>>>=
broker.log<br>>>> File "/usr/lib/python2.6/si=
te-packages/ovirt_hosted_engine_ha/broker/listener.py", line 165, in handle=
<br>>>> response =3D "success " + self.=
_dispatch(data)<br>>>> File "/usr/lib/python2.=
6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 261, in _d=
ispatch<br>>>> .get_all_stats_for_servi=
ce_type(**options)<br>>>> File "/usr/lib/pytho=
n2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 4=
1, in get_all_stats_for_service_type<br>>>> &=
nbsp;d =3D self.get_raw_stats_for_service_type(storage_dir, service_type)<b=
r>>>> File "/usr/lib/python2.6/site-packages/o=
virt_hosted_engine_ha/broker/storage_broker.py", line 74, in get_raw_stats_=
for_service_type<br>>>> f =3D os.open(p=
ath, direct_flag | os.O_RDONLY)<br>>>> OSError: [Errno 24] Too man=
y open files: '/rhev/data-center/mnt/gluster1:_engine/6eb220be-daff-4785-8f=
78-111cc24139c4/ha_agent/hosted-engine.metadata'<br>>><br>>> - =
ah, there we go ^^^^^^ you might need to tweak the limit of allowed<br>>=
> open files as described here [1] or find the app keeps so many files o=
pen<br>>><br>>><br>>> --Jirka<br>>><br>>> [1]=
<br>>> http://www.cyberciti.biz/faq/linux-increase-the-maximum-number=
-of-open-files/<br>>><br>>>> Thread-38160::INFO::2014-10-31 =
10:28:37,989::listener::184::ovirt_hosted_engine_ha.broker.listener.Connect=
ionHandler::(handle) Connection closed<br>>>> Thread-38161::INFO::=
2014-10-31 10:28:53,656::listener::134::ovirt_hosted_engine_ha.broker.liste=
ner.ConnectionHandler::(setup) Connection established<br>>>> Threa=
d-38161::ERROR::2014-10-31 10:28:53,657::listener::190::ovirt_hosted_engine=
_ha.broker.listener.ConnectionHandler::(handle) Error handling request, dat=
a: 'get-stats storage_dir=3D/rhev/data-center/mnt/gluster1:_engine/6eb220be=
-daff-4785-8f78-111cc24139c4/ha_agent service_type=3Dhosted-engine'<br>>=
>> Traceback (most recent call last):<br>>>> &=
nbsp;File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/l=
istener.py", line 165, in handle<br>>>>  =
;response =3D "success " + self._dispatch(data)<br>>>> &nbs=
p; File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/brok=
er/listener.py", line 261, in _dispatch<br>>>>  =
; .get_all_stats_for_service_type(**options)<br>>>> &=
nbsp; File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/b=
roker/storage_broker.py", line 41, in get_all_stats_for_service_type<br>>=
;>> d =3D self.get_raw_stats_for_service_t=
ype(storage_dir, service_type)<br>>>> File "/u=
sr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker=
.py", line 74, in get_raw_stats_for_service_type<br>>>> &nb=
sp; f =3D os.open(path, direct_flag | os.O_RDONLY)<br>>>=
> OSError: [Errno 24] Too many open files: '/rhev/data-center/mnt/gluste=
r1:_engine/6eb220be-daff-4785-8f78-111cc24139c4/ha_agent/hosted-engine.meta=
data'<br>>>> Thread-38161::INFO::2014-10-31 10:28:53,658::listener=
::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) C=
onnection closed<br>>>><br>>>> Thanks,<br>>>> Ja=
icel<br>>>><br>>>> ----- Original Message -----<br>>&g=
t;> From: "Niels de Vos" <ndevos(a)redhat.com><br>>>> To: "=
Vijay Bellur" <vbellur(a)redhat.com><br>>>> Cc: "Jiri Moskovca=
k" <jmoskovc(a)redhat.com>, "Jaicel R. Sabonsolin" <jaicel(a)asti.dost=
.gov.ph>, users(a)ovirt.org, "Gluster Devel" <gluster-devel(a)gluster.org=
><br>>>> Sent: Friday, October 31, 2014 4:11:25 AM<br>>>&=
gt; Subject: Re: [ovirt-users] Hosted-Engine HA problem<br>>>><br>=
>>> On Thu, Oct 30, 2014 at 09:07:24PM +0530, Vijay Bellur wrote:<=
br>>>>> On 10/30/2014 06:45 PM, Jiri Moskovcak wrote:<br>>&g=
t;>>> On 10/30/2014 09:22 AM, Jaicel R. Sabonsolin wrote:<br>>&=
gt;>>>> Hi Guys,<br>>>>>>><br>>>>>=
;>> I need help with my ovirt Hosted-Engine HA setup. I am running on=
2<br>>>>>>> ovirt hosts and 2 gluster nodes with replica=
ted volumes. i already have<br>>>>>>> VMs running on my h=
osts and they can migrate normally once i for example<br>>>>>&g=
t;> power off the host that they are running on. the problem is that the=
<br>>>>>>> engine can't migrate once i switch off the hos=
t that hosts the engine.<br>>>>>>><br>>>>>>=
;> oVirt 3.4.3-1.el6<br>=
>>>>>> KVM &nbs=
p; 0.12.1.2 - 2.415.el6_5.10<br>>>>>>> &nbs=
p; LIBVIRT libvirt-0.10.2-29.el6_5.9<br>>>>>>> &nb=
sp; VDSM vdsm-4.14.17-0.el6<br>>>&g=
t;>>><br>>>>>>><br>>>>>>> righ=
t now, i have this result from hosted-engine --vm-status.<br>>>>&g=
t;>><br>>>>>>> Fi=
le "/usr/lib64/python2.6/runpy.py", line 122, in<br>>>>>>>=
; _run_module_as_main<br>>>>>>> &nbs=
p; "__main__", fname, loader, pkg_name)<b=
r>>>>>>> File "/usr/lib=
64/python2.6/runpy.py", line 34, in _run_code<br>>>>>>> &=
nbsp; exec code in run_globals<br>>>=
;>>>> File<br>>>>>=
;>><br>>>>>>> "/usr/lib/python2.6/site-packages/ovi=
rt_hosted_engine_setup/vm_status.py",<br>>>>>>><br>>&g=
t;>>>> line 111, in <module><br>>=
>>>>> if not status=
_checker.print_status():<br>>>>>>> &=
nbsp; File<br>>>>>>><br>>>>>>> "/=
usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.py",<br=
>>>>>>><br>>>>>>> =
line 58, in print_status<br>>>>>>> &=
nbsp; all_host_stats =3D ha_cli.get_all_host_stats()<br>>&g=
t;>>>> File<br>>>>&g=
t;>><br>>>>>>> "/usr/lib/python2.6/site-packages/ov=
irt_hosted_engine_ha/client/client.py",<br>>>>>>><br>>=
>>>>> line 137, in get_all_host_stats<b=
r>>>>>>> return =
self.get_all_stats(self.StatModes.HOST)<br>>>>>>> =
File<br>>>>>>><br>>>>=
>>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/clien=
t/client.py",<br>>>>>>><br>>>>>>>  =
; line 86, in get_all_stats<br>>>>>>>  =
; constants.SERVICE_TYPE)<br>>>>=
>>> File<br>>>>>>=
><br>>>>>>> "/usr/lib/python2.6/site-packages/ovirt_ho=
sted_engine_ha/lib/brokerlink.py",<br>>>>>>><br>>>&=
gt;>>> line 171, in get_stats_from_storage<br=
>>>>>>> result =
=3D self._checked_communicate(request)<br>>>>>>> &=
nbsp; File<br>>>>>>><br>>>>&=
gt;>> "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/br=
okerlink.py",<br>>>>>>><br>>>>>>>  =
; line 199, in _checked_communicate<br>>>>>>&g=
t; .format(message or response))<b=
r>>>>>>> ovirt_hosted_engine_ha.lib.=
exceptions.RequestError: Request failed:<br>>>>>>> =
<type 'exceptions.OSError'><br>>>>>>>=
;<br>>>>>>><br>>>>>>> restarting ha-bro=
ker and ha-agent normalizes the status but eventually<br>>>>>&g=
t;> it would become "false" and then return to the result above. hope yo=
u<br>>>>>>> guys could help me with this.<br>>>>=
>>><br>>>>>><br>>>>>> Hi Jaicel,<br>=
>>>>> please attach agent.log and broker.log from the host w=
here you trying to<br>>>>>> run hosted-engine --vm-status. I=
have a feeling that you ran into a<br>>>>>> known problem o=
n gluster - stalled file descriptor, in that case the<br>>>>>&g=
t; only known solution at this time is to restart the broker & agent as=
you<br>>>>>> have already found out.<br>>>>>>=
;<br>>>>><br>>>>> Adding Niels and gluster-devel to=
troubleshoot from Gluster NFS perspective.<br>>>><br>>>>=
I'd welcome any details on this "stalled file descriptor" problem. Is<br>&=
gt;>> there a bug filed with some details like logs, sysrq-t and mayb=
e even<br>>>> tcpdumps? If there is an easy way to reproduce this =
behaviour, I can<br>>>> surely look into it and hopefully come up =
with some advise or fix.<br>>>><br>>>> Thanks,<br>>>=
;> Niels<br>>>></div></div><br></div></div></body></html>
------=_Part_891998_1183135702.1415681763754
Content-Type: image/gif; name=undefined
Content-Disposition: attachment; filename=undefined
Content-Transfer-Encoding: base64
Content-ID: <8b096be5d873a9597907183bb13f9baf5a0669a2@zimbra>
R0lGODlhEgASAPQfAMKmMq6qpuPQHKOGBqGVjPXnLO/v7r+qTnJeSPPwWdK+H8OrGsjHxWpTEsS8
nfbYEPryR+DYurebE/r6+v79cf32N4h1WtjV07qfLsq2at7SMnljE9XDMVI9Df77WgAAACH5BAUA
AB8ALAAAAAASABIAQAXU4CeKU2mWo2hIVOUW8CMIT+1JxjhdzgEsC8DBcZmkJgcFBcLhuJoQiuJg
nDgWHkHjwigxLg2FZ+EwkjISToJCSXAkGfMHqfG4Op17vuLRUFcJCwgWhIWECBgQOHMXAAoJFQsb
AwsVCQoARSlzEQcSAwMSBxFyIwYYCwkeEKwQHoEYOZwsEBUNe7cVURKkgLUVGw0uwS4QCTgrFAUc
NDXODwsCigYGBxq2BAwBFxcBDATCfgYT1WICGwTc4AIeU+MqFwcLGq6rGgsHF7JHBlwM//pKhQAA
Ow==
------=_Part_891998_1183135702.1415681763754--
------=_Part_891997_988106587.1415681763753--
2
1

Jiří Moskovčák změnil čas události Ovirt - Hosted Engine iSCSI support (deep dive)
by Google+ (Jiří Moskovčák) 11 Nov '14
by Google+ (Jiří Moskovčák) 11 Nov '14
11 Nov '14
Jiří Moskovčák změnil čas události na
st 12. listopadu, 14:00 SEČ
Toto oznámení bylo odesláno na adresu users(a)ovirt.org; Chcete-li
aktualizovat svou adresu, přejděte na nastavení doručování oznámení:
https://plus.google.com/_/notifications/ngemlink?&emid=COCoq96I8sECFRKW3Aod…
Ve správě odběrů můžete nastavit, jaké e-maily z Google+ chcete dostávat:
https://plus.google.com/_/notifications/ngemlink?&emid=COCoq96I8sECFRKW3Aod…
Google Inc., 1600 Amphitheatre Pkwy, Mountain View, CA 94043 USA
1
0
I attempted to make a live snapshot of a Windows VM last night, and got
the message "Failed to create live snapshot <name> for VM <vm>. VM restart
is recommended".
So, I then shut the VM down normally, and attempted to remove the snapshot.
I got an error that the snapshot could not be removed because it was still
in progress. Looking underneath the covers at the 'images' directory for
this VM on the storage filesystem, I don't see any evidence of a snapshot
in progress (the *_MERGE* files).
The Snapshots tab for the VM does not show any snapshots (other than
the 'Active VM' entry).
I can't boot the VM however. When attempting to detach the disk and
mount it on another Windows VM for inspection, I receive the message
"Cannot detach virtual disk. The disk is already configured in a snapshot.
In order to detach it, remove the disk's snapshots".
How can I convince oVirt there is no snapshot so I can move forward with
the rest of resurrecting this important VM?
oVirt Engine Version: 3.4.0-1.fc19 (running on Fedora Core 19 server)
hosts: running vdsm-4.13.0-11.el6.x86_64 on CentOS 6.5
storage: NFS from an in-house NAS based on ZFS on OpenIndiana
Many thanks in advance,
Toby
--
Toby Chappell, RHCE
Director, Enterprise Services
Educational Technology
Georgia Gwinnett College
toby(a)ggc.edu / 678-407-5305
1
0
Hello,
I would like to know if there are improvements regarding ovirt support of
mixed clusters (with amd and intel cpus).
I see that is a feature supported by kvm because at least pve/proxmox and
cloudstack support mixed cluster with migration and ha.
Thanks in advance for any reply!
Mario
1
1