Alerts
by Koen Vanoppen
Hi All,
We recently had a major crash on our ovirt environment due a strange bug
(has been reported in the meanwhile).
But now we are left we a bunch of aerts (+100) that are still standing
there... Is there a way I can flush them manually from command line or so?
Because the right click+clear all, doesn't seem to work that good... :-).
Kind regards,
Koen
9 years, 11 months
Issues with vm start up
by Shanil S
Hi All,
I am using the ovirt version 3.5 and having some issues with the vm startup
with cloud-init using api in run-once mode.....
Below is the steps i follow :-
1. Create the VM by API from precreated Template..
2. Start the VM in run-once mode and push the cloud-init data from API..
3. VM stuck and from console it display the following :-
Booting from DVD/CD.. ...
Boot failed : could not read from CDROM (code 004)
I am using the following xml for this operation :-
<action>
<vm>
<os>
<boot dev='cdrom'/>
</os>
<initialization>
<cloud_init>
<host>
<address>test</address>
</host>
<network_configuration>
<nics>
<nic>
<interface>virtIO</interface>
<name>eth0</name>
<boot_protocol>static</boot_protocol>
<mac address=''/>
<network>
<ip address='' netmask='' gateway=''/>
</network>
<on_boot>true</on_boot><vnic_profile id='' />
</nic>
<nic>
<interface>virtIO</interface>
<name>eth1</name>
<boot_protocol>static</boot_protocol>
<mac address=''/>
<network>
<ip address='' netmask='255.255.255.0' gateway=''/>
</network>
<on_boot>true</on_boot><vnic_profile id='' />
</nic>
</nics>
</network_configuration>
<files>
<file>
<name>/ignored</name><content><![CDATA[#cloud-config
disable-ec2-metadata: true
disable_root: false
ssh_pwauth: true
ssh_deletekeys: true
chpasswd: { expire: False }
users:
- name: root
primary-group: root
passwd: 8W7RQ5Bh
lock-passwd: false
runcmd:
- sed -i '/nameserver/d' /etc/resolv.conf
- echo 'nameserver 8.8.8.8' >> /etc/resolv.conf
- echo 'nameserver 8.8.4.4' >> /etc/resolv.conf
- echo 'root:8W7RQ5Bh' | chpasswd
- yum -y update
- yum -y install rdate
- rdate -s stdtime.gov.hk]]></content>
<type>plaintext</type>
</file>
</files>
</cloud_init><custom_script><![CDATA[#cloud-config
disable-ec2-metadata: true
disable_root: false
ssh_pwauth: true
ssh_deletekeys: true
chpasswd: { expire: False }
users:
- name: root
primary-group: root
passwd: 8W7RQ5Bh
lock-passwd: false
runcmd:
- sed -i '/nameserver/d' /etc/resolv.conf
- echo 'nameserver 8.8.8.8' >> /etc/resolv.conf
- echo 'nameserver 8.8.4.4' >> /etc/resolv.conf
- echo 'root:8W7RQ5Bh' | chpasswd
- yum -y update
- yum -y install rdate
- rdate -s stdtime.gov.hk]]></custom_script>
</initialization>
</vm>
</action>
I am also attaching the screen shot to this.
--
Regards
Shanil
9 years, 11 months
Backup and Restore of VMs
by Soeren Malchow
--_000_687766E0007010429FBFF98CA70824887896D9mcexch01mcongroup_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Dear all,
ovirt: 3.5
gluster: 3.6.1
OS: CentOS 7 (except ovirt hosted engine =3D centos 6.6)
i spent quite a while researching backup and restore for VMs right now, so =
far I have come up with this as a start for us
- API calls to create schedule snapshots of virtual machines
This is or short term storage and to guard against accidential deletion wit=
hin the VM but not for storage corruption
- Since we are using a gluster backend, gluster snapshots
I wasn't able so far to really test it since the LV needs to be thin provis=
ioned and we did not do that in the setup
For the API calls we have the problem that we can not find any existing scr=
ipts or something like that to do those snapshots (and i/we are not develop=
ers enough to do that).
As an additional information, we have a ZFS based storage with deduplicatio=
n that we use for other backup purposes which does a great job especially b=
ecause of the deduplication (we can storage generations of backups without =
problems), this storage can be NFS exported and used as backup repository.
Are there any backup and restore procedure you guys are using for backup an=
d restore that works for you and can you point me into the right direction =
?
I am a little bit list right now and would appreciate any help.
Regards
Soeren
--_000_687766E0007010429FBFF98CA70824887896D9mcexch01mcongroup_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3D"Generator" content=3D"Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
@page WordSection1
{size:8.5in 11.0in;
margin:70.85pt 70.85pt 56.7pt 70.85pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1885018823;
mso-list-type:hybrid;
mso-list-template-ids:518132072 -1996084360 67567619 67567621 67567617 675=
67619 67567621 67567617 67567619 67567621;}
@list l0:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Calibri",sans-serif;
mso-fareast-font-family:Calibri;
mso-bidi-font-family:"Times New Roman";}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=3D"DE" link=3D"#0563C1" vlink=3D"#954F72">
<div class=3D"WordSection1">
<p class=3D"MsoNormal">Dear all,<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US">ovirt: 3.5<o:p></o:p></span></p=
>
<p class=3D"MsoNormal"><span lang=3D"EN-US">gluster: 3.6.1<o:p></o:p></span=
></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US">OS: CentOS 7 (except ovirt host=
ed engine =3D centos 6.6)<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US"><o:p> </o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US">i spent quite a while researchi=
ng backup and restore for VMs right now, so far I have come up with this as=
a start for us<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US"><o:p> </o:p></span></p>
<p class=3D"MsoListParagraph" style=3D"text-indent:-.25in;mso-list:l0 level=
1 lfo1"><![if !supportLists]><span lang=3D"EN-US"><span style=3D"mso-list:I=
gnore">-<span style=3D"font:7.0pt "Times New Roman""> =
</span></span></span><![endif]><span lang=3D"EN-US">API calls to create sch=
edule snapshots of virtual machines<br>
This is or short term storage and to guard against accidential deletion wit=
hin the VM but not for storage corruption<br>
<br>
<o:p></o:p></span></p>
<p class=3D"MsoListParagraph" style=3D"text-indent:-.25in;mso-list:l0 level=
1 lfo1"><![if !supportLists]><span lang=3D"EN-US"><span style=3D"mso-list:I=
gnore">-<span style=3D"font:7.0pt "Times New Roman""> =
</span></span></span><![endif]><span lang=3D"EN-US">Since we are using a gl=
uster backend, gluster snapshots<br>
I wasn’t able so far to really test it since the LV needs to be thin =
provisioned and we did not do that in the setup<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US"><o:p> </o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US">For the API calls we have the p=
roblem that we can not find any existing scripts or something like that to =
do those snapshots (and i/we are not developers enough to do that).<o:p></o=
:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US"><o:p> </o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US">As an additional information, w=
e have a ZFS based storage with deduplication that we use for other backup =
purposes which does a great job especially because of the deduplication (we=
can storage generations of backups
without problems), this storage can be NFS exported and used as backup rep=
ository.<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US"><o:p> </o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US">Are there any backup and restor=
e procedure you guys are using for backup and restore that works for you an=
d can you point me into the right direction ?<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US">I am a little bit list right no=
w and would appreciate any help.<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US"><o:p> </o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US">Regards<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US">Soeren<o:p></o:p></span></p>
</div>
</body>
</html>
--_000_687766E0007010429FBFF98CA70824887896D9mcexch01mcongroup_--
9 years, 11 months
Re: [ovirt-users] Can't remove a storage domain related to a broken hardware
by Olivier Navas
------=_Part_1573992_482995922.1421055717805
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Hi !=20
I finally solved my problem by removing information related to broken stora=
ge in the engine database.=20
I post below the sql queries I executed in postgres (with engine stopped wh=
ile executing them) in order to remove storage domain. Maybe it can be usef=
ul for somebody else in the same situation.=20
Connect to database=20
# su - postgres=20
# psql engine=20
identify broken storage=20
engine=3D# select id, storage_name from storage_domain_static;=20
id | storage_name=20
--------------------------------------+------------------------=20
34c95a44-db7f-4d0f-ba13-5f06a7feefe7 | my-broken-storage-domain=20
identify ovf disks on this storage=20
engine=3D# select storage_domain_id, ovf_disk_id from storage_domains_ovf_i=
nfo where storage_domain_id =3D '34c95a44-db7f-4d0f-ba13-5f06a7feefe7';=20
storage_domain_id | ovf_disk_id=20
--------------------------------------+------------------------------------=
--=20
34c95a44-db7f-4d0f-ba13-5f06a7feefe7 | 033f8fba-5145-47e8-b3b5-32a34a39ad11=
=20
34c95a44-db7f-4d0f-ba13-5f06a7feefe7 | 2d9a6a40-1dd0-4180-b7c7-3829a443c825=
=20
engine=3D# delete from storage_domain_dynamic where id =3D '34c95a44-db7f-4=
d0f-ba13-5f06a7feefe7';=20
engine=3D# delete from storage_domain_static where id =3D '34c95a44-db7f-4d=
0f-ba13-5f06a7feefe7';=20
engine=3D# delete from base_disks where disk_id =3D '033f8fba-5145-47e8-b3b=
5-32a34a39ad11';=20
engine=3D# delete from base_disks where disk_id =3D '2d9a6a40-1dd0-4180-b7c=
7-3829a443c825';=20
engine=3D# delete from storage_domains_ovf_info where storage_domain_id =3D=
'34c95a44-db7f-4d0f-ba13-5f06a7feefe7';=20
engine=3D# delete from storage_pool_iso_map where storage_id =3D '34c95a44-=
db7f-4d0f-ba13-5f06a7feefe7';=20
identify and delete luns and connections related to storage (i did not take=
notes of results of these ones, but it was easy to find the good lines). A=
lso, i noticed that lun_storage_server_connection_map only contained inform=
ation about iscsi storage, not fiber channel.=20
engine=3D# select * from luns;=20
engine=3D# delete from luns where physical_volume_id =3D 'IqOdm6-BWuT-9YBW-=
uvM1-q41E-a3Cz-zPnWHq';=20
engine=3D# delete from lun_storage_server_connection_map where storage_serv=
er_connection=3D'1b9f3167-3236-431e-93c2-ab5ee18eba04';=20
engine=3D# delete from lun_storage_server_connection_map where storage_serv=
er_connection=3D'ea5971f8-e1a0-42e3-826d-b95e9031ce53';=20
engine=3D# delete from storage_server_connections where id=3D'1b9f3167-3236=
-431e-93c2-ab5ee18eba04';=20
engine=3D# delete from storage_server_connections where id=3D'ea5971f8-e1a0=
-42e3-826d-b95e9031ce53';=20
delete remaining disk(s) ; I had 1 virtual disk on this storage=20
engine=3D# delete from base_disks where disk_id=3D'03d651eb-14a9-4dca-8c87-=
605f101a5e0c';=20
engine=3D# delete from permissions where object_id=3D'03d651eb-14a9-4dca-8c=
87-605f101a5e0c';=20
Then started engine and all is fine now.=20
----- Mail original -----=20
> De: "Amit Aviram" <aaviram(a)redhat.com>=20
> =C3=80: "Olivier Navas" <olivier.navas(a)sdis33.fr>=20
> Envoy=C3=A9: Mardi 6 Janvier 2015 17:24:19=20
> Objet: Re: [ovirt-users] Can't remove a storage domain related to a broke=
n hardware=20
>=20
>=20
>=20
> ----- Original Message -----=20
> From: "Olivier Navas" <olivier.navas(a)sdis33.fr>=20
> Hi, can you please add the engine log?=20
>=20
> To: users(a)ovirt.org=20
> Sent: Tuesday, January 6, 2015 11:28:42 AM=20
> Subject: [ovirt-users] Can't remove a storage domain related to a broken=
=20
> hardware=20
>=20
>=20
> Hello Ovirt users !=20
>=20
> I experienced an hardware failure on a ISCSI storage making it unrecovera=
ble,=20
> and I would like to remove it from storage domains.=20
>=20
> There was 1 disk on this storage domain, and this disk isn't attached to =
any=20
> VM anymore, but I still can't detach this storage domain from cluster.=20
>=20
> The storage domain is in "inactive" status and if I try to "detach" it fr=
om=20
> data center, ovirt tries to activate it. Obviously it can't activate it=
=20
> since hardware is broken, and it fails after several minutes with the eve=
nt=20
> "Failed to detach Storage Domain my_storage_domain to Data Center Default=
.=20
> (User: admin)". I can post my engine.log if useful.=20
>=20
> I need a way to force removal of this storage domain. Any trick would be=
=20
> greatly appreciated.=20
>=20
> Perhaps does Ovirt miss some sort of "force detach, i know what i'm doing=
"=20
> button ?=20
>=20
> I am running an Ovirt 3.5 cluster (engine with CentOS 6.6, 4 nodes with=
=20
> CentOS 7) with FC and ISCSI storage domains.=20
>=20
> Thanks for your help.=20
>=20
>=20
>=20
>=20
> Ce courriel et tous les fichiers attach=C3=A9s qu'il contient sont confid=
entiels=20
> et destin=C3=A9s exclusivement =C3=A0 la personne =C3=A0 laquelle ils son=
t adress=C3=A9s. Si=20
> vous avez re=C3=A7u ce courriel par erreur, merci de le retourner =C3=A0 =
son=20
> exp=C3=A9diteur et de le d=C3=A9truire. Il est rappel=C3=A9 que tout mess=
age =C3=A9lectronique=20
> est susceptible d'alteration au cours de son acheminement sur internet.=
=20
> Seuls les documents officiels du SDIS sont de nature =C3=A0 engager sa=20
> responsabilit=C3=A9. Les id=C3=A9es ou opinions pr=C3=A9sent=C3=A9es dans=
ce courriel sont=20
> celles de son auteur et ne repr=C3=A9sentent pas n=C3=A9cessairement cell=
es du SDIS de=20
> la Gironde.=20
>=20
> _______________________________________________=20
> Users mailing list=20
> Users(a)ovirt.org=20
> http://lists.ovirt.org/mailman/listinfo/users=20
>=20
=0A--=0ACe courriel et tous les fichiers attach=C3=A9s qu'il contient sont =
confidentiels et destin=C3=A9s exclusivement =C3=A0 la personne =C3=A0 laque=
lle ils sont adress=C3=A9s. Si vous avez re=C3=A7u ce courriel par erreur, m=
erci de le retourner =C3=A0 son exp=C3=A9diteur et de le d=C3=A9truire. Il e=
st rappel=C3=A9 que tout message =C3=A9lectronique est susceptible d'alterat=
ion au cours de son acheminement sur internet. Seuls les documents officiels=
du SDIS sont de nature =C3=A0 engager sa responsabilit=C3=A9. Les id=C3=A9e=
s ou opinions pr=C3=A9sent=C3=A9es dans ce courriel sont celles de son auteu=
r et ne repr=C3=A9sentent pas n=C3=A9cessairement celles du SDIS de la Giron=
de.=0A
------=_Part_1573992_482995922.1421055717805
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable
<html><body><div style=3D"font-family: times new roman, new york, times, se=
rif; font-size: 12pt; color: #000000"><div><br>Hi !<br></div><div><br></div=
><div>I finally solved my problem by removing information related to broken=
storage in the engine database.<br></div><div><br></div><div>I post below =
the sql queries I executed in postgres (with engine stopped while executing=
them) in order to remove storage domain. Maybe it can be useful for somebo=
dy else in the same situation.<br></div><div><br></div><div><br>Connect to =
database<br></div><div><br></div><div># su - postgres<br># psql engine<br><=
/div><div><br></div><div><br>identify broken storage <br></div><div><br></d=
iv><div>engine=3D# select id, storage_name from storage_domain_static;<br>&=
nbsp; id  =
; | &n=
bsp;storage_name <br>----------------------------------=
----+------------------------<br> 34c95a44-db7f-4d0f-ba13-5f06a7feefe7=
| my-broken-storage-domain<br></div><div><br></div><div><br></div><div><br=
></div><div>identify ovf disks on this storage<br></div><div><br></div><div=
>engine=3D# select storage_domain_id, ovf_disk_id from storage_domain=
s_ovf_info where storage_domain_id =3D '34c95a44-db7f-4d0f-ba13-5f06a7feefe=
7';<br> storage_domain_id &nb=
sp; | ovf_di=
sk_id <br>-----------------=
---------------------+--------------------------------------<br> 34c95=
a44-db7f-4d0f-ba13-5f06a7feefe7 | 033f8fba-5145-47e8-b3b5-32a34a39ad11<br>&=
nbsp;34c95a44-db7f-4d0f-ba13-5f06a7feefe7 | 2d9a6a40-1dd0-4180-b7c7-3829a44=
3c825<br></div><div><br></div><div><br>engine=3D# delete from storage_domai=
n_dynamic where id =3D '34c95a44-db7f-4d0f-ba13-5f06a7feefe7';<br>engine=3D=
# delete from storage_domain_static where id =3D '34c95a44-db7f-4d0f-ba13-5=
f06a7feefe7';<br>engine=3D# delete from base_disks where disk_id =3D '033f8=
fba-5145-47e8-b3b5-32a34a39ad11';<br>engine=3D# delete from base_disks wher=
e disk_id =3D '2d9a6a40-1dd0-4180-b7c7-3829a443c825';<br>engine=3D# delete =
from storage_domains_ovf_info where storage_domain_id =3D '34c95a44-d=
b7f-4d0f-ba13-5f06a7feefe7';<br>engine=3D# delete from storage_pool_iso_map=
where storage_id =3D '34c95a44-db7f-4d0f-ba13-5f06a7feefe7';<br></di=
v><div><br></div><div><br>identify and delete luns and connections related =
to storage (i did not take notes of results of these ones, but it was easy =
to find the good lines). Also, i noticed that lun_storage_server_connection=
_map only contained information about iscsi storage, not fiber channel.<br>=
</div><div><br></div><div>engine=3D# select * from luns;<br></div><div><br>=
</div><div>engine=3D# delete from luns where physical_volume_id =3D 'IqOdm6=
-BWuT-9YBW-uvM1-q41E-a3Cz-zPnWHq';<br>engine=3D# delete from lun_storage_se=
rver_connection_map where storage_server_connection=3D'1b9f3167-3236-431e-9=
3c2-ab5ee18eba04';<br>engine=3D# delete from lun_storage_server_connection_=
map where storage_server_connection=3D'ea5971f8-e1a0-42e3-826d-b95e9031ce53=
';<br>engine=3D# delete from storage_server_connections where id=3D'1b9f316=
7-3236-431e-93c2-ab5ee18eba04';<br>engine=3D# delete from storage_server_co=
nnections where id=3D'ea5971f8-e1a0-42e3-826d-b95e9031ce53';<br></div><div>=
<br></div><div><br>delete remaining disk(s) ; I had 1 virtual disk on this =
storage<br></div><div><br></div><div>engine=3D# delete from base_disks wher=
e disk_id=3D'03d651eb-14a9-4dca-8c87-605f101a5e0c';<br>engine=3D# delete fr=
om permissions where object_id=3D'03d651eb-14a9-4dca-8c87-605f101a5e0c';<br=
></div><div><br></div><div><br>Then started engine and all is fine now.<br>=
</div><div><br></div><div><br></div><div><br></div><div>----- Mail original=
-----<br>> De: "Amit Aviram" <aaviram(a)redhat.com><br>> =C3=80:=
"Olivier Navas" <olivier.navas(a)sdis33.fr><br>> Envoy=C3=A9: Mardi=
6 Janvier 2015 17:24:19<br>> Objet: Re: [ovirt-users] Can't remove a st=
orage domain related to a broken &=
nbsp;hardware<br>> <br>> <br>> <br>> ----- Original Message ---=
--<br>> From: "Olivier Navas" <olivier.navas(a)sdis33.fr><br>> Hi=
, can you please add the engine log?<br>> <br>> To: users(a)ovirt.org<b=
r>> Sent: Tuesday, January 6, 2015 11:28:42 AM<br>> Subject: [ovirt-u=
sers] Can't remove a storage domain related to a broken<br>>  =
; hardware<br>> <br>> <br>> Hel=
lo Ovirt users !<br>> <br>> I experienced an hardware failure on a IS=
CSI storage making it unrecoverable,<br>> and I would like to remove it =
from storage domains.<br>> <br>> There was 1 disk on this storage dom=
ain, and this disk isn't attached to any<br>> VM anymore, but I still ca=
n't detach this storage domain from cluster.<br>> <br>> The storage d=
omain is in "inactive" status and if I try to "detach" it from<br>> data=
center, ovirt tries to activate it. Obviously it can't activate it<br>>=
since hardware is broken, and it fails after several minutes with the even=
t<br>> "Failed to detach Storage Domain my_storage_domain to Data Center=
Default.<br>> (User: admin)". I can post my engine.log if useful.<br>&g=
t; <br>> I need a way to force removal of this storage domain. Any trick=
would be<br>> greatly appreciated.<br>> <br>> Perhaps does Ovirt =
miss some sort of "force detach, i know what i'm doing"<br>> button ?<br=
>> <br>> I am running an Ovirt 3.5 cluster (engine with CentOS 6.6, 4=
nodes with<br>> CentOS 7) with FC and ISCSI storage domains.<br>> <b=
r>> Thanks for your help.<br>> <br>> <br>> <br>> <br>> Ce=
courriel et tous les fichiers attach=C3=A9s qu'il contient sont confidenti=
els<br>> et destin=C3=A9s exclusivement =C3=A0 la personne =C3=A0 laquel=
le ils sont adress=C3=A9s. Si<br>> vous avez re=C3=A7u ce courriel par e=
rreur, merci de le retourner =C3=A0 son<br>> exp=C3=A9diteur et de le d=
=C3=A9truire. Il est rappel=C3=A9 que tout message =C3=A9lectronique<br>>=
; est susceptible d'alteration au cours de son acheminement sur internet.<b=
r>> Seuls les documents officiels du SDIS sont de nature =C3=A0 engager =
sa<br>> responsabilit=C3=A9. Les id=C3=A9es ou opinions pr=C3=A9sent=C3=
=A9es dans ce courriel sont<br>> celles de son auteur et ne repr=C3=A9se=
ntent pas n=C3=A9cessairement celles du SDIS de<br>> la Gironde.<br>>=
<br>> _______________________________________________<br>> Users mai=
ling list<br>> Users(a)ovirt.org<br>> http://lists.ovirt.org/mailman/li=
stinfo/users<br>> <br></div></div>
<br>=
=0A<hr><font size=3D"-2"><em>=0ACe courriel et tous les fichiers attach=C3=
=A9s qu'il contient sont confidentiels et destin=C3=A9s exclusivement =C3=A0=
la personne =C3=A0 laquelle ils sont adress=C3=A9s. Si vous avez re=C3=A7u =
ce courriel par erreur, merci de le retourner =C3=A0 son exp=C3=A9diteur et =
de le d=C3=A9truire. Il est rappel=C3=A9 que tout message =C3=A9lectronique =
est susceptible d'alteration au cours de son acheminement sur internet. Seul=
s les documents officiels du SDIS sont de nature =C3=A0 engager sa responsab=
ilit=C3=A9. Les id=C3=A9es ou opinions pr=C3=A9sent=C3=A9es dans ce courriel=
sont celles de son auteur et ne repr=C3=A9sentent pas n=C3=A9cessairement c=
elles du SDIS de la Gironde.=0A</font></em>=0A
<br>=
</body></html>
------=_Part_1573992_482995922.1421055717805--
9 years, 11 months
Re: [ovirt-users] VM failover with ovirt3.5
by cong yue
The vdsm.log just after I turned the host where HE VM is to local.
In the log, there is some part like
---
GuestMonitor-HostedEngine::DEBUG::2014-12-30
13:01:03,988::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
GuestMonitor-HostedEngine::DEBUG::2014-12-30
13:01:03,989::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
GuestMonitor-HostedEngine::DEBUG::2014-12-30
13:01:03,990::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
JsonRpc (StompReactor)::DEBUG::2014-12-30
13:01:04,675::stompReactor::98::Broker.StompAdapter::(handle_frame)
Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2014-12-30
13:01:04,676::__init__::504::jsonrpc.JsonRpcServer::(serve_requests)
Waiting for request
Thread-1806995::DEBUG::2014-12-30
13:01:04,677::stompReactor::163::yajsonrpc.StompServer::(send) Sending
response
JsonRpc (StompReactor)::DEBUG::2014-12-30
13:01:04,678::stompReactor::98::Broker.StompAdapter::(handle_frame)
Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2014-12-30
13:01:04,679::__init__::504::jsonrpc.JsonRpcServer::(serve_requests)
Waiting for request
Thread-1806996::DEBUG::2014-12-30
13:01:04,681::vm::486::vm.Vm::(_getUserCpuTuneInfo)
vmId=`0d3adb5c-0960-483c-9d73-5e256a519f2f`::Domain Metadata is not
set
---
I this with some wrong?
Thanks,
Cong
> From: Artyom Lukianov <alukiano(a)redhat.com>
> Date: 2014年12月29日 23:13:45 GMT-8
> To: "Yue, Cong" <Cong_Yue(a)alliedtelesis.com>
> Cc: Simone Tiraboschi <stirabos(a)redhat.com>, "users(a)ovirt.org"
> <users(a)ovirt.org>
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> HE vm migrated only by ovirt-ha-agent and not by engine, but FatalError it's
> more interesting, can you provide vdsm.log for this one please.
>
> ----- Original Message -----
> From: "Cong Yue" <Cong_Yue(a)alliedtelesis.com>
> To: "Artyom Lukianov" <alukiano(a)redhat.com>
> Cc: "Simone Tiraboschi" <stirabos(a)redhat.com>, users(a)ovirt.org
> Sent: Monday, December 29, 2014 8:29:04 PM
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> I disabled local maintenance mode for all hosts, and then only set the host
> where HE VM is there to local maintenance mode. The logs are as follows.
> During the migration of HE VM , it shows some fatal error happen. By the
> way, also HE VM can not work with live migration. Instead, other VMs can do
> live migration.
>
> ---
> [root@compute2-3 ~]# hosted-engine --set-maintenance --mode=local
> You have new mail in /var/spool/mail/root
> [root@compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-29
> 13:16:12,435::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.92 (id: 3, score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:22,711::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:22,711::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.92 (id: 3, score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:32,978::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:32,978::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:43,272::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:43,272::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:53,316::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm running on localhost
> MainThread::INFO::2014-12-29
> 13:16:53,562::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-29
> 13:16:53,562::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:03,600::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-29
> 13:17:03,611::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419877023.61 type=state_transition
> detail=EngineUp-LocalMaintenanceMigrateVm hostname='compute2-3'
> MainThread::INFO::2014-12-29
> 13:17:03,672::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (EngineUp-LocalMaintenanceMigrateVm) sent? sent
> MainThread::INFO::2014-12-29
> 13:17:03,911::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> Score is 0 due to local maintenance mode
> MainThread::INFO::2014-12-29
> 13:17:03,912::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenanceMigrateVm (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:03,912::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:03,960::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419877023.96 type=state_transition
> detail=LocalMaintenanceMigrateVm-EngineMigratingAway
> hostname='compute2-3'
> MainThread::INFO::2014-12-29
> 13:17:03,980::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (LocalMaintenanceMigrateVm-EngineMigratingAway) sent? sent
> MainThread::INFO::2014-12-29
> 13:17:04,218::states::66::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_penalize_memory)
> Penalizing score by 400 due to low free memory
> MainThread::INFO::2014-12-29
> 13:17:04,218::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineMigratingAway (score: 2000)
> MainThread::INFO::2014-12-29
> 13:17:04,219::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::ERROR::2014-12-29
> 13:17:14,251::hosted_engine::867::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitor_migration)
> Failed to migrate
> Traceback (most recent call last):
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 863, in _monitor_migration
> vm_id,
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py",
> line 85, in run_vds_client_cmd
> response['status']['message'])
> DetailedError: Error 12 from migrateStatus: Fatal error during migration
> MainThread::INFO::2014-12-29
> 13:17:14,262::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419877034.26 type=state_transition
> detail=EngineMigratingAway-ReinitializeFSM hostname='compute2-3'
> MainThread::INFO::2014-12-29
> 13:17:14,263::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (EngineMigratingAway-ReinitializeFSM) sent? ignored
> MainThread::INFO::2014-12-29
> 13:17:14,496::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state ReinitializeFSM (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:14,496::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:24,536::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-29
> 13:17:24,547::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419877044.55 type=state_transition
> detail=ReinitializeFSM-LocalMaintenance hostname='compute2-3'
> MainThread::INFO::2014-12-29
> 13:17:24,574::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (ReinitializeFSM-LocalMaintenance) sent? sent
> MainThread::INFO::2014-12-29
> 13:17:24,812::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:24,812::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:34,851::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-29
> 13:17:35,095::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:35,095::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-29
> 13:17:45,130::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-29
> 13:17:45,368::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-29
> 13:17:45,368::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> ^C
> [root@compute2-3 ~]#
>
>
> [root@compute2-3 ~]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.94
> Host ID : 1
> Engine status : {"health": "good", "vm": "up",
> "detail": "up"}
> Score : 0
> Local maintenance : True
> Host timestamp : 1014956<tel:1014956>
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=1014956<tel:1014956> (Mon Dec 29 13:20:19 2014)
> host-id=1
> score=0
> maintenance=True
> state=LocalMaintenance
>
>
> --== Host 2 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.93
> Host ID : 2
> Engine status : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score : 2400
> Local maintenance : False
> Host timestamp : 866019
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=866019 (Mon Dec 29 10:19:45 2014)
> host-id=2
> score=2400
> maintenance=False
> state=EngineDown
>
>
> --== Host 3 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.92
> Host ID : 3
> Engine status : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score : 2400
> Local maintenance : False
> Host timestamp : 860493
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=860493 (Mon Dec 29 10:20:35 2014)
> host-id=3
> score=2400
> maintenance=False
> state=EngineDown
> [root@compute2-3 ~]#
> ---
> Thanks,
> Cong
>
>
>
> On 2014/12/29, at 8:43, "Artyom Lukianov"
> <alukiano(a)redhat.com<mailto:alukiano@redhat.com>> wrote:
>
> I see that HE vm run on host with ip 10.0.0.94, and two another hosts in
> "Local Maintenance" state, so vm will not migrate to any of them, can you
> try disable local maintenance on all hosts in HE environment and after
> enable "local maintenance" on host where HE vm run, and provide also output
> of hosted-engine --vm-status.
> Failover works in next way:
> 1) if host where run HE vm have score less by 800 that some other host in HE
> environment, HE vm will migrate on host with best score
> 2) if something happen to vm(kernel panic, crash of service...), agent will
> restart HE vm on another host in HE environment with positive score
> 3) if put to local maintenance host with HE vm, vm will migrate to another
> host with positive score
> Thanks.
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue(a)alliedtelesis.com<mailto:Cong_Yue@alliedtelesis.com>>
> To: "Artyom Lukianov" <alukiano(a)redhat.com<mailto:alukiano@redhat.com>>
> Cc: "Simone Tiraboschi" <stirabos(a)redhat.com<mailto:stirabos@redhat.com>>,
> users(a)ovirt.org<mailto:users@ovirt.org>
> Sent: Monday, December 29, 2014 6:30:42 PM
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> Thanks and the --vm-status log is as follows:
> [root@compute2-2 ~]# hosted-engine --vm-status
>
>
> --== Host 1 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.94
> Host ID : 1
> Engine status : {"health": "good", "vm": "up",
> "detail": "up"}
> Score : 2400
> Local maintenance : False
> Host timestamp : 1008087
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=1008087<tel:1008087> (Mon Dec 29 11:25:51 2014)
> host-id=1
> score=2400
> maintenance=False
> state=EngineUp
>
>
> --== Host 2 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.93
> Host ID : 2
> Engine status : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score : 0
> Local maintenance : True
> Host timestamp : 859142
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=859142 (Mon Dec 29 08:25:08 2014)
> host-id=2
> score=0
> maintenance=True
> state=LocalMaintenance
>
>
> --== Host 3 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.92
> Host ID : 3
> Engine status : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score : 0
> Local maintenance : True
> Host timestamp : 853615
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=853615 (Mon Dec 29 08:25:57 2014)
> host-id=3
> score=0
> maintenance=True
> state=LocalMaintenance
> You have new mail in /var/spool/mail/root
> [root@compute2-2 ~]#
>
> Could you please explain how VM failover works inside ovirt? Is there any
> other debug option I can enable to check the problem?
>
> Thanks,
> Cong
>
>
> On 2014/12/29, at 1:39, "Artyom Lukianov"
> <alukiano(a)redhat.com<mailto:alukiano@redhat.com><mailto:alukiano@redhat.com>>
> wrote:
>
> Can you also provide output of hosted-engine --vm-status please, previous
> time it was useful, because I do not see something unusual.
> Thanks
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue(a)alliedtelesis.com<mailto:Cong_Yue@alliedtelesis.com><mailto:Cong_Yue@alliedtelesis.com>>
> To: "Artyom Lukianov"
> <alukiano(a)redhat.com<mailto:alukiano@redhat.com><mailto:alukiano@redhat.com>>
> Cc: "Simone Tiraboschi"
> <stirabos(a)redhat.com<mailto:stirabos@redhat.com><mailto:stirabos@redhat.com>>,
> users(a)ovirt.org<mailto:users@ovirt.org><mailto:users@ovirt.org>
> Sent: Monday, December 29, 2014 7:15:24 AM
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> Also I change the maintenance mode to local in another host. But also the VM
> in this host can not be migrated. The logs are as follows.
>
> [root@compute2-2 ~]# hosted-engine --set-maintenance --mode=local
> [root@compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-28
> 21:09:04,184::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:14,603::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:14,603::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:24,903::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:24,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:35,026::states::437::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm is running on host 10.0.0.94 (id 1)
> MainThread::INFO::2014-12-28
> 21:09:35,236::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:35,236::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:45,604::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:45,604::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 21:09:55,691::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-28
> 21:09:55,701::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Trying: notify time=1419829795.7 type=state_transition
> detail=EngineDown-LocalMaintenance hostname='compute2-2'
> MainThread::INFO::2014-12-28
> 21:09:55,761::brokerlink::120::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition
> (EngineDown-LocalMaintenance) sent? sent
> MainThread::INFO::2014-12-28
> 21:09:55,990::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> Score is 0 due to local maintenance mode
> MainThread::INFO::2014-12-28
> 21:09:55,990::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-28
> 21:09:55,991::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> ^C
> You have new mail in /var/spool/mail/root
> [root@compute2-2 ~]# ps -ef | grep qemu
> root 18420 2777 0 21:10<x-apple-data-detectors://39> pts/0
> 00:00:00<x-apple-data-detectors://40> grep --color=auto qemu
> qemu 29809 1 0 Dec19 ? 01:17:20 /usr/libexec/qemu-kvm
> -name testvm2-2 -S -machine rhel6.5.0,accel=kvm,usb=off -cpu Nehalem
> -m 500 -realtime mlock=off -smp
> 1,maxcpus=16,sockets=16,cores=1,threads=1 -uuid
> c31e97d0-135e-42da-9954-162b5228dce3 -smbios
> type=1,manufacturer=oVirt,product=oVirt
> Node,version=7-0.1406.el7.centos.2.5,serial=4C4C4544-0059-3610-8033-B4C04F395931,uuid=c31e97d0-135e-42da-9954-162b5228dce3
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/testvm2-2.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=2014-12-19T20:17:17<x-apple-data-detectors://42>,driftfix=slew
> -no-kvm-pit-reinjection
> -no-hpet -no-shutdown -boot strict=on -device
> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
> virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5
> -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial=
> -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
> -drive
> file=/rhev/data-center/00000002-0002-0002-0002-0000000001e4/1dc71096-27c4-4256-b2ac-bd7265525c69/images/5cbeb8c9-4f04-48d0-a5eb-78c49187c550/a0570e8c-9867-4ec4-818f-11e102fc4f9b,if=none,id=drive-virtio-disk0,format=qcow2,serial=5cbeb8c9-4f04-48d0-a5eb-78c49187c550,cache=none,werror=stop,rerror=stop,aio=threads
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=29 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:db:94:00,bus=pci.0,addr=0x3
> -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/c31e97d0-135e-42da-9954-162b5228dce3.com.redhat.rhevm.vdsm,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm
> -chardev
> socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/c31e97d0-135e-42da-9954-162b5228dce3.org.qemu.guest_agent.0,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
> -chardev spicevmc,id=charchannel2,name=vdagent -device
> virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0
> -spice
> tls-port=5901,addr=10.0.0.93,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on
> -k en-us -vga qxl -global qxl-vga.ram_size=67108864<tel:67108864> -global
> qxl-vga.vram_size=33554432<tel:33554432> -incoming tcp:[::]:49152 -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7
> [root@compute2-2 ~]#
>
> Thanks,
> Cong
>
>
> On 2014/12/28, at 20:53, "Yue, Cong"
> <Cong_Yue(a)alliedtelesis.com<mailto:Cong_Yue@alliedtelesis.com><mailto:Cong_Yue@alliedtelesis.com><mailto:Cong_Yue@alliedtelesis.com>>
> wrote:
>
> I checked it again and confirmed there is one guest VM is running on the top
> of this host. The log is as follows:
>
> [root@compute2-1 vdsm]# ps -ef | grep qemu
> qemu 2983 846 0 Dec19 ? 00:00:00<x-apple-data-detectors://0>
> [supervdsmServer] <defunct>
> root 5489 3053 0 20:49<x-apple-data-detectors://1> pts/0
> 00:00:00<x-apple-data-detectors://2> grep --color=auto qemu
> qemu 26128 1 0 Dec19 ? 01:09:19 /usr/libexec/qemu-kvm
> -name testvm2 -S -machine rhel6.5.0,accel=kvm,usb=off -cpu Nehalem -m
> 500 -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1
> -uuid e46bca87-4df5-4287-844b-90a26fccef33 -smbios
> type=1,manufacturer=oVirt,product=oVirt
> Node,version=7-0.1406.el7.centos.2.5,serial=4C4C4544-0030-3310-8059-B8C04F585231,uuid=e46bca87-4df5-4287-844b-90a26fccef33
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/testvm2.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=2014-12-19T20:18:01<x-apple-data-detectors://4>,driftfix=slew
> -no-kvm-pit-reinjection
> -no-hpet -no-shutdown -boot strict=on -device
> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
> virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5
> -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial=
> -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
> -drive
> file=/rhev/data-center/00000002-0002-0002-0002-0000000001e4/1dc71096-27c4-4256-b2ac-bd7265525c69/images/b4b5426b-95e3-41af-b286-da245891cdaf/0f688d49-97e3-4f1d-84d4-ac1432d903b3,if=none,id=drive-virtio-disk0,format=qcow2,serial=b4b5426b-95e3-41af-b286-da245891cdaf,cache=none,werror=stop,rerror=stop,aio=threads
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:db:94:01,bus=pci.0,addr=0x3
> -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/e46bca87-4df5-4287-844b-90a26fccef33.com.redhat.rhevm.vdsm,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm
> -chardev
> socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/e46bca87-4df5-4287-844b-90a26fccef33.org.qemu.guest_agent.0,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
> -chardev spicevmc,id=charchannel2,name=vdagent -device
> virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0
> -spice
> tls-port=5900,addr=10.0.0.92,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on
> -k en-us -vga qxl -global qxl-vga.ram_size=67108864<tel:67108864> -global
> qxl-vga.vram_size=33554432<tel:33554432> -incoming tcp:[::]:49152 -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7
> [root@compute2-1 vdsm]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-28
> 20:49:27,315::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-28
> 20:49:27,646::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-28
> 20:49:27,646::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 20:49:37,732::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-28
> 20:49:37,961::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-28
> 20:49:37,961::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-28
> 20:49:48,048::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-28
> 20:49:48,319::states::208::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> Score is 0 due to local maintenance mode
> MainThread::INFO::2014-12-28
> 20:49:48,319::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-28
> 20:49:48,319::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
>
> Thanks,
> Cong
>
>
> On 2014/12/28, at 3:46, "Artyom Lukianov"
> <alukiano(a)redhat.com<mailto:alukiano@redhat.com><mailto:alukiano@redhat.com><mailto:alukiano@redhat.com>>
> wrote:
>
> I see that you set local maintenance on host3 that do not have engine vm on
> it, so it nothing to migrate from this host.
> If you set local maintenance on host1, vm must migrate to another host with
> positive score.
> Thanks
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue(a)alliedtelesis.com<mailto:Cong_Yue@alliedtelesis.com><mailto:Cong_Yue@alliedtelesis.com><mailto:Cong_Yue@alliedtelesis.com>>
> To: "Simone Tiraboschi"
> <stirabos(a)redhat.com<mailto:stirabos@redhat.com><mailto:stirabos@redhat.com><mailto:stirabos@redhat.com>>
> Cc:
> users(a)ovirt.org<mailto:users@ovirt.org><mailto:users@ovirt.org><mailto:users@ovirt.org>
> Sent: Saturday, December 27, 2014 6:58:32 PM
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
> Hi
>
> I had a try with "hosted-engine --set-maintence --mode=local" on
> compute2-1, which is host 3 in my cluster. From the log, it shows
> maintence mode is dectected, but migration does not happen.
>
> The logs are as follows. Is there any other config I need to check?
>
> [root@compute2-1 vdsm]# hosted-engine --vm-status
>
>
> --== Host 1 status ==-
>
> Status up-to-date : True
> Hostname : 10.0.0.94
> Host ID : 1
> Engine status : {"health": "good", "vm": "up",
> "detail": "up"}
> Score : 2400
> Local maintenance : False
> Host timestamp : 836296
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=836296 (Sat Dec 27 11:42:39 2014)
> host-id=1
> score=2400
> maintenance=False
> state=EngineUp
>
>
> --== Host 2 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.93
> Host ID : 2
> Engine status : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score : 2400
> Local maintenance : False
> Host timestamp : 687358
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=687358 (Sat Dec 27 08:42:04 2014)
> host-id=2
> score=2400
> maintenance=False
> state=EngineDown
>
>
> --== Host 3 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.92
> Host ID : 3
> Engine status : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score : 0
> Local maintenance : True
> Host timestamp : 681827
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=681827 (Sat Dec 27 08:42:40 2014)
> host-id=3
> score=0
> maintenance=True
> state=LocalMaintenance
> [root@compute2-1 vdsm]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-27
> 08:42:41,109::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:42:51,198::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-27
> 08:42:51,420::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-27
> 08:42:51,420::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:43:01,507::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-27
> 08:43:01,773::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-27
> 08:43:01,773::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:43:11,859::state_decorators::124::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
> Local maintenance detected
> MainThread::INFO::2014-12-27
> 08:43:12,072::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state LocalMaintenance (score: 0)
> MainThread::INFO::2014-12-27
> 08:43:12,072::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
>
>
>
> [root@compute2-3 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-27
> 11:36:28,855::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:39,130::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:39,130::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:49,449::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:49,449::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:59,739::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:36:59,739::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:37:09,779::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm running on localhost
> MainThread::INFO::2014-12-27
> 11:37:10,026::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:37:10,026::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-27
> 11:37:20,331::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-27
> 11:37:20,331::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
>
>
> [root@compute2-2 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log
> MainThread::INFO::2014-12-27
> 08:36:12,462::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:22,797::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:22,798::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:32,876::states::437::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm is running on host 10.0.0.94 (id 1)
> MainThread::INFO::2014-12-27
> 08:36:33,169::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:33,169::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:43,567::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:43,567::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:53,858::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:36:53,858::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-27
> 08:37:04,028::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Global metadata: {'maintenance': False}
> MainThread::INFO::2014-12-27
> 08:37:04,028::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.0.0.94 (id 1): {'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=835987
> (Sat Dec 27 11:37:30
> 2014)\nhost-id=1\nscore=2400\nmaintenance=False\nstate=EngineUp\n',
> 'hostname': '10.0.0.94', 'alive': True, 'host-id': 1, 'engine-status':
> {'health': 'good', 'vm': 'up', 'detail': 'up'}, 'score': 2400,
> 'maintenance': False, 'host-ts': 835987}
> MainThread::INFO::2014-12-27
> 08:37:04,028::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.0.0.92 (id 3): {'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=681528
> (Sat Dec 27 08:37:41
> 2014)\nhost-id=3\nscore=0\nmaintenance=True\nstate=LocalMaintenance\n',
> 'hostname': '10.0.0.92', 'alive': True, 'host-id': 3, 'engine-status':
> {'reason': 'vm not running on this host', 'health': 'bad', 'vm':
> 'down', 'detail': 'unknown'}, 'score': 0, 'maintenance': True,
> 'host-ts': 681528}
> MainThread::INFO::2014-12-27
> 08:37:04,028::state_machine::168::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Local (id 2): {'engine-health': {'reason': 'vm not running on this
> host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}, 'bridge':
> True, 'mem-free': 15300.0, 'maintenance': False, 'cpu-load': 0.0215,
> 'gateway': True}
> MainThread::INFO::2014-12-27
> 08:37:04,265::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-27
> 08:37:04,265::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
>
> Thanks,
> Cong
>
> On 2014/12/22, at 5:29, "Simone Tiraboschi"
> <stirabos(a)redhat.com<mailto:stirabos@redhat.com><mailto:stirabos@redhat.com><mailto:stirabos@redhat.com>>
> wrote:
>
>
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue(a)alliedtelesis.com<mailto:Cong_Yue@alliedtelesis.com><mailto:Cong_Yue@alliedtelesis.com><mailto:Cong_Yue@alliedtelesis.com>>
> To: "Simone Tiraboschi"
> <stirabos(a)redhat.com<mailto:stirabos@redhat.com><mailto:stirabos@redhat.com><mailto:stirabos@redhat.com>>
> Cc:
> users(a)ovirt.org<mailto:users@ovirt.org><mailto:users@ovirt.org><mailto:users@ovirt.org>
> Sent: Friday, December 19, 2014 7:22:10 PM
> Subject: RE: [ovirt-users] VM failover with ovirt3.5
>
> Thanks for the information. This is the log for my three ovirt nodes.
> From the output of hosted-engine --vm-status, it shows the engine state for
> my 2nd and 3rd ovirt node is DOWN.
> Is this the reason why VM failover not work in my environment?
>
> No, they looks ok: you can run the engine VM on single host at a time.
>
> How can I make
> also engine works for my 2nd and 3rd ovit nodes?
>
> If you put the host 1 in local maintenance mode ( hosted-engine
> --set-maintenance --mode=local ) the VM should migrate to host 2; if you
> reactivate host 1 ( hosted-engine --set-maintenance --mode=none ) and put
> host 2 in local maintenance mode the VM should migrate again.
>
> Can you please try that and post the logs if something is going bad?
>
>
> --
> --== Host 1 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.94
> Host ID : 1
> Engine status : {"health": "good", "vm": "up",
> "detail": "up"}
> Score : 2400
> Local maintenance : False
> Host timestamp : 150475
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=150475 (Fri Dec 19 13:12:18 2014)
> host-id=1
> score=2400
> maintenance=False
> state=EngineUp
>
>
> --== Host 2 status ==--
>
> Status up-to-date : True
> Hostname : 10.0.0.93
> Host ID : 2
> Engine status : {"reason": "vm not running on
> this host", "health": "bad", "vm": "down", "detail": "unknown"}
> Score : 2400
> Local maintenance : False
> Host timestamp : 1572
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=1572 (Fri Dec 19 10:12:18 2014)
> host-id=2
> score=2400
> maintenance=False
> state=EngineDown
>
>
> --== Host 3 status ==--
>
> Status up-to-date : False
> Hostname : 10.0.0.92
> Host ID : 3
> Engine status : unknown stale-data
> Score : 2400
> Local maintenance : False
> Host timestamp : 987
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=987 (Fri Dec 19 10:09:58 2014)
> host-id=3
> score=2400
> maintenance=False
> state=EngineDown
>
> --
> And the /var/log/ovirt-hosted-engine-ha/agent.log for three ovirt nodes are
> as follows:
> --
> 10.0.0.94(hosted-engine-1)
> ---
> MainThread::INFO::2014-12-19
> 13:09:33,716::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:33,716::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:44,017::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:44,017::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:54,303::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:09:54,303::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:04,342::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm running on localhost
> MainThread::INFO::2014-12-19
> 13:10:04,617::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:04,617::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:14,657::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Global metadata: {'maintenance': False}
> MainThread::INFO::2014-12-19
> 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.0.0.93 (id 2): {'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1448
> (Fri Dec 19 10:10:14
> 2014)\nhost-id=2\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
> 'hostname': '10.0.0.93', 'alive': True, 'host-id': 2, 'engine-status':
> {'reason': 'vm not running on this host', 'health': 'bad', 'vm':
> 'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False,
> 'host-ts': 1448}
> MainThread::INFO::2014-12-19
> 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Host 10.0.0.92 (id 3): {'extra':
> 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=987
> (Fri Dec 19 10:09:58
> 2014)\nhost-id=3\nscore=2400\nmaintenance=False\nstate=EngineDown\n',
> 'hostname': '10.0.0.92', 'alive': True, 'host-id': 3, 'engine-status':
> {'reason': 'vm not running on this host', 'health': 'bad', 'vm':
> 'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False,
> 'host-ts': 987}
> MainThread::INFO::2014-12-19
> 13:10:14,658::state_machine::168::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh)
> Local (id 1): {'engine-health': {'health': 'good', 'vm': 'up',
> 'detail': 'up'}, 'bridge': True, 'mem-free': 1079.0, 'maintenance':
> False, 'cpu-load': 0.0269, 'gateway': True}
> MainThread::INFO::2014-12-19
> 13:10:14,904::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:14,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:25,210::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:25,210::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:35,499::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:35,499::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:45,784::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:45,785::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:56,070::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:10:56,070::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:06,109::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Engine vm running on localhost
> MainThread::INFO::2014-12-19
> 13:11:06,359::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:06,359::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:16,658::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:16,658::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:26,991::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:26,991::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:37,341::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineUp (score: 2400)
> MainThread::INFO::2014-12-19
> 13:11:37,341::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.93 (id: 2, score: 2400)
> ----
>
> 10.0.0.93 (hosted-engine-2)
> MainThread::INFO::2014-12-19
> 10:12:18,339::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:18,339::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:28,651::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:28,652::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:39,010::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:39,010::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:49,338::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:49,338::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:59,642::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:12:59,642::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
> MainThread::INFO::2014-12-19
> 10:13:10,010::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Current state EngineDown (score: 2400)
> MainThread::INFO::2014-12-19
> 10:13:10,010::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Best remote host 10.0.0.94 (id: 1, score: 2400)
>
>
> 10.0.0.92(hosted-engine-3)
> same as 10.0.0.93
> --
>
> -----Original Message-----
> From: Simone Tiraboschi [mailto:stirabos@redhat.com]
> Sent: Friday, December 19, 2014 12:28 AM
> To: Yue, Cong
> Cc:
> users(a)ovirt.org<mailto:users@ovirt.org><mailto:users@ovirt.org><mailto:users@ovirt.org>
> Subject: Re: [ovirt-users] VM failover with ovirt3.5
>
>
>
> ----- Original Message -----
> From: "Cong Yue"
> <Cong_Yue(a)alliedtelesis.com<mailto:Cong_Yue@alliedtelesis.com><mailto:Cong_Yue@alliedtelesis.com><mailto:Cong_Yue@alliedtelesis.com>>
> To:
> users(a)ovirt.org<mailto:users@ovirt.org><mailto:users@ovirt.org><mailto:users@ovirt.org>
> Sent: Friday, December 19, 2014 2:14:33 AM
> Subject: [ovirt-users] VM failover with ovirt3.5
>
>
>
> Hi
>
>
>
> In my environment, I have 3 ovirt nodes as one cluster. And on top of
> host-1, there is one vm to host ovirt engine.
>
> Also I have one external storage for the cluster to use as data domain
> of engine and data.
>
> I confirmed live migration works well in my environment.
>
> But it seems very buggy for VM failover if I try to force to shut down
> one ovirt node. Sometimes the VM in the node which is shutdown can
> migrate to other host, but it take more than several minutes.
>
> Sometimes, it can not migrate at all. Sometimes, only when the host is
> back, the VM is beginning to move.
>
> Can you please check or share the logs under
> /var/log/ovirt-hosted-engine-ha/
> ?
>
> Is there some documentation to explain how VM failover is working? And
> is there some bugs reported related with this?
>
> http://www.ovirt.org/Features/Self_Hosted_Engine#Agent_State_Diagram
>
> Thanks in advance,
>
> Cong
>
>
>
>
> This e-mail message is for the sole use of the intended recipient(s)
> and may contain confidential and privileged information. Any
> unauthorized review, use, disclosure or distribution is prohibited. If
> you are not the intended recipient, please contact the sender by reply
> e-mail and destroy all copies of the original message. If you are the
> intended recipient, please be advised that the content of this message
> is subject to access, review and disclosure by the sender's e-mail System
> Administrator.
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org<mailto:Users@ovirt.org><mailto:Users@ovirt.org><mailto:Users@ovirt.org>
> http://lists.ovirt.org/mailman/listinfo/users
>
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
>
>
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org<mailto:Users@ovirt.org><mailto:Users@ovirt.org><mailto:Users@ovirt.org>
> http://lists.ovirt.org/mailman/listinfo/users
>
> ________________________________
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
>
> ________________________________
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
>
> ________________________________
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
>
>
> ________________________________
> This e-mail message is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. Any unauthorized review,
> use, disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all copies
> of the original message. If you are the intended recipient, please be
> advised that the content of this message is subject to access, review and
> disclosure by the sender's e-mail System Administrator.
9 years, 11 months
Recovering from an aborted hosted-engine --deploy
by Michael Schefczyk
Dear All,
after a failing hosted-engine --deploy, I am trying to recover the system based on the following description:
http://lists.ovirt.org/pipermail/users/2014-May/024423.html
Whatever I do, however, I receive the following message during the next hosted-engine --deploy:
[ ERROR ] Failed to execute stage 'Environment setup': Failed to reconfigure libvirt for VDSM
Is there a way to initiate such reconfiguration without completely installing the server from scratch??
Thank you very much for any efforts,
Michael
9 years, 11 months
Problems starting VM
by Jeremy Utley
Hello everyone!
We have been working on our testing OVirt cluster today again, for the
first time in a few weeks, and all of the sudden a new problem has cropped
up. VM's that I created weeks ago and had working properly are now no
longer starting. When we try to start one of them, we get this error in
the engine console:
VM CentOS1 is down with error. Exit message: Bad volume specification
{'index': 0, 'iface': 'virtio', 'type': 'disk', 'format': 'raw',
'bootOrder': '1', 'volumeID': 'a737621e-6e66-4cd9-9014-67f7aaa184fb',
'apparentsize': '53687091200', 'imageID':
'702440a9-cd53-4300-8369-28123e8a095e', 'specParams': {}, 'readonly':
'false', 'domainID': 'fa2f828c-f98a-4a17-99fb-1ec1f46d018c', 'reqsize':
'0', 'deviceId': '702440a9-cd53-4300-8369-28123e8a095e', 'truesize':
'53687091200', 'poolID': 'a0781e2b-6242-4043-86c2-cd6694688ed2', 'device':
'disk', 'shared': 'false', 'propagateErrors': 'off', 'optional': 'false'}.
Looking at the VDSM log files, I think I've found what's actually
triggering this up, but I honestly do not know how to decipher it - here's
the message:
Thread-418::ERROR::2015-01-09
15:59:57,874::task::863::Storage.TaskManager.Task::(_setError)
Task=`11a740b7-4391-47ab-8575-919bd1e0c3fb`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 870, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
res = f(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 3242, in prepareImage
leafInfo = dom.produceVolume(imgUUID, leafUUID).getVmVolumeInfo()
File "/usr/share/vdsm/storage/glusterVolume.py", line 35, in
getVmVolumeInfo
volTrans = VOLUME_TRANS_MAP[volInfo[volname]['transportType'][0]]
KeyError: u'_gf-os'
This is Ovirt 3.5, with a 2-node gluster as the storage domain (no ovirt
stuff running there) , and 5 virtualization nodes, all machines running
CentOS 6.6 installs. We also have the patched RPMs that *should* enable
libgfapi access to gluster, but I can't confirm those are working
properly. The gluster filesystem is mounted on the virtualization node:
gf-os01-ib:/gf-os on /rhev/data-center/mnt/glusterSD/gf-os01-ib:_gf-os type
fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
Anyone got any ideas? More logs available upon request.
9 years, 11 months