uncaught exception with engine vm in 3.6.1 selecting hosted_storage
by Gianluca Cecchi
Hello after updating from 3.6.0 to 3.6.1 my HE environment, I'm not able to
import/activate the hosted_storage domain because it results anattached
Moreover I get the popup message
Uncaught exception occurred. Please try reloading the page. Details:
(TypeError) __gwt$exception: <skipped>: c is null
when I select the line of hosted_storage domain in storage tab. See:
https://drive.google.com/file/d/0BwoPbcrMv8mvVlc1XzNUZ18yWWs/view?usp=sha...
I don't find apparently ERROR messages wen I do this, so I don't know where
to watch.
On engine, where the db lives, I have under /var/lib/pgsql/data/pg_log
[root@ractorshe pg_log]# ls -lrt
total 12
-rw-------. 1 postgres postgres 0 Dec 16 01:00 postgresql-Wed.log
-rw-------. 1 postgres postgres 0 Dec 17 01:00 postgresql-Thu.log
-rw-------. 1 postgres postgres 0 Dec 18 01:00 postgresql-Fri.log
-rw-------. 1 postgres postgres 7480 Dec 20 00:39 postgresql-Sat.log
-rw-------. 1 postgres postgres 3488 Dec 20 01:29 postgresql-Sun.log
-rw-------. 1 postgres postgres 0 Dec 21 01:00 postgresql-Mon.log
-rw-------. 1 postgres postgres 0 Dec 22 01:00 postgresql-Tue.log
On Sunday logs, when the engine was started I see
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 1/254C6710
LOG: record with zero length at 1/254EB6B8
LOG: redo done at 1/254EB688
LOG: last completed transaction was at log time 2015-12-19
23:49:47.920147+00
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
ERROR: insert or update on table "storage_domain_dynamic" violates foreign
key constraint "fk_stora
ge_domain_dynamic_storage_domain_static"
DETAIL: Key (id)=(00000000-0000-0000-0000-000000000000) is not present in
table "storage_domain_static".
CONTEXT: SQL statement "INSERT INTO
storage_domain_dynamic(available_disk_size, id, used_disk_size)
VALUES(v_available_disk_size, v_id, v_used_disk_size)"
PL/pgSQL function
insertstorage_domain_dynamic(integer,uuid,integer) line 3 at SQL statement
STATEMENT: select * from insertstorage_domain_dynamic($1, $2, $3) as result
and then many lines of kind
LOG: autovacuum: found orphan temp table "pg_temp_8"."tt_temp22" in
database "engine"
so messages related to domains and could be the reason of the exception and
of the inability of importing/attaching my HE storage domain.
Let me know if I can provide more logs or output of queries from the
database.
Gianluca
9 years
How to run "engine-backup"?
by Will Dennis
--_000_F3282EEAFF180F43BAF1AD0A7C34739D38F285njmailneclabscom_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Yay, I *finally* have my 3-host hyper-converged oVirt datacenter stood up :=
)
[root@ovirt-node-01 ~]# hosted-engine --vm-status
--=3D=3D Host 1 status =3D=3D--
Status up-to-date : True
Hostname : ovirt-node-01
Host ID : 1
Engine status : {"health": "good", "vm": "up", "detail=
": "up"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 65c41ca5
Host timestamp : 217522
--=3D=3D Host 2 status =3D=3D--
Status up-to-date : True
Hostname : ovirt-node-02
Host ID : 2
Engine status : {"reason": "vm not running on this hos=
t", "health": "bad", "vm": "down", "detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : a7a599d8
Host timestamp : 56101
--=3D=3D Host 3 status =3D=3D--
Status up-to-date : True
Hostname : ovirt-node-03
Host ID : 3
Engine status : {"reason": "vm not running on this hos=
t", "health": "bad", "vm": "down", "detail": "unknown"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 6e138d0b
Host timestamp : 432658
Now in the oVirt webadmin UI down in the "Alerts" section, I am seeing this=
message:
"There is no full backup available, please run engine-backup to prevent dat=
a loss in case of corruption."
I do not see a "engine-backup" CLI command on my hosts; how does one do thi=
s? (I have searched ovirt.org to no avail...)
Thanks,
Will
--_000_F3282EEAFF180F43BAF1AD0A7C34739D38F285njmailneclabscom_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3D"Generator" content=3D"Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=3D"EN-US" link=3D"blue" vlink=3D"purple">
<div class=3D"WordSection1">
<p class=3D"MsoNormal">Yay, I *<b>finally</b>* have my 3-host hyper-converg=
ed oVirt datacenter stood up :)<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">[root@ovirt-node-01 ~]# hosted-engine --vm-status<o:=
p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">--=3D=3D Host 1 status =3D=3D--<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">Status up-to-date  =
; : True<=
o:p></o:p></p>
<p class=3D"MsoNormal">Hostname &n=
bsp;  =
; : ovirt-node-01<o:p></o:p></p>
<p class=3D"MsoNormal">Host ID &nb=
sp; =
: 1<o:p></o:p></p>
<p class=3D"MsoNormal">Engine status &nb=
sp; =
: {"health": "good", "vm": "=
;up", "detail": "up"}<o:p></o:p></p>
<p class=3D"MsoNormal">Score  =
; &n=
bsp; : 3400<o:p></o:p></p>
<p class=3D"MsoNormal">stopped &nb=
sp; =
: False<o:p></o:p></p>
<p class=3D"MsoNormal">Local maintenance  =
; : False=
<o:p></o:p></p>
<p class=3D"MsoNormal">crc32  =
; &n=
bsp; : 65c41ca5<o:p></o:p><=
/p>
<p class=3D"MsoNormal">Host timestamp &n=
bsp;  =
; : 217522<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">--=3D=3D Host 2 status =3D=3D--<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">Status up-to-date  =
; : True<=
o:p></o:p></p>
<p class=3D"MsoNormal">Hostname &n=
bsp;  =
; : ovirt-node-02<o:p></o:p></p>
<p class=3D"MsoNormal">Host ID &nb=
sp; =
: 2<o:p></o:p></p>
<p class=3D"MsoNormal">Engine status &nb=
sp; =
: {"reason": "vm not running on this host"=
, "health": "bad", "vm": "down", &q=
uot;detail": "unknown"}<o:p></o:p></p>
<p class=3D"MsoNormal">Score  =
; &n=
bsp; : 3400<o:p></o:p></p>
<p class=3D"MsoNormal">stopped &nb=
sp; =
: False<o:p></o:p></p>
<p class=3D"MsoNormal">Local maintenance  =
; : False=
<o:p></o:p></p>
<p class=3D"MsoNormal">crc32  =
; &n=
bsp; : a7a599d8<o:p></o:p><=
/p>
<p class=3D"MsoNormal">Host timestamp &n=
bsp;  =
; : 56101<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">--=3D=3D Host 3 status =3D=3D--<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">Status up-to-date  =
; : True<=
o:p></o:p></p>
<p class=3D"MsoNormal">Hostname &n=
bsp;  =
; : ovirt-node-03<o:p></o:p></p>
<p class=3D"MsoNormal">Host ID &nb=
sp; =
: 3<o:p></o:p></p>
<p class=3D"MsoNormal">Engine status &nb=
sp; =
: {"reason": "vm not running on this host"=
, "health": "bad", "vm": "down", &q=
uot;detail": "unknown"}<o:p></o:p></p>
<p class=3D"MsoNormal">Score  =
; &n=
bsp; : 3400<o:p></o:p></p>
<p class=3D"MsoNormal">stopped &nb=
sp; =
: False<o:p></o:p></p>
<p class=3D"MsoNormal">Local maintenance  =
; : False=
<o:p></o:p></p>
<p class=3D"MsoNormal">crc32  =
; &n=
bsp; : 6e138d0b<o:p></o:p><=
/p>
<p class=3D"MsoNormal">Host timestamp &n=
bsp;  =
; : 432658<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">Now in the oVirt webadmin UI down in the “Aler=
ts” section, I am seeing this message:<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">“There is no full backup available, please run=
engine-backup to prevent data loss in case of corruption.”<o:p></o:p=
></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">I do not see a “engine-backup” CLI comma=
nd on my hosts; how does one do this? (I have searched ovirt.org to no avai=
l...)<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p> </o:p></p>
<p class=3D"MsoNormal">Thanks,<o:p></o:p></p>
<p class=3D"MsoNormal">Will<o:p></o:p></p>
</div>
</body>
</html>
--_000_F3282EEAFF180F43BAF1AD0A7C34739D38F285njmailneclabscom_--
9 years
mount a usb
by alireza sadeh seighalan
hi everyone
i want to mount a usb to a windows vm in ovirt3.6.1. how can i run
it?thanks in advance
9 years
virtual Disk
by Taste-Of-IT
Hello,
i testing oVirt 3.6 as Self-Hosted-Engine and create a virtual machine.
Now i want to change the size of the disk and found the possibilities to
change the size of the disk and a field to grow the disk. In ovirt
manual it is descripted to change the value of the grow field. My
question is what is the difference and what are the results of that. E.g
what happend if i only change the disk size from 8 to 10? is it the same
like i change the grow size from 0 to 2?
thx for technical explanation.
9 years
After update to 3.6.1 profile internal does not exist message
by Gianluca Cecchi
Hello.
I updated my test self hosted engine vm to 3.6.1 and to CentOS 7.2.
Now it seems it isn't able to to login to webadmin due to this in engine.log
(field is empty in web admin gui)
2015-12-19 13:59:05,182 ERROR
[org.ovirt.engine.core.bll.aaa.LoginBaseCommand] (default task-17) []
Can't login because authentication profile 'internal' doesn't exist.
Is it an already known problem?
The first ERRROR message I find in it is dated 13:34 and it is an SQL one:
2015-12-19 13:34:39,882 ERROR [org.ovirt.engine.core.bll.Backend]
(ServerService Thread Pool -- 42) [] Failed to run compensation on startup
for Command
'org.ovirt.engine.core.bll.storage.AddExistingFileStorageDomainCommand',
Command Id 'bb7ee4a3-6b35-4c62-9823-a434a40e5b38':
CallableStatementCallback; SQL [{call insertstorage_domain_dynamic(?, ?,
?)}]; ERROR: insert or update on table "storage_domain_dynamic" violates
foreign key constraint "fk_storage_domain_dynamic_storage_domain_static"
Detail: Key (id)=(00000000-0000-0000-0000-000000000000) is not present in
table "storage_domain_static".
Where: SQL statement "INSERT INTO
storage_domain_dynamic(available_disk_size, id, used_disk_size)
VALUES(v_available_disk_size, v_id, v_used_disk_size)"
PL/pgSQL function insertstorage_domain_dynamic(integer,uuid,integer) line 3
at SQL statement; nested exception is org.postgresql.util.PSQLException:
ERROR: insert or update on table "storage_domain_dynamic" violates foreign
key constraint "fk_storage_domain_dynamic_storage_domain_static"
Detail: Key (id)=(00000000-0000-0000-0000-000000000000) is not present in
table "storage_domain_static".
Where: SQL statement "INSERT INTO
storage_domain_dynamic(available_disk_size, id, used_disk_size)
VALUES(v_available_disk_size, v_id, v_used_disk_size)"
PL/pgSQL function insertstorage_domain_dynamic(integer,uuid,integer) line 3
at SQL statement
2015-12-19 13:34:39,882 ERROR [org.ovirt.engine.core.bll.Backend]
(ServerService Thread Pool -- 42) [] Exception:
org.springframework.dao.DataIntegrityViolationException:
CallableStatementCallback; SQL [{call insertstorage_domain_dynamic(?, ?,
?)}]; ERROR: insert or update on table "storage_domain_dynamic" violates
foreign key constraint "fk_storage_domain_dynamic_storage_domain_static"
Detail: Key (id)=(00000000-0000-0000-0000-000000000000) is not present in
table "storage_domain_static".
I think it is not related to update from CentOS 7.1 to 7.2 that generates
an update of PostgreSQL too because it is 10 minutes later:
from installed one
Nov 04 12:55:30 Installed: postgresql-server-9.2.13-1.el7_1.x86_64
to 7.2 one
Dec 19 13:45:51 Updated: postgresql-server-9.2.14-1.el7_1.x86_64
BTW: my order was
yum update "ovirt-engine-setup-*"
engine-setup
the engine-setup completed successfully with
...
[ INFO ] Stage: Misc configuration
[ INFO ] Backing up database localhost:engine to
'/var/lib/ovirt-engine/backups/engine-20151219132925.5OoFQA.dump'.
[ INFO ] Creating/refreshing Engine database schema
[ INFO ] Creating/refreshing Engine 'internal' domain database schema
[ INFO ] Upgrading CA
[ INFO ] Configuring WebSocket Proxy
[ INFO ] Generating post install configuration file
'/etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf'
[ INFO ] Stage: Transaction commit
[ INFO ] Stage: Closing up
--== SUMMARY ==--
SSH fingerprint: 19:56:8d:3e:50:fc:90:37:5a:ba:6c:57:30:b1:7d:93
Internal CA
DA:E6:04:34:99:A0:DB:CE:3F:0A:7B:A2:96:67:4C:7F:19:CA:95:5F
Note! If you want to gather statistical information you can
install Reports and/or DWH:
http://www.ovirt.org/Ovirt_DWH
http://www.ovirt.org/Ovirt_Reports
Web access is enabled at:
http://ractorshe.mydomain:80/ovirt-engine
https://ractorshe.mydomain:443/ovirt-engine
--== END OF SUMMARY ==--
[ INFO ] Starting engine service
[ INFO ] Restarting httpd
[ INFO ] Restarting ovirt-vmconsole proxy service
[ INFO ] Stage: Clean up
Log file is located at
/var/log/ovirt-engine/setup/ovirt-engine-setup-20151219132415-2pfee1.log
[ INFO ] Generating answer file
'/var/lib/ovirt-engine/setup/answers/20151219133045-setup.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ INFO ] Execution of setup completed successfully
Strangely the VM was shutdown...
In /var/log/libvirt/qemu/HostedEngine.log I saw (in UTC)
2015-12-19 12:31:19.494+0000: shutting down
So I exited from global maintenance, the VM was started and I verified that
admin web gui was ok with new version.
Then I put maintenance again and run on engine VM
systemctl stop ovirt-engine
yum update (I forgot to also shutdown POstgreSQL before... )
shutdown -r now
As it was in maintenance it wasn't started again so I exited from
maintenance and VM was started, but now I'm not able to see the internal
profile....
Let me know if you need further info.
Gianluca
9 years
Re: [ovirt-users] Cannot retrieve answer file from 1st HE host when setting up 2nd host
by Simone Tiraboschi
On Tue, Dec 22, 2015 at 3:06 PM, Will Dennis <wdennis(a)nec-labs.com> wrote:
> See attached for requested logs
>
Thanks, the issue is here:
Dec 21 19:40:53 ovirt-node-03 etc-glusterfs-glusterd.vol[1079]: [2015-12-22
00:40:53.496109] C [MSGID: 106002]
[glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
0-management: Server quorum lost for volume engine. Stopping local bricks.
Dec 21 19:40:53 ovirt-node-03 etc-glusterfs-glusterd.vol[1079]: [2015-12-22
00:40:53.496410] C [MSGID: 106002]
[glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
0-management: Server quorum lost for volume vmdata. Stopping local bricks.
So at that point gluster lost its quorum and the fail system got read-only.
On the getStorageDomainsList VDSM internally raises cause the file-system
is read only:
Thread-141::DEBUG::2015-12-21
11:29:59,666::fileSD::157::Storage.StorageDomainManifest::(__init__)
Reading domain in path
/rhev/data-center/mnt/glusterSD/localhost:_engine/e89b6e64-bd7d-4846-b970-9af32a3295ee
Thread-141::DEBUG::2015-12-21
11:29:59,666::__init__::320::IOProcessClient::(_run) Starting IOProcess...
Thread-141::DEBUG::2015-12-21
11:29:59,680::persistentDict::192::Storage.PersistentDict::(__init__)
Created a persistent dict with FileMetadataRW backend
Thread-141::ERROR::2015-12-21
11:29:59,686::hsm::2898::Storage.HSM::(getStorageDomainsList) Unexpected
error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/hsm.py", line 2882, in getStorageDomainsList
dom = sdCache.produce(sdUUID=sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 100, in produce
domain.getRealDomain()
File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 124, in _realProduce
domain = self._findDomain(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 143, in _findDomain
dom = findMethod(sdUUID)
File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain
return GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
File "/usr/share/vdsm/storage/fileSD.py", line 198, in __init__
validateFileSystemFeatures(manifest.sdUUID, manifest.mountpoint)
File "/usr/share/vdsm/storage/fileSD.py", line 93, in
validateFileSystemFeatures
oop.getProcessPool(sdUUID).directTouch(testFilePath)
File "/usr/share/vdsm/storage/outOfProcess.py", line 350, in directTouch
ioproc.touch(path, flags, mode)
File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 543,
in touch
self.timeout)
File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 427,
in _sendCommand
raise OSError(errcode, errstr)
OSError: [Errno 30] Read-only file system
But instead of reporting a failure to hosted-engine-setup, it reported a
successfully execution where it wasn't able to find any storage domain
there ( this one is a real bug, I'm going to open a bug on that, can I
attach your logs there? ):
Thread-141::INFO::2015-12-21
11:29:59,702::logUtils::51::dispatcher::(wrapper) Run and protect:
getStorageDomainsList, Return response: {'domlist': []}
Thread-141::DEBUG::2015-12-21
11:29:59,702::task::1191::Storage.TaskManager.Task::(prepare)
Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::finished: {'domlist': []}
Thread-141::DEBUG::2015-12-21
11:29:59,702::task::595::Storage.TaskManager.Task::(_updateState)
Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::moving from state preparing ->
state finished
Thread-141::DEBUG::2015-12-21
11:29:59,703::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}
Thread-141::DEBUG::2015-12-21
11:29:59,703::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-141::DEBUG::2015-12-21
11:29:59,703::task::993::Storage.TaskManager.Task::(_decref)
Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::ref 0 aborting False
Thread-141::INFO::2015-12-21
11:29:59,704::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request
handler for 127.0.0.1:39718 stopped
And so, cause VDSM doesn't report any existing storage domain,
hosted-engine-setup assumes that you are going to deploy the first host and
so your original issue.
>
>
> *From:* Simone Tiraboschi [mailto:stirabos@redhat.com]
> *Sent:* Tuesday, December 22, 2015 8:56 AM
> *To:* Will Dennis
> *Cc:* Sahina Bose; Yedidyah Bar David
>
> *Subject:* Re: [ovirt-users] Cannot retrieve answer file from 1st HE host
> when setting up 2nd host
>
>
>
>
>
> On Tue, Dec 22, 2015 at 2:44 PM, Will Dennis <wdennis(a)nec-labs.com> wrote:
>
> Which logs are needed?
>
>
> Let's start with vdsm.log and /var/log/messages
> Then it's quite strange that you have that amount of data in mom.log so
> also that one could be interesting.
>
>
>
>
>
> /var/log/vdsm
>
> total 24M
>
> drwxr-xr-x 3 vdsm kvm 4.0K Dec 18 20:10 .
>
> drwxr-xr-x. 13 root root 4.0K Dec 20 03:15 ..
>
> drwxr-xr-x 2 vdsm kvm 6 Dec 9 03:24 backup
>
> -rw-r--r-- 1 vdsm kvm 2.5K Dec 21 11:29 connectivity.log
>
> -rw-r--r-- 1 vdsm kvm 173K Dec 21 11:21 mom.log
>
> -rw-r--r-- 1 vdsm kvm 2.0M Dec 17 10:09 mom.log.1
>
> -rw-r--r-- 1 vdsm kvm 2.0M Dec 17 04:06 mom.log.2
>
> -rw-r--r-- 1 vdsm kvm 2.0M Dec 16 22:03 mom.log.3
>
> -rw-r--r-- 1 vdsm kvm 2.0M Dec 16 16:00 mom.log.4
>
> -rw-r--r-- 1 vdsm kvm 2.0M Dec 16 09:57 mom.log.5
>
> -rw-r--r-- 1 root root 115K Dec 21 11:29 supervdsm.log
>
> -rw-r--r-- 1 root root 2.7K Oct 16 11:38 upgrade.log
>
> -rw-r--r-- 1 vdsm kvm 13M Dec 22 08:44 vdsm.log
>
>
>
>
>
> *From:* Simone Tiraboschi [mailto:stirabos@redhat.com]
> *Sent:* Tuesday, December 22, 2015 3:58 AM
> *To:* Will Dennis; Sahina Bose
> *Cc:* Yedidyah Bar David; users
> *Subject:* Re: [ovirt-users] Cannot retrieve answer file from 1st HE host
> when setting up 2nd host
>
>
>
>
>
>
>
> On Tue, Dec 22, 2015 at 2:09 AM, Will Dennis <wdennis(a)nec-labs.com> wrote:
>
> http://ur1.ca/ocstf
>
>
>
>
> 2015-12-21 11:28:39 DEBUG otopi.plugins.otopi.dialog.human
> dialog.__logString:219 DIALOG:SEND Please specify the full
> shared storage connection path to use (example: host:/path):
> 2015-12-21 11:28:55 DEBUG otopi.plugins.otopi.dialog.human
> dialog.__logString:219 DIALOG:RECEIVE localhost:/engine
>
>
>
> OK, so you are trying to deploy hosted-engine on GlusterFS in a
> hyper-converged way (using the same hosts for virtualization and for
> serving GlusterFS). Unfortunately I've to advise you that this is not a
> supported configuration on oVirt 3.6 due to different open bugs.
>
> So I'm glad you can help us testing it but I prefer to advise that today
> that schema is not production ready.
>
>
>
> In your case it seams that VDSM correctly connects the GlusterFS volume
> seeing all the bricks
>
>
> 2015-12-21 11:28:55 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.nfs plugin.execute:936
> execute-output: ('/sbin/gluster', '--mode=script', '--xml', 'volume',
> 'info', 'engine', '--remote-host=localhost') stdout:
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <cliOutput>
> <opRet>0</opRet>
> <opErrno>0</opErrno>
> <opErrstr/>
> <volInfo>
> <volumes>
> <volume>
> <name>engine</name>
> <id>974c9da4-b236-4fc1-b26a-645f14601db8</id>
> <status>1</status>
> <statusStr>Started</statusStr>
> <brickCount>6</brickCount>
> <distCount>3</distCount>
>
>
>
> but then VDSM doesn't find any storage domain there:
>
>
>
>
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage.Plugin._late_customization
> 2015-12-21 11:29:58 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._getExistingDomain:476 _getExistingDomain
> 2015-12-21 11:29:58 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._storageServerConnection:638 connectStorageServer
> 2015-12-21 11:29:58 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._storageServerConnection:701 {'status': {'message': 'OK', 'code':
> 0}, 'statuslist': [{'status': 0, 'id':
> '67ece152-dd66-444c-8d18-4249d1b8f488'}]}
> 2015-12-21 11:29:58 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._getStorageDomainsList:595 getStorageDomainsList
> 2015-12-21 11:29:59 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._getStorageDomainsList:598 {'status': {'message': 'OK', 'code': 0},
> 'domlist': []}
>
>
>
> Can you please attach also the correspondent VDSM logs?
>
>
>
> Adding Sahina here.
>
>
>
>
>
> On Dec 21, 2015, at 11:58 AM, Simone Tiraboschi <stirabos(a)redhat.com
> <mailto:stirabos@redhat.com>> wrote:
>
>
> On Mon, Dec 21, 2015 at 5:52 PM, Will Dennis <wdennis(a)nec-labs.com<mailto:
> wdennis(a)nec-labs.com>> wrote:
>
> However, when I went to the 3rd host and did the setup, I selected
> 'glusterfs' and gave the path of the engine volume, it came back and
> incorrectly identified it as the first host, instead of an additional
> host... How does setup determine that? I confirmed that on this 3rd host
> that the engine volume is available and has the GUID subfolder of the
> hosted engine...
>
>
> Can you please attach a log of hosted-engine-setup also from there?
>
>
>
>
>
9 years
Re: [ovirt-users] Hosted Engine crash - state = EngineUp-EngineUpBadHealth
by Simone Tiraboschi
On Tue, Dec 22, 2015 at 4:03 PM, Will Dennis <wdennis(a)nec-labs.com> wrote:
> I believe IPtables may be the culprit...
>
>
>
> Host 1:
>
> -------
>
> [root@ovirt-node-01 ~]# iptables -L
>
> Chain INPUT (policy ACCEPT)
>
> target prot opt source destination
>
> ACCEPT all -- anywhere anywhere state
> RELATED,ESTABLISHED
>
> ACCEPT icmp -- anywhere anywhere
>
> ACCEPT all -- anywhere anywhere
>
> ACCEPT tcp -- anywhere anywhere tcp dpt:54321
>
> ACCEPT tcp -- anywhere anywhere tcp
> dpt:sunrpc
>
> ACCEPT udp -- anywhere anywhere udp
> dpt:sunrpc
>
> ACCEPT tcp -- anywhere anywhere tcp dpt:ssh
>
> ACCEPT udp -- anywhere anywhere udp dpt:snmp
>
> ACCEPT tcp -- anywhere anywhere tcp dpt:16514
>
> ACCEPT tcp -- anywhere anywhere multiport
> dports rockwell-csp2
>
> ACCEPT tcp -- anywhere anywhere multiport
> dports rfb:6923
>
> ACCEPT tcp -- anywhere anywhere multiport
> dports 49152:49216
>
> REJECT all -- anywhere anywhere reject-with
> icmp-host-prohibited
>
>
>
> Chain FORWARD (policy ACCEPT)
>
> target prot opt source destination
>
> REJECT all -- anywhere anywhere PHYSDEV
> match ! --physdev-is-bridged reject-with icmp-host-prohibited
>
>
>
> Chain OUTPUT (policy ACCEPT)
>
> target prot opt source destination
>
>
>
> Host 2:
>
> -------
>
> [root@ovirt-node-02 ~]# iptables -L
>
> Chain INPUT (policy ACCEPT)
>
> target prot opt source destination
>
> ACCEPT all -- anywhere anywhere state
> RELATED,ESTABLISHED
>
> ACCEPT icmp -- anywhere anywhere
>
> ACCEPT all -- anywhere anywhere
>
> ACCEPT tcp -- anywhere anywhere tcp dpt:54321
>
> ACCEPT tcp -- anywhere anywhere tcp
> dpt:sunrpc
>
> ACCEPT udp -- anywhere anywhere udp
> dpt:sunrpc
>
> ACCEPT tcp -- anywhere anywhere tcp dpt:ssh
>
> ACCEPT udp -- anywhere anywhere udp dpt:snmp
>
> ACCEPT tcp -- anywhere anywhere tcp dpt:16514
>
> ACCEPT tcp -- anywhere anywhere multiport
> dports rockwell-csp2
>
> ACCEPT tcp -- anywhere anywhere multiport
> dports rfb:6923
>
> ACCEPT tcp -- anywhere anywhere multiport
> dports 49152:49216
>
> REJECT all -- anywhere anywhere reject-with
> icmp-host-prohibited
>
>
>
> Chain FORWARD (policy ACCEPT)
>
> target prot opt source destination
>
> REJECT all -- anywhere anywhere PHYSDEV
> match ! --physdev-is-bridged reject-with icmp-host-prohibited
>
>
>
> Chain OUTPUT (policy ACCEPT)
>
> target prot opt source destination
>
>
>
> Host 3:
>
> -------
>
> [root@ovirt-node-03 ~]# iptables -L
>
> Chain INPUT (policy ACCEPT)
>
> target prot opt source destination
>
>
>
> Chain FORWARD (policy ACCEPT)
>
> target prot opt source destination
>
>
>
> Chain OUTPUT (policy ACCEPT)
>
> target prot opt source destination
>
>
>
>
>
> An example of my Gluster engine volume status (off host #2):
>
>
>
> [root@ovirt-node-02 ~]# gluster volume status
>
> Status of volume: engine
>
> Gluster process TCP Port RDMA Port Online
> Pid
>
>
> ------------------------------------------------------------------------------
>
> Brick ovirt-node-02:/gluster_brick2/engine_
>
> brick 49217 0 Y
> 2973
>
> Brick ovirt-node-03:/gluster_brick3/engine_
>
> brick N/A N/A N
> N/A
>
> Brick ovirt-node-02:/gluster_brick4/engine_
>
> brick 49218 0 Y
> 2988
>
> Brick ovirt-node-03:/gluster_brick5/engine_
>
> brick N/A N/A N
> N/A
>
> NFS Server on localhost 2049 0 Y
> 3007
>
> Self-heal Daemon on localhost N/A N/A Y
> 3012
>
> NFS Server on ovirt-node-03 2049 0 Y
> 1671
>
> Self-heal Daemon on ovirt-node-03 N/A N/A Y
> 1707
>
>
>
> I had changed the base port # per instructions found at
> http://www.ovirt.org/Features/Self_Hosted_Engine_Hyper_Converged_Gluster_...
> :
>
> “By default gluster uses a port that vdsm also wants, so we need to change
> base-port setting avoiding the clash between the two daemons. We need to add
>
>
>
> option base-port 49217
>
> to /etc/glusterfs/glusterd.vol
>
>
>
> and ensure glusterd service is enabled and started before proceeding.”
>
>
>
> So I did that on all the hosts:
>
>
>
> [root@ovirt-node-02 ~]# cat /etc/glusterfs/glusterd.vol
>
> volume management
>
> type mgmt/glusterd
>
> option working-directory /var/lib/glusterd
>
> option transport-type socket,rdma
>
> option transport.socket.keepalive-time 10
>
> option transport.socket.keepalive-interval 2
>
> option transport.socket.read-fail-log off
>
> option ping-timeout 30
>
> # option base-port 49152
>
> option base-port 49217
>
> option rpc-auth-allow-insecure on
>
> end-volume
>
>
>
>
>
> Question: does oVirt really need IPtables to be enforcing rules, or can I
> just set everything wide open? If I can, how to specify that in setup?
>
hosted-engine-setup asks:
iptables was detected on your computer, do you wish setup to
configure it? (Yes, No)[Yes]:
You have just to say no here.
If you say no it's completely up to you to configure it opening the
required ports or everything disabling it if you don't care.
The issue with gluster ports is that hosted-engine-setup simply configure
iptables for what it knows you'll need and on 3.6 it's always assuming that
the gluster volume is served by external hosts.
>
>
>
> W.
>
>
>
>
>
> *From:* Sahina Bose [mailto:sabose@redhat.com]
> *Sent:* Tuesday, December 22, 2015 9:19 AM
> *To:* Will Dennis; Simone Tiraboschi; Dan Kenigsberg
>
> *Subject:* Re: [ovirt-users] Hosted Engine crash - state =
> EngineUp-EngineUpBadHealth
>
>
>
>
>
> On 12/22/2015 07:47 PM, Sahina Bose wrote:
>
>
>
> On 12/22/2015 07:28 PM, Will Dennis wrote:
>
> See attached for requested log files
>
>
> From gluster logs
>
> [2015-12-22 00:40:53.501341] W [MSGID: 108001]
> [afr-common.c:3924:afr_notify] 0-engine-replicate-1: Client-quorum is not
> met
> [2015-12-22 00:40:53.502288] W [socket.c:588:__socket_rwv]
> 0-engine-client-2: readv on 138.15.200.93:49217 failed (No data available)
>
> [2015-12-22 00:41:17.667302] W [fuse-bridge.c:2292:fuse_writev_cbk]
> 0-glusterfs-fuse: 3875597: WRITE => -1 (Read-only file system)
>
> Could you check if the gluster ports are open on all nodes?
>
>
> It's possible you ran into this ? -
> https://bugzilla.redhat.com/show_bug.cgi?id=1288979
>
>
>
>
>
>
>
> *From:* Sahina Bose [mailto:sabose@redhat.com <sabose(a)redhat.com>]
> *Sent:* Tuesday, December 22, 2015 4:59 AM
> *To:* Simone Tiraboschi; Will Dennis; Dan Kenigsberg
> *Cc:* users
> *Subject:* Re: [ovirt-users] Hosted Engine crash - state =
> EngineUp-EngineUpBadHealth
>
>
>
>
>
> On 12/22/2015 02:38 PM, Simone Tiraboschi wrote:
>
>
>
>
>
> On Tue, Dec 22, 2015 at 2:31 AM, Will Dennis <wdennis(a)nec-labs.com> wrote:
>
> OK, another problem :(
>
> I was having the same problem with my second oVirt host that I had with my
> first one, where when I ran “hosted-engine —deploy” on it, after it
> completed successfully, then I was experiencing a ~50sec lag when SSH’ing
> into the node…
>
> vpnp71:~ will$ time ssh root@ovirt-node-02 uptime
> 19:36:06 up 4 days, 8:31, 0 users, load average: 0.68, 0.70, 0.67
>
> real 0m50.540s
> user 0m0.025s
> sys 0m0.008s
>
>
> So, in the oVirt web admin console, I put the "ovirt-node-02” node into
> Maintenance mode, then SSH’d to the server and rebooted it. Sure enough,
> after the server came back up, SSH was fine (no delay), which again was the
> same experience I had had with the first oVirt host. So, I went back to the
> web console, and choose the “Confirm host has been rebooted” option, which
> I thought would be the right action to take after a reboot. The system
> opened a dialog box with a spinner, which never stopped spinning… So
> finally, I closed the dialog box with the upper right (X) symbol, and then
> for this same host choose “Activate” from the menu. It was then I noticed I
> had recieved a state transition email notifying me that
> "EngineUp-EngineUpBadHealth” and sure enough, the web UI was then
> unresponsive. I checked on the first oVirt host, the VM with the name
> “HostedEngine” is still running, but obviously isn’t working…
>
> So, looks like I need to restart the HostedEngine VM or take whatever
> action is needed to return oVirt to operation… Hate to keep asking this
> question, but what’s the correct action at this point?
>
>
>
> ovirt-ha-agent should always restart it for you after a few minutes but
> the point is that the network configuration seams to be not that stable.
>
>
>
> I know from another thread that you are trying to deploy hosted-engine
> over GlusterFS in an hyperconverged way and this, as I said, is currently
> not supported.
>
> I think that it can also requires some specific configuration on network
> side.
>
>
> For hyperconverged gluster+engine , it should work without any specific
> configuration on network side. However if the network is flaky, it is
> possible that there are errors with gluster volume access. Could you
> provide the ovirt-ha-agent logs as well as gluster mount logs?
>
>
>
>
> Adding Sahina and Dan here.
>
>
>
> Thanks, again,
> Will
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
>
>
>
>
>
9 years
Restart of vdsmd
by gflwqs gflwqs
Hi list,
Do i need to put host into maintenance before i restart vdsmd if i need to
change a parameter in /etc/vdsm/vdsm.conf?
Thanks!
Christian
9 years
iLO2
by Eriks Goodwin
------=_Part_48121_1727506711.1450775295131
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
I have been tinkering with various settings and configurations trying to get power management working properly--but it keeps giving me "Fail:" (without any details) every time I test the settings. The servers I am using are HP Proliant DL380 G6 with integrated iLO2. Any tips?
I can use the same credentials to log into the system via the web interface--so I'm sure I have the address, username, and passwd right. :-)
------=_Part_48121_1727506711.1450775295131
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 7bit
<html><body><div style="font-family: tahoma,new york,times,serif; font-size: 10pt; color: #000000"><div>I have been tinkering with various settings and configurations trying to get power management working properly--but it keeps giving me "Fail:" (without any details) every time I test the settings. The servers I am using are HP Proliant DL380 G6 with integrated iLO2. Any tips?<br data-mce-bogus="1"></div><div><br data-mce-bogus="1"></div><div>I can use the same credentials to log into the system via the web interface--so I'm sure I have the address, username, and passwd right. :-)<br data-mce-bogus="1"></div></div></body></html>
------=_Part_48121_1727506711.1450775295131--
9 years