December 2015 - Users - oVirt List Archives

uncaught exception with engine vm in 3.6.1 selecting hosted_storage
by Gianluca Cecchi 23 Dec '15

23 Dec '15

Hello after updating from 3.6.0 to 3.6.1 my HE environment, I'm not able to import/activate the hosted_storage domain because it results anattached Moreover I get the popup message Uncaught exception occurred. Please try reloading the page. Details: (TypeError) __gwt$exception: <skipped>: c is null when I select the line of hosted_storage domain in storage tab. See: https://drive.google.com/file/d/0BwoPbcrMv8mvVlc1XzNUZ18yWWs/view?usp=shari… I don't find apparently ERROR messages wen I do this, so I don't know where to watch. On engine, where the db lives, I have under /var/lib/pgsql/data/pg_log [root@ractorshe pg_log]# ls -lrt total 12 -rw-------. 1 postgres postgres 0 Dec 16 01:00 postgresql-Wed.log -rw-------. 1 postgres postgres 0 Dec 17 01:00 postgresql-Thu.log -rw-------. 1 postgres postgres 0 Dec 18 01:00 postgresql-Fri.log -rw-------. 1 postgres postgres 7480 Dec 20 00:39 postgresql-Sat.log -rw-------. 1 postgres postgres 3488 Dec 20 01:29 postgresql-Sun.log -rw-------. 1 postgres postgres 0 Dec 21 01:00 postgresql-Mon.log -rw-------. 1 postgres postgres 0 Dec 22 01:00 postgresql-Tue.log On Sunday logs, when the engine was started I see LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 1/254C6710 LOG: record with zero length at 1/254EB6B8 LOG: redo done at 1/254EB688 LOG: last completed transaction was at log time 2015-12-19 23:49:47.920147+00 LOG: database system is ready to accept connections LOG: autovacuum launcher started ERROR: insert or update on table "storage_domain_dynamic" violates foreign key constraint "fk_stora ge_domain_dynamic_storage_domain_static" DETAIL: Key (id)=(00000000-0000-0000-0000-000000000000) is not present in table "storage_domain_static". CONTEXT: SQL statement "INSERT INTO storage_domain_dynamic(available_disk_size, id, used_disk_size) VALUES(v_available_disk_size, v_id, v_used_disk_size)" PL/pgSQL function insertstorage_domain_dynamic(integer,uuid,integer) line 3 at SQL statement STATEMENT: select * from insertstorage_domain_dynamic($1, $2, $3) as result and then many lines of kind LOG: autovacuum: found orphan temp table "pg_temp_8"."tt_temp22" in database "engine" so messages related to domains and could be the reason of the exception and of the inability of importing/attaching my HE storage domain. Let me know if I can provide more logs or output of queries from the database. Gianluca

4 8

How to run "engine-backup"?
by Will Dennis 23 Dec '15

23 Dec '15

--_000_F3282EEAFF180F43BAF1AD0A7C34739D38F285njmailneclabscom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Yay, I *finally* have my 3-host hyper-converged oVirt datacenter stood up := ) [root@ovirt-node-01 ~]# hosted-engine --vm-status --=3D=3D Host 1 status =3D=3D-- Status up-to-date : True Hostname : ovirt-node-01 Host ID : 1 Engine status : {"health": "good", "vm": "up", "detail= ": "up"} Score : 3400 stopped : False Local maintenance : False crc32 : 65c41ca5 Host timestamp : 217522 --=3D=3D Host 2 status =3D=3D-- Status up-to-date : True Hostname : ovirt-node-02 Host ID : 2 Engine status : {"reason": "vm not running on this hos= t", "health": "bad", "vm": "down", "detail": "unknown"} Score : 3400 stopped : False Local maintenance : False crc32 : a7a599d8 Host timestamp : 56101 --=3D=3D Host 3 status =3D=3D-- Status up-to-date : True Hostname : ovirt-node-03 Host ID : 3 Engine status : {"reason": "vm not running on this hos= t", "health": "bad", "vm": "down", "detail": "unknown"} Score : 3400 stopped : False Local maintenance : False crc32 : 6e138d0b Host timestamp : 432658 Now in the oVirt webadmin UI down in the "Alerts" section, I am seeing this= message: "There is no full backup available, please run engine-backup to prevent dat= a loss in case of corruption." I do not see a "engine-backup" CLI command on my hosts; how does one do thi= s? (I have searched ovirt.org to no avail...) Thanks, Will --_000_F3282EEAFF180F43BAF1AD0A7C34739D38F285njmailneclabscom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr= osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:= //www.w3.org/TR/REC-html40"> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"= > <meta name=3D"Generator" content=3D"Microsoft Word 14 (filtered medium)"> <style></style> </head> <body lang=3D"EN-US" link=3D"blue" vlink=3D"purple"> <div class=3D"WordSection1"> Yay, I *finally* have my 3-host hyper-converg= ed oVirt datacenter stood up :)<o:p></o:p> <o:p> </o:p> [root@ovirt-node-01 ~]# hosted-engine --vm-status<o:= p></o:p> <o:p> </o:p> <o:p> </o:p> --=3D=3D Host 1 status =3D=3D--<o:p></o:p> <o:p> </o:p> Status up-to-date     &nbsp= ;            : True<= o:p></o:p> Hostname       &n= bsp;           &nbsp= ;       : ovirt-node-01<o:p></o:p> Host ID       &nb= sp;            =         : 1<o:p></o:p> Engine status      &nb= sp;            =    : {"health": "good", "vm": &quot= ;up", "detail": "up"}<o:p></o:p> Score       &nbsp= ;            &n= bsp;         : 3400<o:p></o:p> stopped       &nb= sp;            =         : False<o:p></o:p> Local maintenance     &nbsp= ;            : False= <o:p></o:p> crc32       &nbsp= ;            &n= bsp;         : 65c41ca5<o:p></o:p><= /p> Host timestamp      &n= bsp;           &nbsp= ;  : 217522<o:p></o:p> <o:p> </o:p> <o:p> </o:p> --=3D=3D Host 2 status =3D=3D--<o:p></o:p> <o:p> </o:p> Status up-to-date     &nbsp= ;            : True<= o:p></o:p> Hostname       &n= bsp;           &nbsp= ;       : ovirt-node-02<o:p></o:p> Host ID       &nb= sp;            =         : 2<o:p></o:p> Engine status      &nb= sp;            =    : {"reason": "vm not running on this host"= , "health": "bad", "vm": "down", &q= uot;detail": "unknown"}<o:p></o:p> Score       &nbsp= ;            &n= bsp;         : 3400<o:p></o:p> stopped       &nb= sp;            =         : False<o:p></o:p> Local maintenance     &nbsp= ;            : False= <o:p></o:p> crc32       &nbsp= ;            &n= bsp;         : a7a599d8<o:p></o:p><= /p> Host timestamp      &n= bsp;           &nbsp= ;  : 56101<o:p></o:p> <o:p> </o:p> <o:p> </o:p> --=3D=3D Host 3 status =3D=3D--<o:p></o:p> <o:p> </o:p> Status up-to-date     &nbsp= ;            : True<= o:p></o:p> Hostname       &n= bsp;           &nbsp= ;       : ovirt-node-03<o:p></o:p> Host ID       &nb= sp;            =         : 3<o:p></o:p> Engine status      &nb= sp;            =    : {"reason": "vm not running on this host"= , "health": "bad", "vm": "down", &q= uot;detail": "unknown"}<o:p></o:p> Score       &nbsp= ;            &n= bsp;         : 3400<o:p></o:p> stopped       &nb= sp;            =         : False<o:p></o:p> Local maintenance     &nbsp= ;            : False= <o:p></o:p> crc32       &nbsp= ;            &n= bsp;         : 6e138d0b<o:p></o:p><= /p> Host timestamp      &n= bsp;           &nbsp= ;  : 432658<o:p></o:p> <o:p> </o:p> <o:p> </o:p> <o:p> </o:p> Now in the oVirt webadmin UI down in the “Aler= ts” section, I am seeing this message:<o:p></o:p> <o:p> </o:p> “There is no full backup available, please run= engine-backup to prevent data loss in case of corruption.”<o:p></o:p= > <o:p> </o:p> I do not see a “engine-backup” CLI comma= nd on my hosts; how does one do this? (I have searched ovirt.org to no avai= l...)<o:p></o:p> <o:p> </o:p> Thanks,<o:p></o:p> Will<o:p></o:p> </div> </body> </html> --_000_F3282EEAFF180F43BAF1AD0A7C34739D38F285njmailneclabscom_--

3 5

mount a usb
by alireza sadeh seighalan 23 Dec '15

23 Dec '15

hi everyone i want to mount a usb to a windows vm in ovirt3.6.1. how can i run it?thanks in advance

2 1

virtual Disk
by Taste-Of-IT 22 Dec '15

22 Dec '15

Hello, i testing oVirt 3.6 as Self-Hosted-Engine and create a virtual machine. Now i want to change the size of the disk and found the possibilities to change the size of the disk and a field to grow the disk. In ovirt manual it is descripted to change the value of the grow field. My question is what is the difference and what are the results of that. E.g what happend if i only change the disk size from 8 to 10? is it the same like i change the grow size from 0 to 2? thx for technical explanation.

2 3

After update to 3.6.1 profile internal does not exist message
by Gianluca Cecchi 22 Dec '15

22 Dec '15

Hello. I updated my test self hosted engine vm to 3.6.1 and to CentOS 7.2. Now it seems it isn't able to to login to webadmin due to this in engine.log (field is empty in web admin gui) 2015-12-19 13:59:05,182 ERROR [org.ovirt.engine.core.bll.aaa.LoginBaseCommand] (default task-17) [] Can't login because authentication profile 'internal' doesn't exist. Is it an already known problem? The first ERRROR message I find in it is dated 13:34 and it is an SQL one: 2015-12-19 13:34:39,882 ERROR [org.ovirt.engine.core.bll.Backend] (ServerService Thread Pool -- 42) [] Failed to run compensation on startup for Command 'org.ovirt.engine.core.bll.storage.AddExistingFileStorageDomainCommand', Command Id 'bb7ee4a3-6b35-4c62-9823-a434a40e5b38': CallableStatementCallback; SQL [{call insertstorage_domain_dynamic(?, ?, ?)}]; ERROR: insert or update on table "storage_domain_dynamic" violates foreign key constraint "fk_storage_domain_dynamic_storage_domain_static" Detail: Key (id)=(00000000-0000-0000-0000-000000000000) is not present in table "storage_domain_static". Where: SQL statement "INSERT INTO storage_domain_dynamic(available_disk_size, id, used_disk_size) VALUES(v_available_disk_size, v_id, v_used_disk_size)" PL/pgSQL function insertstorage_domain_dynamic(integer,uuid,integer) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: insert or update on table "storage_domain_dynamic" violates foreign key constraint "fk_storage_domain_dynamic_storage_domain_static" Detail: Key (id)=(00000000-0000-0000-0000-000000000000) is not present in table "storage_domain_static". Where: SQL statement "INSERT INTO storage_domain_dynamic(available_disk_size, id, used_disk_size) VALUES(v_available_disk_size, v_id, v_used_disk_size)" PL/pgSQL function insertstorage_domain_dynamic(integer,uuid,integer) line 3 at SQL statement 2015-12-19 13:34:39,882 ERROR [org.ovirt.engine.core.bll.Backend] (ServerService Thread Pool -- 42) [] Exception: org.springframework.dao.DataIntegrityViolationException: CallableStatementCallback; SQL [{call insertstorage_domain_dynamic(?, ?, ?)}]; ERROR: insert or update on table "storage_domain_dynamic" violates foreign key constraint "fk_storage_domain_dynamic_storage_domain_static" Detail: Key (id)=(00000000-0000-0000-0000-000000000000) is not present in table "storage_domain_static". I think it is not related to update from CentOS 7.1 to 7.2 that generates an update of PostgreSQL too because it is 10 minutes later: from installed one Nov 04 12:55:30 Installed: postgresql-server-9.2.13-1.el7_1.x86_64 to 7.2 one Dec 19 13:45:51 Updated: postgresql-server-9.2.14-1.el7_1.x86_64 BTW: my order was yum update "ovirt-engine-setup-*" engine-setup the engine-setup completed successfully with ... [ INFO ] Stage: Misc configuration [ INFO ] Backing up database localhost:engine to '/var/lib/ovirt-engine/backups/engine-20151219132925.5OoFQA.dump'. [ INFO ] Creating/refreshing Engine database schema [ INFO ] Creating/refreshing Engine 'internal' domain database schema [ INFO ] Upgrading CA [ INFO ] Configuring WebSocket Proxy [ INFO ] Generating post install configuration file '/etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf' [ INFO ] Stage: Transaction commit [ INFO ] Stage: Closing up --== SUMMARY ==-- SSH fingerprint: 19:56:8d:3e:50:fc:90:37:5a:ba:6c:57:30:b1:7d:93 Internal CA DA:E6:04:34:99:A0:DB:CE:3F:0A:7B:A2:96:67:4C:7F:19:CA:95:5F Note! If you want to gather statistical information you can install Reports and/or DWH: http://www.ovirt.org/Ovirt_DWH http://www.ovirt.org/Ovirt_Reports Web access is enabled at: http://ractorshe.mydomain:80/ovirt-engine https://ractorshe.mydomain:443/ovirt-engine --== END OF SUMMARY ==-- [ INFO ] Starting engine service [ INFO ] Restarting httpd [ INFO ] Restarting ovirt-vmconsole proxy service [ INFO ] Stage: Clean up Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20151219132415-2pfee1.log [ INFO ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20151219133045-setup.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ INFO ] Execution of setup completed successfully Strangely the VM was shutdown... In /var/log/libvirt/qemu/HostedEngine.log I saw (in UTC) 2015-12-19 12:31:19.494+0000: shutting down So I exited from global maintenance, the VM was started and I verified that admin web gui was ok with new version. Then I put maintenance again and run on engine VM systemctl stop ovirt-engine yum update (I forgot to also shutdown POstgreSQL before... ) shutdown -r now As it was in maintenance it wasn't started again so I exited from maintenance and VM was started, but now I'm not able to see the internal profile.... Let me know if you need further info. Gianluca

5 12

Re: [ovirt-users] Cannot retrieve answer file from 1st HE host when setting up 2nd host
by Simone Tiraboschi 22 Dec '15

22 Dec '15

On Tue, Dec 22, 2015 at 3:06 PM, Will Dennis <wdennis(a)nec-labs.com> wrote: > See attached for requested logs > Thanks, the issue is here: Dec 21 19:40:53 ovirt-node-03 etc-glusterfs-glusterd.vol[1079]: [2015-12-22 00:40:53.496109] C [MSGID: 106002] [glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action] 0-management: Server quorum lost for volume engine. Stopping local bricks. Dec 21 19:40:53 ovirt-node-03 etc-glusterfs-glusterd.vol[1079]: [2015-12-22 00:40:53.496410] C [MSGID: 106002] [glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action] 0-management: Server quorum lost for volume vmdata. Stopping local bricks. So at that point gluster lost its quorum and the fail system got read-only. On the getStorageDomainsList VDSM internally raises cause the file-system is read only: Thread-141::DEBUG::2015-12-21 11:29:59,666::fileSD::157::Storage.StorageDomainManifest::(__init__) Reading domain in path /rhev/data-center/mnt/glusterSD/localhost:_engine/e89b6e64-bd7d-4846-b970-9af32a3295ee Thread-141::DEBUG::2015-12-21 11:29:59,666::__init__::320::IOProcessClient::(_run) Starting IOProcess... Thread-141::DEBUG::2015-12-21 11:29:59,680::persistentDict::192::Storage.PersistentDict::(__init__) Created a persistent dict with FileMetadataRW backend Thread-141::ERROR::2015-12-21 11:29:59,686::hsm::2898::Storage.HSM::(getStorageDomainsList) Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/hsm.py", line 2882, in getStorageDomainsList dom = sdCache.produce(sdUUID=sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 100, in produce domain.getRealDomain() File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 124, in _realProduce domain = self._findDomain(sdUUID) File "/usr/share/vdsm/storage/sdc.py", line 143, in _findDomain dom = findMethod(sdUUID) File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain return GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID)) File "/usr/share/vdsm/storage/fileSD.py", line 198, in __init__ validateFileSystemFeatures(manifest.sdUUID, manifest.mountpoint) File "/usr/share/vdsm/storage/fileSD.py", line 93, in validateFileSystemFeatures oop.getProcessPool(sdUUID).directTouch(testFilePath) File "/usr/share/vdsm/storage/outOfProcess.py", line 350, in directTouch ioproc.touch(path, flags, mode) File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 543, in touch self.timeout) File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 427, in _sendCommand raise OSError(errcode, errstr) OSError: [Errno 30] Read-only file system But instead of reporting a failure to hosted-engine-setup, it reported a successfully execution where it wasn't able to find any storage domain there ( this one is a real bug, I'm going to open a bug on that, can I attach your logs there? ): Thread-141::INFO::2015-12-21 11:29:59,702::logUtils::51::dispatcher::(wrapper) Run and protect: getStorageDomainsList, Return response: {'domlist': []} Thread-141::DEBUG::2015-12-21 11:29:59,702::task::1191::Storage.TaskManager.Task::(prepare) Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::finished: {'domlist': []} Thread-141::DEBUG::2015-12-21 11:29:59,702::task::595::Storage.TaskManager.Task::(_updateState) Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::moving from state preparing -> state finished Thread-141::DEBUG::2015-12-21 11:29:59,703::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-141::DEBUG::2015-12-21 11:29:59,703::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-141::DEBUG::2015-12-21 11:29:59,703::task::993::Storage.TaskManager.Task::(_decref) Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::ref 0 aborting False Thread-141::INFO::2015-12-21 11:29:59,704::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:39718 stopped And so, cause VDSM doesn't report any existing storage domain, hosted-engine-setup assumes that you are going to deploy the first host and so your original issue. > > > *From:* Simone Tiraboschi [mailto:stirabos@redhat.com] > *Sent:* Tuesday, December 22, 2015 8:56 AM > *To:* Will Dennis > *Cc:* Sahina Bose; Yedidyah Bar David > > *Subject:* Re: [ovirt-users] Cannot retrieve answer file from 1st HE host > when setting up 2nd host > > > > > > On Tue, Dec 22, 2015 at 2:44 PM, Will Dennis <wdennis(a)nec-labs.com> wrote: > > Which logs are needed? > > > Let's start with vdsm.log and /var/log/messages > Then it's quite strange that you have that amount of data in mom.log so > also that one could be interesting. > > > > > > /var/log/vdsm > > total 24M > > drwxr-xr-x 3 vdsm kvm 4.0K Dec 18 20:10 . > > drwxr-xr-x. 13 root root 4.0K Dec 20 03:15 .. > > drwxr-xr-x 2 vdsm kvm 6 Dec 9 03:24 backup > > -rw-r--r-- 1 vdsm kvm 2.5K Dec 21 11:29 connectivity.log > > -rw-r--r-- 1 vdsm kvm 173K Dec 21 11:21 mom.log > > -rw-r--r-- 1 vdsm kvm 2.0M Dec 17 10:09 mom.log.1 > > -rw-r--r-- 1 vdsm kvm 2.0M Dec 17 04:06 mom.log.2 > > -rw-r--r-- 1 vdsm kvm 2.0M Dec 16 22:03 mom.log.3 > > -rw-r--r-- 1 vdsm kvm 2.0M Dec 16 16:00 mom.log.4 > > -rw-r--r-- 1 vdsm kvm 2.0M Dec 16 09:57 mom.log.5 > > -rw-r--r-- 1 root root 115K Dec 21 11:29 supervdsm.log > > -rw-r--r-- 1 root root 2.7K Oct 16 11:38 upgrade.log > > -rw-r--r-- 1 vdsm kvm 13M Dec 22 08:44 vdsm.log > > > > > > *From:* Simone Tiraboschi [mailto:stirabos@redhat.com] > *Sent:* Tuesday, December 22, 2015 3:58 AM > *To:* Will Dennis; Sahina Bose > *Cc:* Yedidyah Bar David; users > *Subject:* Re: [ovirt-users] Cannot retrieve answer file from 1st HE host > when setting up 2nd host > > > > > > > > On Tue, Dec 22, 2015 at 2:09 AM, Will Dennis <wdennis(a)nec-labs.com> wrote: > > http://ur1.ca/ocstf > > > > > 2015-12-21 11:28:39 DEBUG otopi.plugins.otopi.dialog.human > dialog.__logString:219 DIALOG:SEND Please specify the full > shared storage connection path to use (example: host:/path): > 2015-12-21 11:28:55 DEBUG otopi.plugins.otopi.dialog.human > dialog.__logString:219 DIALOG:RECEIVE localhost:/engine > > > > OK, so you are trying to deploy hosted-engine on GlusterFS in a > hyper-converged way (using the same hosts for virtualization and for > serving GlusterFS). Unfortunately I've to advise you that this is not a > supported configuration on oVirt 3.6 due to different open bugs. > > So I'm glad you can help us testing it but I prefer to advise that today > that schema is not production ready. > > > > In your case it seams that VDSM correctly connects the GlusterFS volume > seeing all the bricks > > > 2015-12-21 11:28:55 DEBUG > otopi.plugins.ovirt_hosted_engine_setup.storage.nfs plugin.execute:936 > execute-output: ('/sbin/gluster', '--mode=script', '--xml', 'volume', > 'info', 'engine', '--remote-host=localhost') stdout: > <?xml version="1.0" encoding="UTF-8" standalone="yes"?> > <cliOutput> > <opRet>0</opRet> > <opErrno>0</opErrno> > <opErrstr/> > <volInfo> > <volumes> > <volume> > <name>engine</name> > <id>974c9da4-b236-4fc1-b26a-645f14601db8</id> > <status>1</status> > <statusStr>Started</statusStr> > <brickCount>6</brickCount> > <distCount>3</distCount> > > > > but then VDSM doesn't find any storage domain there: > > > > > otopi.plugins.ovirt_hosted_engine_setup.storage.storage.Plugin._late_customization > 2015-12-21 11:29:58 DEBUG > otopi.plugins.ovirt_hosted_engine_setup.storage.storage > storage._getExistingDomain:476 _getExistingDomain > 2015-12-21 11:29:58 DEBUG > otopi.plugins.ovirt_hosted_engine_setup.storage.storage > storage._storageServerConnection:638 connectStorageServer > 2015-12-21 11:29:58 DEBUG > otopi.plugins.ovirt_hosted_engine_setup.storage.storage > storage._storageServerConnection:701 {'status': {'message': 'OK', 'code': > 0}, 'statuslist': [{'status': 0, 'id': > '67ece152-dd66-444c-8d18-4249d1b8f488'}]} > 2015-12-21 11:29:58 DEBUG > otopi.plugins.ovirt_hosted_engine_setup.storage.storage > storage._getStorageDomainsList:595 getStorageDomainsList > 2015-12-21 11:29:59 DEBUG > otopi.plugins.ovirt_hosted_engine_setup.storage.storage > storage._getStorageDomainsList:598 {'status': {'message': 'OK', 'code': 0}, > 'domlist': []} > > > > Can you please attach also the correspondent VDSM logs? > > > > Adding Sahina here. > > > > > > On Dec 21, 2015, at 11:58 AM, Simone Tiraboschi <stirabos(a)redhat.com > <mailto:stirabos@redhat.com>> wrote: > > > On Mon, Dec 21, 2015 at 5:52 PM, Will Dennis <wdennis(a)nec-labs.com<mailto: > wdennis(a)nec-labs.com>> wrote: > > However, when I went to the 3rd host and did the setup, I selected > 'glusterfs' and gave the path of the engine volume, it came back and > incorrectly identified it as the first host, instead of an additional > host... How does setup determine that? I confirmed that on this 3rd host > that the engine volume is available and has the GUID subfolder of the > hosted engine... > > > Can you please attach a log of hosted-engine-setup also from there? > > > > >

2 2

Re: [ovirt-users] Hosted Engine crash - state = EngineUp-EngineUpBadHealth
by Simone Tiraboschi 22 Dec '15

22 Dec '15

On Tue, Dec 22, 2015 at 4:03 PM, Will Dennis <wdennis(a)nec-labs.com> wrote: > I believe IPtables may be the culprit... > > > > Host 1: > > ------- > > [root@ovirt-node-01 ~]# iptables -L > > Chain INPUT (policy ACCEPT) > > target prot opt source destination > > ACCEPT all -- anywhere anywhere state > RELATED,ESTABLISHED > > ACCEPT icmp -- anywhere anywhere > > ACCEPT all -- anywhere anywhere > > ACCEPT tcp -- anywhere anywhere tcp dpt:54321 > > ACCEPT tcp -- anywhere anywhere tcp > dpt:sunrpc > > ACCEPT udp -- anywhere anywhere udp > dpt:sunrpc > > ACCEPT tcp -- anywhere anywhere tcp dpt:ssh > > ACCEPT udp -- anywhere anywhere udp dpt:snmp > > ACCEPT tcp -- anywhere anywhere tcp dpt:16514 > > ACCEPT tcp -- anywhere anywhere multiport > dports rockwell-csp2 > > ACCEPT tcp -- anywhere anywhere multiport > dports rfb:6923 > > ACCEPT tcp -- anywhere anywhere multiport > dports 49152:49216 > > REJECT all -- anywhere anywhere reject-with > icmp-host-prohibited > > > > Chain FORWARD (policy ACCEPT) > > target prot opt source destination > > REJECT all -- anywhere anywhere PHYSDEV > match ! --physdev-is-bridged reject-with icmp-host-prohibited > > > > Chain OUTPUT (policy ACCEPT) > > target prot opt source destination > > > > Host 2: > > ------- > > [root@ovirt-node-02 ~]# iptables -L > > Chain INPUT (policy ACCEPT) > > target prot opt source destination > > ACCEPT all -- anywhere anywhere state > RELATED,ESTABLISHED > > ACCEPT icmp -- anywhere anywhere > > ACCEPT all -- anywhere anywhere > > ACCEPT tcp -- anywhere anywhere tcp dpt:54321 > > ACCEPT tcp -- anywhere anywhere tcp > dpt:sunrpc > > ACCEPT udp -- anywhere anywhere udp > dpt:sunrpc > > ACCEPT tcp -- anywhere anywhere tcp dpt:ssh > > ACCEPT udp -- anywhere anywhere udp dpt:snmp > > ACCEPT tcp -- anywhere anywhere tcp dpt:16514 > > ACCEPT tcp -- anywhere anywhere multiport > dports rockwell-csp2 > > ACCEPT tcp -- anywhere anywhere multiport > dports rfb:6923 > > ACCEPT tcp -- anywhere anywhere multiport > dports 49152:49216 > > REJECT all -- anywhere anywhere reject-with > icmp-host-prohibited > > > > Chain FORWARD (policy ACCEPT) > > target prot opt source destination > > REJECT all -- anywhere anywhere PHYSDEV > match ! --physdev-is-bridged reject-with icmp-host-prohibited > > > > Chain OUTPUT (policy ACCEPT) > > target prot opt source destination > > > > Host 3: > > ------- > > [root@ovirt-node-03 ~]# iptables -L > > Chain INPUT (policy ACCEPT) > > target prot opt source destination > > > > Chain FORWARD (policy ACCEPT) > > target prot opt source destination > > > > Chain OUTPUT (policy ACCEPT) > > target prot opt source destination > > > > > > An example of my Gluster engine volume status (off host #2): > > > > [root@ovirt-node-02 ~]# gluster volume status > > Status of volume: engine > > Gluster process TCP Port RDMA Port Online > Pid > > > ------------------------------------------------------------------------------ > > Brick ovirt-node-02:/gluster_brick2/engine_ > > brick 49217 0 Y > 2973 > > Brick ovirt-node-03:/gluster_brick3/engine_ > > brick N/A N/A N > N/A > > Brick ovirt-node-02:/gluster_brick4/engine_ > > brick 49218 0 Y > 2988 > > Brick ovirt-node-03:/gluster_brick5/engine_ > > brick N/A N/A N > N/A > > NFS Server on localhost 2049 0 Y > 3007 > > Self-heal Daemon on localhost N/A N/A Y > 3012 > > NFS Server on ovirt-node-03 2049 0 Y > 1671 > > Self-heal Daemon on ovirt-node-03 N/A N/A Y > 1707 > > > > I had changed the base port # per instructions found at > http://www.ovirt.org/Features/Self_Hosted_Engine_Hyper_Converged_Gluster_Su… > : > > “By default gluster uses a port that vdsm also wants, so we need to change > base-port setting avoiding the clash between the two daemons. We need to add > > > > option base-port 49217 > > to /etc/glusterfs/glusterd.vol > > > > and ensure glusterd service is enabled and started before proceeding.” > > > > So I did that on all the hosts: > > > > [root@ovirt-node-02 ~]# cat /etc/glusterfs/glusterd.vol > > volume management > > type mgmt/glusterd > > option working-directory /var/lib/glusterd > > option transport-type socket,rdma > > option transport.socket.keepalive-time 10 > > option transport.socket.keepalive-interval 2 > > option transport.socket.read-fail-log off > > option ping-timeout 30 > > # option base-port 49152 > > option base-port 49217 > > option rpc-auth-allow-insecure on > > end-volume > > > > > > Question: does oVirt really need IPtables to be enforcing rules, or can I > just set everything wide open? If I can, how to specify that in setup? > hosted-engine-setup asks: iptables was detected on your computer, do you wish setup to configure it? (Yes, No)[Yes]: You have just to say no here. If you say no it's completely up to you to configure it opening the required ports or everything disabling it if you don't care. The issue with gluster ports is that hosted-engine-setup simply configure iptables for what it knows you'll need and on 3.6 it's always assuming that the gluster volume is served by external hosts. > > > > W. > > > > > > *From:* Sahina Bose [mailto:sabose@redhat.com] > *Sent:* Tuesday, December 22, 2015 9:19 AM > *To:* Will Dennis; Simone Tiraboschi; Dan Kenigsberg > > *Subject:* Re: [ovirt-users] Hosted Engine crash - state = > EngineUp-EngineUpBadHealth > > > > > > On 12/22/2015 07:47 PM, Sahina Bose wrote: > > > > On 12/22/2015 07:28 PM, Will Dennis wrote: > > See attached for requested log files > > > From gluster logs > > [2015-12-22 00:40:53.501341] W [MSGID: 108001] > [afr-common.c:3924:afr_notify] 0-engine-replicate-1: Client-quorum is not > met > [2015-12-22 00:40:53.502288] W [socket.c:588:__socket_rwv] > 0-engine-client-2: readv on 138.15.200.93:49217 failed (No data available) > > [2015-12-22 00:41:17.667302] W [fuse-bridge.c:2292:fuse_writev_cbk] > 0-glusterfs-fuse: 3875597: WRITE => -1 (Read-only file system) > > Could you check if the gluster ports are open on all nodes? > > > It's possible you ran into this ? - > https://bugzilla.redhat.com/show_bug.cgi?id=1288979 > > > > > > > > *From:* Sahina Bose [mailto:sabose@redhat.com <sabose(a)redhat.com>] > *Sent:* Tuesday, December 22, 2015 4:59 AM > *To:* Simone Tiraboschi; Will Dennis; Dan Kenigsberg > *Cc:* users > *Subject:* Re: [ovirt-users] Hosted Engine crash - state = > EngineUp-EngineUpBadHealth > > > > > > On 12/22/2015 02:38 PM, Simone Tiraboschi wrote: > > > > > > On Tue, Dec 22, 2015 at 2:31 AM, Will Dennis <wdennis(a)nec-labs.com> wrote: > > OK, another problem :( > > I was having the same problem with my second oVirt host that I had with my > first one, where when I ran “hosted-engine —deploy” on it, after it > completed successfully, then I was experiencing a ~50sec lag when SSH’ing > into the node… > > vpnp71:~ will$ time ssh root@ovirt-node-02 uptime > 19:36:06 up 4 days, 8:31, 0 users, load average: 0.68, 0.70, 0.67 > > real 0m50.540s > user 0m0.025s > sys 0m0.008s > > > So, in the oVirt web admin console, I put the "ovirt-node-02” node into > Maintenance mode, then SSH’d to the server and rebooted it. Sure enough, > after the server came back up, SSH was fine (no delay), which again was the > same experience I had had with the first oVirt host. So, I went back to the > web console, and choose the “Confirm host has been rebooted” option, which > I thought would be the right action to take after a reboot. The system > opened a dialog box with a spinner, which never stopped spinning… So > finally, I closed the dialog box with the upper right (X) symbol, and then > for this same host choose “Activate” from the menu. It was then I noticed I > had recieved a state transition email notifying me that > "EngineUp-EngineUpBadHealth” and sure enough, the web UI was then > unresponsive. I checked on the first oVirt host, the VM with the name > “HostedEngine” is still running, but obviously isn’t working… > > So, looks like I need to restart the HostedEngine VM or take whatever > action is needed to return oVirt to operation… Hate to keep asking this > question, but what’s the correct action at this point? > > > > ovirt-ha-agent should always restart it for you after a few minutes but > the point is that the network configuration seams to be not that stable. > > > > I know from another thread that you are trying to deploy hosted-engine > over GlusterFS in an hyperconverged way and this, as I said, is currently > not supported. > > I think that it can also requires some specific configuration on network > side. > > > For hyperconverged gluster+engine , it should work without any specific > configuration on network side. However if the network is flaky, it is > possible that there are errors with gluster volume access. Could you > provide the ovirt-ha-agent logs as well as gluster mount logs? > > > > > Adding Sahina and Dan here. > > > > Thanks, again, > Will > > _______________________________________________ > Users mailing list > Users(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > > > > > > > >

2 1

FOSDEM16 Virt & IaaS devroom schedule published
by Mikey Ariel 22 Dec '15

22 Dec '15

I am happy to announce that the schedule for the Virtualization & Infrastructure-as-a-Service devroom at FOSDEM 2016 is finalized and published. Link to the detailed announcement and call for volunteers: http://community.redhat.com/blog/2015/12/fosdem16-virt-iaas-devroom-schedul… Cheers, Mikey -- Mikey Ariel Community Lead, oVirt www.ovirt.org "To be is to do" (Socrates) "To do is to be" (Jean-Paul Sartre) "Do be do be do" (Frank Sinatra) Mobile: +420-702-131-141 IRC: mariel / thatdocslady Twitter: @ThatDocsLady

1 0

Restart of vdsmd
by gflwqs gflwqs 22 Dec '15

22 Dec '15

Hi list, Do i need to put host into maintenance before i restart vdsmd if i need to change a parameter in /etc/vdsm/vdsm.conf? Thanks! Christian

2 1

iLO2
by Eriks Goodwin 22 Dec '15

22 Dec '15

------=_Part_48121_1727506711.1450775295131 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit I have been tinkering with various settings and configurations trying to get power management working properly--but it keeps giving me "Fail:" (without any details) every time I test the settings. The servers I am using are HP Proliant DL380 G6 with integrated iLO2. Any tips? I can use the same credentials to log into the system via the web interface--so I'm sure I have the address, username, and passwd right. :-) ------=_Part_48121_1727506711.1450775295131 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit <html><body><div style="font-family: tahoma,new york,times,serif; font-size: 10pt; color: #000000"><div>I have been tinkering with various settings and configurations trying to get power management working properly--but it keeps giving me "Fail:" (without any details) every time I test the settings.  The servers I am using are HP Proliant DL380 G6 with integrated iLO2.  Any tips? </div><div> </div><div>I can use the same credentials to log into the system via the web interface--so I'm sure I have the address, username, and passwd right.  :-) </div></div></body></html> ------=_Part_48121_1727506711.1450775295131--

3 2