November 2020 - Devel - Ovirt List Archives

Change compatibility level for data center/cluster during upgrade - what exactly are we changing?

by Patrick Chiang

Hi all, From the upgrade procedure, when we completed most upgrade tasks, we have to change the compatibility level in cluster and data center, and then restart all guest VM afterwards.It seems no down time expected with these 2 changes from the procedure. I wonder what exactly we are changing (a setting in the database/file??) and most important of all, whether there will be a few seconds down time when we made those changes to "data center"? Because I worried if a data center change could lead to storage domain's down time and eventually force hypervisors, guest VM restart before expected time. (My hosts/guest VM directly use HP 3par's storage as boot disks/data disks...) Hope anyone can share any experience on this. Thank you, Patrick Chiang

4 years, 6 months

2
1
0 / 0

[OST] Network suites fails, 6+ days runtime?

by Nir Soffer

I have a trivial patch: https://gerrit.ovirt.org/c/112068 This should not have any effect on the network tests, but I ran the tests twice, and the network suite fails in both runs: - https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-system-tests_s... - https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-system-tests_s... The first run waited in the queue, 1 day 12 hours then ran for 11 hours. The second run waited in the queue for 2 days 21 hours and then ran for 19 hours. This is a total of 135 hours waiting for test results! Looks like we don't have the resources to test OST patches. The current way testing 8 suites for every patch is not workable. Can someone look at the network suite failure? Should we remove it from the test matrix? Nir

4 years, 6 months

3
4
0 / 0

How to set up a (rh)el8 machine for running OST

by Marcin Sobczyk

Hi All, there are multiple pieces of information floating around on how to set up a machine for running OST. Some of them outdated (like dealing with el7), some of them more recent, but still a bit messy. Not long ago, in some email conversation, Milan presented an ansible playbook that provided the steps necessary to do that. We've picked up the playbook, tweaked it a bit, made a convenience shell script wrapper that runs it, and pushed that into OST project [1]. This script, along with the playbook, should be our single-source-of-truth, one-stop solution for the job. It's been tested by a couple of persons and proved to be able to set up everything on a bare (rh)el8 machine. If you encounter any problems with the script please either report it on the devel mailing list, directly to me, or simply file a patch. Let's keep it maintained. Regards, Marcin [1] https://gerrit.ovirt.org/#/c/111749/

4 years, 6 months

6
14
0 / 0

virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)

by Yedidyah Bar David

Hi all, On Mon, Oct 12, 2020 at 5:17 AM <jenkins(a)jenkins.phx.ovirt.org> wrote: > > Project: https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/ > Build: https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_night... Above failed with: https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_night... vdsm.log has: https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_night... 2020-10-11 22:05:14,695-0400 INFO (jsonrpc/1) [api.host] FINISH getJobs return={'jobs': {'05eaea44-7e4c-4442-9926-2bcb696520f1': {'id': '05eaea44-7e4c-4442-9926-2bcb696520f1', 'status': 'failed', 'description': 'sparsify_volume', 'job_type': 'storage', 'error': {'code': 100, 'message': 'General Exception: (\'Command [\\\'/usr/bin/virt-sparsify\\\', \\\'--machine-readable\\\', \\\'--in-place\\\', \\\'/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/8b292c13-fd8a-4a7c-903c-5724ec742c10/images/a367c179-2ac9-4930-abeb-848229f81c97/515fcf06-8743-45d1-9af8-61a0c48e8c67\\\'] failed with rc=1 out=b\\\'3/12\\\\n{ "message": "libguestfs error: guestfs_launch failed.\\\\\\\\nThis usually means the libguestfs appliance failed to start or crashed.\\\\\\\\nDo:\\\\\\\\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\\\\\\\\nand run the command again. For further information, read:\\\\\\\\n http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\\\\\\\nYou can also run \\\\\\\'libguestfs-test-tool\\\\\\\' and post the *complete* output\\\\\\\\ninto a bug report or message to the libguestfs mailing list.", "timestamp": "2020-10-11T22:05:08.397538670-04:00", "type": "error" }\\\\n\\\' err=b"virt-sparsify: error: libguestfs error: guestfs_launch failed.\\\\nThis usually means the libguestfs appliance failed to start or crashed.\\\\nDo:\\\\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\\\\nand run the command again. For further information, read:\\\\n http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\\\nYou can also run \\\'libguestfs-test-tool\\\' and post the *complete* output\\\\ninto a bug report or message to the libguestfs mailing list.\\\\n\\\\nIf reporting bugs, run virt-sparsify with debugging enabled and include the \\\\ncomplete output:\\\\n\\\\n virt-sparsify -v -x [...]\\\\n"\',)'}}}, 'status': {'code': 0, 'message': 'Done'}} from=::ffff:192.168.201.4,43318, flow_id=365642f4-2fe2-45df-937a-f4ca435eea38 (api:54) 2020-10-11 22:05:14,695-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer] Return 'Host.getJobs' in bridge with {'05eaea44-7e4c-4442-9926-2bcb696520f1': {'id': '05eaea44-7e4c-4442-9926-2bcb696520f1', 'status': 'failed', 'description': 'sparsify_volume', 'job_type': 'storage', 'error': {'code': 100, 'message': 'General Exception: (\'Command [\\\'/usr/bin/virt-sparsify\\\', \\\'--machine-readable\\\', \\\'--in-place\\\', \\\'/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/8b292c13-fd8a-4a7c-903c-5724ec742c10/images/a367c179-2ac9-4930-abeb-848229f81c97/515fcf06-8743-45d1-9af8-61a0c48e8c67\\\'] failed with rc=1 out=b\\\'3/12\\\\n{ "message": "libguestfs error: guestfs_launch failed.\\\\\\\\nThis usually means the libguestfs appliance failed to start or crashed.\\\\\\\\nDo:\\\\\\\\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\\\\\\\\nand run the command again. For further information, read:\\\\\\\\n http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\\\\\\\nYou can also run \\\\\\\'libguestfs-test-tool\\\\\\\' and post the *complete* output\\\\\\\\ninto a bug report or message to the libguestfs mailing list.", "timestamp": "2020-10-11T22:05:08.397538670-04:00", "type": "error" }\\\\n\\\' err=b"virt-sparsify: error: libguestfs error: guestfs_launch failed.\\\\nThis usually means the libguestfs appliance failed to start or crashed.\\\\nDo:\\\\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\\\\nand run the command again. For further information, read:\\\\n http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\\\nYou can also run \\\'libguestfs-test-tool\\\' and post the *complete* output\\\\ninto a bug report or message to the libguestfs mailing list.\\\\n\\\\nIf reporting bugs, run virt-sparsify with debugging enabled and include the \\\\ncomplete output:\\\\n\\\\n virt-sparsify -v -x [...]\\\\n"\',)'}}} (__init__:356) /var/log/messages has: Oct 11 22:04:51 lago-basic-suite-master-host-0 kvm[80601]: 1 guest now active Oct 11 22:05:06 lago-basic-suite-master-host-0 journal[80557]: Domain id=1 name='guestfs-hl0ntvn92rtkk2u0' uuid=05ea5a53-562f-49f8-a8ca-76b45c5325b4 is tainted: custom-argv Oct 11 22:05:06 lago-basic-suite-master-host-0 journal[80557]: Domain id=1 name='guestfs-hl0ntvn92rtkk2u0' uuid=05ea5a53-562f-49f8-a8ca-76b45c5325b4 is tainted: host-cpu Oct 11 22:05:06 lago-basic-suite-master-host-0 kvm[80801]: 2 guests now active Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: internal error: End of file from qemu monitor Oct 11 22:05:08 lago-basic-suite-master-host-0 kvm[80807]: 1 guest now active Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot resolve symlink /tmp/libguestfseTG8xF/console.sock: No such file or directoryKN<F3>L^? Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot resolve symlink /tmp/libguestfseTG8xF/guestfsd.sock: No such file or directoryKN<F3>L^? Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot lookup default selinux label for /tmp/libguestfs1WkcF7/overlay1.qcow2 Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot lookup default selinux label for /var/tmp/.guestfs-36/appliance.d/kernel Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot lookup default selinux label for /var/tmp/.guestfs-36/appliance.d/initrd Oct 11 22:05:08 lago-basic-suite-master-host-0 vdsm[74096]: ERROR Job '05eaea44-7e4c-4442-9926-2bcb696520f1' failed#012Traceback (most recent call last):#012 File "/usr/lib/python3.6/site-packages/vdsm/jobs.py", line 159, in run#012 self._run()#012 File "/usr/lib/python3.6/site-packages/vdsm/storage/sdm/api/sparsify_volume.py", line 57, in _run#012 virtsparsify.sparsify_inplace(self._vol_info.path)#012 File "/usr/lib/python3.6/site-packages/vdsm/virtsparsify.py", line 40, in sparsify_inplace#012 commands.run(cmd)#012 File "/usr/lib/python3.6/site-packages/vdsm/common/commands.py", line 101, in run#012 raise cmdutils.Error(args, p.returncode, out, err)#012vdsm.common.cmdutils.Error: Command ['/usr/bin/virt-sparsify', '--machine-readable', '--in-place', '/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/8b292c13-fd8a-4a7c-903c-5724ec742c10/images/a367c179-2ac9-4930-abeb-848229f81c97/515fcf06-8743-45d1-9af8-61a0c48e8c67'] failed with rc=1 out=b'3/12\n{ "message": "libguestfs error: guestfs_launch failed.\\nThis usually means the libguestfs appliance failed to start or crashed.\\nDo:\\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\\nand run the command again. For further information, read:\\n http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\nYou can also run \'libguestfs-test-tool\' and post the *complete* output\\ninto a bug report or message to the libguestfs mailing list.", "timestamp": "2020-10-11T22:05:08.397538670-04:00", "type": "error" }\n' err=b"virt-sparsify: error: libguestfs error: guestfs_launch failed.\nThis usually means the libguestfs appliance failed to start or crashed.\nDo:\n export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\nand run the command again. For further information, read:\n http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou can also run 'libguestfs-test-tool' and post the *complete* output\ninto a bug report or message to the libguestfs mailing list.\n\nIf reporting bugs, run virt-sparsify with debugging enabled and include the \ncomplete output:\n\n virt-sparsify -v -x [...]\n" The next run of the job (480) did finish successfully. No idea if it was already fixed by a patch, or is simply a random/env issue. Is it possible to pass LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1 without patching vdsm? Not sure. In any case, if this does not cause too much excess debug logging, perhaps better always pass it, to help retroactively analyze such failures. Or, patch virt-sparsify/libguestfs/whatever to always log at least enough information on failure even without passing these. Best regards, > Build Number: 479 > Build Status: Failure > Triggered By: Started by timer > > ------------------------------------- > Changes Since Last Success: > ------------------------------------- > Changes for Build #479 > [hbraha] network: bond active slave test > > > > > ----------------- > Failed Tests: > ----------------- > 1 tests failed. > FAILED: basic-suite-master.test-scenarios.004_basic_sanity.test_sparsify_disk1 > > Error Message: > AssertionError: False != True after 600 seconds > > Stack Trace: > api_v4 = <ovirtsdk4.Connection object at 0x7fe717c60e50> > > @order_by(_TEST_LIST) > def test_sparsify_disk1(api_v4): > engine = api_v4.system_service() > disk_service = test_utils.get_disk_service(engine, DISK1_NAME) > with test_utils.TestEvent(engine, 1325): # USER_SPARSIFY_IMAGE_START event > disk_service.sparsify() > > with test_utils.TestEvent(engine, 1326): # USER_SPARSIFY_IMAGE_FINISH_SUCCESS > > pass > > ../basic-suite-master/test-scenarios/004_basic_sanity.py:295: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > /usr/lib64/python2.7/contextlib.py:24: in __exit__ > self.gen.next() > ../ost_utils/ost_utils/engine_utils.py:44: in wait_for_event > lambda: > ../ost_utils/ost_utils/assertions.py:97: in assert_true_within_long > assert_equals_within_long(func, True, allowed_exceptions) > ../ost_utils/ost_utils/assertions.py:82: in assert_equals_within_long > func, value, LONG_TIMEOUT, allowed_exceptions=allowed_exceptions > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > > func = <function <lambda> at 0x7fe7176dfb18>, value = True, timeout = 600 > allowed_exceptions = [], initial_wait = 0 > error_message = 'False != True after 600 seconds' > > def assert_equals_within( > func, value, timeout, allowed_exceptions=None, initial_wait=10, > error_message=None > ): > allowed_exceptions = allowed_exceptions or [] > with _EggTimer(timeout) as timer: > while not timer.elapsed(): > try: > res = func() > if res == value: > return > except Exception as exc: > if _instance_of_any(exc, allowed_exceptions): > time.sleep(3) > continue > > LOGGER.exception("Unhandled exception in %s", func) > raise > > if initial_wait == 0: > time.sleep(3) > else: > time.sleep(initial_wait) > initial_wait = 0 > try: > if error_message is None: > error_message = '%s != %s after %s seconds' % (res, value, timeout) > > raise AssertionError(error_message) > E AssertionError: False != True after 600 seconds > > ../ost_utils/ost_utils/assertions.py:60: AssertionError_______________________________________________ > Infra mailing list -- infra(a)ovirt.org > To unsubscribe send an email to infra-leave(a)ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/65FECSE7EBW... -- Didi

4 years, 6 months

5
14
0 / 0

OST check-patch jobs are stuck on archive artifacts stage

by Amit Bawer

Hi, Exemplified here, started last night and still hasn't finished archiving: https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/131... Thanks.

4 years, 6 months

2
2
0 / 0

Testing image transfer and backup with OST environment

by Nir Soffer

I want to share useful info from the OST hackathon we had this week. Image transfer must work with real hostnames to allow server certificate verification. Inside the OST environment, engine and hosts names are resolvable, but on the host (or vm) running OST, the names are not available. This can be fixed by adding the engine and hosts to /etc/hosts like this: $ cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.200.2 engine 192.168.200.3 lago-basic-suite-master-host-0 192.168.200.4 lago-basic-suite-master-host-1 It would be if this was automated by OST. You can get the details using: $ cd src/ovirt-system-tests/deployment-xxx $ lago status OST keeps the deployment directory in the source directory. Be careful if you like to "git clean -dxf' since it will delete all the deployment and you will have to kill the vms manually later. The next thing we need is the engine ca cert. It can be fetched like this: $ curl -k 'https://engine/ovirt-engine/services/pki-resource?resource=ca-certificate...' > ca.pem I would expect OST to do this and put the file in the deployment directory. To upload or download images, backup vms or use other modern examples from the sdk, you need to have a configuration file like this: $ cat ~/.config/ovirt.conf [engine] engine_url = https://engine username = admin@internal password = 123 cafile = ca.pem With this uploading from the same directory where ca.pem is located will work. If you want it to work from any directory, use absolute path to the file. I created a test image using qemu-img and qemu-io: $ qemu-img create -f qcow2 test.qcow2 1g To write some data to the test image we can use qemu-io. This writes 64k of data (b"\xf0" * 64 * 1024) to offset 1 MiB. $ qemu-io -f qcow2 -c "write -P 240 1m 64k" test.qcow2 Since this image contains only 64k of data, uploading it should be instant. The last part we need is the imageio client package: $ dnf install ovirt-imageio-client To upload the image, we need at least one host up and storage domains created. I did not find a way to prepare OST, so simply run this after run_tests completed. It took about an hour. To upload the image to raw sparse disk we can use: $ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py -c engine --sd-name nfs --disk-sparse --disk-format raw test.qcow2 [ 0.0 ] Checking image... [ 0.0 ] Image format: qcow2 [ 0.0 ] Disk format: raw [ 0.0 ] Disk content type: data [ 0.0 ] Disk provisioned size: 1073741824 [ 0.0 ] Disk initial size: 1073741824 [ 0.0 ] Disk name: test.raw [ 0.0 ] Disk backup: False [ 0.0 ] Connecting... [ 0.0 ] Creating disk... [ 36.3 ] Disk ID: 26df08cf-3dec-47b9-b776-0e2bc564b6d5 [ 36.3 ] Creating image transfer... [ 38.2 ] Transfer ID: de8cfac9-ead2-4304-b18b-a1779d647716 [ 38.2 ] Transfer host name: lago-basic-suite-master-host-1 [ 38.2 ] Uploading image... [ 100.00% ] 1.00 GiB, 1.79 seconds, 571.50 MiB/s [ 40.0 ] Finalizing image transfer... [ 44.1 ] Upload completed successfully I uploaded this before I added the hosts to /etc/hosts, so the upload was done via the proxy. Yes, it took 36 seconds to create the disk. To download the disk use: $ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/download_disk.py -c engine 5ac63c72-6296-46b1-a068-b1039c8ecbd1 downlaod.qcow2 [ 0.0 ] Connecting... [ 0.2 ] Creating image transfer... [ 1.6 ] Transfer ID: a99e2a43-8360-4661-81dc-02828a88d586 [ 1.6 ] Transfer host name: lago-basic-suite-master-host-1 [ 1.6 ] Downloading image... [ 100.00% ] 1.00 GiB, 0.32 seconds, 3.10 GiB/s [ 1.9 ] Finalizing image transfer... We can verify the transfers using checksums. Here we create a checksum of the remote disk: $ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/checksum_disk.py -c engine 26df08cf-3dec-47b9-b776-0e2bc564b6d5 { "algorithm": "blake2b", "block_size": 4194304, "checksum": "a79a1efae73484e0218403e6eb715cdf109c8e99c2200265b779369339cf347b" } And checksum of the downloaded image - they should match: $ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/checksum_image.py downlaod.qcow2 { "algorithm": "blake2b", "block_size": 4194304, "checksum": "a79a1efae73484e0218403e6eb715cdf109c8e99c2200265b779369339cf347b" } Same upload to iscsi domain, using qcow2 format: $ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py -c engine --sd-name iscsi --disk-sparse --disk-format qcow2 test.qcow2 [ 0.0 ] Checking image... [ 0.0 ] Image format: qcow2 [ 0.0 ] Disk format: cow [ 0.0 ] Disk content type: data [ 0.0 ] Disk provisioned size: 1073741824 [ 0.0 ] Disk initial size: 458752 [ 0.0 ] Disk name: test.qcow2 [ 0.0 ] Disk backup: False [ 0.0 ] Connecting... [ 0.0 ] Creating disk... [ 27.8 ] Disk ID: e7ef253e-7baa-4d4a-a9b2-1a6b7db13f41 [ 27.8 ] Creating image transfer... [ 30.0 ] Transfer ID: 88328857-ac99-4ee1-9618-6b3cd14a7db8 [ 30.0 ] Transfer host name: lago-basic-suite-master-host-0 [ 30.0 ] Uploading image... [ 100.00% ] 1.00 GiB, 0.31 seconds, 3.28 GiB/s [ 30.3 ] Finalizing image transfer... [ 35.4 ] Upload completed successfully Again, creating the disk is very slow, not sure why. Probably having a storage server on a nested vm is not a good idea. We can compare the checksum with the source image since checksum are computed from the guest content: [nsoffer@ost ~]$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/checksum_disk.py -c engine e7ef253e-7baa-4d4a-a9b2-1a6b7db13f41 { "algorithm": "blake2b", "block_size": 4194304, "checksum": "a79a1efae73484e0218403e6eb715cdf109c8e99c2200265b779369339cf347b" } Finally, we can try real images using virt-builder: $ virt-builder fedora-32 Will create a new Fedora 32 server image in the current directory. See --help for many useful options to create different format, set root password, or install packages. Uploading this image is much slower: $ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py -c engine --sd-name nfs --disk-sparse --disk-format raw fedora-32.img [ 0.0 ] Checking image... [ 0.0 ] Image format: raw [ 0.0 ] Disk format: raw [ 0.0 ] Disk content type: data [ 0.0 ] Disk provisioned size: 6442450944 [ 0.0 ] Disk initial size: 6442450944 [ 0.0 ] Disk name: fedora-32.raw [ 0.0 ] Disk backup: False [ 0.0 ] Connecting... [ 0.0 ] Creating disk... [ 36.8 ] Disk ID: b17126f3-fa03-4c22-8f59-ef599b64a42e [ 36.8 ] Creating image transfer... [ 38.5 ] Transfer ID: fe82fb86-b87a-4e49-b9cd-f1f4334e7852 [ 38.5 ] Transfer host name: lago-basic-suite-master-host-0 [ 38.5 ] Uploading image... [ 100.00% ] 6.00 GiB, 99.71 seconds, 61.62 MiB/s [ 138.2 ] Finalizing image transfer... [ 147.8 ] Upload completed successfully At the current state of OST, we should avoid such long tests. Using backup_vm.py and other examples should work in the same way. I posted this patch to improve nfs performance, please review: https://gerrit.ovirt.org/c/112067/ Nir

4 years, 6 months

4
9
0 / 0

回复:Re: Host's status how to change to UP when I execute Management--> Active command

by 李伏琼

thanks very much --------------原始邮件-------------- 发件人："Liran Rotenberg "<lrotenbe(a)redhat.com>; 发送时间：2020年11月5日(星期四) 晚上11:52 收件人："lifuqiong(a)sunyainfo.com" <lifuqiong(a)sunyainfo.com>; 抄送："users "<users(a)ovirt.org>;"devel "<devel(a)ovirt.org>; 主题：[ovirt-devel] Re: Host's status how to change to UP when I execute Management--> Active command ----------------------------------- Hi,The engine will monitor that host. Once it's reachable and we can get information from it the status will change accordingly. Please check out the HostMonitoring class. On Thu, Nov 5, 2020 at 11:57 AM lifuqiong(a)sunyainfo.com <lifuqiong(a)sunyainfo.com> wrote: Hi,      I checkout ovirt-engine's code with branch 4.2;  When execute Host's Management --> Active  command,  ovirt engine just set vds's status to "Unassigned" but how the vds's status changed to "UP" in ovirt-engine?  The code in ActiveVdsCommand.executeCommand() as follows: protected void executeCommand() {         final VDS vds = getVds();         try (EngineLock monitoringLock = acquireMonitorLock("Activate host")) {             executionHandler.updateSpecificActionJobCompleted(vds.getId(), ActionType.MaintenanceVds, false);             setSucceeded(setVdsStatus(VDSStatus.Unassigned).getSucceeded());             if (getSucceeded()) {                 TransactionSupport.executeInNewTransaction(() -> {                     // set network to operational / non-operational                     List<Network> networks = networkDao.getAllForCluster(vds.getClusterId());                     networkClusterHelper.setStatus(vds.getClusterId(), networks);                     return null;                 });                 // Start glusterd service on the node, which would haven been stopped due to maintenance                 if (vds.getClusterSupportsGlusterService()) {                     runVdsCommand(VDSCommandType.ManageGlusterService,                             new GlusterServiceVDSParameters(vds.getId(), Arrays.asList("glusterd"), "restart"));                     // starting vdo service                     GlusterStatus isRunning = glusterUtil.isVDORunning(vds.getId());                     switch (isRunning) {                     case DOWN:                         log.info("VDO service is down in host : '{}' with id '{}', starting VDO service",                                 vds.getHostName(),                                 vds.getId());                         startVDOService(vds);                         break;                     case UP:                         log.info("VDO service is up in host : '{}' with id '{}', skipping starting of VDO service",                                 vds.getHostName(),                                 vds.getId());                            break;                     case UNKNOWN:                         log.info("VDO service is not installed host : '{}' with id '{}', ignoring to start VDO service",                                 vds.getHostName(),                                 vds.getId());                         break;                     }                 }             }         }     } Your Sincerely Mark Lee _______________________________________________ Devel mailing list -- devel(a)ovirt.org To unsubscribe send an email to devel-leave(a)ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/36745S77NLH...

4 years, 6 months

1
0
0 / 0

Re: Host's status how to change to UP when I execute Management--> Active command

by Liran Rotenberg

Hi, The engine will monitor that host. Once it's reachable and we can get information from it the status will change accordingly. Please check out the HostMonitoring class. On Thu, Nov 5, 2020 at 11:57 AM lifuqiong(a)sunyainfo.com < lifuqiong(a)sunyainfo.com> wrote: > > Hi, > I checkout ovirt-engine's code with branch 4.2; When execute Host's > Management --> Active command, ovirt engine just set vds's status to > "Unassigned" but how the vds's status changed to "UP" in ovirt-engine? The > code in ActiveVdsCommand.executeCommand() as follows: > protected void executeCommand() { > > final VDS vds = getVds(); > try (EngineLock monitoringLock = acquireMonitorLock("Activate > host")) { > executionHandler.updateSpecificActionJobCompleted(vds.getId(), > ActionType.MaintenanceVds, false); > > setSucceeded(setVdsStatus(VDSStatus.Unassigned).getSucceeded()); > > if (getSucceeded()) { > TransactionSupport.executeInNewTransaction(() -> { > // set network to operational / non-operational > List<Network> networks = > networkDao.getAllForCluster(vds.getClusterId()); > networkClusterHelper.setStatus(vds.getClusterId(), > networks); > return null; > }); > > // Start glusterd service on the node, which would haven > been stopped due to maintenance > if (vds.getClusterSupportsGlusterService()) { > runVdsCommand(VDSCommandType.ManageGlusterService, > new GlusterServiceVDSParameters(vds.getId(), > Arrays.asList("glusterd"), "restart")); > // starting vdo service > GlusterStatus isRunning = > glusterUtil.isVDORunning(vds.getId()); > switch (isRunning) { > case DOWN: > log.info("VDO service is down in host : '{}' with > id '{}', starting VDO service", > vds.getHostName(), > vds.getId()); > startVDOService(vds); > break; > case UP: > log.info("VDO service is up in host : '{}' with > id '{}', skipping starting of VDO service", > vds.getHostName(), > vds.getId()); > break; > case UNKNOWN: > log.info("VDO service is not installed host : > '{}' with id '{}', ignoring to start VDO service", > vds.getHostName(), > vds.getId()); > break; > } > > } > } > } > } > > > Your Sincerely > Mark Lee > _______________________________________________ > Devel mailing list -- devel(a)ovirt.org > To unsubscribe send an email to devel-leave(a)ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/36745S77NLH... >

4 years, 6 months

1
0
0 / 0

Building lago from source

by Nir Soffer

I'm trying to test with ost: https://github.com/lago-project/lago/pull/815 So clone the project on the ost vm and built rpms: make make rpm The result is: lago-1.0.2-1.el8.noarch.rpm python3-lago-1.0.2-1.el8.noarch.rpm But the lago version installed by setup_for_ost.sh is: $ rpm -q lago lago-1.0.11-1.el8.noarch I tried to install lago from master, and then lago_init fail: $ lago_init /usr/share/ost-images/el8-engine-installed.qcow2 -k /usr/share/ost-images/el8_id_rsa Using images ost-images-el8-host-installed-1-202011021248.x86_64, ost-images-el8-engine-installed-1-202011021248.x86_64 containing ovirt-engine-4.4.4-0.0.master.20201031195930.git8f858d6c01d.el8.noarch vdsm-4.40.35.1-1.el8.x86_64 usage: lago [-h] [-l {info,debug,error,warning}] [--logdepth LOGDEPTH] [--version] [--out-format {default,flat,json,yaml}] [--prefix-path PREFIX_PATH] [--workdir-path WORKDIR_PATH] [--prefix-name PREFIX_NAME] [--ssh-user SSH_USER] [--ssh-password SSH_PASSWORD] [--ssh-tries SSH_TRIES] [--ssh-timeout SSH_TIMEOUT] [--libvirt_url LIBVIRT_URL] [--libvirt-user LIBVIRT_USER] [--libvirt-password LIBVIRT_PASSWORD] [--default_vm_type DEFAULT_VM_TYPE] [--default_vm_provider DEFAULT_VM_PROVIDER] [--default_root_password DEFAULT_ROOT_PASSWORD] [--lease_dir LEASE_DIR] [--reposync-dir REPOSYNC_DIR] [--ignore-warnings] VERB ... lago: error: unrecognized arguments: --ssh-key /home/nsoffer/src/ovirt-system-tests/deployment-basic-suite-master /home/nsoffer/src/ovirt-system-tests/basic-suite-master/LagoInitFile Do we use a customized lago version for ost? Where is the source? Nir

4 years, 6 months

2
1
0 / 0

Host's status how to change to UP when I execute Management--> Active command

by lifuqiong＠sunyainfo.com

Hi, I checkout ovirt-engine's code with branch 4.2; When execute Host's Management --> Active command, ovirt engine just set vds's status to "Unassigned" but how the vds's status changed to "UP" in ovirt-engine? The code in ActiveVdsCommand.executeCommand() as follows: protected void executeCommand() { final VDS vds = getVds(); try (EngineLock monitoringLock = acquireMonitorLock("Activate host")) { executionHandler.updateSpecificActionJobCompleted(vds.getId(), ActionType.MaintenanceVds, false); setSucceeded(setVdsStatus(VDSStatus.Unassigned).getSucceeded()); if (getSucceeded()) { TransactionSupport.executeInNewTransaction(() -> { // set network to operational / non-operational List<Network> networks = networkDao.getAllForCluster(vds.getClusterId()); networkClusterHelper.setStatus(vds.getClusterId(), networks); return null; }); // Start glusterd service on the node, which would haven been stopped due to maintenance if (vds.getClusterSupportsGlusterService()) { runVdsCommand(VDSCommandType.ManageGlusterService, new GlusterServiceVDSParameters(vds.getId(), Arrays.asList("glusterd"), "restart")); // starting vdo service GlusterStatus isRunning = glusterUtil.isVDORunning(vds.getId()); switch (isRunning) { case DOWN: log.info("VDO service is down in host : '{}' with id '{}', starting VDO service", vds.getHostName(), vds.getId()); startVDOService(vds); break; case UP: log.info("VDO service is up in host : '{}' with id '{}', skipping starting of VDO service", vds.getHostName(), vds.getId()); break; case UNKNOWN: log.info("VDO service is not installed host : '{}' with id '{}', ignoring to start VDO service", vds.getHostName(), vds.getId()); break; } } } } } Your Sincerely Mark Lee

4 years, 6 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Devel November 2020