Change compatibility level for data center/cluster during upgrade - what exactly are we changing?
by Patrick Chiang
Hi all,
From the upgrade procedure, when we completed most upgrade tasks, we have
to change the compatibility level in cluster and data center, and then
restart all guest VM afterwards.It seems no down time expected with these 2
changes from the procedure.
I wonder what exactly we are changing (a setting in the database/file??)
and most important of all, whether there will be a few seconds down time
when we made those changes to "data center"?
Because I worried if a data center change could lead to storage
domain's down time and eventually force hypervisors, guest VM restart
before expected time. (My hosts/guest VM directly use HP 3par's storage as
boot disks/data disks...)
Hope anyone can share any experience on this.
Thank you,
Patrick Chiang
4 years, 2 months
How to set up a (rh)el8 machine for running OST
by Marcin Sobczyk
Hi All,
there are multiple pieces of information floating around on how to set
up a machine
for running OST. Some of them outdated (like dealing with el7), some
of them more recent,
but still a bit messy.
Not long ago, in some email conversation, Milan presented an ansible
playbook that provided
the steps necessary to do that. We've picked up the playbook, tweaked
it a bit, made a convenience shell script wrapper that runs it, and
pushed that into OST project [1].
This script, along with the playbook, should be our
single-source-of-truth, one-stop
solution for the job. It's been tested by a couple of persons and
proved to be able
to set up everything on a bare (rh)el8 machine. If you encounter any
problems with the script
please either report it on the devel mailing list, directly to me, or
simply file a patch.
Let's keep it maintained.
Regards, Marcin
[1] https://gerrit.ovirt.org/#/c/111749/
4 years, 2 months
virt-sparsify failed (was: [oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 479 - Failure!)
by Yedidyah Bar David
Hi all,
On Mon, Oct 12, 2020 at 5:17 AM <jenkins(a)jenkins.phx.ovirt.org> wrote:
>
> Project: https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/
> Build: https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_night...
Above failed with:
https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_night...
vdsm.log has:
https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_night...
2020-10-11 22:05:14,695-0400 INFO (jsonrpc/1) [api.host] FINISH
getJobs return={'jobs': {'05eaea44-7e4c-4442-9926-2bcb696520f1':
{'id': '05eaea44-7e4c-4442-9926-2bcb696520f1', 'status': 'failed',
'description': 'sparsify_volume', 'job_type': 'storage', 'error':
{'code': 100, 'message': 'General Exception: (\'Command
[\\\'/usr/bin/virt-sparsify\\\', \\\'--machine-readable\\\',
\\\'--in-place\\\',
\\\'/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/8b292c13-fd8a-4a7c-903c-5724ec742c10/images/a367c179-2ac9-4930-abeb-848229f81c97/515fcf06-8743-45d1-9af8-61a0c48e8c67\\\']
failed with rc=1 out=b\\\'3/12\\\\n{ "message": "libguestfs error:
guestfs_launch failed.\\\\\\\\nThis usually means the libguestfs
appliance failed to start or crashed.\\\\\\\\nDo:\\\\\\\\n export
LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\\\\\\\\nand run the command
again. For further information, read:\\\\\\\\n
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\\\\\\\nYou
can also run \\\\\\\'libguestfs-test-tool\\\\\\\' and post the
*complete* output\\\\\\\\ninto a bug report or message to the
libguestfs mailing list.", "timestamp":
"2020-10-11T22:05:08.397538670-04:00", "type": "error" }\\\\n\\\'
err=b"virt-sparsify: error: libguestfs error: guestfs_launch
failed.\\\\nThis usually means the libguestfs appliance failed to
start or crashed.\\\\nDo:\\\\n export LIBGUESTFS_DEBUG=1
LIBGUESTFS_TRACE=1\\\\nand run the command again. For further
information, read:\\\\n
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\\\nYou
can also run \\\'libguestfs-test-tool\\\' and post the *complete*
output\\\\ninto a bug report or message to the libguestfs mailing
list.\\\\n\\\\nIf reporting bugs, run virt-sparsify with debugging
enabled and include the \\\\ncomplete output:\\\\n\\\\n virt-sparsify
-v -x [...]\\\\n"\',)'}}}, 'status': {'code': 0, 'message': 'Done'}}
from=::ffff:192.168.201.4,43318,
flow_id=365642f4-2fe2-45df-937a-f4ca435eea38 (api:54)
2020-10-11 22:05:14,695-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer]
Return 'Host.getJobs' in bridge with
{'05eaea44-7e4c-4442-9926-2bcb696520f1': {'id':
'05eaea44-7e4c-4442-9926-2bcb696520f1', 'status': 'failed',
'description': 'sparsify_volume', 'job_type': 'storage', 'error':
{'code': 100, 'message': 'General Exception: (\'Command
[\\\'/usr/bin/virt-sparsify\\\', \\\'--machine-readable\\\',
\\\'--in-place\\\',
\\\'/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/8b292c13-fd8a-4a7c-903c-5724ec742c10/images/a367c179-2ac9-4930-abeb-848229f81c97/515fcf06-8743-45d1-9af8-61a0c48e8c67\\\']
failed with rc=1 out=b\\\'3/12\\\\n{ "message": "libguestfs error:
guestfs_launch failed.\\\\\\\\nThis usually means the libguestfs
appliance failed to start or crashed.\\\\\\\\nDo:\\\\\\\\n export
LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\\\\\\\\nand run the command
again. For further information, read:\\\\\\\\n
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\\\\\\\nYou
can also run \\\\\\\'libguestfs-test-tool\\\\\\\' and post the
*complete* output\\\\\\\\ninto a bug report or message to the
libguestfs mailing list.", "timestamp":
"2020-10-11T22:05:08.397538670-04:00", "type": "error" }\\\\n\\\'
err=b"virt-sparsify: error: libguestfs error: guestfs_launch
failed.\\\\nThis usually means the libguestfs appliance failed to
start or crashed.\\\\nDo:\\\\n export LIBGUESTFS_DEBUG=1
LIBGUESTFS_TRACE=1\\\\nand run the command again. For further
information, read:\\\\n
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\\\nYou
can also run \\\'libguestfs-test-tool\\\' and post the *complete*
output\\\\ninto a bug report or message to the libguestfs mailing
list.\\\\n\\\\nIf reporting bugs, run virt-sparsify with debugging
enabled and include the \\\\ncomplete output:\\\\n\\\\n virt-sparsify
-v -x [...]\\\\n"\',)'}}} (__init__:356)
/var/log/messages has:
Oct 11 22:04:51 lago-basic-suite-master-host-0 kvm[80601]: 1 guest now active
Oct 11 22:05:06 lago-basic-suite-master-host-0 journal[80557]: Domain
id=1 name='guestfs-hl0ntvn92rtkk2u0'
uuid=05ea5a53-562f-49f8-a8ca-76b45c5325b4 is tainted: custom-argv
Oct 11 22:05:06 lago-basic-suite-master-host-0 journal[80557]: Domain
id=1 name='guestfs-hl0ntvn92rtkk2u0'
uuid=05ea5a53-562f-49f8-a8ca-76b45c5325b4 is tainted: host-cpu
Oct 11 22:05:06 lago-basic-suite-master-host-0 kvm[80801]: 2 guests now active
Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]:
internal error: End of file from qemu monitor
Oct 11 22:05:08 lago-basic-suite-master-host-0 kvm[80807]: 1 guest now active
Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot
resolve symlink /tmp/libguestfseTG8xF/console.sock: No such file or
directoryKN<F3>L^?
Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot
resolve symlink /tmp/libguestfseTG8xF/guestfsd.sock: No such file or
directoryKN<F3>L^?
Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot
lookup default selinux label for /tmp/libguestfs1WkcF7/overlay1.qcow2
Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot
lookup default selinux label for
/var/tmp/.guestfs-36/appliance.d/kernel
Oct 11 22:05:08 lago-basic-suite-master-host-0 journal[80557]: cannot
lookup default selinux label for
/var/tmp/.guestfs-36/appliance.d/initrd
Oct 11 22:05:08 lago-basic-suite-master-host-0 vdsm[74096]: ERROR Job
'05eaea44-7e4c-4442-9926-2bcb696520f1' failed#012Traceback (most
recent call last):#012 File
"/usr/lib/python3.6/site-packages/vdsm/jobs.py", line 159, in run#012
self._run()#012 File
"/usr/lib/python3.6/site-packages/vdsm/storage/sdm/api/sparsify_volume.py",
line 57, in _run#012
virtsparsify.sparsify_inplace(self._vol_info.path)#012 File
"/usr/lib/python3.6/site-packages/vdsm/virtsparsify.py", line 40, in
sparsify_inplace#012 commands.run(cmd)#012 File
"/usr/lib/python3.6/site-packages/vdsm/common/commands.py", line 101,
in run#012 raise cmdutils.Error(args, p.returncode, out,
err)#012vdsm.common.cmdutils.Error: Command ['/usr/bin/virt-sparsify',
'--machine-readable', '--in-place',
'/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share1/8b292c13-fd8a-4a7c-903c-5724ec742c10/images/a367c179-2ac9-4930-abeb-848229f81c97/515fcf06-8743-45d1-9af8-61a0c48e8c67']
failed with rc=1 out=b'3/12\n{ "message": "libguestfs error:
guestfs_launch failed.\\nThis usually means the libguestfs appliance
failed to start or crashed.\\nDo:\\n export LIBGUESTFS_DEBUG=1
LIBGUESTFS_TRACE=1\\nand run the command again. For further
information, read:\\n
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\\nYou
can also run \'libguestfs-test-tool\' and post the *complete*
output\\ninto a bug report or message to the libguestfs mailing
list.", "timestamp": "2020-10-11T22:05:08.397538670-04:00", "type":
"error" }\n' err=b"virt-sparsify: error: libguestfs error:
guestfs_launch failed.\nThis usually means the libguestfs appliance
failed to start or crashed.\nDo:\n export LIBGUESTFS_DEBUG=1
LIBGUESTFS_TRACE=1\nand run the command again. For further
information, read:\n
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou can
also run 'libguestfs-test-tool' and post the *complete* output\ninto a
bug report or message to the libguestfs mailing list.\n\nIf reporting
bugs, run virt-sparsify with debugging enabled and include the
\ncomplete output:\n\n virt-sparsify -v -x [...]\n"
The next run of the job (480) did finish successfully. No idea if it
was already fixed by a patch, or is simply a random/env issue.
Is it possible to pass LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1 without
patching vdsm? Not sure. In any case, if this does not cause too much
excess debug logging, perhaps better always pass it, to help
retroactively analyze such failures. Or, patch
virt-sparsify/libguestfs/whatever to always log at least enough
information on failure even without passing these.
Best regards,
> Build Number: 479
> Build Status: Failure
> Triggered By: Started by timer
>
> -------------------------------------
> Changes Since Last Success:
> -------------------------------------
> Changes for Build #479
> [hbraha] network: bond active slave test
>
>
>
>
> -----------------
> Failed Tests:
> -----------------
> 1 tests failed.
> FAILED: basic-suite-master.test-scenarios.004_basic_sanity.test_sparsify_disk1
>
> Error Message:
> AssertionError: False != True after 600 seconds
>
> Stack Trace:
> api_v4 = <ovirtsdk4.Connection object at 0x7fe717c60e50>
>
> @order_by(_TEST_LIST)
> def test_sparsify_disk1(api_v4):
> engine = api_v4.system_service()
> disk_service = test_utils.get_disk_service(engine, DISK1_NAME)
> with test_utils.TestEvent(engine, 1325): # USER_SPARSIFY_IMAGE_START event
> disk_service.sparsify()
>
> with test_utils.TestEvent(engine, 1326): # USER_SPARSIFY_IMAGE_FINISH_SUCCESS
> > pass
>
> ../basic-suite-master/test-scenarios/004_basic_sanity.py:295:
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> /usr/lib64/python2.7/contextlib.py:24: in __exit__
> self.gen.next()
> ../ost_utils/ost_utils/engine_utils.py:44: in wait_for_event
> lambda:
> ../ost_utils/ost_utils/assertions.py:97: in assert_true_within_long
> assert_equals_within_long(func, True, allowed_exceptions)
> ../ost_utils/ost_utils/assertions.py:82: in assert_equals_within_long
> func, value, LONG_TIMEOUT, allowed_exceptions=allowed_exceptions
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
> func = <function <lambda> at 0x7fe7176dfb18>, value = True, timeout = 600
> allowed_exceptions = [], initial_wait = 0
> error_message = 'False != True after 600 seconds'
>
> def assert_equals_within(
> func, value, timeout, allowed_exceptions=None, initial_wait=10,
> error_message=None
> ):
> allowed_exceptions = allowed_exceptions or []
> with _EggTimer(timeout) as timer:
> while not timer.elapsed():
> try:
> res = func()
> if res == value:
> return
> except Exception as exc:
> if _instance_of_any(exc, allowed_exceptions):
> time.sleep(3)
> continue
>
> LOGGER.exception("Unhandled exception in %s", func)
> raise
>
> if initial_wait == 0:
> time.sleep(3)
> else:
> time.sleep(initial_wait)
> initial_wait = 0
> try:
> if error_message is None:
> error_message = '%s != %s after %s seconds' % (res, value, timeout)
> > raise AssertionError(error_message)
> E AssertionError: False != True after 600 seconds
>
> ../ost_utils/ost_utils/assertions.py:60: AssertionError_______________________________________________
> Infra mailing list -- infra(a)ovirt.org
> To unsubscribe send an email to infra-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
> List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/65FECSE7EBW...
--
Didi
4 years, 2 months
Testing image transfer and backup with OST environment
by Nir Soffer
I want to share useful info from the OST hackathon we had this week.
Image transfer must work with real hostnames to allow server
certificate verification.
Inside the OST environment, engine and hosts names are resolvable, but
on the host
(or vm) running OST, the names are not available.
This can be fixed by adding the engine and hosts to /etc/hosts like this:
$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.200.2 engine
192.168.200.3 lago-basic-suite-master-host-0
192.168.200.4 lago-basic-suite-master-host-1
It would be if this was automated by OST. You can get the details using:
$ cd src/ovirt-system-tests/deployment-xxx
$ lago status
OST keeps the deployment directory in the source directory. Be careful if you
like to "git clean -dxf' since it will delete all the deployment and
you will have to
kill the vms manually later.
The next thing we need is the engine ca cert. It can be fetched like this:
$ curl -k 'https://engine/ovirt-engine/services/pki-resource?resource=ca-certificate...'
> ca.pem
I would expect OST to do this and put the file in the deployment directory.
To upload or download images, backup vms or use other modern examples from
the sdk, you need to have a configuration file like this:
$ cat ~/.config/ovirt.conf
[engine]
engine_url = https://engine
username = admin@internal
password = 123
cafile = ca.pem
With this uploading from the same directory where ca.pem is located
will work. If you want
it to work from any directory, use absolute path to the file.
I created a test image using qemu-img and qemu-io:
$ qemu-img create -f qcow2 test.qcow2 1g
To write some data to the test image we can use qemu-io. This writes 64k of data
(b"\xf0" * 64 * 1024) to offset 1 MiB.
$ qemu-io -f qcow2 -c "write -P 240 1m 64k" test.qcow2
Since this image contains only 64k of data, uploading it should be instant.
The last part we need is the imageio client package:
$ dnf install ovirt-imageio-client
To upload the image, we need at least one host up and storage domains
created. I did not find a way to prepare OST, so simply run this after
run_tests completed. It took about an hour.
To upload the image to raw sparse disk we can use:
$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py
-c engine --sd-name nfs --disk-sparse --disk-format raw test.qcow2
[ 0.0 ] Checking image...
[ 0.0 ] Image format: qcow2
[ 0.0 ] Disk format: raw
[ 0.0 ] Disk content type: data
[ 0.0 ] Disk provisioned size: 1073741824
[ 0.0 ] Disk initial size: 1073741824
[ 0.0 ] Disk name: test.raw
[ 0.0 ] Disk backup: False
[ 0.0 ] Connecting...
[ 0.0 ] Creating disk...
[ 36.3 ] Disk ID: 26df08cf-3dec-47b9-b776-0e2bc564b6d5
[ 36.3 ] Creating image transfer...
[ 38.2 ] Transfer ID: de8cfac9-ead2-4304-b18b-a1779d647716
[ 38.2 ] Transfer host name: lago-basic-suite-master-host-1
[ 38.2 ] Uploading image...
[ 100.00% ] 1.00 GiB, 1.79 seconds, 571.50 MiB/s
[ 40.0 ] Finalizing image transfer...
[ 44.1 ] Upload completed successfully
I uploaded this before I added the hosts to /etc/hosts, so the upload
was done via the proxy.
Yes, it took 36 seconds to create the disk.
To download the disk use:
$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/download_disk.py
-c engine 5ac63c72-6296-46b1-a068-b1039c8ecbd1 downlaod.qcow2
[ 0.0 ] Connecting...
[ 0.2 ] Creating image transfer...
[ 1.6 ] Transfer ID: a99e2a43-8360-4661-81dc-02828a88d586
[ 1.6 ] Transfer host name: lago-basic-suite-master-host-1
[ 1.6 ] Downloading image...
[ 100.00% ] 1.00 GiB, 0.32 seconds, 3.10 GiB/s
[ 1.9 ] Finalizing image transfer...
We can verify the transfers using checksums. Here we create a checksum
of the remote
disk:
$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/checksum_disk.py
-c engine 26df08cf-3dec-47b9-b776-0e2bc564b6d5
{
"algorithm": "blake2b",
"block_size": 4194304,
"checksum":
"a79a1efae73484e0218403e6eb715cdf109c8e99c2200265b779369339cf347b"
}
And checksum of the downloaded image - they should match:
$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/checksum_image.py
downlaod.qcow2
{
"algorithm": "blake2b",
"block_size": 4194304,
"checksum": "a79a1efae73484e0218403e6eb715cdf109c8e99c2200265b779369339cf347b"
}
Same upload to iscsi domain, using qcow2 format:
$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py
-c engine --sd-name iscsi --disk-sparse --disk-format qcow2 test.qcow2
[ 0.0 ] Checking image...
[ 0.0 ] Image format: qcow2
[ 0.0 ] Disk format: cow
[ 0.0 ] Disk content type: data
[ 0.0 ] Disk provisioned size: 1073741824
[ 0.0 ] Disk initial size: 458752
[ 0.0 ] Disk name: test.qcow2
[ 0.0 ] Disk backup: False
[ 0.0 ] Connecting...
[ 0.0 ] Creating disk...
[ 27.8 ] Disk ID: e7ef253e-7baa-4d4a-a9b2-1a6b7db13f41
[ 27.8 ] Creating image transfer...
[ 30.0 ] Transfer ID: 88328857-ac99-4ee1-9618-6b3cd14a7db8
[ 30.0 ] Transfer host name: lago-basic-suite-master-host-0
[ 30.0 ] Uploading image...
[ 100.00% ] 1.00 GiB, 0.31 seconds, 3.28 GiB/s
[ 30.3 ] Finalizing image transfer...
[ 35.4 ] Upload completed successfully
Again, creating the disk is very slow, not sure why. Probably having a storage
server on a nested vm is not a good idea.
We can compare the checksum with the source image since checksum are computed
from the guest content:
[nsoffer@ost ~]$ python3
/usr/share/doc/python3-ovirt-engine-sdk4/examples/checksum_disk.py -c
engine e7ef253e-7baa-4d4a-a9b2-1a6b7db13f41
{
"algorithm": "blake2b",
"block_size": 4194304,
"checksum":
"a79a1efae73484e0218403e6eb715cdf109c8e99c2200265b779369339cf347b"
}
Finally, we can try real images using virt-builder:
$ virt-builder fedora-32
Will create a new Fedora 32 server image in the current directory. See
--help for many
useful options to create different format, set root password, or
install packages.
Uploading this image is much slower:
$ python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py
-c engine --sd-name nfs --disk-sparse --disk-format raw fedora-32.img
[ 0.0 ] Checking image...
[ 0.0 ] Image format: raw
[ 0.0 ] Disk format: raw
[ 0.0 ] Disk content type: data
[ 0.0 ] Disk provisioned size: 6442450944
[ 0.0 ] Disk initial size: 6442450944
[ 0.0 ] Disk name: fedora-32.raw
[ 0.0 ] Disk backup: False
[ 0.0 ] Connecting...
[ 0.0 ] Creating disk...
[ 36.8 ] Disk ID: b17126f3-fa03-4c22-8f59-ef599b64a42e
[ 36.8 ] Creating image transfer...
[ 38.5 ] Transfer ID: fe82fb86-b87a-4e49-b9cd-f1f4334e7852
[ 38.5 ] Transfer host name: lago-basic-suite-master-host-0
[ 38.5 ] Uploading image...
[ 100.00% ] 6.00 GiB, 99.71 seconds, 61.62 MiB/s
[ 138.2 ] Finalizing image transfer...
[ 147.8 ] Upload completed successfully
At the current state of OST, we should avoid such long tests.
Using backup_vm.py and other examples should work in the same way.
I posted this patch to improve nfs performance, please review:
https://gerrit.ovirt.org/c/112067/
Nir
4 years, 2 months
回复:Re: Host's status how to change to UP when I execute Management--> Active command
by 李伏琼
thanks very much
--------------原始邮件--------------
发件人:"Liran Rotenberg "<lrotenbe(a)redhat.com>;
发送时间:2020年11月5日(星期四) 晚上11:52
收件人:"lifuqiong(a)sunyainfo.com" <lifuqiong(a)sunyainfo.com>;
抄送:"users "<users(a)ovirt.org>;"devel "<devel(a)ovirt.org>;
主题:[ovirt-devel] Re: Host's status how to change to UP when I execute Management--> Active command
-----------------------------------
Hi,The engine will monitor that host. Once it's reachable and we can get information from it the status will change accordingly.
Please check out the HostMonitoring class.
On Thu, Nov 5, 2020 at 11:57 AM lifuqiong(a)sunyainfo.com <lifuqiong(a)sunyainfo.com> wrote:
Hi,
I checkout ovirt-engine's code with branch 4.2; When execute Host's Management --> Active command, ovirt engine just set vds's status to "Unassigned" but how the vds's status changed to "UP" in ovirt-engine? The code in ActiveVdsCommand.executeCommand() as follows:
protected void executeCommand() {
final VDS vds = getVds();
try (EngineLock monitoringLock = acquireMonitorLock("Activate host")) {
executionHandler.updateSpecificActionJobCompleted(vds.getId(), ActionType.MaintenanceVds, false);
setSucceeded(setVdsStatus(VDSStatus.Unassigned).getSucceeded());
if (getSucceeded()) {
TransactionSupport.executeInNewTransaction(() -> {
// set network to operational / non-operational
List<Network> networks = networkDao.getAllForCluster(vds.getClusterId());
networkClusterHelper.setStatus(vds.getClusterId(), networks);
return null;
});
// Start glusterd service on the node, which would haven been stopped due to maintenance
if (vds.getClusterSupportsGlusterService()) {
runVdsCommand(VDSCommandType.ManageGlusterService,
new GlusterServiceVDSParameters(vds.getId(), Arrays.asList("glusterd"), "restart"));
// starting vdo service
GlusterStatus isRunning = glusterUtil.isVDORunning(vds.getId());
switch (isRunning) {
case DOWN:
log.info("VDO service is down in host : '{}' with id '{}', starting VDO service",
vds.getHostName(),
vds.getId());
startVDOService(vds);
break;
case UP:
log.info("VDO service is up in host : '{}' with id '{}', skipping starting of VDO service",
vds.getHostName(),
vds.getId());
break;
case UNKNOWN:
log.info("VDO service is not installed host : '{}' with id '{}', ignoring to start VDO service",
vds.getHostName(),
vds.getId());
break;
}
}
}
}
}
Your Sincerely
Mark Lee
_______________________________________________
Devel mailing list -- devel(a)ovirt.org
To unsubscribe send an email to devel-leave(a)ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/36745S77NLH...
4 years, 2 months
Re: Host's status how to change to UP when I execute Management--> Active command
by Liran Rotenberg
Hi,
The engine will monitor that host. Once it's reachable and we can get
information from it the status will change accordingly.
Please check out the HostMonitoring class.
On Thu, Nov 5, 2020 at 11:57 AM lifuqiong(a)sunyainfo.com <
lifuqiong(a)sunyainfo.com> wrote:
>
> Hi,
> I checkout ovirt-engine's code with branch 4.2; When execute Host's
> Management --> Active command, ovirt engine just set vds's status to
> "Unassigned" but how the vds's status changed to "UP" in ovirt-engine? The
> code in ActiveVdsCommand.executeCommand() as follows:
> protected void executeCommand() {
>
> final VDS vds = getVds();
> try (EngineLock monitoringLock = acquireMonitorLock("Activate
> host")) {
> executionHandler.updateSpecificActionJobCompleted(vds.getId(),
> ActionType.MaintenanceVds, false);
>
> setSucceeded(setVdsStatus(VDSStatus.Unassigned).getSucceeded());
>
> if (getSucceeded()) {
> TransactionSupport.executeInNewTransaction(() -> {
> // set network to operational / non-operational
> List<Network> networks =
> networkDao.getAllForCluster(vds.getClusterId());
> networkClusterHelper.setStatus(vds.getClusterId(),
> networks);
> return null;
> });
>
> // Start glusterd service on the node, which would haven
> been stopped due to maintenance
> if (vds.getClusterSupportsGlusterService()) {
> runVdsCommand(VDSCommandType.ManageGlusterService,
> new GlusterServiceVDSParameters(vds.getId(),
> Arrays.asList("glusterd"), "restart"));
> // starting vdo service
> GlusterStatus isRunning =
> glusterUtil.isVDORunning(vds.getId());
> switch (isRunning) {
> case DOWN:
> log.info("VDO service is down in host : '{}' with
> id '{}', starting VDO service",
> vds.getHostName(),
> vds.getId());
> startVDOService(vds);
> break;
> case UP:
> log.info("VDO service is up in host : '{}' with
> id '{}', skipping starting of VDO service",
> vds.getHostName(),
> vds.getId());
> break;
> case UNKNOWN:
> log.info("VDO service is not installed host :
> '{}' with id '{}', ignoring to start VDO service",
> vds.getHostName(),
> vds.getId());
> break;
> }
>
> }
> }
> }
> }
>
>
> Your Sincerely
> Mark Lee
> _______________________________________________
> Devel mailing list -- devel(a)ovirt.org
> To unsubscribe send an email to devel-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/36745S77NLH...
>
4 years, 2 months
Building lago from source
by Nir Soffer
I'm trying to test with ost:
https://github.com/lago-project/lago/pull/815
So clone the project on the ost vm and built rpms:
make
make rpm
The result is:
lago-1.0.2-1.el8.noarch.rpm python3-lago-1.0.2-1.el8.noarch.rpm
But the lago version installed by setup_for_ost.sh is:
$ rpm -q lago
lago-1.0.11-1.el8.noarch
I tried to install lago from master, and then lago_init fail:
$ lago_init /usr/share/ost-images/el8-engine-installed.qcow2 -k
/usr/share/ost-images/el8_id_rsa
Using images ost-images-el8-host-installed-1-202011021248.x86_64,
ost-images-el8-engine-installed-1-202011021248.x86_64 containing
ovirt-engine-4.4.4-0.0.master.20201031195930.git8f858d6c01d.el8.noarch
vdsm-4.40.35.1-1.el8.x86_64
usage: lago [-h] [-l {info,debug,error,warning}] [--logdepth LOGDEPTH]
[--version] [--out-format {default,flat,json,yaml}]
[--prefix-path PREFIX_PATH] [--workdir-path WORKDIR_PATH]
[--prefix-name PREFIX_NAME] [--ssh-user SSH_USER]
[--ssh-password SSH_PASSWORD] [--ssh-tries SSH_TRIES]
[--ssh-timeout SSH_TIMEOUT] [--libvirt_url LIBVIRT_URL]
[--libvirt-user LIBVIRT_USER]
[--libvirt-password LIBVIRT_PASSWORD]
[--default_vm_type DEFAULT_VM_TYPE]
[--default_vm_provider DEFAULT_VM_PROVIDER]
[--default_root_password DEFAULT_ROOT_PASSWORD]
[--lease_dir LEASE_DIR] [--reposync-dir REPOSYNC_DIR]
[--ignore-warnings]
VERB ...
lago: error: unrecognized arguments: --ssh-key
/home/nsoffer/src/ovirt-system-tests/deployment-basic-suite-master
/home/nsoffer/src/ovirt-system-tests/basic-suite-master/LagoInitFile
Do we use a customized lago version for ost? Where is the source?
Nir
4 years, 2 months
Host's status how to change to UP when I execute Management--> Active command
by lifuqiong@sunyainfo.com
Hi,
I checkout ovirt-engine's code with branch 4.2; When execute Host's Management --> Active command, ovirt engine just set vds's status to "Unassigned" but how the vds's status changed to "UP" in ovirt-engine? The code in ActiveVdsCommand.executeCommand() as follows:
protected void executeCommand() {
final VDS vds = getVds();
try (EngineLock monitoringLock = acquireMonitorLock("Activate host")) {
executionHandler.updateSpecificActionJobCompleted(vds.getId(), ActionType.MaintenanceVds, false);
setSucceeded(setVdsStatus(VDSStatus.Unassigned).getSucceeded());
if (getSucceeded()) {
TransactionSupport.executeInNewTransaction(() -> {
// set network to operational / non-operational
List<Network> networks = networkDao.getAllForCluster(vds.getClusterId());
networkClusterHelper.setStatus(vds.getClusterId(), networks);
return null;
});
// Start glusterd service on the node, which would haven been stopped due to maintenance
if (vds.getClusterSupportsGlusterService()) {
runVdsCommand(VDSCommandType.ManageGlusterService,
new GlusterServiceVDSParameters(vds.getId(), Arrays.asList("glusterd"), "restart"));
// starting vdo service
GlusterStatus isRunning = glusterUtil.isVDORunning(vds.getId());
switch (isRunning) {
case DOWN:
log.info("VDO service is down in host : '{}' with id '{}', starting VDO service",
vds.getHostName(),
vds.getId());
startVDOService(vds);
break;
case UP:
log.info("VDO service is up in host : '{}' with id '{}', skipping starting of VDO service",
vds.getHostName(),
vds.getId());
break;
case UNKNOWN:
log.info("VDO service is not installed host : '{}' with id '{}', ignoring to start VDO service",
vds.getHostName(),
vds.getId());
break;
}
}
}
}
}
Your Sincerely
Mark Lee
4 years, 2 months