vdsmd.service stuck in state: activating
by sceglimilano@gmail.com
= Noob questions =
A) First time here. I couldn't find the rulebook/netiquette for oVirt mailing
list, could you give me a link?
B) What do you use here, top or bottom posting?
C) Which is the line length limit? I've written this email manually entering
new line after 80 chars, is it fine?
D) Which is the log length limit?
E) When should I use pastebin or similar websites?
= /Noob questions =
= Issue questions =
TL;DR
self host engine deployment fails, vdsmd.service has this error I can't find on
Google:
'sysctl: cannot stat /proc/sys/ssl: No such file or directory'
Long description:
I've been trying for a while to deploy an oVirt self hosted engine but it failed
all the time. After every failed deployment, node status resulted in 'FAIL' (or
'DEGRADED', I don't remember) and to restore it to 'OK' status I'd to run
'vdsm-tool configure --force'.
vdsm-tool configuration never went through fully successful:
< code >
abrt is not configured for vdsm
lvm is configured for vdsm
libvirt is not configured for vdsm yet
FAILED: conflicting vdsm and libvirt-qemu tls configuration.
vdsm.conf with ssl=True requires the following changes:
libvirtd.conf: listen_tcp=0, auth_tcp="sasl", listen_tls=1
qemu.conf: spice_tls=1.
Current revision of multipath.conf detected, preserving
< /code >
I set 'auth_tcp = "sasl"' (in "/etc/libvirtd.conf") and 'spice_tls=1' (in
"/etc/libvirt/qemu.conf") and restart vdsdmd.service, but it still fails:
< code >
Starting Virtual Desktop Server Manager...
_init_common.sh[2839]: vdsm: Running mkdirs
_init_common.sh[2839]: vdsm: Running configure_coredump
_init_common.sh[2839]: vdsm: Running configure_vdsm_logs
_init_common.sh[2839]: vdsm: Running wait_for_network
_init_common.sh[2839]: vdsm: Running run_init_hooks
_init_common.sh[2839]: vdsm: Running check_is_configured
_init_common.sh[2839]: abrt is already configured for vdsm
_init_common.sh[2839]: lvm is configured for vdsm
_init_common.sh[2839]: libvirt is already configured for vdsm
_init_common.sh[2839]: Current revision of multipath.conf detected, preserving
_init_common.sh[2839]: vdsm: Running validate_configuration
_init_common.sh[2839]: SUCCESS: ssl configured to true. No conflicts
_init_common.sh[2839]: vdsm: Running prepare_transient_repository
_init_common.sh[2839]: vdsm: Running syslog_available
_init_common.sh[2839]: vdsm: Running nwfilter
_init_common.sh[2839]: vdsm: Running dummybr
_init_common.sh[2839]: vdsm: Running tune_system
_init_common.sh[2839]: sysctl: cannot stat /proc/sys/ssl: No such file or directory
md[1]: vdsmd.service: control process exited, code=exited status=1
md[1]: Failed to start Virtual Desktop Server Manager.
md[1]: Unit vdsmd.service entered failed state.
md[1]: vdsmd.service failed.
md[1]: vdsmd.service holdoff time over, scheduling restart.
< /code >
I've no idea about how to investigate, solve or troubleshoot this issue.
I've updated yesterday to the last version but nothing changed:
ovirt-release-master-4.3.0-0.1.master.20180820000052.gitdd598f0.el7.noarch
= /Issue questions =
6 years, 1 month
Networking - How to pass all vlan (trunk) to a guest VM?
by Jonathan Greg
Hi,
I didn't find the documentation about it... How I can pass all the vlan (trunk) to a guest VM (without SR-IOV)?
My switch is configured to trunk all vlans.
I do have configured a logical network with the "Enable VLAN tagging" box uncheck and assign it to a VM. My understanding with this setting is that tagged and untagged packet should be forwarded to the VM guest.
Unfortunately it doesn't work at all, if I run a tcpdump on the VM nic attached to this logical network, I don't see anything... I would really like to run a virtual router or firewall on my setup without having to shutdown my VM and having attach a new vnic each time I want to add a new vlan.
Any idea?
Jonathan
6 years, 1 month
purchase linkedin endorsements
by http://buylinkedin.com/ patil
Linkedin online place where you can connect with people you have met in professional circumstances and keep up to date with their activities in a manner similar to other social networks such as Facebook.
6 years, 1 month
Re: storage healing question
by Ravishankar N
Hi,
Can you restart the self-heal daemon by doing a `gluster volume start
bgl-vms-gfs force` and then launch the heal again? If you are seeing
different entries and counts each time you run heal info, there is
likely a network issue (disconnect) between the (gluster fuse?) mount
and the bricks of the volume leading to pending heals.
Also, there was a bug in arbiter volumes[1] that got fixed in glusterfs
3.12.15. It can cause VMs to pause when you reboot the arbiter node, so
it is recommended to upgrade to this gluster version.
HTH,
Ravi
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1637989
> From: Dev Ops<sipandbite(a)hotmail.com>
> Date: Mon, Nov 12, 2018 at 1:09 PM
> Subject: [ovirt-users] Re: storage healing question
> To:<users(a)ovirt.org>
>
>
> Any help would be appreciated. I have since rebooted the 3rd gluster
> node which is the arbiter. This doesn't seem to want to heal.
>
> gluster volume heal bgl-vms-gfs info |grep Number
> Number of entries: 68
> Number of entries: 0
> Number of entries: 68
> _______________________________________________
> Users mailing list --users(a)ovirt.org
> To unsubscribe send an email tousers-leave(a)ovirt.org
> Privacy Statement:https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/XN...
6 years, 1 month
New disks cloned from template get wrong quota-id, when quota is disabled on DC
by Florian Schmid
Hi,
we have recently upgraded our ovirt environment from 4.1.6 to 4.1.9 and then to 4.2.5.
I don't know exactly when this issue was happened the first time, because on the affected DCs, we haven't created new VMs for some time ago.
To our setup:
We have several DCs configured and only the former default one has quota enabled.
We have separate templates for each DC and they haven't been cloned or anything else from a template in default DC
When I now create a VM on DC from a template (disks are cloned), where quota is not enabled, the new disks are getting quotas assigned, which are not available in that DC!
The template disks all have the correct quota ID assigned!
Example from engine DB:
select * from image_storage_domain_map where storage_domain_id = '73caedd0-6ef3-46e0-a705-fe268f04f9cc';
->
...
a50b46ce-e350-40a4-8f00-968529777446 | 73caedd0-6ef3-46e0-a705-fe268f04f9cc | 58ab004a-0315-00d0-02b8-00000000011d | 4a7a0fea-9bc4-4c3a-b3f4-0e8444641ea3
1cd8f9d9-e2b5-4dec-aa3b-ade2612ed3e7 | 73caedd0-6ef3-46e0-a705-fe268f04f9cc | 58ab004a-009a-00ea-031c-000000000182 | 4a7a0fea-9bc4-4c3a-b3f4-0e8444641ea3
2f982856-7afa-4f18-a676-fe2cc44b14d6 | 73caedd0-6ef3-46e0-a705-fe268f04f9cc | 58ab004a-009a-00ea-031c-000000000182 | 4a7a0fea-9bc4-4c3a-b3f4-0e8444641ea3
...
->
ll ./6d979004-cb4c-468e-b89a-a292407abafb/
insgesamt 1420
-rw-rw----. 1 vdsm kvm 1073741824 Nov 6 15:46 a50b46ce-e350-40a4-8f00-968529777446
-rw-rw----. 1 vdsm kvm 1048576 Nov 6 15:46 a50b46ce-e350-40a4-8f00-968529777446.lease
-rw-r--r--. 1 vdsm kvm 271 Nov 6 15:46 a50b46ce-e350-40a4-8f00-968529777446.meta
As you see, the disk a50b46ce-e350-40a4-8f00-968529777446 was created some minutes ago, but it has a different default quota ID assigned as the other disks above.
58ab004a-0315-00d0-02b8-00000000011d instead of 58ab004a-009a-00ea-031c-000000000182
select * from quota;
id | storage_pool_id | quota_name | description | _create_date | _update_date | threshold_cluster_percent
age | threshold_storage_percentage | grace_cluster_percentage | grace_storage_percentage | is_default
--------------------------------------+--------------------------------------+------------+-------------------------+-------------------------------+-------------------------------+--------------------------
----+------------------------------+--------------------------+--------------------------+------------
58ab004a-0315-00d0-02b8-00000000011d | 00000001-0001-0001-0001-000000000089 | Default | Default unlimited quota | 2017-02-20 14:42:18.967236+00 | |
80 | 80 | 20 | 20 | t
58ab004a-009a-00ea-031c-000000000182 | 5507b0a6-9170-4f42-90a7-80d22d4238c6 | Default | Default unlimited quota | 2017-02-20 14:42:18.967236+00 | |
80 | 80 | 20 | 20 | t
As you see here, both quota IDs are default IDs and therefore can't be on the same DC or storage domain.
As soon I enable quota on that DC, new disks created by VM creation via template, will get the correct quota ID.
The problem with wrong IDs is, that you can't edit the disks anymore!
I can repeat that error every time and I can give all the information you need to debug this. Pleas inform me, what data do you need...
Best Regards
Florian
6 years, 1 month
storage healing question
by Dev Ops
The switches above our environment had some VPC issues and the port channels went offline. The ports that had issues belonged to 2 of the gfs nodes in our environment. We have 3 storage nodes total with the 3rd being the arbiter. I wound up rebooting the first 2 nodes and everything came back happy. After a few hours I noticed that the storage was up but complaining about being out of sync and needing healing. Within the hour I noticed a VM had paused itself due to storage issues. This is a small environment, for now, with only 30 VM's. I am new to Ovirt so this is uncharted territory for me. I am tailing some logs and things look sort of normal and google is sending me down a wormhole.
If I run "gluster volume heal cps-vms-gfs info" this number seems to be changing pretty regularly. Logs are showing lots of entries like this:
[2018-11-08 21:55:05.996675] I [MSGID: 114047] [client-handshake.c:1242:client_setvolume_cbk] 0-cps-vms-gfs-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2018-11-08 21:55:05.997693] I [MSGID: 108002] [afr-common.c:5312:afr_notify] 0-cps-vms-gfs-replicate-0: Client-quorum is met
[2018-11-08 21:55:05.997717] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-cps-vms-gfs-client-1: Server lk version = 1
I guess I am curious what else should I be looking for? Is this just taking forever to heal? Is there something else I can run or I should do to verify things are actually getting better? I ran an actual heal command and it cleared everything for a few seconds and then the entries started to populate again when I did the info command.
[root@cps-vms-gfs01 glusterfs]# gluster volume status
Status of volume: cps-vms-gfs
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.8.255.1:/gluster/cps-vms-gfs01/brick 49152 0 Y 4054
Brick 10.8.255.2:/gluster/cps-vms-gfs02/brick 49152 0 Y 4144
Brick 10.8.255.3:/gluster/cps-vms-gfs03/brick 49152 0 Y 4294
Self-heal Daemon on localhost N/A N/A Y 4279
Self-heal Daemon on cps-vms-gfs02.cisco.com N/A N/A Y 5185
Self-heal Daemon on 10.196.152.145 N/A N/A Y 50948
Task Status of Volume cps-vms-gfs
------------------------------------------------------------------------------
There are no active volume tasks
I am running ovirt 4.2.5 and gluster 3.12.11.
Thanks!
6 years, 1 month
Network/Storage design
by Josep Manel Andrés Moscardó
Hi,
I am new to oVirt and trying to deploy a cluster to see whether we can
move from VMWare to oVirt, but my first stopper is how to do a proper
design for the infrastructure,
how many networks do I need?, I have 2x10Gb SFP+ and 4x1Gb ethernet.
For storage we have NFS coming from Netapp and ceph.
Could someone point me in the right direction?
Thanks.
6 years, 1 month
libgfapi support are "false" by default in ovirt 4.2 ?
by Mike Lykov
Hi All
I'm try to set up last ovirt version : ovirt-release42-pre.rpm repository
Then, I install these (and many deps) rpms:
ovirt-hosted-engine-setup-2.2.30-1.el7.noarch
ovirt-engine-appliance-4.2-20181026.1.el7.noarch
vdsm-4.20.43-1.el7.x86_64
vdsm-gluster-4.20.43-1.el7.x86_64
vdsm-network-4.20.43-1.el7.x86_64
All from that repository, and use webui installer for create glusterfs
volumes (default suggested engine, data, vmstore) and then install
hosted engine on that "engine" volume.
In cluster I want to use libgfapi gluster storage access method, but
when I import storages, created by installer at first step, VDSM mount
it on hosts with FUSE.
For Example
ovirtstor1.miac:/engine on
/rhev/data-center/mnt/glusterSD/ovirtstor1.miac:_engine type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
ovirtnode1.miac:/data on
/rhev/data-center/mnt/glusterSD/ovirtnode1.miac:_data type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
I see in config, And it is disabled (false)
[root@ovirtengine ~]# engine-config -a | grep -i libgf
LibgfApiSupported: false version: 3.6
LibgfApiSupported: false version: 4.0
LibgfApiSupported: false version: 4.1
LibgfApiSupported: false version: 4.2
Why? Its is needed to enable it by hand?
Like as in this presentation?
https://www.slideshare.net/DenisChapligin/improving-hyperconverged-perfor...
6 years, 1 month
ovirt 4.2.7 nested not importing she domain
by Gianluca Cecchi
Hello,
I'm configuring a nested self hosted engine environment with 4.2.7 and
CentOS 7.5.
Domain type is NFS.
I deployed with
hosted-engine --deploy --noansible
All went apparently good but after creating the master storage domain I see
that the hosted engine domain is not automatically imported
At the moment I have only one host.
ovirt-ha-agent status gives every 10 seconds:
Nov 09 00:36:30 ovirtdemo01.localdomain.local ovirt-ha-agent[18407]:
ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm ERROR
Unable to identify the OVF_STORE volume, falling back to initial vm.conf.
Please ensure you already added your first data domain for regular VMs
In engine.log I see every 15 seconds a dumpxml output ad the message:
2018-11-09 00:31:52,822+01 WARN
[org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder]
(EE-ManagedThreadFactory-engineScheduled-Thread-52) [7fcce3cb] null
architecture type, replacing with x86_64, VM [HostedEngine]
see full below.
Any hint?
Thanks
Gianluca
2018-11-09 00:31:52,714+01 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(EE-ManagedThreadFactory-engineScheduled-Thread-52) [] VM
'21c5fe9f-cd46-49fd-a6f3-009b4d450894' was discovered as 'Up' on VDS
'4de40432-c1f7-4f20-b231-347095015fbd'(ovirtdemo01.localdomain.local)
2018-11-09 00:31:52,764+01 INFO
[org.ovirt.engine.core.bll.AddUnmanagedVmsCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-52) [7fcce3cb] Running
command: AddUnmanagedVmsCommand internal: true.
2018-11-09 00:31:52,766+01 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-52) [7fcce3cb] START,
DumpXmlsVDSCommand(HostName = ovirtdemo01.localdomain.local,
Params:{hostId='4de40432-c1f7-4f20-b231-347095015fbd',
vmIds='[21c5fe9f-cd46-49fd-a6f3-009b4d450894]'}), log id: 5d5a0a63
2018-11-09 00:31:52,775+01 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-52) [7fcce3cb] FINISH,
DumpXmlsVDSCommand, return: {21c5fe9f-cd46-49fd-a6f3-009b4d450894=<domain
type='kvm' id='2'>
<name>HostedEngine</name>
<uuid>21c5fe9f-cd46-49fd-a6f3-009b4d450894</uuid>
<metadata xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="
http://ovirt.org/vm/1.0">
<ovirt-tune:qos/>
<ovirt-vm:vm xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
<ovirt-vm:destroy_on_reboot
type="bool">False</ovirt-vm:destroy_on_reboot>
<ovirt-vm:memGuaranteedSize type="int">0</ovirt-vm:memGuaranteedSize>
<ovirt-vm:startTime type="float">1541719799.6</ovirt-vm:startTime>
<ovirt-vm:device devtype="console" name="console0">
<ovirt-vm:deviceId>2e8944d3-7ac4-4597-8883-c0b2937fb23b</ovirt-vm:deviceId>
<ovirt-vm:specParams/>
<ovirt-vm:vm_custom/>
</ovirt-vm:device>
<ovirt-vm:device mac_address="00:16:3e:35:d9:2c">
<ovirt-vm:deviceId>0f1d0ce3-8843-4418-b882-0d84ca481717</ovirt-vm:deviceId>
<ovirt-vm:network>ovirtmgmt</ovirt-vm:network>
<ovirt-vm:specParams/>
<ovirt-vm:vm_custom/>
</ovirt-vm:device>
<ovirt-vm:device devtype="disk" name="hdc">
<ovirt-vm:deviceId>6d167005-b547-4190-938f-ce1b82eae7af</ovirt-vm:deviceId>
<ovirt-vm:shared>false</ovirt-vm:shared>
<ovirt-vm:specParams/>
<ovirt-vm:vm_custom/>
</ovirt-vm:device>
<ovirt-vm:device devtype="disk" name="vda">
<ovirt-vm:deviceId>8eb98007-1d9a-4689-bbab-b3c7060efef8</ovirt-vm:deviceId>
<ovirt-vm:domainID>fbcfb922-0103-43fb-a2b6-2bf0c9e356ea</ovirt-vm:domainID>
<ovirt-vm:guestName>/dev/vda</ovirt-vm:guestName>
<ovirt-vm:imageID>8eb98007-1d9a-4689-bbab-b3c7060efef8</ovirt-vm:imageID>
<ovirt-vm:poolID>00000000-0000-0000-0000-000000000000</ovirt-vm:poolID>
<ovirt-vm:shared>exclusive</ovirt-vm:shared>
<ovirt-vm:volumeID>64bdb7cd-60a1-4420-b3a6-607b20e2cd5a</ovirt-vm:volumeID>
<ovirt-vm:specParams/>
<ovirt-vm:vm_custom/>
<ovirt-vm:volumeChain>
<ovirt-vm:volumeChainNode>
<ovirt-vm:domainID>fbcfb922-0103-43fb-a2b6-2bf0c9e356ea</ovirt-vm:domainID>
<ovirt-vm:imageID>8eb98007-1d9a-4689-bbab-b3c7060efef8</ovirt-vm:imageID>
<ovirt-vm:leaseOffset type="int">0</ovirt-vm:leaseOffset>
<ovirt-vm:leasePath>/rhev/data-center/mnt/ovirtdemo01.localdomain.local:_SHE__DOMAIN/fbcfb922-0103-43fb-a2b6-2bf0c9e356ea/images/8eb98007-1d9a-4689-bbab-b3c7060efef8/64bdb7cd-60a1-4420-b3a6-607b20e2cd5a.lease</ovirt-vm:leasePath>
<ovirt-vm:path>/rhev/data-center/mnt/ovirtdemo01.localdomain.local:_SHE__DOMAIN/fbcfb922-0103-43fb-a2b6-2bf0c9e356ea/images/8eb98007-1d9a-4689-bbab-b3c7060efef8/64bdb7cd-60a1-4420-b3a6-607b20e2cd5a</ovirt-vm:path>
<ovirt-vm:volumeID>64bdb7cd-60a1-4420-b3a6-607b20e2cd5a</ovirt-vm:volumeID>
</ovirt-vm:volumeChainNode>
</ovirt-vm:volumeChain>
</ovirt-vm:device>
</ovirt-vm:vm>
</metadata>
<memory unit='KiB'>6270976</memory>
<currentMemory unit='KiB'>6270976</currentMemory>
<vcpu placement='static' current='1'>2</vcpu>
<cputune>
<shares>1020</shares>
</cputune>
<resource>
<partition>/machine</partition>
</resource>
<sysinfo type='smbios'>
<system>
<entry name='manufacturer'>oVirt</entry>
<entry name='product'>oVirt Node</entry>
<entry name='version'>7-5.1804.5.el7.centos</entry>
<entry name='serial'>2820BD92-2B2B-42C5-912B-76FB65E93FBF</entry>
<entry name='uuid'>21c5fe9f-cd46-49fd-a6f3-009b4d450894</entry>
</system>
</sysinfo>
<os>
<type arch='x86_64' machine='pc-i440fx-rhel7.5.0'>hvm</type>
<smbios mode='sysinfo'/>
</os>
<features>
<acpi/>
</features>
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>Skylake-Client</model>
<feature policy='require' name='hypervisor'/>
</cpu>
<clock offset='variable' adjustment='0' basis='utc'>
<timer name='rtc' tickpolicy='catchup'/>
<timer name='pit' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>destroy</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='file' device='cdrom'>
<driver error_policy='stop'/>
<source startupPolicy='optional'/>
<target dev='hdc' bus='ide'/>
<readonly/>
<alias name='ide0-1-0'/>
<address type='drive' controller='0' bus='1' target='0' unit='0'/>
</disk>
<disk type='file' device='disk' snapshot='no'>
<driver name='qemu' type='raw' cache='none' error_policy='stop'
io='threads'/>
<source
file='/var/run/vdsm/storage/fbcfb922-0103-43fb-a2b6-2bf0c9e356ea/8eb98007-1d9a-4689-bbab-b3c7060efef8/64bdb7cd-60a1-4420-b3a6-607b20e2cd5a'/>
<backingStore/>
<target dev='vda' bus='virtio'/>
<serial>8eb98007-1d9a-4689-bbab-b3c7060efef8</serial>
<boot order='1'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x0'/>
</disk>
<controller type='scsi' index='0' model='virtio-scsi'>
<alias name='scsi0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04'
function='0x0'/>
</controller>
<controller type='usb' index='0' model='piix3-uhci'>
<alias name='usb'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x2'/>
</controller>
<controller type='pci' index='0' model='pci-root'>
<alias name='pci.0'/>
</controller>
<controller type='ide' index='0'>
<alias name='ide'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x1'/>
</controller>
<controller type='virtio-serial' index='0'>
<alias name='virtio-serial0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x0'/>
</controller>
<lease>
<lockspace>fbcfb922-0103-43fb-a2b6-2bf0c9e356ea</lockspace>
<key>64bdb7cd-60a1-4420-b3a6-607b20e2cd5a</key>
<target
path='/rhev/data-center/mnt/ovirtdemo01.localdomain.local:_SHE__DOMAIN/fbcfb922-0103-43fb-a2b6-2bf0c9e356ea/images/8eb98007-1d9a-4689-bbab-b3c7060efef8/64bdb7cd-60a1-4420-b3a6-607b20e2cd5a.lease'/>
</lease>
<interface type='bridge'>
<mac address='00:16:3e:35:d9:2c'/>
<source bridge='ovirtmgmt'/>
<target dev='vnet0'/>
<model type='virtio'/>
<link state='up'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03'
function='0x0'/>
</interface>
<console type='pty' tty='/dev/pts/0'>
<source path='/dev/pts/0'/>
<target type='virtio' port='0'/>
<alias name='console0'/>
</console>
<channel type='unix'>
<source mode='bind'
path='/var/lib/libvirt/qemu/channels/21c5fe9f-cd46-49fd-a6f3-009b4d450894.com.redhat.rhevm.vdsm'/>
<target type='virtio' name='com.redhat.rhevm.vdsm' state='connected'/>
<alias name='channel0'/>
<address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
<channel type='unix'>
<source mode='bind'
path='/var/lib/libvirt/qemu/channels/21c5fe9f-cd46-49fd-a6f3-009b4d450894.org.qemu.guest_agent.0'/>
<target type='virtio' name='org.qemu.guest_agent.0'
state='connected'/>
<alias name='channel1'/>
<address type='virtio-serial' controller='0' bus='0' port='2'/>
</channel>
<channel type='unix'>
<source mode='bind'
path='/var/lib/libvirt/qemu/channels/21c5fe9f-cd46-49fd-a6f3-009b4d450894.org.ovirt.hosted-engine-setup.0'/>
<target type='virtio' name='org.ovirt.hosted-engine-setup.0'
state='disconnected'/>
<alias name='channel2'/>
<address type='virtio-serial' controller='0' bus='0' port='3'/>
</channel>
<input type='mouse' bus='ps2'>
<alias name='input0'/>
</input>
<input type='keyboard' bus='ps2'>
<alias name='input1'/>
</input>
<graphics type='vnc' port='5900' autoport='yes' listen='0'
passwdValidTo='1970-01-01T00:00:01'>
<listen type='address' address='0'/>
</graphics>
<video>
<model type='vga' vram='32768' heads='1' primary='yes'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02'
function='0x0'/>
</video>
<memballoon model='none'>
<alias name='balloon0'/>
</memballoon>
<rng model='virtio'>
<backend model='random'>/dev/urandom</backend>
<alias name='rng0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07'
function='0x0'/>
</rng>
</devices>
<seclabel type='dynamic' model='selinux' relabel='yes'>
<label>system_u:system_r:svirt_t:s0:c500,c542</label>
<imagelabel>system_u:object_r:svirt_image_t:s0:c500,c542</imagelabel>
</seclabel>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+107:+107</label>
<imagelabel>+107:+107</imagelabel>
</seclabel>
</domain>
}, log id: 5d5a0a63
2018-11-09 00:31:52,822+01 WARN
[org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder]
(EE-ManagedThreadFactory-engineScheduled-Thread-52) [7fcce3cb] null
architecture type, replacing with x86_64, VM [HostedEngine]
6 years, 1 month
Storage domain mount error: Lustre file system (Posix compliant FS)
by okok102928@fusiondata.co.kr
Hi.
I am an ovirt user in Korea. I am working on VDI. It's a pleasure to meet you, the ovirt specialists.
(I do not speak English well... Thank you for your understanding!)
I am testing Lustre File System in Ovirt or RH(E)V environment.
(The reason is simple: glusterfs and nfs have limit of performance, SAN Storage and excellent Software Defined Storage are quite expensive.)
Testing for file system performance was successful.
As expected, luster showed amazing performance.
However, there was an error adding luster storage to the storage domain as Posix compliant FS.
Domain Function : Data
Storage Type : POSIX compliant FS
Host to Use : [SPM_HOSTNAME]
Name : [STORAGE_DOMAIN_NAME]
Path : 10.10.10.15@tcp:/lustre/vmstore
VFS Type : lustre
Mount Options :
The vdsm debug logs are shown below.
2018-10-25 12:46:58,963+0900 INFO (jsonrpc/2) [storage.xlease] Formatting index for lockspace u'c0ef7ee6-1da9-4eef-9e03-387cd3a24445' (version=1) (xlease:653)
2018-10-25 12:46:58,971+0900 DEBUG (jsonrpc/2) [root] /usr/bin/dd iflag=fullblock of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases oflag=direct,seek_bytes seek=1048576 bs=256512 count=1 conv=notrunc,nocreat,fsync (cwd None) (commands:65)
2018-10-25 12:46:58,985+0900 DEBUG (jsonrpc/2) [root] FAILED: <err> = "/usr/bin/dd: error writing '/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s, 0.0 kB/s\n"; <rc> = 1 (commands:86)
2018-10-25 12:46:58,985+0900 INFO (jsonrpc/2) [vdsm.api] FINISH createStorageDomain error=Command ['/usr/bin/dd', 'iflag=fullblock', u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases', 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' err="/usr/bin/dd: error writing '/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s, 0.0 kB/s\n" from=::ffff:192.168.161.104,52188, flow_id=794bd395, task_id=c9847bf3-2267-483b-9099-f05a46981f7f (api:50)
2018-10-25 12:46:58,985+0900 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') Unexpected error (task:875)
Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in createStorageDomain
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2591, in createStorageDomain
storageType, domVersion)
File "/usr/lib/python2.7/site-packages/vdsm/storage/nfsSD.py", line 87, in create
remotePath, storageType, version)
File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 465, in _prepareMetadata
cls.format_external_leases(sdUUID, xleases_path)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 1200, in format_external_leases
xlease.format_index(lockspace, backend)
File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 661, in format_index
index.dump(file)
File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 761, in dump
file.pwrite(INDEX_BASE, self._buf)
File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 994, in pwrite
self._run(args, data=buf[:])
File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 1011, in _run
raise cmdutils.Error(args, rc, "[suppressed]", err)
Error: Command ['/usr/bin/dd', 'iflag=fullblock', u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases', 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' err="/usr/bin/dd: error writing '/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s, 0.0 kB/s\n"
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') Task._run: c9847bf3-2267-483b-9099-f05a46981f7f (6, u'c0ef7ee6-1da9-4eef-9e03-387cd3a24445', u'vmstore', u'10.10.10.15@tcp:/lustre/vmstore', 1, u'4') {} failed - stopping task (task:894)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') stopping in state failed (force False) (task:1256)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') ref 1 aborting True (task:1002)
2018-10-25 12:46:58,986+0900 INFO (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') aborting: Task is aborted: u'Command [\'/usr/bin/dd\', \'iflag=fullblock\', u\'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases\', \'oflag=direct,seek_bytes\', \'seek=1048576\', \'bs=256512\', \'count=1\', \'conv=notrunc,nocreat,fsync\'] failed with rc=1 out=\'[suppressed]\' err="/usr/bin/dd: error writing \'/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases\': Invalid argument\\n1+0 records in\\n0+0 records out\\n0 bytes (0 B) copied, 0.000943896 s, 0.0 kB/s\\n"' - code 100 (task:1181)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') Prepare: aborted: Command ['/usr/bin/dd', 'iflag=fullblock', u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases', 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' err="/usr/bin/dd: error writing '/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s, 0.0 kB/s\n" (task:1186)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') ref 0 aborting True (task:1002)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') Task._doAbort: force False (task:937)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.ResourceManager.Owner] Owner.cancelAll requests {} (resourceManager:947)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') moving from state failed -> state aborting (task:602)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') _aborting: recover policy none (task:557)
2018-10-25 12:46:58,987+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='c9847bf3-2267-483b-9099-f05a46981f7f') moving from state failed -> state failed (task:602)
2018-10-25 12:46:58,987+0900 DEBUG (jsonrpc/2) [storage.ResourceManager.Owner] Owner.releaseAll requests {} resources {} (resourceManager:910)
2018-10-25 12:46:58,987+0900 DEBUG (jsonrpc/2) [storage.ResourceManager.Owner] Owner.cancelAll requests {} (resourceManager:947)
2018-10-25 12:46:58,987+0900 ERROR (jsonrpc/2) [storage.Dispatcher] FINISH createStorageDomain error=Command ['/usr/bin/dd', 'iflag=fullblock', u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases', 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' err="/usr/bin/dd: error writing '/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s, 0.0 kB/s\n" (dispatcher:86)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line 73, in wrapper
result = ctask.prepare(func, *args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in wrapper
return m(self, *a, **kw)
File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1189, in prepare
raise self.error
Error: Command ['/usr/bin/dd', 'iflag=fullblock', u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases', 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' err="/usr/bin/dd: error writing '/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s, 0.0 kB/s\n"
2018-10-25 12:46:58,987+0900 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call StorageDomain.create failed (error 351) in 0.41 seconds (__init__:573)
2018-10-25 12:46:59,058+0900 DEBUG (jsonrpc/3) [jsonrpc.JsonRpcServer] Calling 'StoragePool.disconnectStorageServer' in bridge with {u'connectionParams': [{u'id': u'316c5f1f-753e-42a1-8e30-4ee6f976906a', u'connection': u'10.10.10.15@tcp:/lustre/vmstore', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'lustre', u'password': '********', u'port': u''}], u'storagepoolID': u'00000000-0000-0000-0000-000000000000', u'domainType': 6} (__init__:590)
2018-10-25 12:46:59,058+0900 WARN (jsonrpc/3) [devel] Provided value "6" not defined in StorageDomainType enum for StoragePool.disconnectStorageServer (vdsmapi:275)
2018-10-25 12:46:59,058+0900 WARN (jsonrpc/3) [devel] Provided parameters {u'id': u'316c5f1f-753e-42a1-8e30-4ee6f976906a', u'connection': u'10.10.10.15@tcp:/lustre/vmstore', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'lustre', u'password': '********', u'port': u''} do not match any of union ConnectionRefParameters values (vdsmapi:275)
2018-10-25 12:46:59,059+0900 DEBUG (jsonrpc/3) [storage.TaskManager.Task] (Task='3c6b249f-a47f-47f1-a647-5893b6f60b7c') moving from state preparing -> state preparing (task:602)
2018-10-25 12:46:59,059+0900 INFO (jsonrpc/3) [vdsm.api] START disconnectStorageServer(domType=6, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'id': u'316c5f1f-753e-42a1-8e30-4ee6f976906a', u'connection': u'10.10.10.15@tcp:/lustre/vmstore', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'lustre', u'password': '********', u'port': u''}], options=None) from=::ffff:192.168.161.104,52188, flow_id=f1bf4bf8-9033-42af-9329-69960638ba0e, task_id=3c6b249f-a47f-47f1-a647-5893b6f60b7c (api:46)
2018-10-25 12:46:59,059+0900 INFO (jsonrpc/3) [storage.Mount] unmounting /rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore (mount:212)
2018-10-25 12:46:59,094+0900 DEBUG (jsonrpc/3) [storage.Mount] /rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore unmounted: 0.03 seconds (utils:452)
1. If you use the direct flag when using the dd command, luster only works in multiples of 4k (4096).
Therefore, bs = 256512, which is not a multiple of 4096, causes an error.
2. Ovirt / RH(E)V An error occurs regardless of version. I have tested both in 3.6, 4.1, and 4.2 environments.
I searched hard, but I did not have a similar case. I want to ask three questions.
1. Is there a way to fix the problem in Ovirt? (Secure bypass or some configurations)
2. (Extension of Question # 1) Where does block size 256512 come from? Why is it 256512?
3. Is this a problem you need to solve in the Lustre file system? (For example, a setting capable of direct IO in units of 512 bytes)
I need help. Thank you for your reply.
6 years, 1 month