Hi.
I am an ovirt user in Korea. I am working on VDI. It's a pleasure to meet you, the
ovirt specialists.
(I do not speak English well... Thank you for your understanding!)
I am testing Lustre File System in Ovirt or RH(E)V environment.
(The reason is simple: glusterfs and nfs have limit of performance, SAN Storage and
excellent Software Defined Storage are quite expensive.)
Testing for file system performance was successful.
As expected, luster showed amazing performance.
However, there was an error adding luster storage to the storage domain as Posix compliant
FS.
Domain Function : Data
Storage Type : POSIX compliant FS
Host to Use : [SPM_HOSTNAME]
Name : [STORAGE_DOMAIN_NAME]
Path : 10.10.10.15@tcp:/lustre/vmstore
VFS Type : lustre
Mount Options :
The vdsm debug logs are shown below.
2018-10-25 12:46:58,963+0900 INFO (jsonrpc/2) [storage.xlease] Formatting index for
lockspace u'c0ef7ee6-1da9-4eef-9e03-387cd3a24445' (version=1) (xlease:653)
2018-10-25 12:46:58,971+0900 DEBUG (jsonrpc/2) [root] /usr/bin/dd iflag=fullblock
of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases
oflag=direct,seek_bytes seek=1048576 bs=256512 count=1 conv=notrunc,nocreat,fsync (cwd
None) (commands:65)
2018-10-25 12:46:58,985+0900 DEBUG (jsonrpc/2) [root] FAILED: <err> =
"/usr/bin/dd: error writing
'/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases':
Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s,
0.0 kB/s\n"; <rc> = 1 (commands:86)
2018-10-25 12:46:58,985+0900 INFO (jsonrpc/2) [vdsm.api] FINISH createStorageDomain
error=Command ['/usr/bin/dd', 'iflag=fullblock',
u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases',
'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512',
'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1
out='[suppressed]' err="/usr/bin/dd: error writing
'/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases':
Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s,
0.0 kB/s\n" from=::ffff:192.168.161.104,52188, flow_id=794bd395,
task_id=c9847bf3-2267-483b-9099-f05a46981f7f (api:50)
2018-10-25 12:46:58,985+0900 ERROR (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') Unexpected error (task:875)
Traceback (most recent call last): File
"/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
return fn(*args, **kargs) File "<string>", line 2, in
createStorageDomain
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2591, in
createStorageDomain
storageType, domVersion)
File "/usr/lib/python2.7/site-packages/vdsm/storage/nfsSD.py", line 87, in
create
remotePath, storageType, version)
File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 465, in
_prepareMetadata
cls.format_external_leases(sdUUID, xleases_path)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 1200, in
format_external_leases
xlease.format_index(lockspace, backend)
File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 661, in
format_index
index.dump(file)
File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 761, in
dump
file.pwrite(INDEX_BASE, self._buf)
File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 994, in
pwrite
self._run(args, data=buf[:])
File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 1011, in
_run
raise cmdutils.Error(args, rc, "[suppressed]", err)
Error: Command ['/usr/bin/dd', 'iflag=fullblock',
u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases',
'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512',
'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1
out='[suppressed]' err="/usr/bin/dd: error writing
'/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases':
Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s,
0.0 kB/s\n"
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') Task._run:
c9847bf3-2267-483b-9099-f05a46981f7f (6, u'c0ef7ee6-1da9-4eef-9e03-387cd3a24445',
u'vmstore', u'10.10.10.15@tcp:/lustre/vmstore', 1, u'4') {} failed
- stopping task (task:894)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') stopping in state failed (force
False) (task:1256)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') ref 1 aborting True (task:1002)
2018-10-25 12:46:58,986+0900 INFO (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') aborting: Task is aborted:
u'Command [\'/usr/bin/dd\', \'iflag=fullblock\',
u\'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases\',
\'oflag=direct,seek_bytes\', \'seek=1048576\', \'bs=256512\',
\'count=1\', \'conv=notrunc,nocreat,fsync\'] failed with rc=1
out=\'[suppressed]\' err="/usr/bin/dd: error writing
\'/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases\':
Invalid argument\\n1+0 records in\\n0+0 records out\\n0 bytes (0 B) copied, 0.000943896 s,
0.0 kB/s\\n"' - code 100 (task:1181)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') Prepare: aborted: Command
['/usr/bin/dd', 'iflag=fullblock',
u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases',
'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512',
'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1
out='[suppressed]' err="/usr/bin/dd: error writing
'/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases':
Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s,
0.0 kB/s\n" (task:1186)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') ref 0 aborting True (task:1002)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') Task._doAbort: force False
(task:937)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.ResourceManager.Owner]
Owner.cancelAll requests {} (resourceManager:947)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') moving from state failed -> state
aborting (task:602)
2018-10-25 12:46:58,986+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') _aborting: recover policy none
(task:557)
2018-10-25 12:46:58,987+0900 DEBUG (jsonrpc/2) [storage.TaskManager.Task]
(Task='c9847bf3-2267-483b-9099-f05a46981f7f') moving from state failed -> state
failed (task:602)
2018-10-25 12:46:58,987+0900 DEBUG (jsonrpc/2) [storage.ResourceManager.Owner]
Owner.releaseAll requests {} resources {} (resourceManager:910)
2018-10-25 12:46:58,987+0900 DEBUG (jsonrpc/2) [storage.ResourceManager.Owner]
Owner.cancelAll requests {} (resourceManager:947)
2018-10-25 12:46:58,987+0900 ERROR (jsonrpc/2) [storage.Dispatcher] FINISH
createStorageDomain error=Command ['/usr/bin/dd', 'iflag=fullblock',
u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases',
'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512',
'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1
out='[suppressed]' err="/usr/bin/dd: error writing
'/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases':
Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s,
0.0 kB/s\n" (dispatcher:86)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line 73,
in wrapper
result = ctask.prepare(func, *args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in
wrapper
return m(self, *a, **kw)
File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1189, in
prepare
raise self.error
Error: Command ['/usr/bin/dd', 'iflag=fullblock',
u'of=/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases',
'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512',
'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1
out='[suppressed]' err="/usr/bin/dd: error writing
'/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore/c0ef7ee6-1da9-4eef-9e03-387cd3a24445/dom_md/xleases':
Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.000943896 s,
0.0 kB/s\n"
2018-10-25 12:46:58,987+0900 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call
StorageDomain.create failed (error 351) in 0.41 seconds (__init__:573)
2018-10-25 12:46:59,058+0900 DEBUG (jsonrpc/3) [jsonrpc.JsonRpcServer] Calling
'StoragePool.disconnectStorageServer' in bridge with {u'connectionParams':
[{u'id': u'316c5f1f-753e-42a1-8e30-4ee6f976906a', u'connection':
u'10.10.10.15@tcp:/lustre/vmstore', u'iqn': u'', u'user':
u'', u'tpgt': u'1', u'vfs_type': u'lustre',
u'password': '********', u'port': u''}],
u'storagepoolID': u'00000000-0000-0000-0000-000000000000',
u'domainType': 6} (__init__:590)
2018-10-25 12:46:59,058+0900 WARN (jsonrpc/3) [devel] Provided value "6" not
defined in StorageDomainType enum for StoragePool.disconnectStorageServer (vdsmapi:275)
2018-10-25 12:46:59,058+0900 WARN (jsonrpc/3) [devel] Provided parameters {u'id':
u'316c5f1f-753e-42a1-8e30-4ee6f976906a', u'connection':
u'10.10.10.15@tcp:/lustre/vmstore', u'iqn': u'', u'user':
u'', u'tpgt': u'1', u'vfs_type': u'lustre',
u'password': '********', u'port': u''} do not match any of
union ConnectionRefParameters values (vdsmapi:275)
2018-10-25 12:46:59,059+0900 DEBUG (jsonrpc/3) [storage.TaskManager.Task]
(Task='3c6b249f-a47f-47f1-a647-5893b6f60b7c') moving from state preparing ->
state preparing (task:602)
2018-10-25 12:46:59,059+0900 INFO (jsonrpc/3) [vdsm.api] START
disconnectStorageServer(domType=6, spUUID=u'00000000-0000-0000-0000-000000000000',
conList=[{u'id': u'316c5f1f-753e-42a1-8e30-4ee6f976906a',
u'connection': u'10.10.10.15@tcp:/lustre/vmstore', u'iqn':
u'', u'user': u'', u'tpgt': u'1',
u'vfs_type': u'lustre', u'password': '********',
u'port': u''}], options=None) from=::ffff:192.168.161.104,52188,
flow_id=f1bf4bf8-9033-42af-9329-69960638ba0e, task_id=3c6b249f-a47f-47f1-a647-5893b6f60b7c
(api:46)
2018-10-25 12:46:59,059+0900 INFO (jsonrpc/3) [storage.Mount] unmounting
/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore (mount:212)
2018-10-25 12:46:59,094+0900 DEBUG (jsonrpc/3) [storage.Mount]
/rhev/data-center/mnt/10.10.10.15@tcp:_lustre_vmstore unmounted: 0.03 seconds (utils:452)
1. If you use the direct flag when using the dd command, luster only works in multiples of
4k (4096).
Therefore, bs = 256512, which is not a multiple of 4096, causes an error.
2. Ovirt / RH(E)V An error occurs regardless of version. I have tested both in 3.6, 4.1,
and 4.2 environments.
I searched hard, but I did not have a similar case. I want to ask three questions.
1. Is there a way to fix the problem in Ovirt? (Secure bypass or some configurations)
2. (Extension of Question # 1) Where does block size 256512 come from? Why is it 256512?
3. Is this a problem you need to solve in the Lustre file system? (For example, a setting
capable of direct IO in units of 512 bytes)
I need help. Thank you for your reply.