On Wed, Jul 10, 2019 at 4:18 PM Milan Zamazal <mzamazal@redhat.com> wrote:
Dafna Ron <dron@redhat.com> writes:

> Hi,
>
> We have a failure on test  004_basic_sanity.vdsm_recovery on basic suite.
> the error seems to be an error in KSM (invalid arg)
>
> can you please have a look?
>
> Link and headline of suspected patches:
>
>
> cq identified this as the cause of failure:
>
> https://gerrit.ovirt.org/#/c/101603/ - localFsSD: Enable 4k block_size and
> alignments
>
>
> However, I can see some py3 patches merged at the same time:
>
>
> py3: storage: Fix bytes x string in lvm locking type validation -
> https://gerrit.ovirt.org/#/c/101124/

OST was successfully run on this patch before merging, so it's unlikely
to be the cause.

> Link to Job:
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14963/
>
> Link to all logs:
> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/14963/artifact/basic-suite.el7.x86_64/test_logs/basic-suite-master/post-004_basic_sanity.py/
>
> (Relevant) error snippet from the log:
>
> <error>
>
> s/da0eeccb-5dd8-47e5-9009-8a848fe17ea5.ovirt-guest-agent.0',) {}
> MainProcess|vm/da0eeccb::DEBUG::2019-07-10
> 07:53:41,003::supervdsm_server::106::SuperVdsm.ServerCallback::(wrapper)
> return prepareVmChannel with None
> MainProcess|jsonrpc/1::DEBUG::2019-07-10
> 07:54:05,580::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper)
> call ksmTune with ({u'pages_to_scan': 64, u'run': 1, u'sleep
> _millisecs': 89.25152465623417},) {}
> MainProcess|jsonrpc/1::ERROR::2019-07-10
> 07:54:05,581::supervdsm_server::103::SuperVdsm.ServerCallback::(wrapper)
> Error in ksmTune
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line
> 101, in wrapper
>     res = func(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_api/ksm.py", line
> 45, in ksmTune
>     f.write(str(v))

This writes to files in /sys/kernel/mm/ksm/ and the values are checked
in the code.  It's also weird that Vdsm starts happily and then fails on
this in recovery.

Can you exclude there is some problem with the system OST runs on?

I see there was a change in ksm patch 7 weeks ago which explains the failure we are seeing.
However, I am not sure why its failing the test now and I am not seeing any other error that can cause this.

Adding Ehud and Evgheni.
Are the manual jobs running on containers or physical severs?



https://gerrit.ovirt.org/#/c/95994/ - fix path of ksm files in a comment

> IOError: [Errno 22] Invalid argument
> MainProcess|jsonrpc/5::DEBUG::2019-07-10
> 07:56:33,211::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper)
> call rmAppropriateMultipathRules with
> ('da0eeccb-5dd8-47e5-9009-8a848fe17ea5',) {}
>
>
> </error>