host-deploy is still broken on master fc28
On Mon, Sep 17, 2018 at 8:01 AM, Yuval Turgeman <yturgema(a)redhat.com> wrote:
I'm pretty sure I verified this on el7 as well, i'll check
again, but
thinking about it, tar will stop when it gets to the first empty block, so
if the record size on the engine's side is large and the end is filled with
zeros, -b1 will make it stop at the first empty block so the next read on
the host's side would get the trailing zeros which is what otopi reads.
Btw, it could be a problem with deployed el7 systems as well, if for any
reason the default on the host is set to something that is more than 20
blocks (can be set with export TAR_BLOCKING_FACTOR for the root account on
the host side).
It's ok to revert the patch to fix the regression, but I don't see any
other way other than -b1... perhaps add a `cat -` after to just read until
EOF or something, or have otopi strip the input.
On Mon, Sep 17, 2018 at 2:30 PM, Galit Rosenthal <grosenth(a)redhat.com>
wrote:
> Didi,
>
> Is this what you are looking for
>
https://ovirt-jira.atlassian.net/browse/OVIRT-2259
> ?
> Galit
>
> On Mon, Sep 17, 2018 at 1:54 PM Dafna Ron <dron(a)redhat.com> wrote:
>
>> I think that in ovirt-engine we currently only build to centos.
>> since we have not had an engine build for 2 weeks (on master) I think we
>> should merge and worry about fc28 once it would be relevant.
>>
>> the failure we have now could be another regression missed since the
>> project has been broken for two weeks.
>>
>> Thanks,
>> Dafna
>>
>>
>>
>> On Mon, Sep 17, 2018 at 10:30 AM Yedidyah Bar David <didi(a)redhat.com>
>> wrote:
>>
>>> On Mon, Sep 17, 2018 at 11:49 AM Dafna Ron <dron(a)redhat.com> wrote:
>>> >
>>> > Didi, Marin, any update on the patch?
>>>
>>> Yes - it passed. Actually failed, but only after host-deploy:
>>>
>>>
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/
>>> ovirt-system-tests_manual/3189/
>>>
>>> I'd rather not merge it as-is, because it will break fedora.
>>>
>>> If someone can have a look at the code generating the tar file, and can
>>> see if
>>> it's easy to make it work well for both centos and fedora, perhaps by
>>> explicitly
>>> setting all relevant params to some reasonable values, great.
>>> Otherwise, I guess
>>> we can merge for now, as fedora is still not supported anyway.
>>>
>>> Thanks,
>>>
>>> >
>>> >
>>> > On Sun, Sep 16, 2018 at 11:09 AM Yedidyah Bar David
<didi(a)redhat.com>
>>> wrote:
>>> >>
>>> >> On Sun, Sep 16, 2018 at 12:53 PM Yedidyah Bar David
<didi(a)redhat.com>
>>> wrote:
>>> >> >
>>> >> > On Fri, Sep 14, 2018 at 6:06 PM Martin Perina
<mperina(a)redhat.com>
>>> wrote:
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > On Fri, Sep 14, 2018 at 4:51 PM, Ravi Shankar Nori <
>>> rnori(a)redhat.com> wrote:
>>> >> > >>
>>> >> > >> I see the same errors on my dev env. From the logs
attached by
>>> Andrej the response received by otopi has a bunch of null chars before the
>>> actual response CONFIRM DEPLOY_PROCEED=yes
>>> >> > >>
>>> >> > >>
>>> >> > >>
>>> >> > >> 2018-09-14 15:49:23,018+0200 DEBUG
>>> otopi.plugins.otopi.dialog.machine dialog.__logString:204 DIALOG:SEND
>>> ### Response is CONFIRM DEPLOY_PROCEED=yes|no or ABORT DEPLOY_PROCEED
>>> >> > >>
>>> >> > >> ^@^@^@^@^@^@^@^@^@CONFIRM DEPLOY_PROCEED=yes
>>> >> > >
>>> >> > >
>>> >> > > Didi/Sandro, could you please take a look? Below error
seems
>>> like some issue in otopi, where an error is raised when handling binary
>>> input:
>>> >> >
>>> >> > Not sure the issue is "binary input" in general, but
simply illegal
>>> >> > input. The prompt expects, as it says, one of these 3 replies:
>>> >> >
>>> >> > CONFIRM DEPLOY_PROCEED=yes
>>> >> > CONFIRM DEPLOY_PROCEED=no
>>> >> > ABORT DEPLOY_PROCEED
>>> >> >
>>> >> > Instead, judging from the file supplied by Andrej, it gets
from
>>> the engine:
>>> >> > <7169 null bytes>CONFIRM DEPLOY_PROCEED=yes
>>> >> >
>>> >> > So either the engine now sends, for some reason, 7169 null
bytes,
>>> in
>>> >> > this response, or there is some low-level change causing this
to be
>>> >> > eventually supplied to otopi - a change in apache-sshd,
openssh,
>>> some
>>> >> > library, the kernel, no idea.
>>> >> >
>>> >> > Well, thinking a bit, I have a wild guess: Perhaps it's
related to
>>> the
>>> >> > patch introduced recently to change the tar blocking?
>>> >>
>>> >>
https://gerrit.ovirt.org/94357
>>> >>
>>> >> I am leaving soon, perhaps someone can try the manual job with the
>>> >> result of the check-patch job for above patch, to see if it fixes.
>>> >> Otherwise I'll do this tomorrow.
>>> >>
>>> >> >
>>> >> > >
>>> >> > >
>>> >> > > 2018-09-14 15:49:23,032+0200 DEBUG otopi.context
>>> context._executeMethod:143 method exception
>>> >> > > Traceback (most recent call last):
>>> >> > > File
"/usr/lib/python2.7/site-packages/otopi/context.py",
>>> line 133, in _executeMethod
>>> >> > > method['method']()
>>> >> > > File "/tmp/ovirt-O6CfS4aUHI/otopi-p
>>> lugins/ovirt-host-deploy/core/misc.py", line 87, in _confirm
>>> >> > > prompt=True,
>>> >> > > File "/tmp/ovirt-O6CfS4aUHI/otopi-p
>>> lugins/otopi/dialog/machine.py", line 478, in confirm
>>> >> > > code=opcode,
>>> >> > >
>>> >> > >
>>> >> > >>
>>> >> > >> On Fri, Sep 14, 2018 at 10:44 AM, Dafna Ron
<dron(a)redhat.com>
>>> wrote:
>>> >> > >>>
>>> >> > >>> if you run it with mock you would remove any
environmental
>>> conditions that can effect the outcome so I recommend using mock
>>> >> > >>>
>>> >> > >>>
>>> >> > >>> On Fri, Sep 14, 2018 at 3:32 PM, Martin Perina
<
>>> mperina(a)redhat.com> wrote:
>>> >> > >>>>
>>> >> > >>>>
>>> >> > >>>>
>>> >> > >>>> On Fri, Sep 14, 2018 at 3:49 PM, Dafna Ron
<dron(a)redhat.com>
>>> wrote:
>>> >> > >>>>>
>>> >> > >>>>> did you use mock to reproduce?
>>> >> > >>>>
>>> >> > >>>>
>>> >> > >>>> No, just run_suite under myself
>>> >> > >>>>>
>>> >> > >>>>>
>>> >> > >>>>> On Fri, Sep 14, 2018 at 2:39 PM, Martin
Perina <
>>> mperina(a)redhat.com> wrote:
>>> >> > >>>>>>
>>> >> > >>>>>> Hi,
>>> >> > >>>>>>
>>> >> > >>>>>> the problem is that we haven't
fetched the temporary
>>> host-deploy log from /tmp directory, so we don't know which string that
>>> host-deploy process sent to engine is causing that issue. I tried to
>>> reproduce on my local machine, but I was unable to reproduce it,
>>> 002_bootstrap phase finished successfully (other phases are still running).
>>> >> > >>>>>>
>>> >> > >>>>>> So if anyone is able to reproduce,
please try to fetch
>>> host-deploy log from /tmp directory after the error is raised and share it.
>>> >> > >>>>>>
>>> >> > >>>>>> Thanks
>>> >> > >>>>>>
>>> >> > >>>>>> Martin
>>> >> > >>>>>>
>>> >> > >>>>>>
>>> >> > >>>>>> On Fri, Sep 14, 2018 at 1:52 PM, Dafna
Ron <dron(a)redhat.com>
>>> wrote:
>>> >> > >>>>>>>
>>> >> > >>>>>>> Full logs can be found here:
>>> >> > >>>>>>>
>>> >> > >>>>>>>
https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/
>>> ovirt-master_change-queue-tester/10307/artifact/upgrade-from
>>> -release-suite.el7.x86_64/test_logs/upgrade-from-release-
>>> suite-master/post-002_bootstrap.py/
>>> >> > >>>>>>>
>>> >> > >>>>>>> On Fri, Sep 14, 2018 at 12:48 PM,
Dafna Ron <
>>> dron(a)redhat.com> wrote:
>>> >> > >>>>>>>>
>>> >> > >>>>>>>> Hi,
>>> >> > >>>>>>>>
>>> >> > >>>>>>>> The previous regression was
resolved and we now have a
>>> new regression.
>>> >> > >>>>>>>>
>>> >> > >>>>>>>> I don't think that the
reported change is related so can
>>> someone from ovirt-engine take a look?
>>> >> > >>>>>>>>
>>> >> > >>>>>>>> The failure is add host on the
upgrade suite.
>>> >> > >>>>>>>>
>>> >> > >>>>>>>> Please note that we have not
had an engine-ovirt build
>>> for over 10 days due to several consecutive regressions and I would ask you
>>> to stop merging until we can stabilize the project and have a new package
>>> of engine.
>>> >> > >>>>>>>>
>>> >> > >>>>>>>> error:
>>> >> > >>>>>>>>
>>> >> > >>>>>>>> 2018-09-14 05:51:07,670-04
INFO
>>> [org.ovirt.engine.core.uutils.ssh.SSHDialog]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] SSH execute
>>> 'root@lago-upgrade-from-release-suite-master-host-0' 'umask
0077;
>>> MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t
ovirt-XXXXXXXXXX)";
>>> trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1;
rm -fr \"${MYTMP}\" >
>>> /dev/null 2>&1" 0; tar -b1 --warning=no-timestamp -C
"${MYTMP}" -x &&
>>> "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine
>>> DIALOG/customization=bool:True'
>>> >> > >>>>>>>> 2018-09-14 05:51:08,550-04
INFO
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (VdsDeploy) [5c91fcbd] EVENT_ID: VDS_INSTALL_IN_PROGRESS(509), Installing
>>> Host lago-upgrade-from-release-suite-master-host-0. Stage:
>>> Initializing.
>>> >> > >>>>>>>> 2018-09-14 05:51:08,565-04
INFO
>>> [org.ovirt.engine.core.utils.transaction.TransactionSupport]
>>> (VdsDeploy) [5c91fcbd] transaction rolled back
>>> >> > >>>>>>>> 2018-09-14 05:51:08,574-04
ERROR
>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy)
>>> [5c91fcbd] Error during deploy dialog
>>> >> > >>>>>>>> 2018-09-14 05:51:08,578-04
ERROR
>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Error during host
>>> lago-upgrade-from-release-suite-master-host-0 install
>>> >> > >>>>>>>> 2018-09-14 05:51:08,586-04
ERROR
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] EVENT_ID:
>>> VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has occurred during
>>> installation of Host lago-upgrade-from-release-suite-master-host-0:
>>> CallableStatementCallback; SQL [{call insertauditlog(?, ?, ?, ?, ?, ?, ?,
>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>>> ?, ?)}ERROR: invalid byte sequence for encoding "UTF8": 0x00;
nested
>>> exception is org.postgresql.util.PSQLException: ERROR: invalid byte
>>> sequence for encoding "UTF8": 0x00.
>>> >> > >>>>>>>> 2018-09-14 05:51:08,586-04
ERROR
>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Error during host
>>> lago-upgrade-from-release-suite-master-host-0 install, preferring
>>> first exception: CallableStatementCallback; SQL [{call insertauditlog(?, ?,
>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>>> ?, ?, ?, ?, ?, ?, ?)}ERROR: invalid byte sequence for encoding
"UTF8":
>>> 0x00; nested exception is org.postgresql.util.PSQLException: ERROR:
>>> invalid byte sequence for encoding "UTF8": 0x00
>>> >> > >>>>>>>> 2018-09-14 05:51:08,586-04
ERROR
>>> [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Host installation
>>> failed for host 'e475e93a-63b3-4573-b242-162c2ed864f0',
>>> 'lago-upgrade-from-release-suite-master-host-0':
>>> CallableStatementCallback; SQL [{call insertauditlog(?, ?, ?, ?, ?, ?, ?,
>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>>> ?, ?)}ERROR: invalid byte sequence for encoding "UTF8": 0x00;
nested
>>> exception is org.postgresql.util.PSQLException: ERROR: invalid byte
>>> sequence for encoding "UTF8": 0x00
>>> >> > >>>>>>>> 2018-09-14 05:51:08,615-04
INFO
>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] START,
>>> SetVdsStatusVDSCommand(HostName =
lago-upgrade-from-release-suite-master-host-0,
>>>
SetVdsStatusVDSCommandParameters:{hostId='e475e93a-63b3-4573-b242-162c2ed864f0',
>>> status='InstallFailed', nonOperationalReason='NONE',
>>> stopSpmFailureLogged='false', maintenanceReason='null'}), log
id: 146cdc08
>>> >> > >>>>>>>> 2018-09-14 05:51:08,626-04
INFO
>>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] FINISH,
>>> SetVdsStatusVDSCommand, return: , log id: 146cdc08
>>> >> > >>>>>>>> 2018-09-14 05:51:08,639-04
ERROR
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] EVENT_ID:
>>> VDS_INSTALL_FAILED(505), Host lago-upgrade-from-release-suite-master-host-0
>>> installation failed. CallableStatementCallback; SQL [{call
>>> insertauditlog(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}ERROR: invalid byte sequence for
>>> encoding "UTF8": 0x00; nested exception is
org.postgresql.util.PSQLException:
>>> ERROR: invalid byte sequence for encoding "UTF8": 0x00.
>>> >> > >>>>>>>> 2018-09-14 05:51:08,652-04
INFO
>>> [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [5c91fcbd] Lock freed to
>>> object
'EngineLock:{exclusiveLocks='[e475e93a-63b3-4573-b242-162c2ed864f0=VDS]',
>>> sharedLocks=''}'
>>> >> > >>>>>>>> 2018-09-14 05:51:37,996-04
INFO
>>> [org.ovirt.engine.core.bll.quota.QuotaManager]
>>> (EE-ManagedThreadFactory-engineScheduled-Thread-44) [] Quota Cache
>>> updated. (19 msec)
>>> >> > >>>>>>>> (END)
>>> >> > >>>>>>>>
>>> >> > >>>>>>>> Thanks,
>>> >> > >>>>>>>> Dafna
>>> >> > >>>>>>>>
>>> >> > >>>>>>>
>>> >> > >>>>>>
>>> >> > >>>>>>
>>> >> > >>>>>>
>>> >> > >>>>>> --
>>> >> > >>>>>> Martin Perina
>>> >> > >>>>>> Associate Manager, Software
Engineering
>>> >> > >>>>>> Red Hat Czech s.r.o.
>>> >> > >>>>>
>>> >> > >>>>>
>>> >> > >>>>
>>> >> > >>>>
>>> >> > >>>>
>>> >> > >>>> --
>>> >> > >>>> Martin Perina
>>> >> > >>>> Associate Manager, Software Engineering
>>> >> > >>>> Red Hat Czech s.r.o.
>>> >> > >>>
>>> >> > >>>
>>> >> > >>
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > --
>>> >> > > Martin Perina
>>> >> > > Associate Manager, Software Engineering
>>> >> > > Red Hat Czech s.r.o.
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Didi
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Didi
>>>
>>>
>>>
>>> --
>>> Didi
>>>
>> _______________________________________________
>> Infra mailing list -- infra(a)ovirt.org
>> To unsubscribe send an email to infra-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
https://www.ovirt.org/communit
>> y/about/community-guidelines/
>> List Archives:
https://lists.ovirt.org/archiv
>> es/list/infra(a)ovirt.org/message/CG2IYPXSSEFTL6XCN72JHUSWOUY7QRSA/
>>
>
>
> --
>
> GALIT ROSENTHAL
>
> SOFTWARE ENGINEER
>
> Red Hat
>
> <
https://www.redhat.com/>
>
> galit(a)gmail.com T: 972-9-7692230
> <
https://red.ht/sig>
>
_______________________________________________
Infra mailing list -- infra(a)ovirt.org
To unsubscribe send an email to infra-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-
guidelines/
List Archives:
https://lists.ovirt.org/archives/list/infra@ovirt.org/
message/QMRM2INTCRDPT7GPF24EEPNJAZRP4CUQ/