Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit

Hi infra, I have seen this failure today: 16:52:35 Triggered by Gerrit: http://gerrit.ovirt.org/28610 16:52:35 Building remotely on dcaro-ovirt-vm02-fc20 (os1 fedora20) in workspace /home/jenkins/workspace/vdsm_master_storage_functional_tests_localfs_gerrit ... 16:53:00 + sudo sh -c 'echo "" > /var/log/vdsm/vdsm.log' 16:53:00 sh: /var/log/vdsm/vdsm.log: No such file or directory 16:53:02 Build step 'Execute shell' marked build as failure We need more robust code. Something like: vdsm_log="/var/log/vdsm/vdsm.log" if [ -f "$vdsm_log" ]; then sudo sh -c 'echo "" > "$vdsm_log" fi Thanks, Nir

we're not maintaining the functional tests code in jenkins, i believe it's vered. vered - do you need any help with refactoring this? eyal. ----- Original Message -----
From: "Nir Soffer" <nsoffer@redhat.com> To: "infra" <infra@ovirt.org>, "Vered Volansky" <vvolansk@redhat.com>, "Dan Kenigsberg" <danken@redhat.com> Sent: Wednesday, June 11, 2014 9:10:40 PM Subject: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
Hi infra,
I have seen this failure today:
16:52:35 Triggered by Gerrit: http://gerrit.ovirt.org/28610 16:52:35 Building remotely on dcaro-ovirt-vm02-fc20 (os1 fedora20) in workspace /home/jenkins/workspace/vdsm_master_storage_functional_tests_localfs_gerrit ... 16:53:00 + sudo sh -c 'echo "" > /var/log/vdsm/vdsm.log' 16:53:00 sh: /var/log/vdsm/vdsm.log: No such file or directory 16:53:02 Build step 'Execute shell' marked build as failure
We need more robust code. Something like:
vdsm_log="/var/log/vdsm/vdsm.log"
if [ -f "$vdsm_log" ]; then sudo sh -c 'echo "" > "$vdsm_log" fi
Thanks, Nir _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

The job with this issue is gone, let me know if it's risen again. Vered ----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "infra" <infra@ovirt.org>, "Vered Volansky" <vvolansk@redhat.com>, "Dan Kenigsberg" <danken@redhat.com> Sent: Thursday, June 12, 2014 2:37:33 PM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
we're not maintaining the functional tests code in jenkins, i believe it's vered.
vered - do you need any help with refactoring this?
eyal.
----- Original Message -----
From: "Nir Soffer" <nsoffer@redhat.com> To: "infra" <infra@ovirt.org>, "Vered Volansky" <vvolansk@redhat.com>, "Dan Kenigsberg" <danken@redhat.com> Sent: Wednesday, June 11, 2014 9:10:40 PM Subject: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
Hi infra,
I have seen this failure today:
16:52:35 Triggered by Gerrit: http://gerrit.ovirt.org/28610 16:52:35 Building remotely on dcaro-ovirt-vm02-fc20 (os1 fedora20) in workspace /home/jenkins/workspace/vdsm_master_storage_functional_tests_localfs_gerrit ... 16:53:00 + sudo sh -c 'echo "" > /var/log/vdsm/vdsm.log' 16:53:00 sh: /var/log/vdsm/vdsm.log: No such file or directory 16:53:02 Build step 'Execute shell' marked build as failure
We need more robust code. Something like:
vdsm_log="/var/log/vdsm/vdsm.log"
if [ -f "$vdsm_log" ]; then sudo sh -c 'echo "" > "$vdsm_log" fi
Thanks, Nir _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

On Sun, Jun 15, 2014 at 04:11:53AM -0400, Vered Volansky wrote:
The job with this issue is gone, let me know if it's risen again.
The fragile code is still in http://jenkins.ovirt.org/view/All/job/vdsm_master_storage_functional_tests_l... why not make it more robust before /var/log/vdsm disappears and make it break again? BTW, if /var/log/vdsm/vdsm.log is missing, the discussed line would create it owned by root, which would fail the startup of vdsm. Please use the vdsm user instead.

----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Vered Volansky" <vered@redhat.com> Cc: "infra" <infra@ovirt.org> Sent: Monday, June 16, 2014 11:29:42 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
On Sun, Jun 15, 2014 at 04:11:53AM -0400, Vered Volansky wrote:
The job with this issue is gone, let me know if it's risen again.
The fragile code is still in http://jenkins.ovirt.org/view/All/job/vdsm_master_storage_functional_tests_l... why not make it more robust before /var/log/vdsm disappears and make it break again?
because I don't understand the issue. The file is only created if missing. The directory should be there.
BTW, if /var/log/vdsm/vdsm.log is missing, the discussed line would create it owned by root, which would fail the startup of vdsm. Please use the vdsm user instead.
Ack.
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

----- Original Message -----
From: "Vered Volansky" <vered@redhat.com> To: "Dan Kenigsberg" <danken@redhat.com> Cc: "infra" <infra@ovirt.org> Sent: Tuesday, June 17, 2014 10:40:51 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Vered Volansky" <vered@redhat.com> Cc: "infra" <infra@ovirt.org> Sent: Monday, June 16, 2014 11:29:42 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
On Sun, Jun 15, 2014 at 04:11:53AM -0400, Vered Volansky wrote:
The job with this issue is gone, let me know if it's risen again.
The fragile code is still in http://jenkins.ovirt.org/view/All/job/vdsm_master_storage_functional_tests_l... why not make it more robust before /var/log/vdsm disappears and make it break again?
because I don't understand the issue. The file is only created if missing. The directory should be there.
The test failure proves that it may not be there in all cases.
BTW, if /var/log/vdsm/vdsm.log is missing, the discussed line would create it owned by root, which would fail the startup of vdsm. Please use the vdsm user instead.
Ack.
Why not remove the file - much simpler and cannot fail: rm -f /var/log/vdsm/vdsm.log
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

On Tue, Jun 17, 2014 at 03:40:51AM -0400, Vered Volansky wrote:
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Vered Volansky" <vered@redhat.com> Cc: "infra" <infra@ovirt.org> Sent: Monday, June 16, 2014 11:29:42 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
On Sun, Jun 15, 2014 at 04:11:53AM -0400, Vered Volansky wrote:
The job with this issue is gone, let me know if it's risen again.
The fragile code is still in http://jenkins.ovirt.org/view/All/job/vdsm_master_storage_functional_tests_l... why not make it more robust before /var/log/vdsm disappears and make it break again?
because I don't understand the issue. The file is only created if missing. The directory should be there.
However, apparently it was not there, which made the echo fail, which led to the job failing. We should understand why it disappeared. dcaro, eedri - do you have any idea?

----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Vered Volansky" <vered@redhat.com>, dcaro@redhat.com, eedri@redhat.com Cc: "infra" <infra@ovirt.org> Sent: Tuesday, June 17, 2014 11:24:49 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
On Tue, Jun 17, 2014 at 03:40:51AM -0400, Vered Volansky wrote:
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Vered Volansky" <vered@redhat.com> Cc: "infra" <infra@ovirt.org> Sent: Monday, June 16, 2014 11:29:42 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
On Sun, Jun 15, 2014 at 04:11:53AM -0400, Vered Volansky wrote:
The job with this issue is gone, let me know if it's risen again.
The fragile code is still in http://jenkins.ovirt.org/view/All/job/vdsm_master_storage_functional_tests_l... why not make it more robust before /var/log/vdsm disappears and make it break again?
because I don't understand the issue. The file is only created if missing. The directory should be there.
However, apparently it was not there, which made the echo fail, which led to the job failing. We should understand why it disappeared.
Maybe it was removed by parallel running yum remove vdsm?

This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --aea2b0PgCn5HD67RexwFcgX827SEvUFqP Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Tue 17 Jun 2014 10:24:49 AM CEST, Dan Kenigsberg wrote:
On Tue, Jun 17, 2014 at 03:40:51AM -0400, Vered Volansky wrote:
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Vered Volansky" <vered@redhat.com> Cc: "infra" <infra@ovirt.org> Sent: Monday, June 16, 2014 11:29:42 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_l=
ocalfs_gerrit
On Sun, Jun 15, 2014 at 04:11:53AM -0400, Vered Volansky wrote:
The job with this issue is gone, let me know if it's risen again.
The fragile code is still in http://jenkins.ovirt.org/view/All/job/vdsm_master_storage_functional_=
tests_localfs_gerrit/configure
why not make it more robust before /var/log/vdsm disappears and make = it break again?
because I don't understand the issue. The file is only created if missing. The directory should be there.
It was fixed by me some time ago (I added the mkdir -p before the=20 touch, just in case) sudo mkdir -p /var/log/vdsm sudo chown vdsm:kvm sudo sh -c 'echo "" > /var/log/vdsm/vdsm.log' sudo sh -c 'echo "" > /var/log/vdsm/supervdsm.log'
However, apparently it was not there, which made the echo fail, which led to the job failing. We should understand why it disappeared.
dcaro, eedri - do you have any idea?
Totally agree, and, if it was meant to be there, I'll remove the mkdir=20 to make the test fail if it's not there. But for what I see on the job, there's nothing that ensures you that=20 the directory will be there, vdsm might never have been installed on=20 that machine, or might have been properly cleaned at some point=20 (removing logs and leftovers). So, In my opinion the issue is that we are not cleaning up properly=20 after the vdsm jobs and leaving the logs behind. Also, to which point can these tests run on docker? Anyone have tried?=20 Because it would make it fit to be run in parallel and on any slave=20 with any specific deps. -- David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --aea2b0PgCn5HD67RexwFcgX827SEvUFqP Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJTn/1UAAoJEEBxx+HSYmnDgOkH/3elCRKZyw+SuTdrpTKKjeCJ q38TSjRMWUl+RG0sm13QfN3i+mADgrNScvzSXujQrETiFHZpa5tI5Jp9iuCUsYE9 36Q1NIOb5Te7dIa+IJBcbvc44L7KEKPBv0RFxsiQiMEv05K8FsaW1z3RuM0InlIC z8N5jYOFgbb2+4x9loWzT8+PkQgwptYX/VSGrgMB7Mpgeqy/40jq+NK9Wp+uMw51 LueNxdZwdP7Dr+SKg3xmJCv95Hygbl7DHgpiEECv3aCUZONCWuzBdULZURvSeLcn qwJ/zTTsd/ZSPFKopjvNGYWV0zBdvXXx6gHhTqmzwxEZpAo9IK47AHxIzrJy8K0= =8wxD -----END PGP SIGNATURE----- --aea2b0PgCn5HD67RexwFcgX827SEvUFqP--

----- Original Message -----
From: "David Caro" <dcaroest@redhat.com> To: "Dan Kenigsberg" <danken@redhat.com> Cc: dcaro@redhat.com, "Vered Volansky" <vered@redhat.com>, "infra" <infra@ovirt.org> Sent: Tuesday, June 17, 2014 11:33:24 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
On Tue 17 Jun 2014 10:24:49 AM CEST, Dan Kenigsberg wrote:
On Tue, Jun 17, 2014 at 03:40:51AM -0400, Vered Volansky wrote:
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Vered Volansky" <vered@redhat.com> Cc: "infra" <infra@ovirt.org> Sent: Monday, June 16, 2014 11:29:42 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
On Sun, Jun 15, 2014 at 04:11:53AM -0400, Vered Volansky wrote:
The job with this issue is gone, let me know if it's risen again.
The fragile code is still in http://jenkins.ovirt.org/view/All/job/vdsm_master_storage_functional_tests_l... why not make it more robust before /var/log/vdsm disappears and make it break again?
because I don't understand the issue. The file is only created if missing. The directory should be there.
It was fixed by me some time ago (I added the mkdir -p before the touch, just in case)
sudo mkdir -p /var/log/vdsm sudo chown vdsm:kvm sudo sh -c 'echo "" > /var/log/vdsm/vdsm.log' sudo sh -c 'echo "" > /var/log/vdsm/supervdsm.log'
I saw that, but this thread was opened after this change.
However, apparently it was not there, which made the echo fail, which led to the job failing. We should understand why it disappeared.
dcaro, eedri - do you have any idea?
Totally agree, and, if it was meant to be there, I'll remove the mkdir to make the test fail if it's not there. But for what I see on the job, there's nothing that ensures you that the directory will be there, vdsm might never have been installed on that machine, or might have been properly cleaned at some point (removing logs and leftovers). So, In my opinion the issue is that we are not cleaning up properly after the vdsm jobs and leaving the logs behind.
We sure are not. In the past I had vdsm logs when vdsm was not installed, which led to this echo. logrotate was suggested to me, but I saw the job is already configured this way and then told this was actually related to something else and the above is how it should be done. A different suggestion is welcome. The test is not supposed to run in parallel to another, this is how it's configured.
Also, to which point can these tests run on docker? Anyone have tried?
No, When asked how to set it up this is what was suggested.
Because it would make it fit to be run in parallel and on any slave with any specific deps.
-- David Caro
Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D
Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

On Tue, Jun 17, 2014 at 10:33:24AM +0200, David Caro wrote:
On Tue 17 Jun 2014 10:24:49 AM CEST, Dan Kenigsberg wrote:
On Tue, Jun 17, 2014 at 03:40:51AM -0400, Vered Volansky wrote:
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Vered Volansky" <vered@redhat.com> Cc: "infra" <infra@ovirt.org> Sent: Monday, June 16, 2014 11:29:42 AM Subject: Re: Bad setup code in vdsm_master_storage_functional_tests_localfs_gerrit
On Sun, Jun 15, 2014 at 04:11:53AM -0400, Vered Volansky wrote:
The job with this issue is gone, let me know if it's risen again.
The fragile code is still in http://jenkins.ovirt.org/view/All/job/vdsm_master_storage_functional_tests_l... why not make it more robust before /var/log/vdsm disappears and make it break again?
because I don't understand the issue. The file is only created if missing. The directory should be there.
It was fixed by me some time ago (I added the mkdir -p before the touch, just in case)
sudo mkdir -p /var/log/vdsm sudo chown vdsm:kvm sudo sh -c 'echo "" > /var/log/vdsm/vdsm.log' sudo sh -c 'echo "" > /var/log/vdsm/supervdsm.log'
However, apparently it was not there, which made the echo fail, which led to the job failing. We should understand why it disappeared.
dcaro, eedri - do you have any idea?
Totally agree, and, if it was meant to be there, I'll remove the mkdir to make the test fail if it's not there. But for what I see on the job, there's nothing that ensures you that the directory will be there, vdsm might never have been installed on that machine, or might have been properly cleaned at some point (removing logs and leftovers). So, In my opinion the issue is that we are not cleaning up properly after the vdsm jobs and leaving the logs behind.
We can never be sure that the former job running on a particular thread has finished properly. It might have aborted before it cleaned the log. Thus, it's prudent to clear the log before a new job starts. But my original question begins - what could possibly remove the directory between the mkdir and the echo? Are you aware of anything that removes the directory (removal of vdsm.rpm does not) Whatever it was, it must not be running while a functional test takes place!
participants (5)
-
Dan Kenigsberg
-
David Caro
-
Eyal Edri
-
Nir Soffer
-
Vered Volansky