Nir,

Looks like its crashing on the dmidecode system call.

I've attached the output from gbd as well as a dmidecode text dump, dmidecode binary dump and each keywords run individually.

From the keywords it look like my dmi info is corrupted.  I have download a AMI dmi editor but this only allows access to limited fields.  Do you know another tools to rewrite the dmi info?


Thanks so much for your help.

Cheers,


On Thu, Oct 13, 2016 at 5:34 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Tue, Oct 11, 2016 at 11:59 PM, David Pinkerton <dpinkert@redhat.com> wrote:
> Logs attached

According vdsm.log and supervdsm.log, each time vdsm try to call
getHardwareInfo,
supervdsm show the start of the call and then it show no logs for 10 seconds,
and than we see the startup message.

So it seems that supervdsm is crashing each time it try to invoke dmidecode
code.

To dig deeper, I suggest you try to run the relevant code from the
shell. If this
code crash, we will see the details in the shell, and we can also run the python
shell in gdb to debug this.

Try this:

1. Open a python shell as root

    $ sudo python

2. In the shell, type this

    >>> from vdsm import dmidecodeUtil
    >>> dmidecodeUtil.getHardwareInfoStructure()

If at this point the the python shell crash, please try:

1. Install python debug-info packages:

    $ sudo debuginfo-install -y python

2. Start python in gdb

    $ sudo gdb python

3. In the gdb shell, run python

    (gdb) run

Python shell will show, type the code above again.

If this crash in gdb, please type this in the gdb shell:

    (gdb) thread apply all bt full


Nir

>
> On Mon, Oct 10, 2016 at 4:59 PM, Nir Soffer <nsoffer@redhat.com> wrote:
>>
>> On Mon, Oct 10, 2016 at 5:05 AM, Charles Kozler <ckozleriii@gmail.com>
>> wrote:
>>>
>>> Possibly stupid question but are you doing this on a base empty
>>> centos/rhel 7?
>>>
>>>
>>> On Oct 9, 2016 9:48 PM, "David Pinkerton" <dpinkert@redhat.com> wrote:
>>>>
>>>>
>>>> I've spent the weekend trying to get to the bottom of this issue.
>>>>
>>>> Adding a Host fails:
>>>>
>>>> From RHVM
>>>>
>>>>
>>>> VDSM rhv1 command failed: Connection reset by peer
>>>> Could not get hardware information for host rhv1
>>>> VDSM rhv1 command failed: Failed to read hardware information
>>>> Host rhv1 installed
>>>> Network changes were saved on host rhv1
>>>> Installing Host rhv1. Stage: Termination.
>>>> Installing Host rhv1. Retrieving installation logs to:
>>>> '/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20161010115606-192.168.21.71-24d39274.log'.
>>>> Installing Host rhv1. Stage: Pre-termination.
>>>> Installing Host rhv1. Starting ovirt-vmconsole-host-sshd.
>>>> Installing Host rhv1. Starting vdsm.
>>>> Installing Host rhv1. Stopping libvirtd.
>>>> Installing Host rhv1. Stage: Closing up.
>>>> Installing Host rhv1. Setting kernel arguments.
>>>> Installing Host rhv1. Stage: Transaction commit.
>>>> Installing Host rhv1. Enrolling serial console certificate.
>>>> Installing Host rhv1. Enrolling certificate.
>>>> Installing Host rhv1. Stage: Misc configuration.
>>>>
>>>>
>>>>
>>>> This was in the /var/log/vdsm/vdsm.log on the host trying to be added:
>>>>
>>>> jsonrpc.Executor/2::ERROR::2016-10-10
>>>> 11:57:10,276::API::1340::vds::(getHardwareInfo) failed to retrieve hardware
>>>> info
>>>> Traceback (most recent call last):
>>>>   File "/usr/share/vdsm/API.py", line 1337, in getHardwareInfo
>>>>     hw = supervdsm.getProxy().getHardwareInfo()
>>>>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 53, in
>>>> __call__
>>>>     return callMethod()
>>>>   File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 51, in
>>>> <lambda>
>>>>     **kwargs)
>>>>   File "<string>", line 2, in getHardwareInfo
>>>>   File "/usr/lib64/python2.7/multiprocessing/managers.py", line 759, in
>>>> _callmethod
>>>>     kind, result = conn.recv()
>>>> EOFError
>>
>>
>> If a request to supervdsm fails with EOFError, something bad happened
>> supervdsm and we would see the exception in the supervdsm log.
>>
>> Can you share supervdsm.log?
>>
>> Nir
>
>
>
>
> --
>
> David Pinkerton
> Consultant
> Red Hat Asia Pacific Pty. Ltd.
> Level 11, Canberra House
> 40 Marcus Clarke Street
> Canberra 2600 ACT
>
> Mobile: +61-488-904-232
> Email: david.pinkerton@redhat.com
> Web: http://apac.redhat.com/
>
>
> _______________________________________________
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>



--
David Pinkerton
Consultant
Red Hat Asia Pacific Pty. Ltd.
Level 11, Canberra House
40 Marcus Clarke Street
Canberra 2600 ACT

Mobile: +61-488-904-232
Email: david.pinkerton@redhat.com
Web: http://apac.redhat.com/