On 11/10/2015 03:45 AM, Alon Bar-Lev wrote:
----- Original Message -----
> From: "Yedidyah Bar David" <didi(a)redhat.com>
> To: "John Florian" <jflorian(a)doubledog.org>, "Alon Bar-Lev"
<alonbl(a)redhat.com>, "Roy Golan" <rgolan(a)redhat.com>,
> "Eli Mesika" <emesika(a)redhat.com>
> Cc: "users" <users(a)ovirt.org>
> Sent: Tuesday, November 10, 2015 10:39:27 AM
> Subject: Re: [ovirt-users] Moving a Hosted Engine from Fedora 20 to CentOS 7
>
> On Tue, Nov 10, 2015 at 2:16 AM, John Florian <jflorian(a)doubledog.org> wrote:
>> On 11/09/2015 06:25 PM, John Florian wrote:
>>> I don't think it has anything to do with name resolution either. I
>>> believe the telltale clue is this bit... 2015-11-09 18:22:31,738 WARN
>>> [org.apache.sshd.client.session.ClientSessionImpl] (pool-20-thread-3)
>>> Exception caught: java.lang.IllegalStateException: Unable to negotiate
>>> key exchange for kex algorithms (client: diffie-hellman-group1-sha1 /
>>> server:
>>>
curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1)
>>> As mentioned, I can ssh from my engine to the host just fine. It
>>> appears that the Java-based ssh client however cannot.
>> I got past the above problem by adding the following line to the
>> /etc/ssh/sshd_config of the new F22 host:
>>
>> KexAlgorithms
>>
curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
>>
>> This represents the defaults for F22 -- at least according to
>> sshd_config(5) -- but with the addition of diffie-hellman-group1-sha1
>> that the Java-based ssh client seems insistent on using.
>>
> Adding Alon for this. Not sure if we can configure the java ssh client
> and how.
>
I think that newer than apache-sshd-0.14 altered its behavior, can you please try to
downgrade to apache-sshd-0.13 and see if it helps, if it does we will enforce it. Only in
apache-sshd-1.1.0 (unreleased) we will be able to migrate properly (I hope).
This is my fault. I see now that this was mentioned in the 3.6.0
release notes as a Fedora 22 specific issue (see also
https://bugzilla.redhat.com/show_bug.cgi?id=1225531). The work around
is just exactly as I mentioned above, though I figured it out, then read
the release notes; doh!
I don't even see an apache-sshd package installed on my engine (or hosts).
>> However, all is not rosy. The deploy script ground to a halt
with:
>> [ INFO ] Waiting for the host to become operational in the engine. This
>> may take several minutes...
>> The host hosted_engine_2 is in non-operational state.
>> Please try to activate it via the engine webadmin UI.
>> Retry checking host status or ignore this and continue (Retry,
>> Ignore)[Retry]?
>>
>> So I did as suggested and tried to activate the host from the webadmin
>> UI. That didn't work either. The status message at the bottom of the
>> browser page shows:
>>
>> Host hosted_engine_2 is installed with VDSM version (<UNKNOWN>) and
>> cannot join cluster Default which is compatible with VDSM versions
>> [4.13, 4.14, 4.9, 4.16, 4.11, 4.15, 4.12, 4.10].
>>
>> The attempt to activate the host via the web UI also caused the
>> following to be logged on the engine:
>>
>> 2015-11-09 19:12:39,828 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand] (ajp--127.0.0.1-8702-7)
>> [4bf460e8] Lock Acquired to object EngineLock [exclusiveLocks= key:
>> fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df value: VDS
>> , sharedLocks= ]
>> 2015-11-09 19:12:39,838 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (org.ovirt.thread.pool-8-thread-49) [4bf460e8] Running command:
>> ActivateVdsCommand internal: false. Entities affected : ID:
>> fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df Type: VDSAction group
>> MANIPULATE_HOST with role type ADMIN
>> 2015-11-09 19:12:39,851 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (org.ovirt.thread.pool-8-thread-49) [4bf460e8] Before acquiring lock in
>> order to prevent monitoring for host hosted_engine_2 from data-center
>> Default
>> 2015-11-09 19:12:39,856 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (org.ovirt.thread.pool-8-thread-49) [4bf460e8] Lock acquired, from now a
>> monitoring of host will be skipped for host hosted_engine_2 from
>> data-center Default
>> 2015-11-09 19:12:39,861 INFO
>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>> (org.ovirt.thread.pool-8-thread-49) [4bf460e8] START,
>> SetVdsStatusVDSCommand(HostName = hosted_engine_2, HostId =
>> fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df, status=Unassigned,
>> nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 1d206899
>> 2015-11-09 19:12:39,870 INFO
>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>> (org.ovirt.thread.pool-8-thread-49) [4bf460e8] FINISH,
>> SetVdsStatusVDSCommand, log id: 1d206899
>> 2015-11-09 19:12:39,888 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (org.ovirt.thread.pool-8-thread-49) Activate finished. Lock released.
>> Monitoring can run now for host hosted_engine_2 from data-center Default
>> 2015-11-09 19:12:39,892 INFO
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (org.ovirt.thread.pool-8-thread-49) Correlation ID: 4bf460e8, Job ID:
>> 08a2b1ad-1c1c-425c-b657-7739df72b764, Call Stack: null, Custom Event ID:
>> -1, Message: Host hosted_engine_2 was activated by admin@internal.
>> 2015-11-09 19:12:39,895 INFO
>> [org.ovirt.engine.core.bll.ActivateVdsCommand]
>> (org.ovirt.thread.pool-8-thread-49) Lock freed to object EngineLock
>> [exclusiveLocks= key: fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df value: VDS
>> , sharedLocks= ]
>> 2015-11-09 19:12:40,263 INFO
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand]
>> (DefaultQuartzScheduler_Worker-12) [79b24fed] START,
>> GetHardwareInfoVDSCommand(HostName = hosted_engine_2, HostId =
>> fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df,
>> vds=Host[hosted_engine_2,fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df]), log id:
>> 7be846bb
>> 2015-11-09 19:12:40,298 INFO
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand]
>> (DefaultQuartzScheduler_Worker-12) [79b24fed] FINISH,
>> GetHardwareInfoVDSCommand, log id: 7be846bb
>> 2015-11-09 19:12:40,326 INFO
>> [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand]
>> (DefaultQuartzScheduler_Worker-12) [5569d8a6] Running command:
>> SetNonOperationalVdsCommand internal: true. Entities affected : ID:
>> fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df Type: VDS
>> 2015-11-09 19:12:40,328 INFO
>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>> (DefaultQuartzScheduler_Worker-12) [5569d8a6] START,
>> SetVdsStatusVDSCommand(HostName = hosted_engine_2, HostId =
>> fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df, status=NonOperational,
>> nonOperationalReason=NETWORK_UNREACHABLE, stopSpmFailureLogged=false),
>> log id: 56456697
>> 2015-11-09 19:12:40,330 INFO
>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>> (DefaultQuartzScheduler_Worker-12) [5569d8a6] FINISH,
>> SetVdsStatusVDSCommand, log id: 56456697
>> 2015-11-09 19:12:40,332 ERROR
>> [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand]
>> (DefaultQuartzScheduler_Worker-12) [5569d8a6] Host hosted_engine_2 is
>> set to Non-Operational, it is missing the following networks: ovirtmgmt
>> 2015-11-09 19:12:40,335 WARN
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (DefaultQuartzScheduler_Worker-12) [5569d8a6] Correlation ID: 5569d8a6,
>> Job ID: 98718a7b-4f64-4a3d-bb72-e6d46100edc5, Call Stack: null, Custom
>> Event ID: -1, Message: Host hosted_engine_2 does not comply with the
>> cluster Default networks, the following networks are missing on host:
>> 'ovirtmgmt'
> Didn't you see this one anywhere? There was some problem creating the
> bridge, or something else making the engine think so.
>
> If it seems ok to you, perhaps check also vdsm logs.
>
>> 2015-11-09 19:12:40,341 INFO
>> [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand]
>> (DefaultQuartzScheduler_Worker-12) [a4459ff] Running command:
>> HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities
>> affected : ID: fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df Type: VDS
>> 2015-11-09 19:12:40,383 INFO
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand]
>> (DefaultQuartzScheduler_Worker-12) [a4459ff] START,
>> GetHardwareInfoVDSCommand(HostName = hosted_engine_2, HostId =
>> fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df,
>> vds=Host[hosted_engine_2,fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df]), log id:
>> 3cf70a78
>> 2015-11-09 19:12:40,387 INFO
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand]
>> (DefaultQuartzScheduler_Worker-12) [a4459ff] FINISH,
>> GetHardwareInfoVDSCommand, log id: 3cf70a78
>> 2015-11-09 19:12:40,407 INFO
>> [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand]
>> (DefaultQuartzScheduler_Worker-12) [3031cc4e] Running command:
>> SetNonOperationalVdsCommand internal: true. Entities affected : ID:
>> fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df Type: VDS
>> 2015-11-09 19:12:40,408 INFO
>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>> (DefaultQuartzScheduler_Worker-12) [3031cc4e] START,
>> SetVdsStatusVDSCommand(HostName = hosted_engine_2, HostId =
>> fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df, status=NonOperational,
>> nonOperationalReason=NETWORK_UNREACHABLE, stopSpmFailureLogged=false),
>> log id: 17fa8ac0
>> 2015-11-09 19:12:40,411 INFO
>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
>> (DefaultQuartzScheduler_Worker-12) [3031cc4e] FINISH,
>> SetVdsStatusVDSCommand, log id: 17fa8ac0
>> 2015-11-09 19:12:40,413 ERROR
>> [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand]
>> (DefaultQuartzScheduler_Worker-12) [3031cc4e] Host hosted_engine_2 is
>> set to Non-Operational, it is missing the following networks: ovirtmgmt
>> 2015-11-09 19:12:40,418 INFO
>> [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand]
>> (DefaultQuartzScheduler_Worker-12) [5e83f7aa] Running command:
>> HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities
>> affected : ID: fab55ebe-cc0f-4f95-87aa-fc3a5e08a5df Type: VDS
>> 2015-11-09 19:12:40,439 INFO
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (DefaultQuartzScheduler_Worker-12) [5e83f7aa] Correlation ID: null, Call
>> Stack: null, Custom Event ID: -1, Message: Status of host
>> hosted_engine_2 was set to NonOperational.
>> 2015-11-09 19:12:40,443 ERROR
>> [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
>> (DefaultQuartzScheduler_Worker-12) [5e83f7aa]
>> ResourceManager::refreshVdsRunTimeInfo: Error:
>> DataIntegrityViolationException: CallableStatementCallback; SQL [{call
>> updatevdsdynamic(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}]; ERROR: value too long for type
>> character varying(255)
> That's ugly too, not sure why it happened and how reproducible, but
> I think you should first take care of the ovirtmgmt network.
>
> Adding Eli anyway.
>
>> Where: SQL statement "UPDATE vds_dynamic
>> SET cpu_cores = v_cpu_cores,cpu_threads = v_cpu_threads,
>> cpu_model = v_cpu_model,cpu_speed_mh = v_cpu_speed_mh,
>> if_total_speed = v_if_total_speed,kvm_enabled = v_kvm_enabled,
>> mem_commited = v_mem_commited,physical_mem_mb = v_physical_mem_mb,
>> status = v_status,vm_active = v_vm_active,vm_count = v_vm_count,
>> vm_migrating = v_vm_migrating,reserved_mem = v_reserved_mem,
>> guest_overhead = v_guest_overhead,rpm_version = v_rpm_version,
>> software_version = v_software_version,
>> version_name = v_version_name,build_name =
>> v_build_name,previous_status = v_previous_status,
>> cpu_flags = v_cpu_flags,
>> vms_cores_count = v_vms_cores_count,pending_vcpus_count =
>> v_pending_vcpus_count,
>> pending_vmem_size = v_pending_vmem_size,
>> cpu_sockets = v_cpu_sockets,net_config_dirty = v_net_config_dirty,
>> supported_cluster_levels = v_supported_cluster_levels,
>> supported_engines = v_supported_engines,host_os = v_host_os,
>> kvm_version = v_kvm_version,libvirt_version =
>> v_libvirt_version,spice_version = v_spice_version,
>> gluster_version = v_gluster_version,
>> kernel_version = v_kernel_version,iscsi_initiator_name =
>> v_iscsi_initiator_name,
>> transparent_hugepages_state = v_transparent_hugepages_state,
>> hooks = v_hooks,
>> _update_date = LOCALTIMESTAMP,non_operational_reason =
>> v_non_operational_reason,
>> hw_manufacturer = v_hw_manufacturer, hw_product_name =
>> v_hw_product_name,
>> hw_version = v_hw_version, hw_serial_number = v_hw_serial_number,
>> hw_uuid = v_hw_uuid, hw_family = v_hw_family, hbas = v_hbas,
>> supported_emulated_machines = v_supported_emulated_machines,
>> kdump_status = v_kdump_status, selinux_enforce_mode =
>> v_selinux_enforce_mode,
>> auto_numa_balancing = v_auto_numa_balancing,
>> is_numa_supported = v_is_numa_supported,
>> supported_rng_sources = v_supported_rng_sources,
>> is_live_snapshot_supported = v_is_live_snapshot_supported,
>> is_live_merge_supported = v_is_live_merge_supported,
>> online_cpus = v_online_cpus
>> WHERE vds_id = v_vds_id"
>>
>>
>> The SQL error then seems to repeat several more times.
>>
>> -- John Florian
> Thanks for the report.
>
> Best,
> --
> Didi
>
--
John Florian