
Hi, What do you mean by "the email to infra-support didn't work"? Email was rejected? Issue was not created? Something else? Thanks. On Wed, Feb 19, 2020 at 11:21 AM Yedidyah Bar David <didi@redhat.com> wrote:
Forwarding to infra, as the email to infra-support didn't work.
---------- Forwarded message --------- From: Yedidyah Bar David <didi@redhat.com> Date: Wed, Feb 19, 2020 at 10:37 AM Subject: ssh connection fails after restarting engine VM To: <infra-support@ovirt.org>
Hi all,
Apparently my jira account was stuck in some state and all my emails to infra-support from last years were dropped to /dev/null. Now saw another case of a ticket I tried to open a month ago, and considered updating it, so spent the time to actually login and see that I can't find it anywhere. So copy/pasting below my original email.
The new case is:
https://jenkins.ovirt.org/job/ovirt-system-tests_he-node-ng-suite-4.3/341/
lago.log has:
2020-02-19 05:10:50,890::log_utils.py::__exit__::611::lago.prefix::INFO:: # [Thread-2] lago-he-node-ng-suite-4-3-host-1: [32mSuccess [0m (in 0:00:08) 2020-02-19 05:10:52,116::ssh.py::get_ssh_client::373::lago.ssh::DEBUG::SSH error connecting to lago-he-node-ng-suite-4-3-engine: No existing session 2020-02-19 05:10:52,116::ssh.py::get_ssh_client::381::lago.ssh::DEBUG::Still got 0 tries for lago-he-node-ng-suite-4-3-engine 2020-02-19 05:10:53,117::log_utils.py::__exit__::611::lago.ssh::DEBUG::end task:e0b52607-6e65-4583-bf43-b615aa901cc7:Get ssh client for lago-he-node-ng-suite-4-3-engine: 2020-02-19 05:10:53,232::log_utils.py::end_log_task::670::root::ERROR:: # [Thread-3] lago-he-node-ng-suite-4-3-engine: [31mERROR [0m (in 0:00:11) 2020-02-19 05:10:53,245::log_utils.py::__exit__::607::lago.prefix::DEBUG:: File "/usr/lib/python2.7/site-packages/lago/prefix.py", line 1526, in _collect_artifacts vm.collect_artifacts(path, ignore_nopath) File "/usr/lib/python2.7/site-packages/lago/plugins/vm.py", line 748, in collect_artifacts ignore_nopath=ignore_nopath File "/usr/lib/python2.7/site-packages/lago/plugins/vm.py", line 468, in extract_paths return self.provider.extract_paths(paths, *args, **kwargs) File "/usr/lib/python2.7/site-packages/lago/plugins/vm.py", line 259, in extract_paths format(self.vm.name())
2020-02-19 05:10:53,245::utils.py::_ret_via_queue::63::lago.utils::DEBUG::Error while running thread Thread-3 Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/lago/utils.py", line 58, in _ret_via_queue queue.put({'return': func()}) File "/usr/lib/python2.7/site-packages/lago/prefix.py", line 1526, in _collect_artifacts vm.collect_artifacts(path, ignore_nopath) File "/usr/lib/python2.7/site-packages/lago/plugins/vm.py", line 748, in collect_artifacts ignore_nopath=ignore_nopath File "/usr/lib/python2.7/site-packages/lago/plugins/vm.py", line 468, in extract_paths return self.provider.extract_paths(paths, *args, **kwargs) File "/usr/lib/python2.7/site-packages/lago/plugins/vm.py", line 259, in extract_paths format(self.vm.name()) ExtractPathError: Unable to extract paths from lago-he-node-ng-suite-4-3-engine: unreachable with SSH
Original email I wrote (and was never received) follows:
Hi all,
See e.g. [1]. In lago.log [2]:
2020-01-19 05:38:06,052::log_utils.py::__exit__::611::ovirtlago.prefix::INFO::@ Run test: 008_restart_he_vm.py: [32mSuccess [0m (in 0:23:27) ...
2020-01-19 05:38:07,680::log_utils.py::__enter__::600::lago.prefix::INFO:: # [Thread-3] lago-he-basic-suite-4-3-engine: [0m [0m ...
2020-01-19 05:38:07,686::log_utils.py::__enter__::600::lago.ssh::DEBUG::start task:170b4eaa-fbf5-48ca-b81a-4ddf0c9a3bd5:Get ssh client for lago-he-basic-suite-4-3-engine: ...
2020-01-19 05:38:12,415::log_utils.py::__exit__::611::lago.prefix::INFO:: # [Thread-1] lago-he-basic-suite-4-3-host-0: [32mSuccess [0m (in 0:00:04) 2020-01-19 05:38:17,729::ssh.py::get_ssh_client::373::lago.ssh::DEBUG::SSH error connecting to lago-he-basic-suite-4-3-engine: No existing session 2020-01-19 05:38:17,730::ssh.py::get_ssh_client::381::lago.ssh::DEBUG::Still got 0 tries for lago-he-basic-suite-4-3-engine 2020-01-19 05:38:18,731::log_utils.py::__exit__::611::lago.ssh::DEBUG::end task:170b4eaa-fbf5-48ca-b81a-4ddf0c9a3bd5:Get ssh client for lago-he-basic-suite-4-3-engine: 2020-01-19 05:38:18,739::log_utils.py::end_log_task::670::root::ERROR:: # [Thread-3] lago-he-basic-suite-4-3-engine: [31mERROR [0m (in 0:00:11) ... ExtractPathError: Unable to extract paths from lago-he-basic-suite-4-3-engine: unreachable with SSH
I wonder what might "No existing session" mean. Perhaps paramiko caches the connection, and after the engine VM is restarted it does not try to connect again? Or something similar?
Searching the net, [3] looks similar, although I am definitely not sure we do not already do this (or similar) in lago.
Anyway, instead of (perhaps) spending time on this, it might be better/possible to explicitly expose some "closeconnection" function in lago, to be used by OST after such an engine vm restart (or similar cases).
[1] https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-4.3/324/ [2] https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-4.3/324/arti... [3] https://stackoverflow.com/questions/57508919/paramiko-ssh-exception-sshexcep...
Thanks, -- Didi
-- Didi _______________________________________________ Infra mailing list -- infra@ovirt.org To unsubscribe send an email to infra-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/infra@ovirt.org/message/I2FEUC5J2SM2YH...
-- Emil Natan RHV/CNV DevOps