stty: standard tty: inappropriate ioctl for device

I'd like to share this with the list because its something that I changed for convenience, in .bashrc, but had a not so obvious rippling impact on the ovirt self hosted installer. I could imagine a few others doing this too and I'd rather save future them hours of google time Every install failed no matter what and it was always at the SSO step to revoke token (here: https://github.com/machacekondra/ansible/blob/71547905fab67a876450f95c9ca714...) and then reissue new token but the engine log yielded different information. The SSO failure was a red herring engine.log:2020-05-04 22:09:17,150-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [61038b68] Host installation failed for host '672be551-9259-4d2d-823d-07f586b4e0f1', 'node1': Unexpected error during execution: stty: standard input: Inappropriate ioctl for device engine.log:2020-05-04 22:09:17,145-04 ERROR [org.ovirt.engine.core.uutils.ssh.SSHDialog] (EE-ManagedThreadFactory-engine-Thread-1) [61038b68] SSH error running command root@node1:'umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x && "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALOG/customization=bool:True': RuntimeException: Unexpected error during execution: stty: standard input: Inappropriate ioctl for device I was on the hunt for this for the better part of 2 days or so, because who else has anything to do during quarantine, wracking my brain and trying everything to figure out what was going on Well, it was my own fault # cat .bashrc | grep -i stty stty erase ^? With this set in the .bashrc of the node I was running the installer via cockpit from, ovirt installer will fail to install This was set for convenience to have backspace work in vim since at some point it stopped working for me Should I file this as a bug? The message generated is more of a warning then a failure but I do not know the internals of ovirt like that. Commands still actually execute fine ovirt-engine@192.168.222.84) # ssh 10.0.16.221 "uptime" root@10.0.16.221's password: stty: standard input: Inappropriate ioctl for device 22:30:01 up 22:45, 2 users, load average: 0.80, 0.62, 0.80 One thing I think should be called out in the docs, and called out very loud, is that the entire ovirt installer expects a clean 110% machine that is done right after install and provided and IP and hostname. Its not that obvious, but it is now

On Tue, May 5, 2020 at 6:02 AM Charles Kozler <ckozleriii@gmail.com> wrote:
I'd like to share this with the list because its something that I changed for convenience, in .bashrc, but had a not so obvious rippling impact on the ovirt self hosted installer. I could imagine a few others doing this too and I'd rather save future them hours of google time
Thanks!
Every install failed no matter what and it was always at the SSO step to revoke token (here: https://github.com/machacekondra/ansible/blob/71547905fab67a876450f95c9ca714...) and then reissue new token but the engine log yielded different information. The SSO failure was a red herring
What do you mean by that? Usually (I am not a native English speaker), "red herring" means, for me, "something that made me look at the error not in the place where it actually occurred". In this case:
engine.log:2020-05-04 22:09:17,150-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1) [61038b68] Host installation failed for host '672be551-9259-4d2d-823d-07f586b4e0f1', 'node1': Unexpected error during execution: stty: standard input: Inappropriate ioctl for device
engine.log:2020-05-04 22:09:17,145-04 ERROR [org.ovirt.engine.core.uutils.ssh.SSHDialog] (EE-ManagedThreadFactory-engine-Thread-1) [61038b68] SSH error running command root@node1:'umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x && "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALOG/customization=bool:True': RuntimeException: Unexpected error during execution: stty: standard input: Inappropriate ioctl for device
The error message does mention 'stty', and I guess this is more-or-less the best the engine could have done, other than trying to analyze your .bashrc.
I was on the hunt for this for the better part of 2 days or so, because who else has anything to do during quarantine, wracking my brain and trying everything to figure out what was going on
Well, it was my own fault
# cat .bashrc | grep -i stty stty erase ^?
A common idiom for such cases is to check PS1 - bash should set it only for interactive shells. E.g.: if [ "$PS1" ]; then stty erase ^? fi But you might better fix your terminal emulator instead.
With this set in the .bashrc of the node I was running the installer via cockpit from, ovirt installer will fail to install
This was set for convenience to have backspace work in vim since at some point it stopped working for me
:-( (Used to be more common 20-30 years ago, when people actually had many more different physical terminals, terminal emulators, and unix-like OSes. Sadly our industry failed to fix this, requiring such workarounds, so far. See also, likely way-out-of-date: http://www.tldp.org/HOWTO/BackspaceDelete/ )
Should I file this as a bug?
You can, but not sure it will be handled, unless you provide very concrete expected-behavior with sound reasoning...
The message generated is more of a warning then a failure but I do not know the internals of ovirt like that. Commands still actually execute fine
ovirt-engine@192.168.222.84) # ssh 10.0.16.221 "uptime" root@10.0.16.221's password: stty: standard input: Inappropriate ioctl for device 22:30:01 up 22:45, 2 users, load average: 0.80, 0.62, 0.80
Checking the current 4.3 code, I see that we fail if there is anything at all on stderr. Trying to check git log finding out why we do that, I fail to find a concrete reason - although, if you ask me, it might make sense. Changing that is easy, but then might make real errors you should actually notice, go unnoticed. In 4.4 we do not have this code anymore, and use ansible instead for host-deploy. Since 4.4 GA is expected soon (a few weeks?), and 4.3 will be EOLed soon after that, I do not see much point in investing in this anyway.
One thing I think should be called out in the docs, and called out very loud, is that the entire ovirt installer expects a clean 110% machine that is done right after install and provided and IP and hostname. Its not that obvious, but it is now
Well, I wouldn't say "110%", but the fact that we try to do things automatically, on remote machines, using tools that were mainly intended for manual/interactive use, does mean that we are limited. You are still welcome to open a documentation bug if you feel one is needed. Again, thanks for the report! Best regards, -- Didi

What do you mean by that? Usually (I am not a native English speaker), "red herring" means, for me, "something that made me look at the error not in the place where it actually occurred". In this case:
Yep! The installer would die out after failing at the SSO step. In the log for this failure it fails with the linked error "You must specify either url or hostname" like you see in that bug report. Nowhere was I was able to deduce it was a failing SSH remote command. I just had happened to decide to look at the other log and saw it
The funny thing is, it was right in front of my face the whole time (stty) but it never dawned on me until I decided to try running a command myself like you see with uptime below
The error message does mention 'stty', and I guess this is more-or-less the best the engine could have done, other than trying to analyze your .bashrc. {...} A common idiom for such cases is to check PS1 - bash should set it only for interactive shells. E.g.:
if [ "$PS1" ]; then stty erase ^? fi
But you might better fix your terminal emulator instead. {...} :-(
(Used to be more common 20-30 years ago, when people actually had many more different physical terminals, terminal emulators, and unix-like OSes. Sadly our industry failed to fix this, requiring such workarounds, so far. See also, likely way-out-of-date: http://www.tldp.org/HOWTO/BackspaceDelete/ )
Yup, going forward I will be putting it like that by checking for PS1. This was just such an edge case that it had completely slipped my mind that it would be the problem
I dont know why or when this started occuring - I want to say around RHEL 7.5 or so, but vim reverted back to this behavior I havent seen in almost 15 years
I run Fedora on all my machines but collegues on Win10 also started to see it
The only fix is the stty thing
Checking the current 4.3 code, I see that we fail if there is anything at all on stderr. Trying to check git log finding out why we do that, I fail to find a concrete reason - although, if you ask me, it might make sense. Changing that is easy, but then might make real errors you should actually notice, go unnoticed.
In 4.4 we do not have this code anymore, and use ansible instead for host-deploy. Since 4.4 GA is expected soon (a few weeks?), and 4.3 will be EOLed soon after that, I do not see much point in investing in this anyway.
I agree. I can see the benefit in it being a little more decisive, however, the trade off means the developer has to try to account for every case and this, of course, was a complete edge case so likely would have been missed if parsing individual stderr output
However, you can run SSH not try and load .bashrc environment - so I was thinking something more along the lines of that
[root@node01 ~]# ssh root@localhost uptime root@localhost's password: stty: standard input: Inappropriate ioctl for device 07:50:39 up 1 day, 8:05, 2 users, load average: 0.08, 0.07, 0.06
[root@node01 ~]# ssh -t root@localhost uptime root@localhost's password: 07:51:47 up 1 day, 8:07, 3 users, load average: 0.06, 0.07, 0.06 Connection to localhost closed.
Well, I wouldn't say "110%", but the fact that we try to do things automatically, on remote machines, using tools that were mainly intended for manual/interactive use, does mean that we are limited.
I guess 110 is a bit of a biased over-exaggeration :-)
The only reason I say that is because I love ovirt a lot but every experience I have had with getting it installed as always ended with a different failure each time where there isnt an obvious reason as to why there was a problem and it deters me from installations and only having to do it if I have to
The reason I have found, specifically now with this with such a minor change in to bashrc, is that in enterprise environment we have a standard base configuration that we work off - this involves changing SSH parameters, bashrc environment changes, security changes w/ sysctl and others
Every time I have had a problem it is apparent now it is because of any of those changes. My suggestion was more or less to have it called out somewhere obvious that oVirt installer/ansible generally want a completely fresh install and that making any changes to the system prior could have negative impact. Example here: a simple convenience change to .bashrc that we have done all the time led to a 2 day time sink
Going forward, my standard operating for install is going to be install immediately after bare ISO install from minimal ISO and bypass any bootstrapping changes we do to our systems. Then, once up, I will apply our changes as I have never had issues with ovirt after its up - its the install that is always a very rigid time consuming problem for me

On Tue, May 5, 2020 at 3:10 PM Charles Kozler <ckozleriii@gmail.com> wrote:
What do you mean by that? Usually (I am not a native English speaker), "red herring" means, for me, "something that made me look at the error not in the place where it actually occurred". In this case:
Yep! The installer would die out after failing at the SSO step. In the log for this failure it fails with the linked error "You must specify either url or hostname" like you see in that bug report.
Which one? Anyway, IMO this is indeed a bug. I'd definitely expect it to either fail immediately or not at all. If you do open a bug, please attach complete relevant logs. Thanks.
Nowhere was I was able to deduce it was a failing SSH remote command. I just had happened to decide to look at the other log and saw it
The funny thing is, it was right in front of my face the whole time (stty) but it never dawned on me until I decided to try running a command myself like you see with uptime below
The error message does mention 'stty', and I guess this is more-or-less the best the engine could have done, other than trying to analyze your .bashrc. {...} A common idiom for such cases is to check PS1 - bash should set it only for interactive shells. E.g.:
if [ "$PS1" ]; then stty erase ^? fi
But you might better fix your terminal emulator instead. {...} :-(
(Used to be more common 20-30 years ago, when people actually had many more different physical terminals, terminal emulators, and unix-like OSes. Sadly our industry failed to fix this, requiring such workarounds, so far. See also, likely way-out-of-date: http://www.tldp.org/HOWTO/BackspaceDelete/ )
Yup, going forward I will be putting it like that by checking for PS1. This was just such an edge case that it had completely slipped my mind that it would be the problem
I dont know why or when this started occuring - I want to say around RHEL 7.5 or so, but vim reverted back to this behavior I havent seen in almost 15 years
Well, I don't think I saw it myself in recent years (including RHEL 7, both before and after 7.5, and 8). I guess that's a bug as well. Does it happen on the console? ssh from somewhere? Where? We did get reports about a different, unrelated but similar issue, when using ssh from a Mac, https://bugzilla.redhat.com/show_bug.cgi?id=1366916 .
I run Fedora on all my machines but collegues on Win10 also started to see it
The only fix is the stty thing
Checking the current 4.3 code, I see that we fail if there is anything at all on stderr. Trying to check git log finding out why we do that, I fail to find a concrete reason - although, if you ask me, it might make sense. Changing that is easy, but then might make real errors you should actually notice, go unnoticed.
In 4.4 we do not have this code anymore, and use ansible instead for host-deploy. Since 4.4 GA is expected soon (a few weeks?), and 4.3 will be EOLed soon after that, I do not see much point in investing in this anyway.
I agree. I can see the benefit in it being a little more decisive, however, the trade off means the developer has to try to account for every case and this, of course, was a complete edge case so likely would have been missed if parsing individual stderr output
However, you can run SSH not try and load .bashrc environment - so I was thinking something more along the lines of that
[root@node01 ~]# ssh root@localhost uptime root@localhost's password: stty: standard input: Inappropriate ioctl for device 07:50:39 up 1 day, 8:05, 2 users, load average: 0.08, 0.07, 0.06
[root@node01 ~]# ssh -t root@localhost uptime root@localhost's password: 07:51:47 up 1 day, 8:07, 3 users, load average: 0.06, 0.07, 0.06 Connection to localhost closed.
That's an option too (assuming apache-sshd has something similar to '-t'), but as I said, we soon move to ansible, so would not invest in this currently.
Well, I wouldn't say "110%", but the fact that we try to do things automatically, on remote machines, using tools that were mainly intended for manual/interactive use, does mean that we are limited.
I guess 110 is a bit of a biased over-exaggeration :-)
The only reason I say that is because I love ovirt a lot but every experience I have had with getting it installed as always ended with a different failure each time where there isnt an obvious reason as to why there was a problem and it deters me from installations and only having to do it if I have to
The reason I have found, specifically now with this with such a minor change in to bashrc, is that in enterprise environment we have a standard base configuration that we work off - this involves changing SSH parameters, bashrc environment changes, security changes w/ sysctl and others
Every time I have had a problem it is apparent now it is because of any of those changes. My suggestion was more or less to have it called out somewhere obvious that oVirt installer/ansible generally want a completely fresh install and that making any changes to the system prior could have negative impact. Example here: a simple convenience change to .bashrc that we have done all the time led to a 2 day time sink
Going forward, my standard operating for install is going to be install immediately after bare ISO install from minimal ISO and bypass any bootstrapping changes we do to our systems. Then, once up, I will apply our changes as I have never had issues with ovirt after its up - its the install that is always a very rigid time consuming problem for me
I see your point. Generally speaking, we'd love to have things work well either way. It's not always easy, nor is it clear how to do this beforehand. If you have concrete ideas/opinions, by all means, please open bugs. Thanks! Best regards, -- Didi
participants (2)
-
Charles Kozler
-
Yedidyah Bar David