Page MenuHomePhabricator

Test torture_proxycommand fails on ubuntu 18.04
Closed, ResolvedPublic

Description

The two tests "torture_options_set_proxycommand_ssh" and "torture_options_set_proxycommand_ssh_stderr" fail to run on ubuntu. I pinned this down to the ssh client, being used as a proxycommand, believing we're running as root and not finding the appropriate files.
I wrote a workaround for running the other test cases on my system in https://gitlab.com/arisada/libssh-mirror/commit/d4428a1d9f351987406988a356a5d47317d13ae3
Basically it creates an environment for the root user.

I created another test case to check if we're running as user bob. To my surprise, most of the environments in the test infrastructure also report getuid() = 0.
https://gitlab.com/arisada/libssh-mirror/commit/f8d1662f17ca9693e43cae83f20e1569f27db7cf

I don't understand why it works there. I moved these two patches to the separate branch because I don't want to merge them into master until we've understood the root cause.

Event Timeline

aris created this task.Nov 21 2019, 4:14 PM
Jakuje added a subscriber: Jakuje.Nov 21 2019, 6:29 PM

Congratulation to the issue #200 :)

Your test is interesting. It actually looks like it is failing also in Fedora builds, while the tests pass correctly there, while the test works in Suse. I am wondering whether this is not actually some issue even with the UID wrapper, which could fail to intercept some of the sycalls, the context gets lost somewhere with the ctest threads or whatever ...

aris added a subscriber: asn.Nov 22 2019, 9:46 AM

@asn mentioned on IRC a bug in libuidwrapper that was caused by a particular/weird implementation of libpam. I have no clue how to debug this problem.

@aris Your test does not call session_setup() so that is the reason for this particular case failing to you. I think using the following will make it working for you:

diff --git a/tests/client/torture_proxycommand.c b/tests/client/torture_proxycommand.c
index 64385472..9b0a9dcf 100644
--- a/tests/client/torture_proxycommand.c
+++ b/tests/client/torture_proxycommand.c
@@ -165,7 +165,9 @@ static void torture_options_set_proxycommand_ssh_stderr(void **state)
 int torture_run_tests(void) {
     int rc;
     struct CMUnitTest tests[] = {
-        cmocka_unit_test(torture_check_uid),
+        cmocka_unit_test_setup_teardown(torture_check_uid,
+                                        session_setup,
+                                        session_teardown),
         cmocka_unit_test_setup_teardown(torture_options_set_proxycommand,
                                         session_setup,
                                         session_teardown),

I tested this so far only on my Fedora, but I am on the way of integrating the Ubuntu to CI too.

Back to the original issue. I just did clean build directory and I see also some issues with proxy command. And these are that ssh is prompting for hostkey verification. I think this is just because I did not run the rest of the tests (which accidentally create the known_hosts in the users directories). So using -o StrictHostKeyChecking=no in the ssh commandline as you propose might be the correct solution for this issue. I will check whether there will be more issues afterward.

The current changeset passes also on Ubuntu:

https://gitlab.com/jjelen/libssh-mirror/commits/master

Would be glad for review of the rest patches (that are not yours :))

reverting the previous comment -- it still fails with Ubuntu. But at least some of the bugs are fixed.

I assume that there is some context lost in the transition how the ssh client is started (through the bash) causing some environment variables getting lost or file descriptors being closed and getuid no longer return correct user information in the forked child. Sigh. Short-term solution is to ignore ubuntu failures, but run the tests there as in my current branch

The whole proxy_command test is mess ... in clean image it hangs for me also in Fedora.

ugh ... so I finally got down to the root cause of this issue. The proxycommand is executed in /bin/sh of the current user. In Fedora we have this symlinked to /bin/bash, while in Ubuntu, this is /bin/dash (sic ,,,). These two differ in a way how they handle environment variables. It looks like they are just ignored in dash. So the poor-man fix is the following in src/socket.c:

-    const char *args[] = {"/bin/sh", "-c", command, NULL};
+    const char *args[] = {"/bin/bash", "-c", command, NULL};

Better would be though to be able to

  • run the proxy command also in bash (very friendly)
  • run the proxy command in the users shell from passwd and set bash in passwd of tests (I think OpenSSH does that in cases where commands are executed) -- I like this one most
  • make sure the environment variables are passed through the shells in some more compatible way

OK, OpenSSH is using the $SHELL environment variable and since the CI runs in bash, this should be simple fix.