Page MenuHomePhabricator

ssh_channel_read() incorrectly returns -1 (SSH_ERROR)
Closed, ResolvedPublic

Description

libssh-0.9.0 used with following code on CentOS Linux release 8.1.1911 and OpenSSH_8.0p1 randomly (once in 10-50 attempts) reports SSH_ERROR during the ssh_channel_read() call.

It prints "Remote Channel is Closed" although all the response data was read correctly.

#include <stdio.h>
#include <libssh/libssh.h>

int main()
{

  ssh_session   session;
  int           userauth;
  int           offset;

  char          buffer[1000], userauthlist[64];

  if (NULL == (session = ssh_new()))
  {
        printf("CANNOT INITIALIZE SESSION\n");
        return -1;
  }


  ssh_set_blocking(session, 1);

  int port = 22;
  int verbosity = SSH_LOG_PACKET;

  char* ip = "127.0.0.1";
  char* user = "root";
  char* passwd = "zabbix";

  if (0 != ssh_options_set(session, SSH_OPTIONS_HOST, ip) ||
      0 != ssh_options_set(session, SSH_OPTIONS_PORT, &port) ||
      0 != ssh_options_set(session, SSH_OPTIONS_USER, user)
      //|| 0 != ssh_options_set(session, SSH_OPTIONS_LOG_VERBOSITY, &verbosity)                                                                                                                                  
      )
  {
        printf("Cannot set SSH session options: %s",
	       ssh_get_error(session));
        return -1;
  }


  if (SSH_OK != ssh_connect(session))
  {
      printf( "Cannot establish SSH session: %s \n", ssh_get_error(session));
      return -1;
  }

  /* check which authentication methods are available */
  if (SSH_AUTH_ERROR == ssh_userauth_none(session, user))
  {
      printf("Error during authentication: %s \n", ssh_get_error(session));
      return -1;
  }

  if (SSH_AUTH_SUCCESS != ssh_userauth_password(session, user, passwd))
  {
      printf( "Password authentication failed: %s", ssh_get_error(session));
      return -1;
  }
  else
  {
      printf("password authentication succeeded");
  }

  ssh_channel channel;
  int rc;
  int nbytes;

  channel = ssh_channel_new(session);
  if (channel == NULL)
    return SSH_ERROR;


  rc = ssh_channel_open_session(channel);
  if (rc != SSH_OK)
  {
   	ssh_channel_free(channel);
        return rc;
  }

  rc = ssh_channel_request_exec(channel, "ls /tmp");
  if (rc != SSH_OK)
  {
      ssh_channel_close(channel);
      ssh_channel_free(channel);
      printf("Channel request exec error");

      return rc;
  }
 nbytes = ssh_channel_read(channel, buffer, sizeof(buffer), 0);
  while (nbytes > 0)
  {
      printf("READ OK, nbytes: %d \n",nbytes);
      int x = write(1, buffer, nbytes);

      printf("READ X, x: %d \n",x);

      if (x != (unsigned int) nbytes)
      {
          ssh_channel_close(channel);
          ssh_channel_free(channel);
          printf("write difference");

          return SSH_ERROR;
      }

      printf("CHANNEL CLOSED TEST: %d \n", ssh_channel_is_closed(channel));
      printf("CHANNEL IS EOF TEST: %d", ssh_channel_is_eof(channel));

      nbytes = ssh_channel_read(channel, buffer, sizeof(buffer), 0);
  }

  if (nbytes < 0)
  {
      printf("nbytes: %d \n",nbytes);

      printf("ERROR: %s \n",ssh_get_error(session));
      printf("CHANNEL CLOSED: %d \n", ssh_channel_is_closed(channel));
      printf("CHANNEL OPENED: %d \n", ssh_channel_is_open(channel));
      printf("CHANNEL IS EOF: %d", ssh_channel_is_eof(channel));


      ssh_channel_close(channel);
      ssh_channel_free(channel);

      return SSH_ERROR;
  }

 ssh_channel_send_eof(channel);
  ssh_channel_close(channel);
  ssh_channel_free(channel);

  printf("\n\n\n\n\n\n\n");
  return SSH_OK;
}

The issue does not seem to manifest itself with OpenSSH 7.
Note that changing various timeout options (in libssh itself or sshd) did not help.

This issue looks to be identical to the one reported here: http://www.libssh.org/archive/libssh/2019-05/0000003.html

Event Timeline

rimdenok created this task.May 22 2020, 3:38 PM
rimdenok updated the task description. (Show Details)
rimdenok renamed this task from ssh_channel_read() incorrectly returns SSH_ERROR to ssh_channel_read() incorrectly returns -1 (SSH_ERROR).May 22 2020, 3:42 PM
rimdenok updated the task description. (Show Details)
rimdenok updated the task description. (Show Details)May 22 2020, 5:34 PM
Jakuje added a subscriber: Jakuje.May 25 2020, 12:43 PM

Could it be a duplicate of an issue described and fixed in this commit [1]? It was also discussed in mailing list recently. Does it work with current master?

https://gitlab.com/libssh/libssh-mirror/-/commit/6417f5a3cac8537ac6f6ff7fc1642dfaa0917fb4

It is fixed in the master.

Is it correct that this issue:

  1. affects only 0.9 versions
  2. 0.8 are not affected
  3. next release (1.0 ?) will include a fix for it

?

Right. It affects only 0.9.x versions. The above commit mentions which revision introduced this issue. The commit is already backported in the stable-0.9 branch so it will be in the next 0.9.5 release

https://gitlab.com/libssh/libssh-mirror/-/commits/stable-0.9

zev333 added a subscriber: zev333.Jun 16 2020, 6:46 PM

I have tested the same code on master (245ad744b5ab0582fef7cf3905a717b791d7e08b commit). ssh_channel_read still return -1 sometimes.
I have enabled ssh debug and it looks like some timing problem. There is log part:

Before read
[2020/06/16 19:34:59.805290, 3] ssh_channel_read_timeout:  Read (1000) buffered : 0 bytes. Window: 64000
[2020/06/16 19:34:59.831316, 4] ssh_socket_pollcallback:  Poll callback on socket 3 (POLLIN ), out buffer 0
[2020/06/16 19:34:59.831415, 3] ssh_packet_socket_callback:  packet: read type 94 [len=48,padding=6,comp=41,payload=41]
[2020/06/16 19:34:59.831472, 3] ssh_packet_process:  Dispatching handler for packet type 94
[2020/06/16 19:34:59.831526, 3] channel_rcv_data:  Channel receiving 32 bytes data in 0 (local win=64000 remote win=2097152)
[2020/06/16 19:34:59.831567, 3] channel_default_bufferize:  placing 32 bytes into channel buffer (stdout)
[2020/06/16 19:34:59.831604, 3] channel_rcv_data:  Channel windows are now (local win=63968 remote win=2097152)
[2020/06/16 19:34:59.831642, 3] ssh_packet_need_rekey:  packet: [data_rekey_needed=0, out_blocks=19, in_blocks=82
[2020/06/16 19:34:59.831758, 3] ssh_socket_unbuffered_write:  Enabling POLLOUT for socket
[2020/06/16 19:34:59.831811, 3] packet_send2:  packet: wrote [type=93, len=16, padding_size=6, comp=9, payload=9]
[2020/06/16 19:34:59.831844, 2] grow_window:  growing window (channel 43:0) to 1280000 bytes
*[2020/06/16 19:34:59.831881, 3] ssh_packet_socket_callback:  Processing 36 bytes left in socket buffer
*[2020/06/16 19:34:59.831925, 3] ssh_packet_socket_callback:  packet: read type 96 [len=16,padding=10,comp=5,payload=5]
*[2020/06/16 19:34:59.831961, 3] ssh_packet_process:  Dispatching handler for packet type 96
*[2020/06/16 19:34:59.831998, 3] channel_rcv_eof:  Received eof on channel (43:0)
*[2020/06/16 19:34:59.832036, 3] ssh_packet_need_rekey:  packet: [data_rekey_needed=0, out_blocks=19, in_blocks=81
[2020/06/16 19:34:59.832075, 3] ssh_packet_need_rekey:  packet: [data_rekey_needed=0, out_blocks=19, in_blocks=81
READ OK, nbytes: 32

With * are marked lines that differ between the log with successful result and the log with failed results. When lines are present before first read is done - everything is fully read, otherwise it fails.
I can provide full logs if required.

Thank you for checking. It looks like I was too fast guessing the fix.

Can you share the complete logs (or at least related snippets with a bit more lines before and after) of good and bad results? I was not able to reproduce it locally.

The ssh_channel_read{,_timeout}() functions use the same function ssh_handle_packets_termination() as the issue I referenced before. This function can return SSH_AGAIN on timeout, but from your log it does not look like it would be hitting the timeout. But still I do not see a clear path what goes wrong there and why

Hello @zev333 ,

Could you please check if the change I proposed in https://gitlab.com/libssh/libssh-mirror/-/merge_requests/122 fixes the issue for you?

Sorry for delay in response.
Mentioned commit fixes problem.