Sunday, February 28, 2021

19c gridSetup.sh failed with [INS-06006] Passwordless SSH connectivity not set up between the following nodes

When running 19.3 gridSetup.sh to configure Grid Infrastructure, following error occurs at configuring passwordless SSH connectivity between cluster nodes
[INS-06006] Passwordless SSH connectivity not set up between the following node(s): [host02].

Cause - Either passwordless SSH connectivity is not setup between specified node(s) or they are not reachable. Refer to the logs for more details.

Action - Refer to the logs for more details or contact Oracle Support Services.

More Details
PRVF-5311 : File "/tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out" either does not exist or is not accessible on node "host02". 

Manually test passwordless SSH connectivity between nodes
[grid@host01]$ ssh host02       <=  Connect to second node host02 from first node host01
[grid@host02]$                  <=  Connected to host02 successfully without password

[grid@host02]$ ssh host02       <=  Connect to first node host01 from second node host02
[grid@host01]$                  <=  Connected to host01 successfully without password

Apparently, the passwordless SSH connectivity has been configured properly. Re-run gridSetup.sh in debug mode to figure out what happened,
grid@host01]$ /u01/app/19.0.0/grid _1/gridSetup.sh -debug

Launching Oracle Grid Infrastructure Setup Wizard...

[main] [ 2021-02-24 17:29:53.105 EST ] [Version.isPre:757]  version to be checked 19.0.0.0.0 major version to check against 10
[main] [ 2021-02-24 17:29:53.106 EST ] [Version.isPre:768]  isPre.java: Returning FALSE

  <<Message truncated>>

[Worker 2] [ 2021-02-24 17:31:26.521 EST ] [Utils.getLocalHost:487]  Hostname retrieved: host01, returned: host01
[Worker 2] [ 2021-02-24 17:31:26.521 EST ] [Utils.getLocalHost:487]  Hostname retrieved: host01, returned: host01
[Worker 2] [ 2021-02-24 17:31:26.521 EST ] [UnixSystem.remoteCopyFile:848]  UnixSystem: /usr/bin/scp -p host02:'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out' /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out
[Worker 2] [ 2021-02-24 17:31:26.534 EST ] [RuntimeExec.runCommand:294]  runCommand: Waiting for the process
[Thread-442] [ 2021-02-24 17:31:26.534 EST ] [StreamReader.run:62]  In StreamReader.run 
[Thread-443] [ 2021-02-24 17:31:26.534 EST ] [StreamReader.run:62]  In StreamReader.run 
[Thread-443] [ 2021-02-24 17:31:26.855 EST ] [StreamReader.run:66]  ERROR>protocol error: filename does not match request
[Worker 2] [ 2021-02-24 17:31:26.855 EST ] [RuntimeExec.runCommand:296]  runCommand: process returns 1
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.runCommand:323]  RunTimeExec: error>
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.runCommand:326]  protocol error: filename does not match request
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:516]  Calling Runtime.exec() with the command 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:518]  /usr/bin/scp 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:518]  -p 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:518]  host02:'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out' 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:518]  /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.runCommand:349]  Returning from RunTimeExec.runCommand
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [NativeSystem.rununixcmd:1345]  NativeSystem.rununixcmd: RetString 0|protocol error: filename does not match request :failed
[Worker 2] [ 2021-02-24 17:31:26.859 EST ] [ClusterConfig$ExecuteCommand.returnCommandToClient:3324]  returnCommandToClient; fillCount=1 is full=false
[Worker 2] [ 2021-02-24 17:31:26.859 EST ] [Semaphore.release:88]  SyncBufferFull:Release called by thread Worker 2 m_count=2

When validating/configuring passwordless SSH connectivity, it has to copy SSH key information between nodes using scp. From the trace we can find that, it uses scp with -p option to copy the files, and destination file on local server (host01) is sent to scp with file name:

  /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out
  
And source file on remote server(host02) is sent to scp with file name:

  host02:'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out'

Interesting thing is that the whole file path includes single quotation marks, I guess Oracle programmer left them there by accident because there is no reason to use them. Let's verify if it is the culprit by manually running scp
[grid@host01]$ scp -p host02:"'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out'" /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out
protocol error: filename does not match request
[grid@host01]$
[grid@host01]$ scp -T -p host02:"'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out'" /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out
getFileInfo1638946.out                                             98    56.5KB/s   00:00
[grid@host01]$

The scp fails with same error "protocol error: filename does not match request", but succeeds if extra option -T is used.

The -T option was introduced by OpenSSH 8.0 released in April 2019. In earlier version of OpenSSH, when copying files from a remote system to a local directory, scp did not verify that the file names that the server sent matched those requested by the client. This could allow a hostile server to create or clobber unexpected local files with attacker-controlled content. OpenSSH 8.0 fixed this security issue and scp, by default, verifies the file name on client side, and also introduced -T option to provide capacity to disable the verification.

Although OpenSSH officially claims that the fix is introduced in 8.0, gridSetup.sh 19.3 also fails with same reason on AIX with OpenSSH 7.5p1 and it is where the errors used in this article happened.

Oracle gridSetup.sh 19.3 (base pubic release of 19c) sends remote file name with single quotation marks, but remote server returns file name without quotation. Techinally, they are same thing, but they are visually different. Therefore, old version scp worked because it did not verify them, but current scp fails it with "filename does not match".

Oracle Release Update 19.6 fixed this problem by removing the single quotation marks. Therefore, gridSetup.sh can be run successfully with -applyRU option to apply 19.6 or higher Realease Update before installing/configuring GI
[grid@r6-dart]$ ./gridSetup.sh -applyRU /u01/stage/30501910
Preparing the home to patch...
Applying the patch /u01/stage/grid/30501910...
Successfully applied the patch.
The log can be found at: /u01/app/oraInventory/logs/GridSetupActions2021-02-24_03-25-57PM/installerPatchActions_2021-02-24_03-25-57PM.log
...

Here, Oracle Grid Infrastructure Release Update 19.6 (Patch 30501910) is unzipped under directory /u01/stage/30501910.

If 19.3 is really needed for some reason, as a temporary workaround, we can rename scp and create a new scp
# Rename the original scp
mv /usr/bin/scp /usr/bin/scp.bak

# Create a new file scp
echo "/usr/bin/scp.orig -T $*" > /usr/bin/scp

# Make the file executable
chmod a+rx /usr/bin/scp

After successfully installing GI 19.3, remember restore original scp
# Delete interim scp
rm /usr/bin/scp

# Restore the original scp.
mv /usr/bin/scp.bak /usr/bin/scp

No comments: