Showing posts with label Upgrade and Migration. Show all posts
Showing posts with label Upgrade and Migration. Show all posts

Wednesday, June 16, 2021

Oracle 19.11 roothas.sh failed with "Out of memory" on AIX

When applying Oracle GI Release Update 19.11.0.0.210420 on AIX, command "roothas.sh -postpatch" failed with "Out of memory" as following
[root@host01]# /u01/app/oracle/product/19.0.0/grid_1/crs/install/roothas.sh -postpatch
Using configuration parameter file: /u01/app/oracle/product/19.0.0/grid_1/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/oracle/crsdata/host01/crsconfig/hapatch_2021-06-15_01-53-27PM.log
Out of memory!
Out of memory!
Out of memory!
/u01/app/oracle/product/19.0.0/grid_1/crs/install/roothas.sh[137]: 7930494 Segmentation fault(coredump)
The command '/u01/app/oracle/product/19.0.0/grid_1/perl/bin/perl -I/u01/app/oracle/product/19.0.0/grid_1/perl/lib -I/u01/app/oracle/product/19.0.0/grid_1/crs/install -I/u01/app/oracle/product/19.0.0/grid_1/xag /u01/app/oracle/product/19.0.0/grid_1/crs/install/roothas.pl -postpatch' execution failed

The error is coming from the perl process (script roothas.pl) executing the root configuration scripts, the perl process does not have enough memory(especially data segments) during the script execution.

Technically, the "Out of memory" error could be seen on AIX while running Oracle 19c perl scripts for root configuration. These perl scripts are usually executed/called by DBA with sh script root.sh, rootupgrade.sh, rootcrs.sh, roothas.sh, etc. Although the error was hard to see before, it is becoming popular since Oracle GI RU 19.11 was released.

It is caused by OS memory allocation method. On AIX, the number of data segments that a process is allowed to use also limits the process memory size. The default number of data segments is one. The size of a data segment is 256 MB. Data segments are shared for both data and stack. The maximum number of additional data segments a process can use is eight (2GB). The number of segments that a process can use for data is controlled by the LDR_CNTRL environment variable. It is defined in the parent process of the process that is to be affected.

Therefore, we are able to fix the issue by increasing process memory size through increasing value of environment variable LDR_CNTRL. For example, the following defines eight additional data segments

export LDR_CNTRL=MAXDATA=0x80000000
<root script>
unset LDR_CNTRL

Here, 
  <root script> is what you have to run as root which got "Out of memory". In my case, it is "roothas.sh -postpatch".
  unset command remove (unset) the LDR_CNTRL environment variable, so that it does not unintentionally affect other processes.

Some argumentative guy says eight additional data segments (2GB) is too large, ok, you can set to 4 as following

export LDR_CNTRL=MAXDATA=0x40000000

Tuesday, March 30, 2021

OEM 13.2 Exported Ruleset Failed with 'Error in parsing properties for RuleSet from the XML' when Importing into EM 13.x

When importing ruleset in Enterprise Manager 13.x (confirmed in 13.2 and 13.4) from xml file exported from Enterprise Manager 13.2, got following error,

  Error in parsing properties for RuleSet from the XML

If the destination EM version is 13.2, following message will be found from trace file emoms.trc under directory <OMS_INSTANCE_BASE>/em/EMGC_OMS1/sysman/log
2021-03-29 22:49:12,394 [[ACTIVE] ExecuteThread: '7' for queue: 'weblogic.kernel.Default (self-tuning)'] ERROR exportimport.ImportHelper logp.251 - Error in parsing properties for RuleSet from the XML
oracle.sysman.emSDK.app.exception.EMSystemException
     at oracle.sysman.core.event.rules.uimodel.rulesfwk.RulesModelUtil.throwEMSystemException(RulesModelUtil.java:1486)
     at oracle.sysman.core.event.rules.uimodel.rulesfwk.RulesFwkEntityImpl.postChanges(RulesFwkEntityImpl.java:116)
     ...
    ... 100 more
Caused by: oracle.jbo.JboException: JBO-29000: Unexpected exception caught: oracle.jbo.DMLConstraintException, msg=JBO-26048: Constraint "EM_RULE_EXPRESSIONS_UK1" is violated during post operation "Insert" using SQL statement "INSERT INTO EM_RULE_EXPRESSIONS(SELECTION_OBJECT_ID,EXPRESSION_ID,EXPRESSION_GROUP_ID,ATTR_ID,OPERATOR_CODE,ATTR_VALUE,SEL_OBJ_TYPE,EXPRESSION_TYPE,UI_GROUP_ID,EXCLUSION_GROUP_ID) VALUES (:1,:2,:3,:4,:5,:6,:7,:8,:9,:10)".
    at oracle.sysman.core.event.rules.uimodel.rulesfwk.RulesFwkEntityImpl.doDML(RulesFwkEntityImpl.java:201)
    at oracle.jbo.server.EntityImpl.postChanges(EntityImpl.java:7271)
    at oracle.sysman.core.event.rules.uimodel.rulesfwk.RulesFwkEntityImpl.postChanges(RulesFwkEntityImpl.java:97)
    ... 104 more
Caused by: oracle.jbo.DMLConstraintException: JBO-26048: Constraint "EM_RULE_EXPRESSIONS_UK1" is violated during post operation "Insert" using SQL statement "INSERT INTO EM_RULE_EXPRESSIONS(SELECTION_OBJECT_ID,EXPRESSION_ID,EXPRESSION_GROUP_ID,ATTR_ID,OPERATOR_CODE,ATTR_VALUE,SEL_OBJ_TYPE,EXPRESSION_TYPE,UI_GROUP_ID,EXCLUSION_GROUP_ID) VALUES (:1,:2,:3,:4,:5,:6,:7,:8,:9,:10)".
    at oracle.jbo.server.OracleSQLBuilderImpl.doEntityDML(OracleSQLBuilderImpl.java:565)
    at oracle.jbo.server.EntityImpl.doDML(EntityImpl.java:9098)
    at oracle.sysman.core.event.rules.uimodel.rulesfwk.RulesFwkEntityImpl.doDML(RulesFwkEntityImpl.java:196)
    ... 106 more
Caused by: java.sql.SQLIntegrityConstraintViolationException: ORA-00001: unique constraint (SYSMAN.EM_RULE_EXPRESSIONS_UK1) violated

If the destination EM version is 13.4 (or 13.3), following message will be found from trace file emoms.trc
2021-03-29 22:04:29,284 [[ACTIVE] ExecuteThread: '70' for queue: 'weblogic.kernel.Default (self-tuning)'] ERROR exportimport.ImportHelper logp.251 - Error in parsing properties for RuleSet from the XML
oracle.sysman.emSDK.app.exception.EMSystemException
    at oracle.sysman.core.event.rules.uimodel.rulesfwk.RulesModelUtil.throwEMSystemException(RulesModelUtil.java:1509)
    at oracle.sysman.core.event.rules.uimodel.rulesfwk.RulesFwkEntityImpl.postChanges(RulesFwkEntityImpl.java:106)
    ...
    at weblogic.work.ExecuteThread.execute(ExecuteThread.java:420)
    at weblogic.work.ExecuteThread.run(ExecuteThread.java:360)
Caused by: oracle.jbo.RowAlreadyDeletedException: JBO-25019: Entity row with key oracle.jbo.Key[BEB7FD94E2112377E053DA570F89CC8F ] is not found in EmRuleSetsEO.
    at oracle.jbo.server.OracleSQLBuilderImpl.doEntitySelectForAltKey(OracleSQLBuilderImpl.java:811)
    at oracle.jbo.server.BaseSQLBuilderImpl.doEntitySelect(BaseSQLBuilderImpl.java:554)
    at oracle.jbo.server.EntityImpl.doSelect(EntityImpl.java:9133)
    at oracle.jbo.server.EntityImpl.lock(EntityImpl.java:6612)
    at oracle.jbo.server.EntityImpl.beforePost(EntityImpl.java:7150)
    at oracle.jbo.server.EntityImpl.postChanges(EntityImpl.java:7384)
    at oracle.sysman.core.event.rules.uimodel.rulesfwk.RulesFwkEntityImpl.postChanges(RulesFwkEntityImpl.java:97)
    ... 95 more

It is a bug of EM 13.2, it can be fixed by apply OMS one-off patch 25986453 on source EM 13.2. The patch is also needed for destination EM if the version is 13.2.

Monday, March 1, 2021

19c runcluvfy.sh faile with PRVF-7596 PRVG-2002

When upgrading Oracle Infrastructure 12.2 to 19c, runcluvfy.sh failed with following messages
Verifying OCR Integrity ...FAILED
host02: PRVF-7596 : CSS is probably working with a non-clustered, local-only
          configuration on node "host02"
Verifying resolv.conf Integrity ...FAILED
host02: PRVG-2002 : Encountered error in copying file "/etc/resolv.conf" from
          node "host02" to node "host01"
          protocol error: filename does not match request
Verifying DNS/NIS name service ...FAILED
host02: PRVG-2002 : Encountered error in copying file "/etc/netsvc.conf" from
          node "host02" to node "host01"
          protocol error: filename does not match request

Set log location and re-run runcluvfy.sh
# runcluvfy.sh will write log files to directory /u01/tmp
export CV_TRACELOC=/u01/tmp
# re-run runcluvfy.sh
./runcluvfy.sh stage -pre crsinst -upgrade -rolling -src_crshome /u01/app/12.2.0/grid_1 -dest_crshome /u01/app/19.0.0/grid_1 -dest_version 19.0.0.0.0 -fixup -verbose
# Log files created
cd /u01/tmp
ls -l
total 15200
-rw-r--r--    1 grid     oinstall     952467 Feb 24 16:52 cvuhelper.log.0
-rw-r--r--    1 grid     oinstall          0 Feb 24 16:48 cvuhelper.log.0.lck
-rw-r--r--    1 grid     oinstall    6824223 Feb 24 16:52 cvutrace.log.0

The log file cvutrace.log.0 shows messsage
[Worker 0] [ 2021-02-24 16:47:49.784 EST ] [UnixSystem.remoteCopyFile:848]  UnixSystem: /usr/bin/scp -p host02:'/tmp/CVU_19.0.0.0.0_grid/scratch/getFileInfo3605304.out' /tmp/host02.getFileInfo3605304.out
[Worker 0] [ 2021-02-24 16:47:49.798 EST ] [RuntimeExec.runCommand:294]  runCommand: Waiting for the process
[Thread-83] [ 2021-02-24 16:47:49.798 EST ] [StreamReader.run:62]  In StreamReader.run
[Thread-82] [ 2021-02-24 16:47:49.798 EST ] [StreamReader.run:62]  In StreamReader.run
[Thread-83] [ 2021-02-24 16:47:50.201 EST ] [StreamReader.run:66]  ERROR>protocol error: filename does not match request
...
[main] [ 2021-02-24 16:51:46.747 EST ] [UnixSystem.remoteCopyFile:848]  UnixSystem: /usr/bin/scp -p host02:'/etc/resolv.conf' /tmp/CVU_19.0.0.0.0_grid/scratch/resolv.conf_host02
[main] [ 2021-02-24 16:51:46.760 EST ] [RuntimeExec.runCommand:294]  runCommand: Waiting for the process
[Thread-1060] [ 2021-02-24 16:51:46.760 EST ] [StreamReader.run:62]  In StreamReader.run
[Thread-1059] [ 2021-02-24 16:51:46.760 EST ] [StreamReader.run:62]  In StreamReader.run
[main] [ 2021-02-24 16:51:47.148 EST ] [RuntimeExec.runCommand:296]  runCommand: process returns 1
[Thread-1060] [ 2021-02-24 16:51:47.149 EST ] [StreamReader.run:66]  ERROR>protocol error: filename does not match request
...
[main] [ 2021-02-24 16:51:47.548 EST ] [UnixSystem.remoteCopyFile:848]  UnixSystem: /usr/bin/scp -p host02:'/etc/netsvc.conf' /tmp/CVU_19.0.0.0.0_grid/scratch/nssw_conf_host02
[Thread-1066] [ 2021-02-24 16:51:47.559 EST ] [StreamReader.run:62]  In StreamReader.run
[main] [ 2021-02-24 16:51:47.559 EST ] [RuntimeExec.runCommand:294]  runCommand: Waiting for the process
[Thread-1065] [ 2021-02-24 16:51:47.560 EST ] [StreamReader.run:62]  In StreamReader.run
[main] [ 2021-02-24 16:51:47.965 EST ] [RuntimeExec.runCommand:296]  runCommand: process returns 1
[Thread-1066] [ 2021-02-24 16:51:47.973 EST ] [StreamReader.run:66]  ERROR>protocol error: filename does not match request

All three errors occurred when scp is executed to copy file from remote node host02 to local node.
Interesting thing is that all three commands are using single quotation marks to enclose remote file name, I guess Oracle programmer left them there by accident because there is no reason to use them. Let's verify if it is the culprit by manually running scp
[grid@host01]$ scp -p host02:"'/etc/resolv.conf'" /tmp/CVU_19.0.0.0.0_grid/scratch/resolv.conf_host02
protocol error: filename does not match request
[grid@host01]$
[grid@host01]$ scp -T -p host02:"'/etc/resolv.conf'" /tmp/CVU_19.0.0.0.0_grid/scratch/resolv.conf_host02
resolv.conf                                             96    51.3KB/s   00:00
[grid@host01]$

The scp fails with same error "protocol error: filename does not match request", but succeeds if extra option -T is used.

The -T option was introduced by OpenSSH 8.0 released in April 2019. In earlier version of OpenSSH, when copying files from a remote system to a local directory, scp did not verify that the file names that the server sent matched those requested by the client. This could allow a hostile server to create or clobber unexpected local files with attacker-controlled content. OpenSSH 8.0 fixed this security issue and scp, by default, verifies the file name on client side, and also introduced -T option to provide capacity to disable the verification.

Although OpenSSH officially claims that the fix is introduced in 8.0, runcluvfy.sh shipped with Grid Infrastructure 19.3 also fails with same reason on AIX with OpenSSH 7.5p1 and it is where the errors used in this article happened.

This version of runcluvfy.sh sends remote file name with single quotation marks, but remote server returns file name without quotation. Technically, they are same thing, but they are visually different. Therefore, old version scp worked because it did not verify them, but current scp fails it with "filename does not match".

It can be resolved by downloading new standalone version Cluster Verification Utility from My Oracle Support(Patch 30839369). Note: the executable is cluvfy instead of runcluvfy.sh. If download is not the option. As a temporary workaround, we can rename scp and create a new scp
# Rename the original scp
mv /usr/bin/scp /usr/bin/scp.bak
# Create a new file scp
echo "/usr/bin/scp.orig -T $*" > /usr/bin/scp
# Make the file executable
chmod a+rx /usr/bin/scp

Now, the errors are gone. After successfully installing GI, remember to restore original scp
# Delete interim scp
rm /usr/bin/scp
# Restore the original scp.
mv /usr/bin/scp.bak /usr/bin/scp

Sunday, February 28, 2021

19c gridSetup.sh failed with [INS-06006] Passwordless SSH connectivity not set up between the following nodes

When running 19.3 gridSetup.sh to configure Grid Infrastructure, following error occurs at configuring passwordless SSH connectivity between cluster nodes
[INS-06006] Passwordless SSH connectivity not set up between the following node(s): [host02].

Cause - Either passwordless SSH connectivity is not setup between specified node(s) or they are not reachable. Refer to the logs for more details.

Action - Refer to the logs for more details or contact Oracle Support Services.

More Details
PRVF-5311 : File "/tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out" either does not exist or is not accessible on node "host02". 

Manually test passwordless SSH connectivity between nodes
[grid@host01]$ ssh host02       <=  Connect to second node host02 from first node host01
[grid@host02]$                  <=  Connected to host02 successfully without password

[grid@host02]$ ssh host02       <=  Connect to first node host01 from second node host02
[grid@host01]$                  <=  Connected to host01 successfully without password

Apparently, the passwordless SSH connectivity has been configured properly. Re-run gridSetup.sh in debug mode to figure out what happened,
grid@host01]$ /u01/app/19.0.0/grid _1/gridSetup.sh -debug

Launching Oracle Grid Infrastructure Setup Wizard...

[main] [ 2021-02-24 17:29:53.105 EST ] [Version.isPre:757]  version to be checked 19.0.0.0.0 major version to check against 10
[main] [ 2021-02-24 17:29:53.106 EST ] [Version.isPre:768]  isPre.java: Returning FALSE

  <<Message truncated>>

[Worker 2] [ 2021-02-24 17:31:26.521 EST ] [Utils.getLocalHost:487]  Hostname retrieved: host01, returned: host01
[Worker 2] [ 2021-02-24 17:31:26.521 EST ] [Utils.getLocalHost:487]  Hostname retrieved: host01, returned: host01
[Worker 2] [ 2021-02-24 17:31:26.521 EST ] [UnixSystem.remoteCopyFile:848]  UnixSystem: /usr/bin/scp -p host02:'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out' /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out
[Worker 2] [ 2021-02-24 17:31:26.534 EST ] [RuntimeExec.runCommand:294]  runCommand: Waiting for the process
[Thread-442] [ 2021-02-24 17:31:26.534 EST ] [StreamReader.run:62]  In StreamReader.run 
[Thread-443] [ 2021-02-24 17:31:26.534 EST ] [StreamReader.run:62]  In StreamReader.run 
[Thread-443] [ 2021-02-24 17:31:26.855 EST ] [StreamReader.run:66]  ERROR>protocol error: filename does not match request
[Worker 2] [ 2021-02-24 17:31:26.855 EST ] [RuntimeExec.runCommand:296]  runCommand: process returns 1
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.runCommand:323]  RunTimeExec: error>
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.runCommand:326]  protocol error: filename does not match request
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:516]  Calling Runtime.exec() with the command 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:518]  /usr/bin/scp 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:518]  -p 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:518]  host02:'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out' 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.traceCmdEnv:518]  /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out 
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [RuntimeExec.runCommand:349]  Returning from RunTimeExec.runCommand
[Worker 2] [ 2021-02-24 17:31:26.856 EST ] [NativeSystem.rununixcmd:1345]  NativeSystem.rununixcmd: RetString 0|protocol error: filename does not match request :failed
[Worker 2] [ 2021-02-24 17:31:26.859 EST ] [ClusterConfig$ExecuteCommand.returnCommandToClient:3324]  returnCommandToClient; fillCount=1 is full=false
[Worker 2] [ 2021-02-24 17:31:26.859 EST ] [Semaphore.release:88]  SyncBufferFull:Release called by thread Worker 2 m_count=2

When validating/configuring passwordless SSH connectivity, it has to copy SSH key information between nodes using scp. From the trace we can find that, it uses scp with -p option to copy the files, and destination file on local server (host01) is sent to scp with file name:

  /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out
  
And source file on remote server(host02) is sent to scp with file name:

  host02:'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out'

Interesting thing is that the whole file path includes single quotation marks, I guess Oracle programmer left them there by accident because there is no reason to use them. Let's verify if it is the culprit by manually running scp
[grid@host01]$ scp -p host02:"'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out'" /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out
protocol error: filename does not match request
[grid@host01]$
[grid@host01]$ scp -T -p host02:"'/tmp/GridSetupActions2021-02-24_05-29-41PM/CVU_19.0.0.0.0_grid/scratch/getFileInfo1638946.out'" /tmp/GridSetupActions2021-02-24_05-29-41PM/host02.getFileInfo1638946.out
getFileInfo1638946.out                                             98    56.5KB/s   00:00
[grid@host01]$

The scp fails with same error "protocol error: filename does not match request", but succeeds if extra option -T is used.

The -T option was introduced by OpenSSH 8.0 released in April 2019. In earlier version of OpenSSH, when copying files from a remote system to a local directory, scp did not verify that the file names that the server sent matched those requested by the client. This could allow a hostile server to create or clobber unexpected local files with attacker-controlled content. OpenSSH 8.0 fixed this security issue and scp, by default, verifies the file name on client side, and also introduced -T option to provide capacity to disable the verification.

Although OpenSSH officially claims that the fix is introduced in 8.0, gridSetup.sh 19.3 also fails with same reason on AIX with OpenSSH 7.5p1 and it is where the errors used in this article happened.

Oracle gridSetup.sh 19.3 (base pubic release of 19c) sends remote file name with single quotation marks, but remote server returns file name without quotation. Techinally, they are same thing, but they are visually different. Therefore, old version scp worked because it did not verify them, but current scp fails it with "filename does not match".

Oracle Release Update 19.6 fixed this problem by removing the single quotation marks. Therefore, gridSetup.sh can be run successfully with -applyRU option to apply 19.6 or higher Realease Update before installing/configuring GI
[grid@r6-dart]$ ./gridSetup.sh -applyRU /u01/stage/30501910
Preparing the home to patch...
Applying the patch /u01/stage/grid/30501910...
Successfully applied the patch.
The log can be found at: /u01/app/oraInventory/logs/GridSetupActions2021-02-24_03-25-57PM/installerPatchActions_2021-02-24_03-25-57PM.log
...

Here, Oracle Grid Infrastructure Release Update 19.6 (Patch 30501910) is unzipped under directory /u01/stage/30501910.

If 19.3 is really needed for some reason, as a temporary workaround, we can rename scp and create a new scp
# Rename the original scp
mv /usr/bin/scp /usr/bin/scp.bak

# Create a new file scp
echo "/usr/bin/scp.orig -T $*" > /usr/bin/scp

# Make the file executable
chmod a+rx /usr/bin/scp

After successfully installing GI 19.3, remember restore original scp
# Delete interim scp
rm /usr/bin/scp

# Restore the original scp.
mv /usr/bin/scp.bak /usr/bin/scp

Saturday, February 27, 2021

19c Upgrade - Fail with ORA-07445: exception encountered: core dump [joevm_invokevirtual()+1260]

When 12.2 pluggable database is upgraded to 19c, dbupgrade fails with message
**********   Upgrading Java and non-Java   *********
Serial   Phase #:53   [PDB1] Files:2 
Unexpected error encountered in catctlMain; Error Stack Below; exiting
Died at /u01/app/oracle/product/19.0.0/dbhome_1/rdbms/admin/catctl.pl line 9188.
 at /u01/app/oracle/product/19.0.0/dbhome_1/rdbms/admin/catctl.pl line 9188.
main::catctlDie("\x{a}Unexpected error encountered in catconExec; exiting\x{a} No such"...) called at /u01/app/oracle/product/19.0.0/dbhome_1/rdbms/admin/catctl.pl line 5650
main::catctlExecutePhaseFiles(53, 2, "PDB1", "", undef) called at /u01/app/oracle/product/19.0.0/dbhome_1/rdbms/admin/catctl.pl line 2035
main::catctlRunPhase(53, 2, "PDB1", "", undef) called at /u01/app/oracle/product/19.0.0/dbhome_1/rdbms/admin/catctl.pl line 2197
main::catctlRunPhases(0, 108, 108, "PDB1", "", undef) called at /u01/app/oracle/product/19.0.0/dbhome_1/rdbms/admin/catctl.pl line 2816
main::catctlRunMainPhases() called at /u01/app/oracle/product/19.0.0/dbhome_1/rdbms/admin/catctl.pl line 1463
main::catctlMain() called at /u01/app/oracle/product/19.0.0/dbhome_1/rdbms/admin/catctl.pl line 1370
eval {...} called at /u01/app/oracle/product/19.0.0/dbhome_1/rdbms/admin/catctl.pl line 1368

------------------------------------------------------
CATCTL FATAL ERROR
------------------------------------------------------

 LOG FILES: (/u01/app/oracle/product/19.0.0/dbhome_1/cfgtoollogs/DB01/upgrade20210222125535/catupgrdPDB1*.log)
TRACE FILE: (/u01/app/oracle/product/19.0.0/dbhome_1/cfgtoollogs/DB01/upgrade20210222125535/catupgrdPDB1_trace.log)

In log file catupgrdPDB1*.log
13:09:24 SQL> -- Load all the Java classes
13:09:24 SQL> begin if initjvmaux.startstep('CREATE_JAVA_SYSTEM') then
13:09:24   2  initjvmaux.rollbacksetup;
13:09:24   3  commit;
13:09:24   4  initjvmaux.rollbackset;
13:09:24   5  initjvmaux.exec('create or replace java system');
13:09:24   6  commit;
13:09:24   7  initjvmaux.rollbackcleanup;
13:09:24   8  initjvmaux.endstep;
13:09:24   9  dbms_registry.update_schema_list('JAVAVM',
13:09:24  10        dbms_registry.schema_list_t('OJVMSYS'));
13:09:24  11  end if; end;
13:09:24  12  /
begin if initjvmaux.startstep('CREATE_JAVA_SYSTEM') then
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
ERROR:
ORA-03114: not connected to ORACLE

In database alert log file
2021-02-22T13:09:41.671656-05:00
PDB1(4):Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x4] [PC:0x118CE298C, joevm_invokevirtual()+1260] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/db01/DB01/trace/DB01_2_ora_4739.trc  (incident=28841) (PDBNAME=PDB1):
ORA-07445: exception encountered: core dump [joevm_invokevirtual()+1260] [SIGSEGV] [ADDR:0x4] [PC:0x118CE298C] [Address not mapped to object] []
PDB1(4):Incident details in: /u01/app/oracle/diag/rdbms/db01/DB01/incident/incdir_28841/DB01_ora_4739_i28841.trc
PDB1(4):Use ADRCI or Support Workbench to package the incident.

The culprit is that the PDB (PDB1) was plugged from NONCDB which was followed by running noncdb_to_pdb.sql.  It can be fixed as following,

1. Connect to the database as SYS and run following sql commands in root container CDB$ROOT
alter session set container=CDB$ROOT;

alter session set "_ORACLE_SCRIPT"=true;

create or replace view rootobj sharing=object as
   select obj#,o.name,u.name uname,o.type#,o.flags
     from obj$ o,user$ u where owner#=user#;

2. Connect to the database as SYS and run following sql commands in PDB container PDB1 (your PDB name may be different)
alter session set container=PDB1;

create or replace view rootobj sharing=object as
   select obj#,o.name,u.name uname,o.type#,o.flags
     from obj$ o,user$ u where owner#=user#;

update obj$ set flags=flags+65536
 where type# in (28,29,30,56)
       and bitand(flags,65536)=0
       and obj# in (select o.obj# from obj$ o,user$ u,rootobj r
                     where o.name=r.name and o.type#=r.type# and o.owner#=u.user# and u.name=r.uname and bitand(r.flags,65536)!=0
                    union
                    select obj# from obj$ where bitand(flags,4259840)=4194304);

delete from sys.idl_ub1$
 where obj# in (select obj# from sys.obj$ where bitand(flags, 65536)=65536 and type# in (28,29,30,56));

commit;

3. Re-start upgrade from failed step (Phase #:53 in my case)
$ORACLE_HOME/bin/dbupgrade -p 53 -c 'PDB1'

Tuesday, October 22, 2019

DBUA 12.2 Failed while PDBS Recompile Invalid Objects

When upgrading 12.1 database to 12.2 with DBUA, got following error,

Server errors encountered during the execution of "PDBS Recompile Invalid Objects".

Following information is found in DBUA log file,
SQL> EXECUTE dbms_registry_sys.validate_components;
...(20:04:27) Starting validate_apex for APEX_180200
ORA-20001: MISSING GRANT: grant execute on "MDSYS"."SDO_DIM_ARRAY" to APEX_180200
ORA-20001: MISSING GRANT: grant execute on "MDSYS"."SDO_DIM_ELEMENT" to APEX_180200
ORA-20001: MISSING GRANT: grant execute on "MDSYS"."SDO_ELEM_INFO_ARRAY" to APEX_180200
ORA-20001: MISSING GRANT: grant execute on "MDSYS"."SDO_GEOMETRY" to APEX_180200
ORA-20001: MISSING GRANT: grant execute on "MDSYS"."SDO_ORDINATE_ARRAY" to APEX_180200
ORA-20001: MISSING GRANT: grant execute on "MDSYS"."SDO_POINT_TYPE" to APEX_180200


These privileges were granted by SYS (SYS is grantor) in original database, and missing during upgrading.

Object privileges in original database,
SQL> select GRANTEE,GRANTOR,OWNER,TABLE_NAME,PRIVILEGE from dba_tab_privs
  2   where owner='MDSYS' and grantee='APEX_180200'
  3   order by grantor,table_name;

GRANTEE        GRANTOR    OWNER     TABLE_NAME                     PRIVILEGE
-------------- ---------- --------- ------------------------------ ----------------------
APEX_180200    MDSYS      MDSYS     MDERR                          EXECUTE
APEX_180200    MDSYS      MDSYS     SDO_GEOM_METADATA_TABLE        DELETE
APEX_180200    MDSYS      MDSYS     SDO_GEOM_METADATA_TABLE        SELECT
APEX_180200    MDSYS      MDSYS     SDO_GEOM_METADATA_TABLE        INSERT
APEX_180200    MDSYS      MDSYS     SDO_META                       EXECUTE
APEX_180200    MDSYS      MDSYS     SDO_UTIL                       EXECUTE
APEX_180200    MDSYS      MDSYS     USER_SDO_INDEX_INFO            SELECT
APEX_180200    SYS        MDSYS     SDO_DIM_ARRAY                  EXECUTE
APEX_180200    SYS        MDSYS     SDO_DIM_ELEMENT                EXECUTE
APEX_180200    SYS        MDSYS     SDO_ELEM_INFO_ARRAY            EXECUTE
APEX_180200    SYS        MDSYS     SDO_GEOMETRY                   EXECUTE
APEX_180200    SYS        MDSYS     SDO_ORDINATE_ARRAY             EXECUTE
APEX_180200    SYS        MDSYS     SDO_POINT_TYPE                 EXECUTE


Object privileges in current (partly upgraded) database,
SQL> select GRANTEE,GRANTOR,OWNER,TABLE_NAME,PRIVILEGE from dba_tab_privs
  2   where owner='MDSYS' and grantee='APEX_180200'
  3   order by grantor,table_name;

GRANTEE        GRANTOR    OWNER      TABLE_NAME                     PRIVILEGE
-------------- ---------- ---------- ------------------------------ --------------------
APEX_180200    MDSYS      MDSYS      MDERR                          EXECUTE
APEX_180200    MDSYS      MDSYS      SDO_GEOM_METADATA_TABLE        DELETE
APEX_180200    MDSYS      MDSYS      SDO_GEOM_METADATA_TABLE        INSERT
APEX_180200    MDSYS      MDSYS      SDO_GEOM_METADATA_TABLE        SELECT
APEX_180200    MDSYS      MDSYS      SDO_META                       EXECUTE
APEX_180200    MDSYS      MDSYS      SDO_UTIL                       EXECUTE
APEX_180200    MDSYS      MDSYS      USER_SDO_INDEX_INFO            SELECT


Workaround

Manually grant the missing privileges and click retry in DBUA.

Wednesday, August 21, 2019

12.2: Preupgrade Utility Fails with ORA-06512: at "SYS.DBMS_PREUP"

While running 12.2 Pre-Upgrade Utility (manually or with DBUA) on 12.1 database, received following errors,
Error in preupgrade tool execution.
ERROR -
ERROR - Unable to run sqlplus due to:
ORA-06512: at "SYS.DBMS_PREUP", line 4401
ORA-06512: at "SYS.DBMS_PREUP", line 9714
ORA-06512: at line 8

declare
*
ERROR at line 1:
ORA-01426: numeric overflow
ORA-06512: at line 56

if you are executing the preupgrade tool from a standby database make sure the preupgrade tool has been executed in the primary database If you are executing the tool from Windows, please make sure you have given the appropriate permissions to the Oracle binaries owner.


This is the bug of the Pre-Upgrade Utility installed in current 12.2 home. It can be fixed by installing new version of the utility as following,

1. Download utility ZIP file from My Oracle Support following the instruction of document "How to Download and Run Oracle's Database Pre-Upgrade Utility (Doc ID 884522.1)"
  
2. The current available ZIP file for 12.2 (target upgrade-to version) is preupgrade_12201_cbuild_22_lf.zip, it contains following files,
   dbms_registry_basic.sql         
   dbms_registry_extended.sql
   preupgrade.jar
   preupgrade_driver.sql
   preupgrade_messages.properties
   preupgrade_package.sql

3. Unzip downloaded file to directory $ORACLE_HOME/rdbms/amdin, here $ORACLE_HOME is 12.2 home.

   Note: Rename existing files before unzipping, do not override any files.

Sunday, November 18, 2018

Patch or Upgrade Single instance Database and Oracle Restart Using OEM 13c Fleet Maintenance

Oracle Enterprise Manager (OEM) 13c Database Fleet Maintenance is a centralized mechanism for patching/upgrading of Oracle homes and maintain consistency. It allows OEM Cloud Control administrators to patch database, Oracle Restart (Standalone Grid Infrastructure) or Grid Infrastructure with minimal downtime.

Friday, August 3, 2018

DBUA 12.2 failed with ORA-01157 at preupgrade tool execution

DBUA got following errors at 'Prerequisite Check' (perupgrade tool execution) while upgrading Oracle database 11.2.0.4 to 12.2.0.1
Error in preupgrade tool execution. ERROR - ERROR - Unable to run sqlplus due to: ORA-06512: at "SYS.DBMS_PREUP",
line 4386 ORA-06512: at "SYS.DBMS_PREUP", line 9108 ORA-06512: at line 8 declare * ERROR at line 1: ORA-01157:
cannot identify/lock data file 202 - see DBWR trace file ORA-01110: data file 202: '+DATA' ORA-06512: at line 56
if you are executing the preupgrade tool from a standby database make sure the preupgrade tool has been executed
in the primary database If you are executing the tool from Windows, please make sure you have given the appropriate
permissions to the Oracle binaries owner. (more details)

Monday, February 19, 2018

ORA-01017 invalid username/password after 11g database upgraded to 12.2 pluggable database

Oracle database 11.2.0.3 was upgraded to 12.2.0.1 and plugged into CDB as pluggable database with 'create pluggable database ... using ...' command. The container database (CDB) was creted with 12.2 dbca and all built-in users' passwords (sys,system,dbsnmp, etc) were set to same ones as in original 11g database.

Saturday, January 20, 2018

Upgrading Grid Infrastructure from 12.1 to 12.2 failed with [INS-20802]

Customer is upgrading two-node Oracle Grid Infrastructure from version 12.1.0.2 to version 12.2.0.1, and gridSetup.sh returned error [INS-20802] during Preparing for deploying Oracle Grid Infrastructure Management Repository.

Wednesday, January 17, 2018

ORA-01403: no data found on ctxposup.sql while upgrading from 11g to 12.2

From: Oracle Database - Enterprise Edition - Version 11.2.0.3 + PSU 11.2.0.3.5 (14727310)
To:   Oracle Database - Enterprise Edition - Version 12.2.0.1.0
Operation System -  Solaris 10 1/13 s10s_u11wos_24a SPARC


It failed to upgrade database from 11.2.3 to 12.2 with error 'ORA-01403: no data found', and following message is found from log file 'Oracle_Server.log',

REASON:
      ERRORS FOUND: During Upgrade
         FILENAME: /u01/app/oracle/cfgtoollogs/dbua/upgrade2018-01-17_11-42-08-AM/DB01/catupgrd0.log AT LINE NUMBER: 825076
------------------------------------------------------
Identifier CONTEXT 18-01-17 12:37:20
SCRIPT      = [/u01/app/oracle/product/12.2.0/dbhome_1/ctx/admin/ctxposup.sql] <===== STATEMENT 1
ERROR      = [ORA-01403: no data found ORA-06512: at line 28
ORA-06512: at line 8
]
STATEMENT = [declare
  l_owner#  number;
  l_errnum  number;
  l_count   number;

begin
  select user# into l_owner# from sys."_BASE_USER" where name='CTXSYS';
  select idx_id into l_count from dr$index where        <===== STATEMENT 2
    idx_name = 'ENT_EXT_DICT_OBJ' and          
    idx_owner# = l_owner#;                       

  /* if this object does not exist yet, create it. */  
  if (0 = l_count) then
    CTX_DDL.create_policy('CTXSYS.ENT_EXT_DICT_OBJ',    <===== STATEMENT 3
      filter        => 'CTXSYS.NULL_FILTER',
      section_group => 'CTXSYS.NULL_SECTION_GROUP',
      lexer        => 'CTXSYS.BASIC_LEXER',
      stoplist        => 'CTXSYS.EMPTY_STOPLIST',
      wordlist        => 'CTXSYS.BASIC_WORDLIST'
  );
  end if;
exception
  when others then
    l_errnum := SQLCODE;
    if (l_errnum = -20000) then
      null;
    else
      raise;
    end if;
end;]

Script ctxposup.sql (STATEMENT 1) which upgrades Oracle Text raised ORA-01403 while running PL/SQL block listed in the above. Testing individual statements in the block as following:

sys@DB01> select user#  from sys."_BASE_USER" where name='CTXSYS';

     USER#
----------
       100

sys@DB01> ALTER SESSION SET CURRENT_SCHEMA = CTXSYS;

Session altered.

sys@DB01> select idx_id from dr$index where
  2    idx_name = 'ENT_EXT_DICT_OBJ' and
  3    idx_owner# = 100;

no rows selected

The object CTXSYS.ENT_EXT_DICT_OBJ (a policy of Oracle Text) does not exist in the database. Therefore, STATEMENT 2 (select ... into ... from) returned ORA-01403. Apparently, The purpose of STATEMENT 2 is to test whether CTXSYS.ENT_EXT_DICT_OBJ exists, but listing records is not a better choice for exist-testing than counting the number of records. Let's try to replace STATEMENT 2 in script filectxposup.sql with following code

 select count(idx_id) into l_count from dr$index where

Start the upgrade again, it succeeded!

Wednesday, June 10, 2015

Failed to retrieve the password file location used by ASM asm

After Oracle Restart (Grid Infrastructure standalone) is upgraded from 11g to 12c, srvctl get errors:
$ srvctl config asm
ASM home: 
PRCA-1057 : Failed to retrieve the password file location used by ASM asm
PRCR-1097 : Resource attribute not found: PWFILE

Check environment and password file:
$ echo $ORACLE_HOME
/u01/app/grid/product/12.1.0/grid

$ ls -l /u01/app/grid/product/12.1.0/grid/dbs
total 28
-rw-rw---- 1 grid oinstall 2783 Jun  9 14:27 ab_+ASM.dat
-rw-rw---- 1 grid oinstall 1544 Jun  9 14:27 hc_+ASM.dat
-rw-r--r-- 1 grid oinstall 2992 Feb  3  2012 init.ora
-rw-r----- 1 grid oinstall 8704 Jun  9 14:10 orapw+ASM
-rw-r----- 1 grid oinstall 1536 Jun  9 14:27 spfile+ASM.ora

The password file was created and ORACLE_HOME is set correctly. Checked with DBA, the upgrade was done successfully. There was no errors/exceptions during upgrade.
Oracle Support explains that:
12c ASM instance has a attribute called PWFILE(password file), but 11g ASM does not. Looks like the upgrade utility does not create the new attribute for an ASM instance which is upgraded from 11g.
Try to manually re-create the ora.asm resource with -pwfile option:
$ crsctl status resource ora.asm
NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=ONLINE on host01

$ srvctl stop asm
PRCR-1065 : Failed to stop resource ora.asm
CRS-2529: Unable to act on 'ora.asm' because that would require stopping or relocating 'ora.DATA.dg', but the force option was not specified
$ srvctl stop asm -f
$ 
$ srvctl config asm
PRCR-1001 : Resource ora.asm does not exist
$ 
$ srvctl add asm -listener LISTENER -spfile /u01/app/grid/product/12.1.0/grid/dbs/spfile+ASM.ora -pwfile /u01/app/grid/product/12.1.0/grid/dbs/orapw+ASM

Test if it is fixed:
$ srvctl sconfig asm
ASM home: 
Password file: /u01/app/grid/product/12.1.0/grid/dbs/orapw+ASM
ASM listener: LISTENER
Spfile: /u01/app/grid/product/12.1.0/grid/dbs/spfile+ASM.ora
ASM diskgroup discovery string: ++no-value-at-resource-creation--never-updated-through-ASM++

There is no error any more.

Friday, January 9, 2015

Relink Oracle Grid Infrastructure Standalone (Oracle Restart) home and Database home after OS is upgraded from AIX 6.1 to 7.0

Oracle Database 11.2.0.3
Operating System AIX 7.0
The server system is upgraded from AIX 6.1 to 7.0, generally IBM guarantees operating system binary compatibility, therefore, no relink of the Oracle
software is required after the upgrade. However, Oracle recommends performing manual relinking of Oracle Home binaries after OS upgrade.