Sunday, April 25, 2021

OEM 13c EMGC_ADMINSERVER & EMGC_OMS1 target status show DOWN when emctl secure wls with cut certificate

 When custom certificate is configured for OEM 13c, WebLogic Servers installed as part of Enterprise Manager Cloud control (Administration Server and Managed Servers) can be secured with custom certificate using following command,

   $OMS_HOME/bin/emclt secure wls

However, the WebLogic Servers and their deployments could show down in OEM console after secured with custom certificate, though they are still running well.

The reason is that the CA involved in issuing the custom certificate for OMS is not "well known", at least it is not accepted by Oracle as default trusted CA. When agent running on OMS server communicates with WebLogic Servers (WLS), WLS is using the custom certificate as his own identification, but agent cannot find trusted certificates of CA involved in issuing the custom certificate from agent's local keystore. Therefore, agent cannot verify the validation of WLS's certificate, and stops the communication with WLS.

The quick fix is to import the certificate of each CA involved in issuing the ticket into agent local keystore with following command,

    $AGENT_HOME/bin/emctl secure add_trust_cert_to_jks -trust_certs_loc <ca_certificate_file> -alias <certificate_alias> [-password <keystore_pwd>]

Here, <certificate_alias> is used to identify the certificates saved in the keystore, they must be unique for each certificate, <keystore_pwd> is the password of the keystore, the default value is welcome.

For example, I have installed a CA in my lab network, and the CA issued certificate to my OMS server. The two of my CA server certificates (root certificate & intermediate certificate) has to be imported into agent keystore as following,

 $AGENT_HOME/bin/emctl stop agent

 $AGENT_HOME/bin/emctl secure add_trust_cert_to_jks -password welcome -alias dbaplus-root -trust_certs_loc /home/oracle/Root_CA_Certificate.txt

 $AGENT_HOME/bin/emctl secure add_trust_cert_to_jks -password welcome -alias dbaplus-intermediate -trust_certs_loc /home/oracle/Intermediate_CA_Certificate.txt

 $AGENT_HOME/bin/emctl start agent

List the certificates imported into agent monitor keystore,

 $AGENT_HOME/jdk/bin/keytool -list -alias <certificate_alias> -keystore   $AGENT_INSTANCE_HOME/sysman/config/montrust/AgentTrust.jks -storepass welcome -v

If needed, the certificates can be removed from keystore as following

 $AGENT_HOME/jdk/bin/keytool -delete -alias <certificate_alias> -keystore   $AGENT_INSTANCE_HOME/sysman/config/montrust/AgentTrust.jks -storepass welcome -v

Saturday, April 24, 2021

OEM 13c Target "EM Jobs Service" shown as down in EM Console while all associated targets are up

"EM Jobs Service" target status is showing down in Enterprise Manager Cloud Control(EM) console even though all associated targets are up and running. It could be an issue with the metric collection definition. Usually it is seen at post blackout of associated targets.

The status of EM Jobs Service is aggregated target status, it is calculated based on the status of the associated targets. The associated targets and calculation logic are defined by default when the system is installed, and you can change it later.

The issue can be fixed by changing/restoring Availability Definition of the service as following,

1. In EM Console navigate to the following menu

     Targets > Services > Click on "EM Jobs Service" target

2. In "EM Jobs Service" home page, click on the tab "Monitoring Configuration" and then click on the link "Availability Definition"

3. Take a screen shot of the "Availability Definition" configurations, change the definition to a different option and click OK to save it.

    For Instance, If Availability Definition is to consider "All key components are up" (default definition), change it to "At least one key component is up" and save change.

4. Now revert "Availability Definition" of the service back to original configration by following the same procedure.

    For instance, change and save "Availability Definition" to "All key components are up"

The target status shows up as all components are up.

Wednesday, April 14, 2021

OEM 12c/13c Agent Deployment fails with "Remote Validatons: Shell Path Validation Failed"

When deploying agent on OEM 12c/13c using 'Add Host Targets' wizard, the deployment fails with

Remote Validations:  Shell Path validation failed

Cause:  Shell path is incorrect or not defined.:/bin/bash(SH_PATH),-c(SH_ARGS) on host <host name> 

Recommendation:  Check the property values in the following files in this order, ssPaths_<plat>.properties or sPaths.properties or Paths.properties, in "/u01/app/oracle/em13.4/middleware/oui/prov/resources" directory. If the property values are correct, then ensure the login user account is enabled for remote logins.For more details, refer to the Oracle Enterprise Manager Basic Installation Guide.

Most common reason why it happened could be one of following

1. Shell (sh, bash & ksh) location is different from OEM defined location
OEM defined shell location can be found from file 'ssPaths_<platform>.properties' under directory '$OMS_HOME/oui/prov/resources'. For example, if the errors happens on deploying agent to AIX host, type the content of file 'ssPaths_aix.properties' which looks like following
SH_PATH=/bin/bash
SH_ARGS=-c
SHELL_PATH=/bin/bash
SHELL_ARGS=-c
KSH_PATH=/usr/bin/ksh
RMDIR_ARGS=
#the date should be in the format of year:month:date:hour:minute:second
DATE_ARGS=-u +%y:%m:%d:%H:%M:%S
PING_PATH=/usr/sbin/ping
SSH_KEYGEN_PATH=/usr/bin/ssh-keygen
TAR_EXCLUDE_ARGS=X
TAR_INCLUDE_ARGS=-I
DF_COL_NAME=avail
SSH_HOST_KEY_LOC=/etc/ssh

On the host where agent is going to be installed, check if the executables/shell exist and are located at same place as in the OEM file 'ssPaths_<platform>.properties'. In previous example files, the executables/shell are

/bin/bash
/usr/bin/ksh
/usr/bin/ssh-keygen

If it does not exist, you have to install it. If it exists but is located at different directory, edit the OEM file and replace the shell/executable path with the directory where the shell/executable is.

2. Incorrect user name or password configured in Named Credential which is used to deploy the agent

If incorrect user name or password is used, the error could also happen. If you do not have the password of the user defined in Named Credential, the issue can be confirmed by checking following log file on oms server,
  
$OMS_INSTANCE_BASE/em/EMGC_OMS1/sysman/agentpush/<timestamp>/applogs/<host_name>_deploy.log

For example, the failed deployment log is

  /u01/app/oracle/em13.4/gc_inst/em/EMGC_OMS1/sysman/agentpush/2021-04-13_12-58-49-PM/applogs/host01.lab.dbaplus.ca_deploy.log

And following message is found in the log
2021-04-13_12-59-55:INFO:===VALIDATION===:Checking SH_PATH on target nodes
2021-04-13_12-59-55:INFO:isWrongShPath:remotePathPropertiesLoc:/u01/app/oracle/em13.4/middleware/oui/prov/resources Platform id:212
2021-04-13_12-59-55:INFO:NODES=host01.lab.dbaplus.ca
2021-04-13_12-59-55:INFO:Running cmd /bin/bash -c /bin/true on node host01.lab.dbaplus.ca
2021-04-13_12-59-55:INFO:Action description Execution of command /bin/bash -c /bin/true  on host host01.lab.dbaplus.ca
2021-04-13_12-59-55:INFO:Attempt :1 pty required false  with no inputs
2021-04-13_12-59-56:INFO:/bin/bash -c /bin/true execution failed on host host01.lab.dbaplus.ca
2021-04-13_12-59-56:INFO: OUT null
2021-04-13_12-59-56:INFO: ERR WARNING: Your password has expired.
Password change required but no TTY available.

We can see that the password has expired, ask system administrator to reset the password and also update the password for Named Credential.

The easist way to eliminate this error because of user name or password issue is to ask system administrator to test the login manually out of OEM.

Sunday, April 11, 2021

OEM 13c Discovering WebLogic Domain failed to save Node Manger target with error 'This target requires a local Management Agent'

When discovering or refreshing a WebLogic Domain or Fusion Middleware Farm in Enterprise Manager (EM) 13.4 Cloud Control, the Node Manager target is not saved. The error is shown in EM:

Failed to save NM_xxx_x(Oracle WebLogic Node Manager) on host <IP/host name>. This target requires a local Management Agent, but a local Management Agent was not found.  In order to add this target, you need to install a Management Agent on the same host as the target and then perform a "Refresh WebLogic Domain" operation.

The agent has been installed on the host. The errors happened because of difference between Listen Address of Node Manager configuration and host name of EM Agent URL. As a solution, the Listen Address Node Manager should be changed to host name of EM Agent URL.

Oracle explains it as incorrect configuration of Oracle WebLogic Node Manager. Therefore, it could happen on all release of EM 13c. However, I can only reproduce the problem in EM 13.1 and 13.4 when Listen Address of WebLogic Node Manager is configured with IP address instead of host name which is used by EM Agent URL and there is no problem with EM 13.2. Anyway, having both configuration use same host name is not bad idea.

Find out host name of EM agent URL with command <AGENT_HOME>/bin/emctl status agent
$ /u01/app/oracle/em13.4/agent/agent_13.4.0.0.0/bin/emctl status agent
Oracle Enterprise Manager Cloud Control 13c Release 4
Copyright (c) 1996, 2020 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version          : 13.4.0.0.0
OMS Version            : 13.4.0.0.0
Protocol Version       : 12.1.0.1.0
Agent Home             : /u01/app/oracle/em13.4/agent/agent_inst
Agent Log Directory    : /u01/app/oracle/em13.4/agent/agent_inst/sysman/log
Agent Binaries         : /u01/app/oracle/em13.4/agent/agent_13.4.0.0.0
Core JAR Location      : /u01/app/oracle/em13.4/agent/agent_13.4.0.0.0/jlib
Agent Process ID       : 76282
Parent Process ID      : 76240
Agent URL              : https://host01.lab.dbaplus.ca:3872/emd/main/
Local Agent URL in NAT : https://host01.lab.dbaplus.ca:3872/emd/main/
Repository URL         : https://oms.lab.dbaplus.ca:4903/empbs/upload
Started at             : 2021-04-07 17:53:56
Started by user        : oracle
Operating System       : Linux version 4.1.12-124.46.4.1.el7uek.x86_64 (amd64)
...
---------------------------------------------------------------
Agent is Running and Ready

Change Listen Address of Node Manager to the host name of EM Agent URL in the WebLogic Admin Console as following,
1. Go to Node Manger configuraiton page

   Environment > Machines > [Machine Name] > Configuration > Node Manager

2. Click 'Lock & Edit' to enable edit mode

3. Set the value of "Listen Address" property to the host name given by previous command 'emctl status agent'

4. Click 'Save', then click 'Activate Changes'

Refresh or rediscover the domain, the Node Manger will be discovered successfully.