AMIR BLOG

Tuesday, May 30, 2023

Script to check workflow mailer service components status

Script to check workflow mailer service components status and send notification if DOWN

Pre-requisites to Run the script:

Make sure mailx and sendmail is installed on your OS, also check below environements/file is created prior to execute the scripts.

This script is tested on Linux Server.

This script needs to be deployed in database node/server.

#Base location for the DBA scripts

DBA_SCRIPTS_HOME=$HOME/DBA_MON

#OS User profile where database environment file is set

$HOME/.bash_profile

Add below lines in env

Vi .sysenv

export DBA_SCRIPTS_HOME=$HOME/DBA_MON

export PATH=${PATH}:$DBA_SCRIPTS_HOME

export DBA_EMAIL_LIST=user @domain.com,

#Below parameter is used in script, whenever there is planned downtime you can set it to Y so there will be no false alert.

export DOWNTIME_MODE=N

vi WorkflowStatusCheck.sh

#!/bin/bash

###################################################################################

# Script Name : WorkflowStatusCheck.sh #

# #

# Description: #

# Script to check workflow mailer service components status and #

# to send alert notification if one or more workflow mailer components are down #

# #

# Usage : sh <script_name> <ORACLE_SID> #

# For example : sh WorkflowStatusCheck.sh ORCL #

# #

###################################################################################

# Initialize variables

INSTANCE=$1

HOST_NAME=`hostname`

PROGRAM=`basename $0 | cut -d'.' -f1`

export DBA_SCRIPTS_HOME=$HOME/DBA_MON

APPS_ID=`echo $INSTANCE | tr '[:lower:]' '[:upper:]'`

LOG_DIR=$DBA_SCRIPTS_HOME/logs/$HOST_NAME

OUT_FILE=$LOG_DIR/`echo $PROGRAM`_$APPS_ID.html.out

LOG_FILE=$LOG_DIR/`echo $PROGRAM`_$APPS_ID.log

ERR_FILE=$LOG_DIR/`echo $PROGRAM`_$APPS_ID.err

LOG_DATE=`date`

# Source the env

. $HOME/.bash_profile

. $DBA_SCRIPTS_HOME/.sysenv

if [ $? -ne 0 ]; then

echo "$LOG_DATE" > $LOG_FILE

echo "Please pass correct environment : exiting the script \n" >> $LOG_FILE

cat $LOG_FILE

exit

if [ -s $OUT_FILE ]; then

echo "$LOG_DATE" > $LOG_FILE

echo "Deleting existing output file $OUT_FILE" >> $LOG_FILE

rm -f $OUT_FILE

cat $LOG_FILE

# If there is a planned downtime then create $ORACLE_SID.down file in $DBA_SCRIPTS_HOME to silent the alerts during maintenance window.

if [ -f $DBA_SCRIPTS_HOME/`echo $ORACLE_SID`.down ]; then

echo "$LOG_DATE" >> $LOG_FILE

echo "Host: $HOST_NAME | Instance: $ORACLE_SID is under maintenance: exiting the script" >> $LOG_FILE

cat $LOG_FILE

exit

if [ $DOWNTIME_MODE = "Y" ]; then

echo "$LOG_DATE" >> $LOG_FILE

echo "Host: $HOST_NAME | Instance: $ORACLE_SID is under maintenance: exiting the script" >> $LOG_FILE

cat $LOG_FILE

exit

usage()

{

echo "$LOG_DATE" > $LOG_FILE

echo "Script To Check Workflow Mailer Components Status" >> $LOG_FILE

echo "Usage : sh <script_name> <ORACLE_SID> " >> $LOG_FILE

echo "For example : sh $PROGRAM.sh $ORACLE_SID" >> $LOG_FILE

echo

}

if [ $# -lt 1 ] || [ "$INSTANCE" != "$ORACLE_SID" ]; then

usage

echo "Error : Insufficient arguments." >> $LOG_FILE

cat $LOG_FILE

exit

get_count()

{

sqlplus -s '/as sysdba' <<!

set heading off

set feedback off

select count(1) from

(

select component_id, startup_mode,component_status,component_type,component_name

from applsys.fnd_svc_components

where component_status <> 'RUNNING'

order by component_id

);

exit;

}

count=`get_count`

#echo $count

echo "$LOG_DATE" > $ERR_FILE

get_count >> $ERR_FILE

ERR_COUNT=`grep "ORA-" $ERR_FILE |wc -l`

if [ $ERR_COUNT -gt 0 ]; then

cat $ERR_FILE | mailx -s "<ERROR> Critical : $APPS_ID - Workflow Mailer Components are DOWN on $HOST_NAME " $DBA_EMAIL_LIST

exit

if [ $count -gt 0 ];

then

sqlplus -s '/as sysdba' <<EOF

SET ECHO OFF

SET pagesize 1000

set feedback off

set lines 180

SET MARKUP HTML ON SPOOL ON -

HEAD '<title></title> -

table { background: #eee; } -

th { font:bold 10pt Arial,Helvetica,sans-serif; color:#b7ceec; background:#151b54; padding: 5px; align:center; } -

td { font:10pt Arial,Helvetica,sans-serif; color:Black; background:#f7f7e7; padding: 5px; align:center; } -

</style>' TABLE "border='1' align='left'" ENTMAP OFF

spool $OUT_FILE

PROMPT Hi Team,

PROMPT

PROMPT Logon on to OAM (Oracle Applications Manager) and make sure the following processes are running, if not restart them.

PROMPT

select component_id, startup_mode,component_status,component_type,component_name

from applsys.fnd_svc_components

where component_status <> 'RUNNING'

order by component_id;

SPOOL OFF

SET MARKUP HTML OFF

exit;

EOF

(

echo "To: $DBA_EMAIL_LIST"

echo "MIME-Version: 1.0"

echo "Content-Type: multipart/alternative; "

echo ' boundary="PAA08673.1018277622/server.xyz.com"'

echo "Subject: Critical : $APPS_ID - Workflow Mailer Components are DOWN on $HOST_NAME "

echo ""

echo "This is a MIME-encapsulated message"

echo ""

echo "--PAA08673.1018277622/server.xyz.com"

echo "Content-Type: text/html"

echo ""

cat $OUT_FILE

echo "--PAA08673.1018277622/server.xyz.com"

) | /usr/sbin/sendmail -t

echo "`date`" > $LOG_FILE

echo "Details sent through an email" >> $LOG_FILE

cat $LOG_FILE

else

echo "`date`" > $OUT_FILE

echo "Workflow Mailer Components are up and running fine" >> $OUT_FILE

Logs and Out files will be generated under $DBA_SCRIPTS_HOME/logs/$HOST_NAME

So make sure to create logs/$HOST_NAME directory under $DBA_SCRIPTS_HOME before executing

the script.

Once the script is ready, then as per the requirement please schedule it in crontab/OEM.

Execute the Script as below:

Syntax : sh <script_name> <ORACLE_SID>

$ cd $HOME/DBA_MON/bin

$ sh WorkflowStatusCheck.sh ORCL

This script will send alert notification if one or more workflow mailer components are down.

Sample Email Notification Output:

MRP Stuck and No RFS Process on Standby

Primary node No Longer Transmits Archive Log Files To the physical standby database, MRP Stuck and No RFS Process on Standby

Issue:

Recently I faced strange issue where I observed that one of the standby node was not showing the RFS process, checked in V$MANAGED_STANDBY

V$MANAGED_STANDBY - showing only ARCH and MRP processes.

Archive logs not shipping from Primary to Standby Node and there are no errors observed in primary and standby alert logs.

Tried to copy Archive logs manually from Primary and Standby but MRP recovery process not applying the further archive logs on Standby.

Cause:

This caused due to outage on the standby server for the network activity.

Observations:

On Primary:

ARCH process stuck on Primary instance

SELECT PROCESS, PID, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM V$MANAGED_STANDBY;

DB Mode Showing UNKNOWN

select dest_id id,database_mode db_mode,recovery_mode,

protection_mode,standby_logfile_count "SRLs",

standby_logfile_active ACTIVE,

archived_seq#

from v$archive_dest_status

where dest_id =2;

On Standby:

No RFS Process showing when query below:

SELECT PROCESS, PID, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM V$MANAGED_STANDBY;

Solution:

1. On Primary:

- Set log transport state to DEFER status:

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=DEFER;

2. On standby server:

- Shutdown Database Listener

- Cancel Managed Recovery

SQL> alter database recover managed standby database cancel;

- Shutdown the Standby Database

SQL> shutdown immediate

3. On the Primary

Kill the ARCx Processes and the Database will respawn them automatically immediately without harming it.

ps -ef | grep -i arc

kill -9 <ospid of ARC process> <another ospid of ARC process> ...

4. On standby server

- Startup Standby Database and resume Managed Recovery

SQL> startup mount;

SQL> alter database recover managed standby database disconnect from session;

- Start Database Listener

5. Set log transport state to ENABLE status:

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=ENABLE;

6. Monitor the alert logs at each site and ensure log shipping and

apply is occurring again.

Also, verify the DB Mode on Primary -

select dest_id id,database_mode db_mode,recovery_mode,

protection_mode,standby_logfile_count "SRLs",

standby_logfile_active ACTIVE,

archived_seq#

from v$archive_dest_status

where dest_id =2;

For Information: In case after Terminating ARCH process on primary node,

if it doesn't restart automatically, you can do the following :

On Primary:

SQL> alter system set log_archive_max_processes=4;

(increase log_archive_max_processes accordingly)

Labels

Tuesday, May 30, 2023

Script to check workflow mailer service components status

MRP Stuck and No RFS Process on Standby

Labels

About

Blog Archive

Pages