Tuesday, May 30, 2023

Script to check workflow mailer service components status

Script to check workflow mailer service components status and send notification if DOWN

 

Pre-requisites to Run the script:

 

Make sure mailx and sendmail is installed on your OS, also check below environements/file is created prior to execute the scripts.

 

This script is tested on Linux Server.

 

This script needs to be deployed in database node/server.

 

#Base location for the DBA scripts

DBA_SCRIPTS_HOME=$HOME/DBA_MON

#OS User profile where database environment file is set

$HOME/.bash_profile

 

 

Add below lines in env

 

Vi .sysenv

 

export DBA_SCRIPTS_HOME=$HOME/DBA_MON

export PATH=${PATH}:$DBA_SCRIPTS_HOME

export DBA_EMAIL_LIST=user @domain.com,

#Below parameter is used in script, whenever there is planned downtime you can set it to Y so there will be no false alert.

export DOWNTIME_MODE=N

 

 

vi WorkflowStatusCheck.sh

 

 

#!/bin/bash

 

###################################################################################

# Script Name :  WorkflowStatusCheck.sh                                           #

#                                                                                 #

# Description:                                                                    #

# Script to check workflow mailer service components status and                   #

# to send alert notification if one or more workflow mailer components are down   #

#                                                                                 #

# Usage : sh <script_name> <ORACLE_SID>                                           #

# For example : sh WorkflowStatusCheck.sh ORCL                                    #

#                                                                                 #

#

###################################################################################

 

# Initialize variables

 

INSTANCE=$1

HOST_NAME=`hostname`

PROGRAM=`basename $0 | cut -d'.' -f1`

export DBA_SCRIPTS_HOME=$HOME/DBA_MON

APPS_ID=`echo $INSTANCE | tr '[:lower:]' '[:upper:]'`

LOG_DIR=$DBA_SCRIPTS_HOME/logs/$HOST_NAME

OUT_FILE=$LOG_DIR/`echo $PROGRAM`_$APPS_ID.html.out

LOG_FILE=$LOG_DIR/`echo $PROGRAM`_$APPS_ID.log

ERR_FILE=$LOG_DIR/`echo $PROGRAM`_$APPS_ID.err

LOG_DATE=`date`

 

 

# Source the env

. $HOME/.bash_profile

. $DBA_SCRIPTS_HOME/.sysenv

 

if [ $? -ne 0 ]; then

   echo "$LOG_DATE" > $LOG_FILE 

   echo "Please pass correct environment : exiting the script  \n" >> $LOG_FILE

   cat $LOG_FILE

   exit

fi

 

if [ -s $OUT_FILE ]; then

 echo "$LOG_DATE" > $LOG_FILE

 echo "Deleting existing output file $OUT_FILE" >> $LOG_FILE

 rm -f $OUT_FILE

 cat $LOG_FILE

fi

 

# If there is a planned downtime then create $ORACLE_SID.down file in $DBA_SCRIPTS_HOME to silent the alerts during maintenance window.

 

if [ -f $DBA_SCRIPTS_HOME/`echo $ORACLE_SID`.down ]; then

 echo "$LOG_DATE" >> $LOG_FILE

        echo "Host: $HOST_NAME | Instance: $ORACLE_SID is under maintenance: exiting the script" >> $LOG_FILE

        cat $LOG_FILE

 exit

fi

 

if [ $DOWNTIME_MODE = "Y" ]; then

 echo "$LOG_DATE" >> $LOG_FILE

 echo "Host: $HOST_NAME | Instance: $ORACLE_SID is under maintenance: exiting the script" >> $LOG_FILE

 cat $LOG_FILE

 exit

fi

 

usage()

{

 echo "$LOG_DATE" > $LOG_FILE

    echo "Script To Check Workflow Mailer Components Status"  >> $LOG_FILE

    echo "Usage   : sh <script_name> <ORACLE_SID> " >> $LOG_FILE

 echo "For example : sh $PROGRAM.sh $ORACLE_SID" >> $LOG_FILE

    echo

}

 

if [ $# -lt 1 ] || [ "$INSTANCE" != "$ORACLE_SID" ]; then

    usage

    echo "Error : Insufficient arguments." >> $LOG_FILE

    cat $LOG_FILE

    exit

fi

 

get_count()

{

 sqlplus -s '/as sysdba' <<!

 set heading off

 set feedback off

 select count(1) from

 (

  select component_id, startup_mode,component_status,component_type,component_name

  from applsys.fnd_svc_components

  where component_status <> 'RUNNING'

  order by component_id

 );

 

 exit;

!

}

 

count=`get_count`

#echo $count

 

echo "$LOG_DATE" > $ERR_FILE

get_count >> $ERR_FILE

ERR_COUNT=`grep "ORA-" $ERR_FILE |wc -l`

 

if [ $ERR_COUNT -gt 0 ]; then

 cat $ERR_FILE | mailx -s "<ERROR> Critical : $APPS_ID - Workflow Mailer Components are DOWN on $HOST_NAME " $DBA_EMAIL_LIST

 exit

fi

 

if [ $count -gt 0 ];

then

 sqlplus -s '/as sysdba' <<EOF

 

 SET ECHO OFF

 SET pagesize 1000

 set feedback off

 set lines 180

 SET MARKUP HTML ON SPOOL ON -

 HEAD '<title></title> -

 <style type="text/css"> -

 table { background: #eee; } -

 th { font:bold 10pt Arial,Helvetica,sans-serif; color:#b7ceec; background:#151b54; padding: 5px; align:center; } -

 td { font:10pt Arial,Helvetica,sans-serif; color:Black; background:#f7f7e7; padding: 5px; align:center; } -

 </style>' TABLE "border='1' align='left'" ENTMAP OFF

 

 spool $OUT_FILE

 

 

 PROMPT Hi Team,

 

 PROMPT

 PROMPT Logon on to OAM (Oracle Applications Manager) and make sure the following processes are running, if not restart them.

 PROMPT

 

 

 select component_id, startup_mode,component_status,component_type,component_name

 from applsys.fnd_svc_components

 where component_status <> 'RUNNING'

 order by component_id;

 

 SPOOL OFF

 SET MARKUP HTML OFF

 exit;

 

EOF

 

(

echo "To: $DBA_EMAIL_LIST"

echo "MIME-Version: 1.0"

echo "Content-Type: multipart/alternative; "

echo ' boundary="PAA08673.1018277622/server.xyz.com"'

echo "Subject: Critical : $APPS_ID - Workflow Mailer Components are DOWN on $HOST_NAME "

echo ""

echo "This is a MIME-encapsulated message"

echo ""

echo "--PAA08673.1018277622/server.xyz.com"

echo "Content-Type: text/html"

echo ""

cat $OUT_FILE

echo "--PAA08673.1018277622/server.xyz.com"

) | /usr/sbin/sendmail -t

 

echo "`date`" > $LOG_FILE

echo "Details sent through an email" >> $LOG_FILE

cat $LOG_FILE

 

else

    echo "`date`" > $OUT_FILE

 echo "Workflow Mailer Components are up and running fine" >> $OUT_FILE

fi

 

 

 

 

Logs and Out files will be generated under $DBA_SCRIPTS_HOME/logs/$HOST_NAME

So make sure to create logs/$HOST_NAME directory under $DBA_SCRIPTS_HOME before executing

the script.

Once the script is ready, then as per the requirement please schedule it in crontab/OEM.

 

Execute the Script as below:

Syntax :  sh <script_name> <ORACLE_SID>

 

 

$ cd $HOME/DBA_MON/bin

$ sh WorkflowStatusCheck.sh ORCL

 

This script will send alert notification if one or more workflow mailer components are down.

 

 

Sample Email Notification Output:





MRP Stuck and No RFS Process on Standby

Primary node No Longer Transmits Archive Log Files To the physical standby database, MRP Stuck and No RFS Process on Standby


Issue:


Recently I faced strange issue where I observed that one of the standby node was not showing the RFS process, checked in V$MANAGED_STANDBY

V$MANAGED_STANDBY  - showing only ARCH and MRP processes.


Archive logs not shipping from Primary to Standby Node and there are no errors observed in primary and standby alert logs.


Tried to copy Archive logs manually from Primary and Standby but MRP recovery process not applying the further archive logs on Standby.



Cause:


This caused due to outage on the standby server for the network activity.



Observations:


On Primary:


ARCH process stuck on Primary instance

SELECT PROCESS, PID, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM V$MANAGED_STANDBY;


 




DB Mode Showing UNKNOWN

select dest_id id,database_mode db_mode,recovery_mode,

protection_mode,standby_logfile_count "SRLs",

standby_logfile_active ACTIVE,

archived_seq#

from v$archive_dest_status

where dest_id =2;


 


On Standby:

No RFS Process showing when query below:

SELECT PROCESS, PID, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM V$MANAGED_STANDBY;


Solution:


1. On Primary:


- Set log transport state to DEFER status:

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=DEFER;



2. On standby server:


- Shutdown Database Listener


- Cancel Managed Recovery

SQL> alter database recover managed standby database cancel;


- Shutdown the Standby Database

SQL> shutdown immediate




3. On the Primary


Kill the ARCx Processes and the Database will respawn them automatically immediately without harming it.


ps -ef | grep -i arc


kill -9 <ospid of ARC process> <another ospid of ARC process> ...




4. On standby server


- Startup Standby Database and resume Managed Recovery

SQL> startup mount;

SQL> alter database recover managed standby database disconnect from session;



- Start Database Listener



5. Set log transport state to ENABLE status:

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=ENABLE;



6. Monitor the alert logs at each site and ensure log shipping and

apply is occurring again.


Also, verify the DB Mode on Primary -

select dest_id id,database_mode db_mode,recovery_mode,

protection_mode,standby_logfile_count "SRLs",

standby_logfile_active ACTIVE,

archived_seq#

from v$archive_dest_status

where dest_id =2; 



For Information: In case after Terminating ARCH process on primary node,

if it doesn't restart automatically, you can do the following :


On Primary:

SQL> alter system set log_archive_max_processes=4;

(increase log_archive_max_processes accordingly)