Wednesday, April 15, 2015

Weblogic Server Monitoring using command line options

There maybe a requirement to monitor the weblogic admin and managed servers using some cron jobs rather than having to manually monitor the Admin console and see if they are running. A robust shell script can be created which basically checks for the server status, or does a ping to the URLs to see if they are active and incase of discrepancies sends out an email to the admin/support group.

The various command line options which can be included in the script are:

1. First and foremost we can check at the unix level if the server processes are running using command ps -ef| grep "weblogic" .
This should ideally return us 3 processes(assuming we have services running under one domain) : nodemanager, admin server, managed server (weblogic.NodeManager, -Dweblogic.Name=soa_server1, -Dweblogic.Name=AdminServer)
So we can grep and check if any of these are down/missing.

2. The other way to check if processes related to weblogic are running is by making use of weblogic.Admin command line utility,

cd <MW_HOME>/wlserver_10.3/server/bin
. ./setWLSEnv.sh

NOTE: There are two Dots before the Script separated by a Single Space... First Dot represents that set the environment in the Current Shell....and Second Dot represents that pick the Script from current Directory

Once environment has been set you can run below commands:
java weblogic.Admin -username weblogic -password welcome1 GETSTATE AdminServer
Current state of "AdminServer" : RUNNING
java weblogic.Admin -username weblogic -password welcome1 GETSTATE soa_server1
Current state of "soa_server1" : RUNNING

3. We can also do a ping to the host (admin and managed servers) to make sure they are alive. Many times it may happen that the servers are just hung, so we can use "weblogic.Admin" command line utility to do a weblogic PING.

java weblogic.Admin -url adminserverhost:port -username weblogic -password welcome1 PING 10

Sending 10 pings of 100 bytes.
RTT = ~1020 milliseconds, or ~102 milliseconds/packet
If it is alive and returns good, then we are good.

4. There maybe cases where the Weblogic servers may go down due to FMW database being down (soainfra down) or due to heap space/outofmemory errors. In those cases the shell script can check into the log file <MW_HOME>/user_projects/domains/<domain_name>/servers/soa_server1/logs/soa_server-diagnostic.log and see if any of the below error messages are present.

ORA-12514, TNS:listener does not currently know of service requested in connect descriptor (This comes when FMW DB is down)
java.lang.OutOfMemoryError: Java heap space (This comes when the heapspace of server is exceeded)

java.lang.OutOfMemoryError: GC overhead limit exceeded (This again is related to heapspace of server)


NOTE: This is not a complete list but few errors which show up in log files due to WLS infrastructure issues. These errors will cause the BPEL/composites to fault and give unexpected results. Hence pro-actively monitoring these will help in fixing them soon.

No comments:

Post a Comment