CRS-8501 CRS-2316 Cannot initialize GPnP, CLSGPNP_INIT_FAILED (GPnP facility initilization failed)

root@dbhost040102:/u01/app/12.2.0.1/grid/bin# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
root@dbhost040102:/u01/app/12.2.0.1/grid/bin#

CRS alertlog shows GPnP initialization failure during the CRS startup.

2022-11-12 06:06:41.472 [CLSECHO(1414)]CRS-10132: Oracle High Availability Service was restarted at least 10 times within the last 60 seconds. Stop auto-restarting O racle High Availability Service.
2022-11-12 06:06:56.664 [GPNPD(1267)]CRS-2316: Cannot initialize GPnP, CLSGPNP_INIT_FAILED (GPnP facility initialization failed).
2022-11-12 06:06:56.673 [GPNPD(1267)]CRS-8501: Oracle Clusterware GPNPD process with operating system process ID 1267 is ending with return value 3

Download the troubleshooting script here. Unzip the script in /tmp and run it from the problematic node as per below steps:

$ cd /tmp/
$ tar -zxvf startUpCheck_[OS_Platform].tar.gz
$ chmod +x startUpCheck_[OS_Platform].{sh,py}
You would find 2 files startUpCheck_[OS_Platform].sh and startUpCheck_[OS_Platform].py. Execute the script as "root" user as follows

# ./startUpCheck_[OS_Platform].sh -n <node list> -i <private/asm interface list>

where
-n => list of nodes in the cluster
-i => list of private/asm interfaces

root@dbhost040102:/tmp/gi_scrpt# ./startUpCheck_Solaris.sh -n dbhost020102,dbhost040102 -i eth1,eth2
Logfile location : /tmp/gi_scrpt/crsstartup_dbhost040102_2022-11-12.12:25:52.log

An error occurred while executing '/u01/app/12.2.0.1/grid/bin/gpnptool get' command. Refer log for details
Verifying if script is executed by root user ...PASSED
Verifying runlevel ...PASSED
Verifying if the environment is STANDALONE or RAC ...RAC
Verifying the provided node list against information in /u01/app/12.2.0.1/grid/crs/install/crsconfig_params ...PASSED
Verifying the provided private/asm interface list against information fetched from GPNP ...FAILED
        RESULT: Warning!!! Provided private interconnect details found incorrect... Proceeding with autodetected private interface details
Verifying GI Home details ...DONE
Verifying '/u01/app/12.2.0.1/grid/dbs' is owned by 'grid' user... PASSED
Verifying ownership and permissions on /u01/app/12.2.0.1/grid/bin/oracle ...PASSED
Verifying mount options for GI Home mount point ...PASSED
Verifying OLR integrity ...FAILED
Cause: OLR is corrupted

Status of Oracle Local Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :        896
         Available space (kbytes) :     408672
         ID                       :  158964337
         Device/File Name         : /u01/app/12.2.0.1/grid/cdata/dbhost040102.olr
                                    Device/File integrity check succeeded

         Local registry integrity check succeeded

         Logical corruption check failed

The output of the script about shows OLR is corrupted on the node, where this script was invoked. The solution is to restore the OLR from the last successful backup as follows:

Before proceeding with the OLR restoration , use the following command to ensure GI stack is completely down on problematic node and ohasd.bin is not up and running.

root@dbhost040102:/tmp/gi_scrpt# ps -ef| grep ohasd.bin
    root 45843 31954   0 12:56:51 pts/8       0:00 grep ohasd.bin
root@dbhost040102:/tmp/gi_scrpt#

Incase few services are up, use the following command to forcefully stop the CRS services.

root@dbhost040102:/u01/app/12.2.0.1/grid/bin# ./crsctl stop crs -f
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Stop failed, or completed with errors.
root@dbhost040102:/u01/app/12.2.0.1/grid/bin# 
Check the available OCR backups
root@dbhost040102:/u01/app/12.2.0.1/grid/bin# ./ocrconfig -local -showbackup

dbhost040102     2020/06/16 00:39:48     /u01/app/12.2.0.1/grid/cdata/dbhost040102/autobackup_20200616_003948.olr     2808044450

dbhost040102     2020/06/15 00:39:45     /u01/app/12.2.0.1/grid/cdata/dbhost040102/autobackup_20200615_003945.olr     2808044450

dbhost040102     2019/10/22 02:50:19     /u01/app/12.2.0.1/grid/cdata/dbhost040102/backup_20191022_025019.olr     185980871

dbhost040102     2016/06/22 20:27:05     /u01/app/12.1.0.2/grid/cdata/dbhost040102/backup_20160622_202705.olr     3118584562
root@dbhost040102:/u01/app/12.2.0.1/grid/bin# 
Use the last successful backup to restore the OLR

For example:

root@dbhost040102:/u01/app/12.2.0.1/grid/bin# ./ocrconfig -local -restore /u01/app/12.2.0.1/grid/cdata/dbhost040102/autobackup_20200616_003948.olr

root@dbhost040102:/u01/app/12.2.0.1/grid/bin#

OLR is restored, lets try to bring up the CRS now.

root@dbhost040102:/u01/app/12.2.0.1/grid/bin# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
root@dbhost040102:/u01/app/12.2.0.1/grid/bin#


root@dbhost040102:/u01/app/12.2.0.1/grid/bin# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
root@dbhost040102:/u01/app/12.2.0.1/grid/bin# ./crsctl check cluster
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
root@dbhost040102:/u01/app/12.2.0.1/grid/bin#

CRS services have started up successfuly. CRS alert log clearly show the clusterware process start.

CRS Alertlog:
2022-11-12 13:13:35.011 [OCSSD(74199)]CRS-1601: CSSD Reconfiguration complete. Active nodes are dbhost020102 dbhost040102 .
2022-11-12 13:13:37.229 [OCSSD(74199)]CRS-1720: Cluster Synchronization Services daemon (CSSD) is ready for operation.
2022-11-12 13:13:37.678 [OCTSSD(76610)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 76610
2022-11-12 13:13:38.581 [OCTSSD(76610)]CRS-2403: The Cluster Time Synchronization Service on host dbhost040102 is in observer mode.
2022-11-12 13:13:40.067 [OCTSSD(76610)]CRS-2407: The new Cluster Time Synchronization Service reference node is host dbhost020102.
2022-11-12 13:13:40.068 [OCTSSD(76610)]CRS-2401: The Cluster Time Synchronization Service started on host dbhost040102.
2022-11-12 13:13:50.647 [OSYSMOND(77856)]CRS-8500: Oracle Clusterware OSYSMOND process is starting with operating system process ID 77856
2022-11-12 13:13:51.945 [CRSD(77979)]CRS-8500: Oracle Clusterware CRSD process is starting with operating system process ID 77979
2022-11-12 13:14:02.883 [CRSD(77979)]CRS-1012: The OCR service started on node dbhost040102.
2022-11-12 13:14:05.479 [CRSD(77979)]CRS-1201: CRSD started on node dbhost040102.
2022-11-12 13:14:10.836 [ORAAGENT(80013)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 80013
2022-11-12 13:14:11.035 [ORAAGENT(80058)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 80058
2022-11-12 13:14:11.422 [ORAROOTAGENT(80102)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 80102
2022-11-12 13:15:22.874 [ORAAGENT(89757)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 89757
Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s