Custom Search . . .

Showing posts with label RAC Troubleshooting. Show all posts
Showing posts with label RAC Troubleshooting. Show all posts

Sunday, May 30, 2010

Modifying the VIP in 10g Oracle Cluster Node

Hello,
I got one request to change public, private & vip address in my two nodes oracle cluster. I was thinking this may possible or it will create any other issues. 


In test instance just tried to change VIP IP address in one nodes it's working fine. Now i am looking for change public & private IP address change using Oracle Interface Configuration Tool (oifcfg) & I will post this later when i got free time. 


I would like to share the below changes with you; If you have any query or suggestion you can reach me any time.  Thanks ...


How to change VIP - IP address in Oracle Cluster Node


There are few steps to change VIP IP Address. 


1. Check current configuration
E:\oracle\product\10.2.0\crs\BIN>srvctl config nodeapps -h
Usage: srvctl config nodeapps -n [-a] [-g] [-o] [-s] [-l]
    -n           Node name
    -a                  Display VIP configuration
    -g                  Display GSD configuration
    -s                  Display ONS daemon configuration
    -l                  Display listener configuration
    -h                  Print usage


E:\oracle\product\10.2.0\crs\BIN>srvctl config nodeapps -n babu-node1 -a
VIP exists.: /babu-node1-vip/192.168.200.34/255.255.255.0/public

2. Stop your asm, db instance and other services on node 1



E:\oracle\product\10.2.0\crs\BIN>srvctl stop asm -n babu-node1


E:\oracle\product\10.2.0\crs\BIN>srvctl stop instance -d devdb -i devdb1


E:\oracle\product\10.2.0\crs\BIN>srvctl stop nodeapps -n babu-node1


E:\oracle\product\10.2.0\crs\BIN>crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora....SM1.asm application    OFFLINE   OFFLINE
ora....E1.lsnr application    OFFLINE   OFFLINE
ora....de1.gsd application    OFFLINE   OFFLINE
ora....de1.ons application    OFFLINE   OFFLINE
ora....de1.vip application    OFFLINE   OFFLINE
ora....SM2.asm application    ONLINE    ONLINE    babu-node2
ora....E2.lsnr application    ONLINE    ONLINE    babu-node2
ora....de2.gsd application    ONLINE    ONLINE    babu-node2
ora....de2.ons application    ONLINE    ONLINE    babu-node2
ora....de2.vip application    ONLINE    ONLINE    babu-node2
ora.devdb.db   application    ONLINE    ONLINE    babu-node2
ora....b1.inst application    OFFLINE   OFFLINE
ora....b2.inst application    ONLINE    ONLINE    babu-node2


3. Modify your new VIP - IP Address or VIP Host Name in /etc/hosts or C:\WINDOWS\system32\drivers\etc\hosts file ( you should update your new ip on both nodes )


4. Now modify new ip address on node1




E:\oracle\product\10.2.0\crs\BIN>srvctl config nodeapps -n babu-node1 -a
VIP exists.: /babu-node1-vip/192.168.200.34/255.255.255.0/public


E:\oracle\product\10.2.0\crs\BIN>srvctl modify nodeapps -n babu-node1 -A 192.168.200.36/255.255.255.0/"public"


E:\oracle\product\10.2.0\crs\BIN>srvctl config nodeapps -n babu-node1 -a
VIP exists.: /babu-node1-vip/192.168.200.36/255.255.255.0/public


5. Double check VIP modified or not using the above command. If everything fine then start all services & check your new VIP - IP address. 


E:\oracle\product\10.2.0\crs\BIN>srvctl start nodeapps -n babu-node1


E:\oracle\product\10.2.0\crs\BIN>srvctl start asm -n babu-node1


E:\oracle\product\10.2.0\crs\BIN>srvctl start instance -d devdb -i devdb1


E:\oracle\product\10.2.0\crs\BIN>ipconfig


Windows IP Configuration




Ethernet adapter public:


   Connection-specific DNS Suffix  . :
   IP Address. . . . . . . . . . . . : 192.168.200.36
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   IP Address. . . . . . . . . . . . : 192.168.200.33
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Default Gateway . . . . . . . . . : 192.168.200.1


Ethernet adapter private:


   Connection-specific DNS Suffix  . :
   IP Address. . . . . . . . . . . . : 10.1.1.1
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Default Gateway . . . . . . . . . :


E:\oracle\product\10.2.0\crs\BIN>crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    babu-node1
ora....E1.lsnr application    ONLINE    ONLINE    babu-node1
ora....de1.gsd application    ONLINE    ONLINE    babu-node1
ora....de1.ons application    ONLINE    ONLINE    babu-node1
ora....de1.vip application    ONLINE    ONLINE    babu-node1
ora....SM2.asm application    ONLINE    ONLINE    babu-node2
ora....E2.lsnr application    ONLINE    ONLINE    babu-node2
ora....de2.gsd application    ONLINE    ONLINE    babu-node2
ora....de2.ons application    ONLINE    ONLINE    babu-node2
ora....de2.vip application    ONLINE    ONLINE    babu-node2
ora.devdb.db   application    ONLINE    ONLINE    babu-node2
ora....b1.inst application    ONLINE    ONLINE    babu-node1
ora....b2.inst application    ONLINE    ONLINE    babu-node2

Sunday, May 2, 2010

Oracle Clusterware daemons Enable/Disable Status



crsctl Using this command we can enable or disable oracle clusterware damemons. Run the following command to enable startup of all oracle clusterware daemons

crsctl enable crs

Run the following command to disable startup for all oracle clusterware daemons

crsctl diable crs

Usually we're enabling/disableing oracle clusterware daemons status using crsctl but no one checking wheather cluster already disabled/enabled or not ?

I just found YES we can check oracle clusterware daemons startup status in scls_scr directory from /etc/oracle

Example:

Enable:


[root@linux1 ~]# cd /u01/app/oracle/product/10.2.0/crs/bin


[root@linux1 bin]# ./crsctl enable crs
[root@linux1 bin]# cat /etc/oracle/scls_scr/linux1/root/crsstart
enable


After executing crsctl enable crs command clusterware daemons status updated in crsstart file "enable"

Disable:


[root@linux1 bin]# ./crsctl disable crs


[root@linux1 bin]# cat /etc/oracle/scls_scr/linux1/root/crsstart


disable

After executing crsctl disable crs command clusterware daemons status updated in crsstart file "disable". Have a great day to found oracle clusterware daemons status :-) ..

Saturday, April 17, 2010

Terminating clsd session

Memorial Day and right now I am so busy at work that I haven’t had time to study. I do have a post half way complete on RAC Installation but I can finish that this weekend. So what can I blog about in the meantime?


Well today I have completed Oracle RAC Installation on Windows 2003 SP2. Next month I have a Solaris environment to setup that is RAC. I did 3 Linux ones last month (All Red Hat). What I will do is post through the install and what hiccups I ran into as well as what the gotcha were with each system.


So here we are 1:20  pm at night and I will be trying to get this done throughout the night. I am actually working on 3 posts in parallel. This, VIP Mondification on RAC Nodes, and then a post on some more RMAN Backup jobs. All good stuff. When I finish that I will try and get some more OCM 10g DBA posts up.

Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle. All rights reserved.
2010-04-17 13:14:03.875: [ default][2120]Terminating clsd session

After successful RAC Installation on RAC in windows; VIP not able to start in both nodes. 5125957.8 as per metalink node i have done CRS upgrade from 10.1 to 10.2


Now I am trying to upgrade CRS 10.2.0.1 to 10.2.0.4. Once successfully completion I will update you same. 


Thanks.., Fell free to write your comments and feedback.

Tuesday, December 8, 2009

Oracle CRS failure. Rebooting for cluster integrityOracle CRS failure. Rebooting for cluster integrity.


Today I faced some issue in my cluster "
Oracle CRS failure. Rebooting for cluster integrity" due to this error my CRS not able to start.

DB & CRS Version: 10.2.0.4
OS Version: Red Hat Linux 4 - 64 bit

Symptoms related to this issue as they were reported to Oracle Support have been identified as (but are not necessarily limited to):

- Cluster member reboots
- CLSOMON failing with status 13
- high cpu usage of ocssd.bin

Due to this nodes got rebooted & CRS failed

When I troubleshoot this issue found some logs from OS & crs.

Operating System Log:


Dec 7 10:57:22 babuhost4 logger: Oracle clsomon failed with fatal status 137.
Dec 7 10:57:23 babuhost4 logger: Oracle CRS failure. Rebooting for cluster integrity.
Dec 7 11:02:19 babuhost4 syslogd 1.4.1: restart.
Dec 7 11:02:19 babuhost4 syslog: syslogd startup succeeded

Cluster Log:

[ CSSD]2009-12-08 11:22:35.200 [1262557536] >TRACE: clssnmRcfgMgrThread: Local Join
[ CSSD]2009-12-08 11:22:35.200 [1262557536] >WARNING: clssnmLocalJoinEvent: takeover aborted due to ALIVE node on Disk
[ CSSD]2009-12-08 11:22:35.885 [1136679264] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(3) wrtcnt(5213) LATS(86809174) Disk lastSeqNo(5213)

Operating System:

Linux babuhost4 2.6.9-67.ELsmp #1 SMP Wed Nov 7 13:56:44 EST 2007 x86_64 x86_64 x86_64 GNU/Linux

CRS Health Check Failed

Checking CRS health...

Check: Health of CRS
Node Name CRS OK?
------------------------------------ ------------------------
babuhost5 yes
babuhost4 unknown
babuhost3 yes
Result: CRS health check failed.


Refer: Document ID
" 731599.1". As per document this issue occur from Oracle Server - Enterprise Edition - Version: 10.2.0.4 to 11.1.0.7.

Looks glibc package version lower need to upgrade higher version.

Oracle Enterprise Linux (OEL) / RHEL 4

* Problem exists with glibc-2.3.4-2.39
* Fixed in glibc-2.3.4-2.40 and above (only version -2.41 was actually released)

Current Version:

[root@babuhost4 log]# rpm -q glibc
glibc-2.3.4-2.39
glibc-2.3.4-2.39
[root@babuhost4 log]

Feel free write your comments here...

Thanks

Monday, November 9, 2009

CRS installation Aborts With "The Operation Has Failed Unexpectedly"

CRS installation Aborts With "The Operation Has Failed Unexpectedly"

Oracle Server Enterprise Edition: 10.2.0.1
OS Verison : Windows 2003 SP2

When i try to install oracle cluster in windows 2003 i got the bellow error message.

CRS installation Aborts With "The Operation Has Failed Unexpectedly"


nodeNames = baburac1-priv, baburac2
The operation has failed unexpectedly

SEVERE: The specified nodes are not clusterable
The operation has failed unexpectedly


Cause:

Private IP listed in NIC first.

Solution:

Simply changing the NIC order and setting the public NIC first.

Step for modifying this are:

  1. Open explorer
  2. Right click on my network places
  3. Select properties
  4. From Advanced menu, select Advanced settings
  5. With Adapters and bindings open, move public NIC to be first on list

After changing NIC order it’s working fine & successfully completed cluster installation in windows 2003.


Fell free to write your comments here… Thanks

Sunday, August 30, 2009

PRKN-1011 : Failed to retrieve value for "local_only" under registry key


During oracle rac database installation. I got the below error details from oracle inventory log

Applies to

Windows 2000 & 2003

envp[0]:path=C:\DOCUME~1\oracle\LOCALS~1\Temp\OraInstall2009-08-31_12-55-30AM\oui\lib\win32
Caught Cluster ExceptionPRKN-1011 : Failed to retrieve value for "local_only" under registry key "HKEY_LOCAL_MACHINE\Software\Oracle\Ocr" on node "BabuRAC1-priv", The system cannot read from the specified device.

[PRKN-1011 : Failed to retrieve value for "local_only" under registry key "HKEY_LOCAL_MACHINE\Software\Oracle\Ocr" on node "BabuRAC1-priv", The system cannot read from the specified device.
]

Action Plan:

This is looks like oracle bug for more details refrer 737961.1

Feel free wirte your comments here...

Thanks

Saturday, July 25, 2009

ORA-07445: exception encountered: core dump [qmkmfreeUga()+27] [SIGSEGV] [Address

ORA-07445: exception encountered: core dump [qmkmfreeUga()+27] [SIGSEGV] [Address mapped to object] [0x4C] [] []

When upgrading from 10.2.0.1.0 to 10.2.0.4.0, a user may see this error when running catupgrd.sql

Cause
This is unpublished bug 6957077

Action:

The database instance needs to be shutdown and restarted, then catupgrd.sql can be run and

should complete successfully

shutdown immediate

startup upgrade

Sunday, July 12, 2009

PRKH-1001 : HASContext Internal Error


Today when I try to add new service in my RAC Environment; I got below error message.

[root@linux1 bin]# ./srvctl add service -d devdb -s devdb_taf -r "devdb1,devdb2" -P BASIC
PRKH-1001 : HASContext Internal Error
[root@linux1 bin]#

When I check my rac services; everything seems running fine.

[root@linux1 bin]# ./olsnodes -n
linux1 1
linux2 2
linux3 3


[root@linux1 bin]# ./crs_stat -t | grep -i linux
ora.devdb.db application ONLINE ONLINE linux1
ora....b1.inst application ONLINE ONLINE linux1
ora....b2.inst application ONLINE ONLINE linux2
ora....SM1.asm application ONLINE ONLINE linux1
ora....X1.lsnr application ONLINE ONLINE linux1
ora.linux1.gsd application ONLINE ONLINE linux1
ora.linux1.ons application ONLINE ONLINE linux1
ora.linux1.vip application ONLINE ONLINE linux1
ora....SM2.asm application ONLINE ONLINE linux2
ora....X2.lsnr application ONLINE ONLINE linux2
ora.linux2.gsd application ONLINE ONLINE linux2
ora.linux2.ons application ONLINE ONLINE linux2
ora.linux2.vip application ONLINE ONLINE linux2
ora.linux3.gsd application ONLINE OFFLINE
ora.linux3.ons application ONLINE OFFLINE
ora.linux3.vip application ONLINE ONLINE linux1

My Database & Operating System Version:

Os Version : Red Hat Linux 4 AS

Database Version: 10.2.0.1.0

This error bug (bug Nr: 4493093 - PRKH-1001 from "srvctl add nodeapps") in 10.2.0.1.0 also this fixed in 10.2.0.3.0 (Check Metalink Document: 4493093.8)

You may need to upgrade you database version.

Please post your feedback & comments here. ..

Thanks