UNIX tricks and treats

Aller au contenu | Aller au menu | Aller à la recherche

lundi 22 mars 2010

Purge backups OCR non fonctionnel sous Cluster Oracle 11gR2

Dans un cluster Oracle 11gR2, la purge des backups automatiques de l'OCR ne fonctionne pas (depuis version 10.2.4) si les  fichiers *.ocr situés sous $ORA_CRS_HOME/cdata/<cluster_name> (ex:/u01/app/11.2.0/grid/cdata/inbdor0809-rac/) ont les mauvaises permissions. De ce fait, les fichiers *.ocr avec des noms aléatoires ne sont pas renoimmés en backupXX.ocr et ne sont pas purgés, saturant la partition.

ocrconfig -showbackup montre bien la bonne date de sauvegarde, mais les fichiers backupXX.ocr ont une date bien antérieure.

Un grand nombre de fichiers XXXXXXXXX.ocr existe sous $ORA_CRS_HOME/cdata/<cluster_name>.


# ls -rtl $ORA_CRS_HOME/cdata/<cluster_name>

-rw------- 1 grid oinstall 7081984 jan 18 22:24 day.ocr
-rw------- 1 grid oinstall 7081984 jan 19 02:24 day_.ocr
-rw------- 1 grid oinstall 7081984 jan 19 06:24 backup02.ocr
-rw------- 1 grid oinstall 7081984 jan 19 10:24 backup01.ocr
-rw------- 1 grid oinstall 7081984 jan 19 14:24 backup00.ocr

...

-rw------- 1 root root 7413760 mar 19 22:06 40778839.ocr
-rw------- 1 root root 7413760 mar 20 02:06 26378630.ocr
-rw------- 1 root root 7413760 mar 20 06:06 11332652.ocr
-rw------- 1 root root 7413760 mar 20 10:06 35215677.ocr
-rw------- 1 root root 7413760 mar 20 14:06 41977816.ocr
-rw------- 1 root root 7413760 mar 20 18:06 18335174.ocr
-rw------- 1 root root 7413760 mar 20 22:06 90743999.ocr
-rw------- 1 root root 7413760 mar 21 02:06 20182690.ocr
-rw------- 1 root root 7413760 mar 21 06:06 28125568.ocr
-rw------- 1 root root 7413760 mar 21 10:06 20121708.ocr
-rw------- 1 root root 7413760 mar 21 14:06 34916120.ocr
-rw------- 1 root root 7413760 mar 21 18:06 24068304.ocr



Solution:

chown root:root $ORA_CRS_HOME/cdata/<cluster_name>/*.ocr

Les fichiers XXXXXXXX.ocr peuvent être ensuite supprimés.

Référence: Note Metalink 741271.1

samedi 19 septembre 2009

Oracle 11gr2 RAC on Red Hat Linux 5.3 install guide part1


Oracle 11gr2 RAC on Red Hat Linux 5.3 install guide part1: Installing the grid infrastructure:

Oracle 11gR2 has been released for Linux, and the installation has somewhat changed from precedent versions, including 11gR1. In this step-by-step guide, we will lead you through a real-life Oracle Real Application Cluster (RAC) installation, its novelties, incompatibilities, and the caveats of Oracle install documentation.

The installation process is divided into two parts: The grid infrastructure installation, which now includes the clusterware, but also ASM installation, which has been moved there from the regular database installation. This stems from the fact that ASM now supports voting disks, and OCR files, and you are no longer required (actually, its now discouraged) to place the voting disks and OCR files on raw devices.

Grid infrastructure also installs for you the ACFS cluster file system, which allows you to share the ORACLE_HOME of your database installation between all the nodes of your RAC cluster. However, it doesn't allow you to share the grid infrastructure ORACLE_HOME between the nodes. For that, you would need to install the grid infrastructure binaries on an ocfs2 filesystem. However, that's not supported by Oracle, nor does it work. Last year, Oracle had promised an Oracle Universal File System (UFS), and it is a bit disappointing to see that ACFS is not what we expected yet.

Download the necessary files:

You will need the grid infrastructure disk, as well as the two database disks from Oracle's site. Download as well the deinstall disk, as Oracle Universal Installer doesn't support the deinstallation of the binaries anymore, and everything has moved to this 300Mb plus disk.

You will also need the three asmlib files from OTN, that are downloadable here.

Do a uname -rm on your platform in order to find out which ones are the right ones for you.

Oracle validate rpm will also be useful in order to ensure you have all the necessary rpm's installed on your server.

Setting the system parameters and prerequisites:

Nothing much new here. Simply follow the install guide's  instructions. The installer will check anyhow if you have set the parameters right, and even generate a fixup script for most of the prerequisites.

# cat >> /etc/sysctl.conf <<EOF
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmmax = 4294967296
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048586
EOF

# sysctl -p

An extremely intriguing fact is the necessity to set up ntpd with the -x option which disallows any brutal adjustment of the system clock under a drift of 600s (instead of 128ms by default). This is a workaround for a RAC bug that was supposed to have been corrected in release 10.2.0.3... Well, actually, it may not be a bug, but a feature for any cluster that needs synchronization. The downside of the -x option is, that if your hardware is down for a month, and the system clock goes off half an hour, it will take days for it to adjust slowly to network time.

Be sure to do a chkconfig oracleasm on after setting up oracleasm on a RHEL 5.3. Else, you will corrupt your voting disks and OCR upon the first reboot. The install guide has simply forgotten to mention that oracleasm enable/disable have been deprecated on this platform.

Don't bother setting up VIP's or passwordless ssh connectivity, contrarily to what the install guide instructs you to do: the installer won't appreciate your initiative, and you will have to set them up the way Oracle wants it. Simply give the same password to your grid and oracle users on both nodes.

Creating the UNIX groups, users, and directories:

Create two separate users (grid and oracle for example), one for grid infrastructure installation, and one for the database installation, with separate ORACLE_BASE and ORACLE_HOME directories.

A new set of three groups have been created to manage asm. Grid user should be member of them.

A change to OFA is that the grid user's ORACLE_HOME cannot be under its ORACLE_BASE directory, but on a separate path.

Here, we will point oracle user's home to a shared ACFS mount. We'll mount that filesystem later, after grid infrastructure's installation  when we will have ACFS installed. Indeed, ACFS is built on top of ASM, which in turn is installed as part of grid infrastructure. Hence the separation of grid infrastructure and database installation.

(As a footnote: you may change u01 to get27, for example, and still be OFA-compliant)

# /usr/sbin/groupadd oinstall
# /usr/sbin/groupadd dba
# /usr/sbin/groupadd oper
# /usr/sbin/groupadd asmadmin
# /usr/sbin/groupadd asmdba
# /usr/sbin/groupadd asmoper
# /usr/sbin/groupadd orauser

# /usr/sbin/usermod -g oinstall -G dba,asmdba oracle
# /usr/sbin/usermod -g oinstall -G dba,asmdba,oper,oinstall,asmadmin grid

# mkdir /u01/app/11.2.0/grid
# mkdir /u01/app/grid
# mkdir /u01/app/acfsmounts/oracle
# chown -R grid:oinstall /u01/app
# chmod -R 775 /u01/app
# chown -R oracle:oinstall /u01/app/acfsmounts
# chmod -R 775 /u01/app/acfsmounts

# cat >> /etc/security/limits.conf <<EOF
grid soft nproc 2047
grid hard nproc 16384
grid soft nofile 1024
grid hard nofile 65536

oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
EOF

To be continued ... stay tuned.

vendredi 18 septembre 2009

Discrepancies and catchas in Oracle 11gR2 grid infrastructure install guide for Linux


Discrepancies in Oracle 11gR2 grid infrastructure install guide for Linux:

- Oracle instructs you to create VIP's on both nodes as a preinstall task.
However, if you do so, Oracle grid infrastructure installer will tell you the VIP adresses are already in use.

- Even though you have set up passwordless ssh connectivity between two RAC nodes, the installer keeps telling you this is not the case. I guess it has something to do with Oracle using rsa1. I gave up and gave both my oracle users the same password, and clicked on "setup", and let the installer do it for me. Everything went fine afterwards.

- /usr/sbin/oracleasm enable and disable have been deprecated on RHEL 5.3.
You have to use chckconfig oracleasm on.
If you fail to do so, upon reboot, asmlib is not loaded, and your voting disks and OCR are corrupted.

- If you use ACFS have to use different ORACLE_BASE directories for the Oracle grid infrastructure user (ex:grid: /u01/app/grid/), and the Oracle database user (ex: oracle: /u01/app/oracle/).
In the install doc, this is not so clear, as only ORACLE_HOME directories (ex:/u01/app/11.2.0/grid/ for grid and /u01/app/oracle/acfsmounts/orahome1/ for oracle) have to be different, the ORACLE_BASE seeming to be a unique one.

- Even though you can set up a shared ORACLE_HOME through ACFS for the database binaries, you still have to rely on ocfs2 if you want to have the Oracle grid infrastructure binaries on a shared filesystem.

- You absolutely have to be patient ant wait for the root.sh script to finish on your first node (can last half an hour), before you may execute it on your other nodes. Else, your  installation will miserably fail.

A complete RAC installation guide for Oracle 11gR2 on RHEL 5.3 with multipath will follow soon.