Voting Disk and OCR in 11gR2: Some changes

Having just delivered an Oracle Database 11gR2 RAC Admin course, I’d like to point out some remarkable changes in the way we handle now the important Clusterware components Voting Disk and Oracle Cluster Registry (OCR): Amazingly, we can now store the two inside of an Automatic Storage Management (ASM) Disk Group, which was not possible in 10g.

The OCR is striped and mirrored (if we have a redundancy other than external), similar as ordinary Database Files are. So we can now leverage the mirroring capabilities of ASM to mirror the OCR also, without having to use multiple RAW devices for that purpose only. The Voting Disk (or Voting File, as it is now also referred to) is not striped but put as a whole on ASM Disks – if we use a redundancy of normal on the Diskgroup, 3 Voting Files are placed, each on one ASM Disk. This is a concern, if our ASM Diskgroups consist of only 2 ASM Disks! Therefore, the new quorum failgroup clause was introduced:

create diskgroup data normal redundancy
 failgroup fg1 disk 'ORCL:ASMDISK1'
 failgroup fg2 disk 'ORCL:ASMDISK2'
 quorum failgroup fg3 disk 'ORCL:ASMDISK3'
 attribute 'compatible.asm' = '11.2.0.0.0';

The failgroup fg3 above needs only one small Disk (300 MB should be on the safe side here, since the Voting File is only about 280 MB in size) to keep one Mirror of the Voting File. fg1 and fg2 will contain each one Voting File and all the other stripes of the Database Area as well, but fg3 will only get that one Voting File.

[root@uhesse1 ~]#  /u01/app/11.2.0/grid/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   511de6e64e354f9bbf4be318fc928c28 (ORCL:ASMDISK1) [DATA]
 2. ONLINE   2f1973ed4be84f50bffc2475949b428f (ORCL:ASMDISK2) [DATA]
 3. ONLINE   5ed44fb7e79c4f79bfaf09b402ba70df (ORCL:ASMDISK3) [DATA]

Another important change regarding the Voting File is that it is no longer supported to take a manual backup of it with dd. Instead, the Voting File gets backed up automatically into the OCR. As a New Feature, you can now do a manual backup of the OCR any time you like, without having to wait until that is done automatically – which is also still done:

[root@uhesse1 ~]# /u01/app/11.2.0/grid/bin/ocrconfig -showbackup

uhesse1     2010/10/06 09:37:30     /u01/app/11.2.0/grid/cdata/cluhesse/backup00.ocr
uhesse1     2010/10/06 05:37:29     /u01/app/11.2.0/grid/cdata/cluhesse/backup01.ocr
uhesse1     2010/10/06 01:37:27     /u01/app/11.2.0/grid/cdata/cluhesse/backup02.ocr
uhesse1     2010/10/05 01:37:21     /u01/app/11.2.0/grid/cdata/cluhesse/day.ocr
uhesse1     2010/10/04 13:37:19     /u01/app/11.2.0/grid/cdata/cluhesse/week.ocr

Above are the automatic backups of the OCR as in earlier versions. Now the manual backup:

[root@uhesse1 ~]# /u01/app/11.2.0/grid/bin/ocrconfig -manualbackup
uhesse1     2010/10/06 13:07:03     /u01/app/11.2.0/grid/cdata/cluhesse/backup_20101006_130703.ocr

I got a manual backup on the default location on my master node. We can define another backup location for the automatic backups as well as for the manual backups – preferrable on a Shared Device that is accessible by all the nodes (which is not the case with /home/oracle, unfortunately :-) ):

[root@uhesse1 ~]# /u01/app/11.2.0/grid/bin/ocrconfig -backuploc /home/oracle
[root@uhesse1 ~]# /u01/app/11.2.0/grid/bin/ocrconfig -manualbackup
uhesse1     2010/10/06 13:10:50     /home/oracle/backup_20101006_131050.ocr
uhesse1     2010/10/06 13:07:03     /u01/app/11.2.0/grid/cdata/cluhesse/backup_20101006_130703.ocr

[root@uhesse1 ~]# /u01/app/11.2.0/grid/bin/ocrconfig -showbackup
uhesse1     2010/10/06 09:37:30     /u01/app/11.2.0/grid/cdata/cluhesse/backup00.ocr
uhesse1     2010/10/06 05:37:29     /u01/app/11.2.0/grid/cdata/cluhesse/backup01.ocr
uhesse1     2010/10/06 01:37:27     /u01/app/11.2.0/grid/cdata/cluhesse/backup02.ocr
uhesse1     2010/10/05 01:37:21     /u01/app/11.2.0/grid/cdata/cluhesse/day.ocr
uhesse1     2010/10/04 13:37:19     /u01/app/11.2.0/grid/cdata/cluhesse/week.ocr
uhesse1     2010/10/06 13:10:50     /home/oracle/backup_20101006_131050.ocr
uhesse1     2010/10/06 13:07:03     /u01/app/11.2.0/grid/cdata/cluhesse/backup_20101006_130703.ocr

Conclusion: The way to handle Voting Disk and OCR has changed significantly – they can be kept inside of an ASM Diskgroup especially.

Advertisement

, ,

  1. #1 by jason arneil on October 6, 2010 - 17:17

    Hi Uwe,

    Any sign of being able to place the voting files onto multiple diskgroups if using external redundancy? And does it make any sense using the quorum feature with external redundancy?

    regards,

    jason.

  2. #2 by Uwe Hesse on October 7, 2010 - 07:49

    Hi Jason,
    there seems to be no supported way to have multiple Voting Files with external redundancy. Even the “crsctl add css votedisk” command is disabled if Voting Files are stored on ASM initially. You get this error message then: “CRS-4671: This command is not supported for ASM diskgroups.”

    Regarding your second question: I don’t see why we should do that, even though it should be possible technically.

  3. #3 by Srinivasan Krishnan on October 7, 2010 - 10:08

    For the quorum disk feature , should I have to add extra disk ? currently I have only one voting disk ?

  4. #4 by Uwe Hesse on October 7, 2010 - 10:18

    The Quorum Failgroup clause was introduced for setups with Extended RAC and/or for setups with Diskgroups that have only 2 Disks (resp. only 2 Failure Groups) but want to use normal redundancy. If you have currently one Voting Disk and you setup a Diskgroup with redundancy external – you will stay with one Voting Disk that is placed on your Diskgroup (as a whole on one Disk) together with all the other files (Datafiles, Controlfiles, Logfiles) that make up your Database Area. If you choose external redundancy, your storage (RAID) should provide redundancy, though.
    Short answer: No, not necessarily :-)

  5. #5 by Manuel Fuenzalida on March 18, 2011 - 04:35

    Hi uwe:

    I read your blog, very good articles, but this one interest me because i’m installing now an extended Rac 11G R2, two storage, but at the same physical site. I need to know, when the installer ask me for storage options for OCR and voting file, and i tell the installer to use asm with normal redundancy, if i want to have the contents of each voting file mirrored in other file group, made of disk of the second storage…..how can i do that? and how many disks i need to do that ? Because in the installer of grid infrastructure, don’t let me indicate thats options….

    thank you :)

    regards
    Manuel

  6. #6 by Uwe Hesse on March 18, 2011 - 09:02

    Hi Manuel,
    I suppose you are aware that it is not recommended to put 2 of 3 voting files on one site of an Extended RAC, because if this site crashes, the cluster is unavailable. You should instead put one voting file on each site and one on a third node. Look here for a whitepaper that describes that:
    http://www.oracle.com/technetwork/database/clusterware/overview/grid-infra-thirdvoteonnfs-131158.pdf

    If you insist to put 2 voting files on one site, you could control this with the mentioned quorum failgroup clause above in the article. Pick one disk on the desired site for that quorum failgroup. But again: From a HA perspective, it is a bad idea to put 2 of 3 voting files on one site of an Extended RAC

  7. #7 by Manuel on March 18, 2011 - 14:04

    Hi uwe:

    Thanks for your reply, i understand that is a bad idea to put 2 of 3 voting files on one site of an Extended RAC, but i dont know how do it….Grid infrastructure installer dont let me configure in the moment i install, the failgroups for disk containing OCR y Voting Files….so that way, asm create the failgroup of each disk in the same disk….i need 1 voting file in one disk from one storage…other voting file in one disk from another storage….and the third voting file, i coul make it a you say, with an nfs or something like that…but for now, i can’t do that way because installer don’t let me specify failgroups…..

    Thak you again.
    Regards
    Manuel

  8. #8 by DanyC on April 12, 2011 - 11:46

    Hi Uwe,

    In case i have 2 nodes accesing 2 storages and the ocr & voting files stored in

    ORCL:CRS_ST1_DISK1
    ORCL:CRS_ST1_DISK2
    ORCL:CRS_ST2_DISK1
    ORCL:CRS_ST2_DISK2
    ORCL:CRS_ST2_DISK3

    is that normal? I’m very confused as one of our dbs said the number should be odd but i didn’t find any notes where Oracle recommend that!?!
    Why i should not have 3 fg on both storages?

    Looking forward to your reply.

    Many thanks,
    Dani

  9. #9 by Uwe Hesse on April 12, 2011 - 20:24

    Dani,
    the number of Voting Files for 11gR2 is 1, 3 or 5, depending on the redundancy (external, normal, high) of the ASM Diskgroup, the Voting Files are kept in. The OCR has 1 file on that Diskgroup, which can be mirrored to another Diskgroup, than you have 2 files. These numbers do not increase even if you create more than 3 Failgroups on your Diskgroups.

  10. #10 by DanyC on April 13, 2011 - 13:48

    Thanks Uwe for your answer.
    Looking on MOS i found a note 877134.1 which says

    “An odd number of voting disks is required for proper clusterware configuration. A node must be able to access strictly more than half of the voting disks at any time. So, in order to tolerate a failure of n voting disks, you must have at least 2n+1 configured. (n=1 means 3 voting disks). Refer to Note 428681.1 for assistance with adding voting disks.”

    Thanks a lot,
    Dani

  11. #11 by Manuel Fuenzalida on May 9, 2011 - 22:56

    Hi Uwe, i’m trying to add a third voting file, but on an nfs file system, i have a problem….other 2 voting files are stored on asm, it’s posible to do this ?

    Regards, Manuel

  12. #12 by Uwe Hesse on May 10, 2011 - 12:30

    Hi Manuel,
    yes it is. That is described in the whitepaper I linked to already in a previous answer to your questions:
    http://www.oracle.com/technetwork/database/clusterware/overview/grid-infra-thirdvoteonnfs-131158.pdf

  13. #13 by Manuel on May 10, 2011 - 15:28

    Hi Uwe, yesterday, I tried to add the third voting file following the instructions in the whitepaper, but, ann error ocurred, this error, tells me that i can´t add the third voting file on an nfs file system, because i have the other 2 voting files on asm storage.

    Have you tried this ?

    Best regards, Manuel

  14. #14 by Uwe Hesse on May 11, 2011 - 16:13

    Hi Manuel,
    although I didn’t do that myself, I trust Roland Knapp & Markus Michalewicz (the authors of the whitepaper) that it can be done as described :-)

  15. #15 by Dinesh on May 25, 2011 - 12:32

    I have a partition in shared disk /dev/sdc1.

    While installing Oracle grid and when it prompted to enter the Diskgroup for OCR and Voting disk I gave a name DGDATA and choose only one (external redundancy) disk “/dev/sdc1″ (it does not ask for one for OCR and one for Voting as it used to do in 10g clusterware installation).

    The Oracle grid installation has gone through successfully. when I restart the system I find CRS not starting.

    The log says: “Error PROC:26: Error while accessing the physical storage ASM……..” ORA-01034: oracle not available Could not init OCR, code:26….Linud permission denied.

    For your information….ASM instance is up and I find the diskgroup mounted.

    Can anyone help on this.

    Regards
    Dinesh

  16. #16 by Manuel on May 25, 2011 - 15:44

    Dinesh, before you do the installation, you have to configure asm disks, with “oracleasm” utility, before that, you have to install this utility, with an rpm package in linux, then execute “oracleasm create disk”….

    Best Regards
    Manuel

  17. #17 by Uwe Hesse on May 26, 2011 - 08:08

    Dinesh, apart from Manuel’s valid hint: If you have such a grieve technical problem, your first contact should be Oracle Support. Second good chance (without MOS Account) would be OTN Discussion Forum: http://forums.oracle.com/forums/forum.jspa?forumID=62
    I do this Blog in my spare time and may not answer at all or late – the forum has 100′s of members who will answer much faster.

  18. #18 by Dinesh on May 29, 2011 - 17:42

    Thanks Manuel and Hesse.

    Manuel: I have applied rpms, I have tried to use ORACLEASM command to create a disk (basically labeling the disk). I had issues executing root.sh on the 2nd node while installing grid. Next time I directly specified the /dev/sdc1 for OCR and Voting disk during grid installation. And the whole grid installation works fine. The problems is only when you restart the system?!!!

    Thanks Hesse for the link, I did go through it. Will post my problem there.

  19. #19 by Ora600Tom on November 18, 2011 - 15:14

    Nice post, thank you.

    You said,

    “The failgroup fg3 above needs only one small Disk (300 MB should be on the safe side here, since the Voting File is only about 280 MB in size) to keep one Mirror of the Voting File. fg1 and fg2 will contain each one Voting File and all the other stripes of the Database Area as well, but fg3 will only get that one Voting File.”

    Is this correct? Voting disk alone make 280MB or Voting disk and OCR together makes up 280MB?

    Thanks
    Thomas Saviour

  20. #20 by Uwe Hesse on November 18, 2011 - 19:05

    Thank you for the comment. I think it is indeed true (although a couple of hundred MB may be considered neglectable anyway, these days):
    http://download.oracle.com/docs/cd/E11882_01/install.112/e22489/storage.htm#CWLIN288

  21. #21 by Ora600Tom on November 19, 2011 - 05:33

    Thank you. That means, OCR will be striped only on fg1 and fg2. My initial impression was both OCR and Voting disk will be in all 3 disks.

    But the same document also says

    “If you are upgrading Oracle Clusterware, and your existing cluster uses 100 MB OCR and 20 MB voting disk partitions, then you must extend these partitions to at least 300 MB. Oracle recommends that you do not use partitions, but instead place OCR and voting disks in disk groups marked as QUORUM disk groups.”

    That means QUORUM can hold OCR in certain circumstances?

    Interestingly as per a blog from Riyaj, one node is writing 512 bytes on a specific offset for each node while heart beat checking. Then I do not understand what makes the voting disk size to 280M. In the past Voting disk size was very small.

    http://orainternals.wordpress.com/2010/10/29/whats-in-a-voting-disk/

    Many Thanks
    Thomas

  22. #22 by Uwe Hesse on November 19, 2011 - 10:42

    Thank you for raising this interesting discussion! I really do appreciate that :-)
    Right now, I have now RAC at hand to check, but I doubt that quorum failgroups contain OCR stripes. They got invented to be used on the third (middle) site of an Extended RAC, where no crsd process is running to use the OCR. Will research that as soon as I find the time & hardware for it – which may take a little because I am very busy with course delivery and assisting Oracle Certification to craft an Exadata Exam.

  23. #23 by Momo on November 22, 2011 - 16:48

    Hi guys, thank you for this interesting blog.
    I just want some clarification about the quorum failgroup. i see this syntax :

    SQL> CREATE DISKGROUP TEST NORMAL REDUNDANCY
    FAILGROUP fg1 DISK ‘
    FAILGROUP fg2 DISK ‘

    QUORUM FAILGROUP fg3 DISK ”
    ATTRIBUTE ‘compatible.asm’ = ’11.2.0.0.0′;

    if i need to place my third voting in a NFS file, how do i create this file, will be working if the file is created by a simple touch?

    Thank you

  24. #24 by Momo on November 22, 2011 - 16:50

    The correct syntaxe

    CREATE DISKGROUP TEST NORMAL REDUNDANCY
    FAILGROUP fg1 DISK ‘
    FAILGROUP fg2 DISK ‘

    QUORUM FAILGROUP
    fg3 DISK ”
    ATTRIBUTE ‘compatible.asm’ = ’11.2.0.0.0′;

  25. #25 by Momo on November 22, 2011 - 16:51

    CREATE DISKGROUP TEST NORMAL REDUNDANCY
    FAILGROUP fg1 DISK ‘disk in SAN1′
    FAILGROUP fg2 DISK ‘ disk in SAN2′
    QUORUM FAILGROUP fg3 DISK ‘another disk or file’
    ATTRIBUTE ‘compatible.asm’ = ’11.2.0.0.0′;

  26. #26 by momo on November 23, 2011 - 14:24

    Finally i found the solution (in this document link http://www.oracle.com/technetwork/database/clusterware/overview/grid-infra-thirdvoteonnfs-131158.pdf

    Just create the file with the dd command as described.

    Thank you for this useful blog

  27. #27 by Uwe Hesse on November 23, 2011 - 18:28

    Thank YOU for sharing this information here :-)

  28. #28 by Jay on December 5, 2011 - 04:17

    We chose external redundancy for OCR/VD in a new 11.2.0.3 RAC grid infrastructure. When adding OCR copy to a second ASM Diskgroup “ocrconfig -add ” , Should NewDg asm and database compatible attricute to be 11.2.0.0 ?.

    Is there any procedure to create second VD into NewDG with external redundancy?. If not, how we maintain redundant copy of VD with external redundancy?

    Thanks
    Jay

  29. #29 by Sabine on January 10, 2012 - 19:19

    Hi Uwe,
    could you explain what is the Universal File ID shown up in the query “crsctl query css votedisk”?
    We are using Rac one node on iSCSI storage and I couldnt identify this Universal File Id – it is not scsi-wwid nor isci-id.
    we noticed also, that after bringing an offline voting disk online again, this Universal File Id has changed.
    Best Regards
    Sabine

  30. #30 by Uwe Hesse on January 12, 2012 - 12:12

    Sabine,
    thank you for stopping by! I am sorry, but I can’t tell you the deeper meaning of the Universal File Id, though. Will keep an eye on that – maybe I will stumble about it in the future.

  31. #31 by Js on February 5, 2012 - 11:50

    Hi Uwe, nice post

    I have couple of doubt regarding OCR/VOT disks on ASM, gone through with some documentation but still its not clear to me.

    I am confused with the starting order of Clusterware and ASM, before 11gr2, oracle used to start clusterware -> ASM -> DB .. so on. But now when we OCR/VOT are on ASM, how does oracle manage this, which component get started first.

    I would appreciate if you can shed some light.

    Regards,

  32. #32 by Uwe Hesse on February 7, 2012 - 13:25

    Thank you, JS, for the question: The secret lies in the OLR (Oracle Local Registry) that points to the voting files, placed (unstriped) on single disks in an ASM diskgroup. Look here for a lot more details regarding the cluster startup sequence:

    11gR2 Clusterware and Grid Home – What You Need to Know [ID 1053147.1]
    Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]

  1. October 2010 Blogroll Report | AskDba.org Weblog
  2. Database Migration to ASM with short downtime « The Oracle Instructor
  3. Merry Christmas & A Happy New Year 2012! « The Oracle Instructor
  4. Merry Christmas & A Happy New Year 2012! | Oracle Administrators Blog - by Aman Sood

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 148 other followers