Thursday 31 October 2013

Recover archivelog gap between primary to standby using RMAN incremental backups

A Physical Standby database synchs with Primary by continuous apply of archive logs from a Primary Database. In case of an archive log gone missing or corrupt, We have to bring back the standby to sync with the primary.

When the logs missing or corrupt is less in number (say below 15), we can ship the logs which were missing in the standby site from the primary site (scp/sftp/ftp) and then we can register the log file in the standby so that the gap can be resolved.

Find the archives which are missing by issueing the following command.
SQL> select * from v$archive_gap;
This would give the gap sequences. Or you can use the v$managed_standby view to find where the log apply stuck.

SQL> select sequence#,process,status from v$managed_standby;
Here u can see status as wait for log for say sequence# 100 but your primary would've proceeded to sequence# 110

At primary
SQL> select max(sequence#) from v$archived_log;      ---> This would show you 110

Copy the logs to the standby site from the primary site
$ scp log_file_name_n.arc oracle@standby:/log/file/location/log_file_name_n.arc

At standby site
SQL> alter database register logfile '/log/file/location/log_file_name_n.arc';
logfile registered
Do the log file registration at the standby site until all the missing log files are registered. Now apply would take place and your standby will become sync with the primary.
This is easy process if you have missing or corrupt logs in lesser number. But when the difference is huge (say around 500 logs) this method is very time consuming and not a proper approach.

Else you have to rebuild the standby database from scratch.
As an enhancement from 10g, an incremental backup created with BACKUP INCREMENTAL... FROM SCN can be used to refresh the standby database with changes at the primary database since the last SCN at Standby and then managed recovery can resume i.e. Compensate for the missing archive logs.

Let us see the steps involved.
Step 1: On the primary:
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
144710998

On the standby:
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
130158742

Clearly there is a difference. But this by itself does not indicate a problem; since the standby is expected to lag behind the primary (this is an asynchronous non-real time apply setup). The real question is how much it is lagging in the terms of wall clock.

To know that use the scn_to_timestamp function to translate the SCN to a timestamp:
SQL> select scn_to_timestamp(144710998) from dual;
SCN_TO_TIMESTAMP(1447102)
-------------------------------
18-AUG-11 08.54.28.000000000 AM

Run the same query to know the timestamp associated with the SCN of the standby database as well
SQL> select scn_to_timestamp(130158742) from dual;
SCN_TO_TIMESTAMP(1301571)
-------------------------------
13-AUG-11 07.19.27.000000000 PM
Note: Run it on the primary database, since it will fail in the standby in a mounted mode
 This shows that the standby is four and half days lagging!

Step 2: [Standby] Stop the managed standby apply process:
SQL> alter database recover managed standby database cancel;
Database altered.

Step 3: [Standby] Shutdown the standby database
SQL> shut immediate

Step 4: [Primary] On the primary, take an incremental backup from the SCN number where the standby has been stuck:
RMAN> run {
2> allocate channel c1 type disk format '/u01/backup/%U.bkp';
3> backup incremental from scn 130158740 database;
4> }

Step 5: [Primary] On the primary, create a new standby controlfile:
SQL> alter database create standby controlfile as '/u01/backup/for_standby.ctl';
Database altered.

Step 6: [Primary] Copy these files to standby host:
oracle@dba1 /u01/backup]$ scp * oracle@dba2:/u01/backup

Step 7: [Standby] Bring up the instance in nomount mode:
SQL> startup nomount

Step 8: [Standby] Check the location of the controlfile:
SQL> show parameter control_files
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
control_files string /u01/oradata/standby_cntfile.ctl

Step 9: [Standby] Replace the controlfile with the one you just created in primary.
 $ cp /u01/backup/for_standby.ctl /u01/oradata/standby_cntfile.ctl

Step 10: [Standby] Mount the standby database:
SQL> alter database mount standby database;

Step 11: [Standby] Connect to RMAN. RMAN does not know about these files yet; so you must let it know – by a process called cataloging. Catalog these files:
$ rman target=/
RMAN> catalog start with '/u01/backup';

Step 12: Recover these files:
RMAN> recover database;

Step 13: After some time, the recovery fails with the message:
archive log filename=/u01/oradata/1_18108_697108460.dbf thread=1 sequence=18109
ORA-00310: archived log contains sequence 18108; sequence 18109 required
This happens because we have come to the last of the archived logs. The expected archived log with sequence# 18108 has not been generated yet.

Step 14: At this point exit RMAN and start managed recovery process:
SQL> alter database recover managed standby database disconnect from session;
Database altered.

Step 15: Check the SCN’s in primary and standby:
[Standby] SQL> select current_scn from v$database;
CURRENT_SCN
-----------
144747125
[Primary] SQL> select current_scn from v$database;
CURRENT_SCN
-----------
144747111
Now they are very close to each other. The standby has now caught up.

Thursday 10 October 2013

Re create control file for standby database

To create a control file, you have to do the following on the Primary server...

SQL> startup mount;
SQL> alter database create standby controlfile as ‘<file-name>’;
SQL> alter database open;

Then you have to copy the file you created to the Standby server in the corresponding directory (control_files parameter in the init.ora file) with the database in shutdown mode. Start it up and it will be Standby again.


Hope that helps!


make as many copy of this control file as much you have define in pfile or spfile ;


Second method 

sql>startup nomount;

take backup of control file by rman, copy this to standby server.
  1. RMAN> restore standby controlfile from '/tmp/control01.ctl';  
  2.   
  3. Starting restore at 08-JUN-10  
  4. using target database control file instead of recovery catalog  
  5. allocated channel: ORA_DISK_1  
  6. channel ORA_DISK_1: sid=1082 instance=biodb1 devtype=DISK  
  7.   
  8. channel ORA_DISK_1: copied control file copy  
  9. output filename=+DATADG/control01.ctl  
  10. output filename=+FRADG/control02.ctl  
  11. output filename=+DATADG/control03.ctl  
  12. Finished restore at 08-JUN-10  
  13.   
  14. RMAN> exit  
  1. SQL> alter database mount standby database;  
  2.   
  3. Database altered.  

Startup and shutdown physical standby

8.1 Starting Up and Shutting Down a Physical Standby Database

This section describes the procedures for starting up and shutting down a physical standby database.

8.1.1 Starting Up a Physical Standby Database

To start up a physical standby database, use SQL*Plus to connect to the database with administrator privileges, and then use the SQL*Plus STARTUP command with the NOMOUNT option. (You must use theNOMOUNT option with a standby database.)
If both the primary and standby databases are offline, then always (if possible) start the standby database before starting the primary database.
After the database is started, mount the database as a standby database. Once it is mounted, the database can receive archived redo data from the primary database.
You then have the option of either starting a managed recovery operation or opening the database for read-only access. Typically, you start a managed recovery operation. The following example shows how to start a standby database:
  1. Start the database:
    SQL> STARTUP NOMOUNT;
    
    
  2. Mount the standby database:
    SQL> ALTER DATABASE MOUNT STANDBY DATABASE;
    
    
  3. Start the managed recovery operation:
    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE 
      2> DISCONNECT FROM SESSION;
    
    
Once the database is performing managed recovery, log apply services apply the archived redo logs to the standby database.
See Also:
Section 6.2.2 for information about managed recovery and Section 8.2 for information on opening a standby database for read-only access

8.1.2 Shutting Down a Physical Standby Database

To shut down a physical standby database, use the SQL*Plus SHUTDOWN command. If the database is performing managed recovery, you must cancel managed recovery operations before issuing theSHUTDOWN command. Control is not returned to the session that initiates a database shutdown until shutdown is complete.
If the primary database is up and running, defer the archive log destination on the primary database and perform a log switch operation (to make the defer operation take effect) before shutting down the standby database. Otherwise, log transport services will not be able to transmit redo data to this standby site.
The following steps show you how to shut down a standby database:

  1. Find out if the standby database is performing managed recovery. If the MRP0 or MRP process exists, then the standby database is performing managed recovery.
    SQL> SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY;
    
    
  2. Cancel managed recovery operations.
    SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
    
    
  3. Shut down the standby database.
    SQL> SHUTDOWN IMMEDIATE;

Tuesday 8 October 2013

About the Undo Retention Period

After a transaction is committed, undo data is no longer needed for rollback or transaction recovery purposes. However, for consistent read purposes, long-running queries may require this old undo information for producing older images of data blocks. Furthermore, the success of several Oracle Flashback features can also depend upon the availability of older undo information. For these reasons, it is desirable to retain the old undo information for as long as possible.
When automatic undo management is enabled, there is always a current undo retention period, which is the minimum amount of time that Oracle Database attempts to retain old undo information before overwriting it. Old (committed) undo information that is older than the current undo retention period is said to be expired and its space is available to be overwritten by new transactions. Old undo information with an age that is less than the current undo retention period is said to be unexpired and is retained for consistent read and Oracle Flashback operations.
Oracle Database automatically tunes the undo retention period based on undo tablespace size and system activity. You can optionally specify a minimum undo retention period (in seconds) by setting the UNDO_RETENTION initialization parameter. The exact impact this parameter on undo retention is as follows:
  • The UNDO_RETENTION parameter is ignored for a fixed size undo tablespace. The database always tunes the undo retention period for the best possible retention, based on system activity and undo tablespace size. See "Automatic Tuning of Undo Retention" for more information.
  • For an undo tablespace with the AUTOEXTEND option enabled, the database attempts to honor the minimum retention period specified byUNDO_RETENTION. When space is low, instead of overwriting unexpired undo information, the tablespace auto-extends. If the MAXSIZE clause is specified for an auto-extending undo tablespace, when the maximum size is reached, the database may begin to overwrite unexpired undo information. The UNDOTBS1 tablespace that is automatically created by DBCA is auto-extending.

    Automatic Tuning of Undo Retention

    Oracle Database automatically tunes the undo retention period based on how the undo tablespace is configured.
    • If the undo tablespace is configured with the AUTOEXTEND option, the database dynamically tunes the undo retention period to be somewhat longer than the longest-running active query on the system. However, this retention period may be insufficient to accommodate Oracle Flashback operations. Oracle Flashback operations resulting in snapshot too old errors are the indicator that you must intervene to ensure that sufficient undo data is retained to support these operations. To better accommodate Oracle Flashback features, you can either set the UNDO_RETENTION parameter to a value equal to the longest expected Oracle Flashback operation, or you can change the undo tablespace to fixed size.
    • If the undo tablespace is fixed size, the database dynamically tunes the undo retention period for the best possible retention for that tablespace size and the current system load. This best possible retention time is typically significantly greater than the duration of the longest-running active query.
      If you decide to change the undo tablespace to fixed-size, you must choose a tablespace size that is sufficiently large. If you choose an undo tablespace size that is too small, the following two errors could occur:
      • DML could fail because there is not enough space to accommodate undo for new transactions.
      • Long-running queries could fail with a snapshot too old error, which means that there was insufficient undo data for read consistency.
      See "Sizing a Fixed-Size Undo Tablespace" for more information.
    Note:
    Automatic tuning of undo retention is not supported for LOBs. This is because undo information for LOBs is stored in the segment itself and not in the undo tablespace. For LOBs, the database attempts to honor the minimum undo retention period specified by UNDO_RETENTION. However, if space becomes low, unexpired LOB undo information may be overwritten.

    Retention Guarantee

    To guarantee the success of long-running queries or Oracle Flashback operations, you can enable retention guarantee. If retention guarantee is enabled, the specified minimum undo retention is guaranteed; the database never overwrites unexpired undo data even if it means that transactions fail due to lack of space in the undo tablespace. If retention guarantee is not enabled, the database can overwrite unexpired undo when space is low, thus lowering the undo retention for the system. This option is disabled by default.
    WARNING:
    Enabling retention guarantee can cause multiple DML operations to fail. Use with caution.
    You enable retention guarantee by specifying the RETENTION GUARANTEE clause for the undo tablespace when you create it with either the CREATE DATABASEor CREATE UNDO TABLESPACE statement. Or, you can later specify this clause in an ALTER TABLESPACE statement. You disable retention guarantee with theRETENTION NOGUARANTEE clause.
    You can use the DBA_TABLESPACES view to determine the retention guarantee setting for the undo tablespace. A column named RETENTION contains a value ofGUARANTEENOGUARANTEE, or NOT APPLY, where NOT APPLY is used for tablespaces other than the undo tablespace.

    Undo Retention Tuning and Alert Thresholds

    For a fixed-size undo tablespace, the database calculates the best possible retention based on database statistics and on the size of the undo tablespace. For optimal undo management, rather than tuning based on 100% of the tablespace size, the database tunes the undo retention period based on 85% of the tablespace size, or on the warning alert threshold percentage for space used, whichever is lower. (The warning alert threshold defaults to 85%, but can be changed.) Therefore, if you set the warning alert threshold of the undo tablespace below 85%, this may reduce the tuned size of the undo retention period. For more information on tablespace alert thresholds, see "Managing Tablespace Alerts".

    Tracking the Tuned Undo Retention Period

    You can determine the current retention period by querying the TUNED_UNDORETENTION column of the V$UNDOSTAT view. This view contains one row for each 10-minute statistics collection interval over the last 4 days. (Beyond 4 days, the data is available in the DBA_HIST_UNDOSTAT view.) TUNED_UNDORETENTION is given in seconds.
    select to_char(begin_time, 'DD-MON-RR HH24:MI') begin_time,
    to_char(end_time, 'DD-MON-RR HH24:MI') end_time, tuned_undoretention
    from v$undostat order by end_time;
    
    BEGIN_TIME      END_TIME        TUNED_UNDORETENTION
    --------------- --------------- -------------------
    04-FEB-05 00:01 04-FEB-05 00:11               12100
          ...                                          
    07-FEB-05 23:21 07-FEB-05 23:31               86700
    07-FEB-05 23:31 07-FEB-05 23:41               86700
    07-FEB-05 23:41 07-FEB-05 23:51               86700
    07-FEB-05 23:51 07-FEB-05 23:52               86700
    
    576 rows selected.
    
    See Oracle Database Reference for more information about V$UNDOSTAT.