Thursday, November 24, 2011

SCP without password prompt

Many a times, we are in need of copying files from one server to another for our work to continue. SCP is a powerful unix utility to do file copy in a secure way. Each time when you do an scp, it prompts for the password of the target server's user. This is very good in terms of security, but might be annoying when you do a large set of file copy.

This password prompt would halt your work by some means. For e.g., you might had a script configured to pass files from source server to target server where scp does the file copy. Here if you want to pass the password each and everytime, the automated script is of no mean.
Let's see how to do scp without a password prompt.

This is to do scp without password in scripts.
First step is to create a key pair between the servers.

The syntax to create the key pair is:

$ ssh-keygen -t rsa

In response, you should see:

Generating public/private rsa key pair
Enter file in which to save the key ...

Press Enter to accept this.

In response, you should see:

Enter passphrase (empty for no passphrase):

You don't need a passphrase, so press Enter twice.

In response, you should see:

Your identification has been saved in ...
Your public key has been saved in ...

Note the name and location of the public key just generated. It always ends in .pub.

Copy the public key just generated to all of your remote Linux boxes. You can use scp or FTP or whatever to make the copy. Assuming you're using root--again, see my warning in step 1--the key must be contained in the file /root/.ssh/authorized_keys. Or, if you are logging in as a user, for example, selva, it would be in ~/.ssh/authorized_keys. Notice that the authorized_keys file can contain keys from other PCs. So, if the file already exists and contains text, you need to append the contents of your public key file to what already is there.

$ cat id_rsa.pub >>~/.ssh/authorized_keys
$ chmod 700 ~/.ssh/authorized_keys

With the above procedure completely done, you can copy your files without the prompt for the password.

Thursday, November 3, 2011

Re-Linking Of Oracle Binaries

Today one of our servers crashed and the server (whole server) was restored from the backup. It took some time to be restored one by one. Meanwhile when things are set, I tried to startup the database, I got a different error that I have not seen that before.

/home/oracle /> sqlplus "/ as sysdba"
/usr/lib/pa20_64/dld.sl: Unsatisfied code symbol 'gethrtime' in load module '/mnt1/app/oracle/product/9.2.0/bin/sqlplus'.
Abort(coredump)

This surely is not due to a database fault. When searched for solutions, found that this is the problem with relinking of oracle binaries.

Relinking of oracle binaries takes place automatically when -

            - An Oracle product has been installed with an Oracle provided installer.
            - An Oracle patch set has been applied via an Oracle provided installer.

Let's look how to relink oracle binaries, if it requires at times. At times! Those times are when -
            - An OS upgrade has occurred.
            - A change has been made to the OS system libraries. This can occur during the application of an OS patch.
            - A new install failed during the relinking phase.
            - Individual Oracle executables core dump during initial startup.
            - An individual Oracle patch has been applied with explicit relink instructions or the relink is integrated into the patch install script.

Step 1 : Login as Oracle user and Set environment parameters such as $ORACLE_HOME, $ORACLE_BASE, $PATH, $LD_LIBRARY_PATH

Step 2 : Shut down all databases and listeners running on the server.

Step 3 : Relink the components as given below for different version of oracle

Oracle Version 7.3.x

For executables: oracle, exp, imp, sqlldr, tkprof
cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk install

For executables: svrmgrl, svrmgrm
cd $ORACLE_HOME/svrmgr/lib
make -f ins_svrmgr.mk linstall minstall
linstall is for svrmgrl, minstall is for svrmgrm

For executables: sqlplus
cd $ORACLE_HOME/sqlplus/lib
make -f ins_sqlplus.mk install

For executables: dbsnmp, oemevent, oratclsh
cd $ORACLE_HOME/network/lib
make -f ins_agent.mk install

For executables: names, namesctl
cd $ORACLE_HOME/network/lib
make -f ins_names.mk install

For executables: tnslsnr, lsnrctl, tnsping, csmnl, trceval, trcroute
cd $ORACLE_HOME/network/lib
make -f ins_network.mk install

Oracle Version 8.0.x

For executables: oracle, exp, imp, sqlldr, tkprof, mig, dbv, orapwd, rman, svrmgrl, ogms, ogmsctl
cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk install

For executables: sqlplus
cd $ORACLE_HOME/sqlplus/lib
make -f ins_sqlplus.mk install

For executables: dbsnmp, oemevent, oratclsh, libosm.so
cd $ORACLE_HOME/network/lib
make -f ins_oemagent.mk install

For executables: tnslsnr, lsnrctl, namesctl, names, osslogin, trcasst, trcroute
cd $ORACLE_HOME/network/lib
make -f ins_network.mk install

Oracle version 8.1.x (8i) or higher

A "relink" script is provided in the $ORACLE_HOME/bin directory and can be run as follows.

cd $ORACLE_HOME/bin
relink

$ relink >parameter<
accepted values for parameter: all, oracle, network, client, client_sharedlib, interMedia, precomp, utilities, oemagent

Or else you can even do individually one by one as done for version 8.0.x with a few changes stated below.
Others remain same.

For executables: names, namesctl
cd $ORACLE_HOME/network/lib
make -f ins_names.mk install

For executables: osslogin, trcasst, trcroute, onrsd, tnsping
cd $ORACLE_HOME/network/lib
make -f ins_net_client.mk install

For executables: tnslsnr, lsnrctl
cd $ORACLE_HOME/network/lib
make -f ins_net_server.mk install

Note : There are no "Relink success" messages displayed at the terminal. If the relink was failure, it will terminate with errors such as 'Fatal error', 'Ld: fatal', 'Exit Code 1.

For HP-UX servers, the relink may fail even when done manually. This is a bug and can be rectified by applying the following OS patches as per Oracle support.

For 11.0: HP/UX patches PHCO_23770, PHCO_23092 must be applied.
For 11.11: HP/UX patches PHCO_29029 and PHCO_25569 must be applied.

Thursday, September 1, 2011

Recover archive gaps in standby database from primary using RMAN incremental backups

A Physical Standby database synchs with Primary by continuous apply of archive logs from a Primary Database. In case of an archive log gone missing or corrupt, We have to bring back the standby to sync with the primary.

When the logs missing or corrupt is less in number (say below 15), we can ship the logs which were missing in the standby site from the primary site (scp/sftp/ftp) and then we can register the log file in the standby so that the gap can be resolved.

Find the archives which are missing by issueing the following command.
SQL> select * from v$archive_gap;

This would give the gap sequences. Or you can use the v$managed_standby view to find where the log apply stuck.
SQL> select sequence#,process,status from v$managed_standby;

Here u can see status as wait for log for say sequence# 100 but your primary would've proceeded to sequence# 110
At primary
SQL> select max(sequence#) from v$archived_log;      ---> This would show you 110

Copy the logs to the standby site from the primary site

$ scp log_file_name_n.arc oracle@standby:/log/file/location/log_file_name_n.arc

At standby site
SQL> alter database register logfile '/log/file/location/log_file_name_n.arc';
logfile registered

Do the log file registration at the standby site until all the missing log files are registered. Now apply would take place and your standby will become sync with the primary.

This is easy process if you have missing or corrupt logs in lesser number. But when the difference is huge (say around 500 logs) this method is very time consuming and not a proper approach. Else you have to rebuild the standby database from scratch.
As an enhancement from 10g, an incremental backup created with BACKUP INCREMENTAL... FROM SCN can be used to refresh the standby database with changes at the primary database since the last SCN at Standby and then managed recovery can resume i.e. Compensate for the missing archive logs.

Let us see the steps involved.

Step 1: On the primary:
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
144710998

On the standby:
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
130158742

Clearly there is a difference. But this by itself does not indicate a problem; since the standby is expected to lag behind the primary (this is an asynchronous non-real time apply setup). The real question is how much it is lagging in the terms of wall clock.
To know that use the scn_to_timestamp function to translate the SCN to a timestamp:

SQL> select scn_to_timestamp(144710998) from dual;
SCN_TO_TIMESTAMP(1447102)
-------------------------------
18-AUG-11 08.54.28.000000000 AM

Run the same query to know the timestamp associated with the SCN of the standby database as well
SQL> select scn_to_timestamp(130158742) from dual;
SCN_TO_TIMESTAMP(1301571)
-------------------------------
13-AUG-11 07.19.27.000000000 PM

Note: Run it on the primary database, since it will fail in the standby in a mounted mode
This shows that the standby is four and half days lagging!

Step 2: [Standby] Stop the managed standby apply process:
SQL> alter database recover managed standby database cancel;
Database altered.

Step 3: [Standby] Shutdown the standby database
SQL> shut immediate

Step 4: [Primary] On the primary, take an incremental backup from the SCN number where the standby has been stuck:
RMAN> run {
2> allocate channel c1 type disk format '/u01/backup/%U.bkp';
3> backup incremental from scn 130158740 database;
4> }

Step 5: [Primary] On the primary, create a new standby controlfile:
SQL> alter database create standby controlfile as '/u01/backup/for_standby.ctl';
Database altered.

or

RMAN> backup current controlfile for standby format '/backups/tempfol/_%U';

Step 6: [Primary] Copy these files to standby host:
oracle@dba1 /u01/backup]$ scp * oracle@dba2:/u01/backup

Step 7: [Standby] Bring up the instance in nomount mode:
SQL> startup nomount

Step 8: [Standby] Check the location of the controlfile:
SQL> show parameter control_files
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
control_files string /u01/oradata/standby_cntfile.ctl

Step 9: [Standby] Replace the controlfile with the one you just created in primary.
 $ cp /u01/backup/for_standby.ctl /u01/oradata/standby_cntfile.ctl

or

RMAN> RESTORE STANDBY CONTROLFILE FROM '/u01/backup/_v8o069ee_1_1';

Step 10: [Standby] Mount the standby database:
SQL> alter database mount standby database;

Step 11: [Standby] Connect to RMAN. RMAN does not know about these files yet; so you must let it know – by a process called cataloging. Catalog these files:
$ rman target=/
RMAN> catalog start with '/u01/backup';

Step 12: Recover these files:
RMAN> recover database;

Step 13: After some time, the recovery fails with the message:

archive log filename=/u01/oradata/1_18108_697108460.dbf thread=1 sequence=18109
ORA-00310: archived log contains sequence 18108; sequence 18109 required

This happens because we have come to the last of the archived logs. The expected archived log with sequence# 18108 has not been generated yet.

Step 14: At this point exit RMAN and start managed recovery process:
SQL> alter database recover managed standby database disconnect from session;
Database altered.

Step 15: Check the SCN’s in primary and standby:
[Standby] SQL> select current_scn from v$database;
CURRENT_SCN
-----------
144747125
[Primary] SQL> select current_scn from v$database;
CURRENT_SCN
-----------
144747111

Now they are very close to each other. The standby has now caught up.

Monday, July 18, 2011

Retrieve deleted rows from table using audit.

Firstly connect as scott from any other user.

Conn scott/tiger
select * from emp;

SQL> DELETE FROM EMP WHERE EMPNO = 7499;


Now lets retrieve this row by using audits and flashback.
I have mentioned using audits as the row may be accidentally deleted or even deleted by any malicious activity (some unauthorised person performing the delete operation)
If it's due to malicious activity, we can also find the culprit who performed the act.

Check what audit is recorded in the database.

SQL> SELECT NAME, value FROM v$parameter WHERE NAME LIKE 'audit%';

NAME
----------------------------------------------------------------------
VALUE
----------------------------------------------------------------------
audit_sys_operations
FALSE
audit_file_dest
D:\ORACLE\PRODUCT\10.2.0\ADMIN\10gDB\ADUMP
audit_trail
DB

The audit is being done to the DB (Mostly DB audit would be done)
Now lets get some details from sys.aud$ table which is the table used for storing audit related things.

SQL> SELECT userid, action#, STATEMENT, OBJ$NAME, To_Char (timestamp#, 'mm/dd/yyyy hh24:mi:ss') FROM sys.aud$ ORDER BY timestamp# asc;

Output would look like this

Timeline from the database audit:

USERID
ACTION#
STATEMENT
OBJ$NAME
TIMESTAMP
SCOTT
101
1

04/30/2006 09:11:36
SCOTT
3
2
X$NLS_PARAMETERS
04/30/2006 09:29:07
SCOTT
3
2
GV$NLS_PARAMETERS
04/30/2006 09:29:07
SCOTT
3
2
V$NLS_PARAMETERS
04/30/2006 09:29:07
SCOTT
3
2
NLS_SESSION_PARAMETERS
04/30/2006 09:29:07
SCOTT
3
5
DUAL
04/30/2006 09:29:07
SCOTT
100
1

04/30/2006 09:29:41
SCOTT
3
22
OBJ$
04/30/2006 09:31:07
SCOTT
3
22
USER_OBJECTS
04/30/2006 09:31:07
SCOTT
3
28
EMP
04/30/2006 09:32:01
SCOTT
3
31
EMP
04/30/2006 09:32:20
SCOTT
7
37
EMP
04/30/2006 09:33:28
SCOTT
3
46
EMP
04/30/2006 09:35:24
SCOTT
7
52
EMP
04/30/2006 09:37:04
SCOTT
7
55
EMP
04/30/2006 09:37:13
SCOTT
3
61
EMP
04/30/2006 09:37:28


Now we have to find which action# refers to what operation. To find this issue the command below.

SQL> SELECT * FROM AUDIT_ACTIONS;

Now you can see action# 7 denotes delete operation.

Delete operation was performed around 9 37. So we want to flashback to before then so have to get the recorded timestamp. Oracle does not actually record a full timeline. Only takes the time every 5 minutes with the relevant SCN.
So now we will find the scn according to the timestamp from the below command which would be stored for maximum 5 days rolling figure.

SQL> SELECT To_Char(TIME_DP, 'dd/mm/yyyy hh24:mi:ss'), SCN_BAS FROM SYS.SMON_SCN_TIME order by scn_bas;

Now you can retrieve the deleted row by flashback the table to before the timestamp where the data is deleted.

SQL> CREATE TABLE EMPCOPY AS SELECT * FROM SCOTT.EMP AS OF TIMESTAMP (TO_TIMESTAMP('30/04/2006 09:31:10','DD-MM-YYYY:HH24:MI:SS'));

where 09:31:10 is the predecessor scn timestamp before the data deletion time of 09:37:13.

Now you can find the deleted row back in the EMPCOPY table.

Let's now find out did SCOTT do this deletion activity or any other malicious user did this on purpose to get SCOTT into trouble.

We can use the same sys.aud$ table to find this too..

SQL> SELECT userid, USERHOST, TERMINAL, SPARE1, action#, STATEMENT, OBJ$NAME, To_Char (timestamp#, 'mm/dd/yyyy hh24:mi:ss') FROM sys.aud$ ORDER BY timestamp# asc;

This additional data shows that SCOTT was coming from a different workstation from normal additionally the SPARE1 column shows that the Windows username was in fact SELVA and not SCOTT.

So take actions accordingly.

Thursday, June 9, 2011

Ora - 00600 [13013], [5001]

I experience an Ora-00600 error. Let's look what is the error and the way to resolve it.

Error received
--------------

ORA-00600: internal error code, arguments: [13013], [5001], [4722], [29361923], [9], [29361923], [17], []

Let's look into details of how to solve this issue.

Section 1> What information needs to be Collected
ORA-600 [13013] [a] [b] [c] [d] [e] [f]

This format relates to Oracle Server 8.0.3 to 10.1

Arg [a] Passcount
Arg [b] Data Object number
Arg [c] Tablespace Relative DBA of block containing the row to be updated
Arg [d] Row Slot number
Arg [e] Relative DBA of block being updated (should be same as [c])
Arg [f] Code

The Second argument would give the information about the data object id.
This would give information about the object involved.

SQL> Select object_name,object_type,owner from dba_objects where data_object_id=<value reported in argment b>;

Once the Object is identified run the following :
The below command check if the table has corruption or not .

SQL> Analzye table <owner>.<table name> validate structure online;

If this goes fine table doesnot have corruption. For next command.
If the above command fails with ORA-1498 go to Section 3

The below command check if table/index has corruption or not

SQL> Analzye table <owner>.<table name> validate structure cascade online;

If the above command errors out with ora-1499 it indicates a corruption in index.
Go to section 2 for resolution

Run dbverify on the datafile reported in the error
Arg [c] in the ora-0600[13011] would give the Relative DBA

For example
ORA-00600: internal error code, arguments: [13013], [5001], [57353], [155254965], [261], [151062605], [17], []

Arg [c] --> rdba-->155254965
Use this value and find the file and block number for this dba(data block address)

SQL> select dbms_utility.data_block_address_file(155254965) Rfile# ,dbms_utility.data_block_address_block(155254965) "Block#" from dual;

RFILE# Block#
---------- ----------
37 65717

You can run dbveirfy on datafile with rfile#=37

SQL> Select name from v$datafile where rfile#=37;

$ dbv file=<location of datafile> blocksize=<db_block_size>

Section 2 > How to resolve if an Index is corrupted.

You would need to drop and recreate the index
Ensure before dropping the Index

SQL> Spool /tmp/createindex.sql
SQL> Set long 100000000
SQL> Select dbms_metadata.get_ddl('INDEX','<Index name>',<'user name>') from dual;
SQL> Spool off

To Identify the index please do the following

Analyze table with "validate structure cascade" fails with ORA-1499 and the trace file contains message "Table/Index row count mismatch". Example:

SQL> analyze table test validate structure cascade;
analyze table test validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
 

The associated trace file contains:
Table/Index row count mismatch
table 6559 : index 10000, 0
Index root = tsn: 6 rdba: 0x01400091

It means: A table scan returned 6559 rows and an index scan returned 10000 rows.
"Index root" is the segment header information for the index:
rdba: 0x01400091 is the Index segment header relative data block address. It is decimal 20971665 which is Rfile#=5 Block#=145 :

SQL> select dbms_utility.data_block_address_file(20971665)  "Rfile#"  
2          ,dbms_utility.data_block_address_block(20971665) "Block#"  
from dual;  

Rfile#     Block#  
---------- ----------  
5          145  

Running the next query can identify the associated index:
QUERY 1: 

SQL> select owner, segment_name, segment_type 
2    from  dba_segments 
3    where header_file = 5 
4      and header_block = 145; 

OWNER    SEGMENT_NAME    SEGMENT_TYPE 
-------- --------------- ------------------ 
SCOTT    I_TEST          INDEX 


Section 3> How to resolve if table is corrupted

Option a> Backup is available

Ora-1498 would be reported on the table.

The trace file from Ora-1498 would contain following information

Example

Block Checking: DBA = 1066265208, Block Type = KTB-managed data block --->
file 254,block 911992
data header at 0xc00000010118e07c
kdbchk: avsp(816) > tosp(812)
Block header dump: 0x3f8dea78
Object id on Block? Y
seg/obj: 0x155b452 csc: 0x05.7b2b4ee6 itc: 3 flg: E typ: 1 - DATA
brn: 0 bdba: 0x3f8dde0c ver: 0x01
inc: 0 exflg: 0

Note the DBA value reported in the trace file DBA = 1066265208
Convert this to find the file number and block number having issue

Sql> Select dbms_utility.data_block_address_file('1066265208') from dual;

DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE('1066265208')
--------------------------------------------------
254

Sql> Select dbms_utility.data_block_address_block('1066265208') from dual;

DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK('1066265208')
---------------------------------------------------
911992


Run dbverify on the datafile containing the table

$ dbv file=<location of datafile> blocksize=<db_block_size>

Corruption would be reported on the block.

If you have an Rman backup do a Rman block recovery (Take the file number and block number affected from the trace file)

Rman> Blockrecover datafile <no> block <block number>;

Or

If you have a user managed backup you can do an restore and recovery from good copy of the problematic datafile
Option b> Backup is not available

If no backups are available then use event 10231 at session level to create a salvage table

SQL> Alter session set events '10231 trace name context forever, level 10';
SQL> Create table <owner>.salvage_table as select * from <Corrupted table>;

Rename the Original table to old.

Rename salvage table to Original table name

How to resolve when Smon terminates the instance due to Ora-00600[13013]

If Smon is terminating the instance then.

Set event 10513 and startup the database
event="10513 trace name context forever, level 2"

SQL> Startup mount;
SQL> Show parameter event
SQL> Alter datatabase open;

Identify the object involved using information from Section 1.

How to resolve the issue if the object involved belongs to system tablespace

System objects are very important.
Please open a Service request with Oracle support if system tables are involved.


Reference: MOS note id - 816784.1

Friday, June 3, 2011

Manual Oracle Database Creation

This document briefs about the step by step procedure for creating a database on unix environment.

Before starting, it is best to have the relevant environment variables set (ORACLE_SID, ORACLE_HOME, PATH, etc...). Best to have this setup in a .profile file.
Also create all the directory structures required.
Step 1: Make an init<SID>.ora in your $ORACLE_HOME/dbs directory. To simplify, you can copy init.ora to init<SID>.ora and modify the file.
            Path names, file names, and sizes will need to be modified
            Example parameter settings:

            db_block_size=8192
            db_cache_size=52428800
            background_dump_dest=/u01/32bit/app/oracle/product/9.0.1/bdump
            core_dump_dest=/u01/32bit/app/oracle/product/9.0.1/cdump
            user_dump_dest=/u01/32bit/app/oracle/product/9.0.1/udump
            timed_statistics=TRUE
            control_files=("/u01/rbdb1/control_01.ctl", "/u01/rbdb1/control_02.ctl")
            db_name=rbdb1
            shared_pool_size=52428800
            sort_area_size=524288
            undo_management=AUTO
            undo_tablespace=UNDOTBS

Step 2: Run the following sqlplus command to connect to the database:

            ]$ sqlplus '/ as sysdba'

Step 3: Startup up the database in NOMOUNT mode:

            SQL> startup nomount

Step 4: Create the Database by issuing the following in the SQL prompt:
            Path names, file names, and sizes will need to be modified

            CREATE DATABASE <db_name>
            MAXLOGFILES 255
            MAXINSTANCES 1
            MAXDATAFILES 256
            MAXLOGHISTORY 256
            DATAFILE '/u01/oracle/rbdb1/system_01.dbf' SIZE 400M REUSE
            SYSAUX DATAFILE '/u01/oracle/rbdb1/sysaux_01.dbf' SIZE 500M REUSE
UNDO TABLESPACE "UNDOTBS" DATAFILE '/u01/oracle/rbdb1/undotbs_01.dbf'
SIZE 200M REUSE AUTOEXTEND ON NEXT 5120K MAXSIZE UNLIMITED
            CHARACTER SET US7ASCII
            LOGFILE GROUP 1 ('/u01/oracle/rbdb1/redo1_01.dbf') SIZE 100M REUSE,
            GROUP 2 ('/u01/oracle/rbdb1/redo1_02.dbf') SIZE 100M REUSE;

Sysaux data file required for database version 10g and above.

Step 5: Create a Temporary Tablespace :
            Path names, file names, and sizes will need to be modified

CREATE TEMPORARY TABLESPACE "TEMP" TEMPFILE '/u01/oracle/rbdb1/temp_01.dbf'
            SIZE 40M REUSE AUTOEXTEND ON NEXT  640K MAXSIZE UNLIMITED
            EXTENT MANAGEMENT LOCAL;
            ALTER DATABASE DEFAULT TEMPORARY TABLESPACE "TEMP";

Step 6: Run the following scripts as sys to create data dictionary views and tables.
@?/rdbms/admin/catalog.sql;
@?/rdbms/admin/catblock.sql;
@?/rdbms/admin/catproc.sql;
@?/rdbms/admin/catoctk.sql;

Step 7: Run the following scripts as system to create product user profile and command line help.
@?/sqlplus/admin/pupbld.sql;
@?/sqlplus/admin/help/hlpbld.sql helpus.sql;

Step 8: Put the database in archivelog mode.

SQL> alter database archivelog;

Step 9: Reset the sys and system password and create password file if required.
           
SQL> ALTER USER SYS IDENTIFIED BY <password>;
            SQL> ALTER USER SYSTEM IDENTIFIED BY <password>;

Step 10: Backup created database

Step 11: Create, configure and start the listener.

Step 12: Schedule the required backup and monitoring scripts in cron