Sunday, May 3, 2009

In oracle datapump Main difference which making all other differences with conventional export import

Data Pump runs as a job inside the database, rather than as a stand-alone client application. This means that jobs are somewhat independent of the process that started the export or import.

11g new view - V$SQL_HINT

In case you wanted to know which Oracle version a particular SQL hint is
applicable in or was introduced in. You can query V$SQL_HINT introduced
in 11g for that. It holds even historical information. Column "version"
gives oracle version in which a particular hint was introduced and
version_outline probably gives upto which version upto which it is
applicable.



This even has a column to give you inverse of an hint.



This view is a undocumented view.



SQL> desc v$sql_hint

Name Null? Type

----------------------------------------- --------
----------------------------

NAME VARCHAR2(64)

SQL_FEATURE VARCHAR2(64)

CLASS VARCHAR2(64)

INVERSE VARCHAR2(64)

TARGET_LEVEL NUMBER

PROPERTY NUMBER

VERSION VARCHAR2(25)

VERSION_OUTLINE VARCHAR2(25)

Does Oracle datapump uses direct path load?





Does Oracle datapump uses direct path load?


Yes. This is one of feature that makes impdp or expdp more faster than conventional export and import. To use direct path loading through oracle datapump, one has follow certain condition. Alternatively it can can be used by external table method by which we unload the data on flat file on file system of database server and after user can use those flat file as simple data source in its SELECT statement.


EXPDP will use DIRECT_PATH mode if:



The structure of a table allows a Direct Path unload, i.e.:

     - The table does not have fine-grained access control enabled for SELECT.

     - The table is not a queue table.

     - The table does not contain one or more columns of type BFILE or opaque, or an object type containing opaque columns.

     - The table does not contain encrypted columns.

     - The table does not contain a column of an evolved type that needs upgrading.

     - If the table has a column of datatype LONG or LONG RAW, then this column is the last column.



The parameters QUERY, SAMPLE, or REMAP_DATA parameter were not used for the specified table in the Export Data Pump job.



The table or partition is relatively small (up to 250 Mb), or the table or partition is larger, but the job cannot run in parallel because the parameter PARALLEL was not specified (or was set to 1).


IMPDP will use DIRECT_PATH if:



The structure of a table allows a Direct Path load, i.e.:

     - A global index does not exist on a multipartition table during a single-partition load. This includes object tables that are partitioned.

     - A domain index does not exist for a LOB column.

     - The table is not in a cluster.

     - The table does not have BFILE columns or columns of opaque types.

     - The table does not have VARRAY columns with an embedded opaque type.

     - The table does not have encrypted columns.

     - Supplemental logging is not enabled and the table does not have a LOB column.

     - The table into which data is being imported is a pre-existing table and:

        – There is not an active trigger, and:

        – The table is partitioned and has an index, and:

        – Fine-grained access control for INSERT mode is not enabled, and:

        – A constraint other than table check does not exist, and:

        – A unique index does not exist.



The parameters QUERY, REMAP_DATA parameter were not used for the specified table in the Import Data Pump job.



The table or partition is relatively small (up to 250 Mb), or the table or partition is larger, but the job cannot run in parallel because the parameter PARALLEL was not specified (or was set to 1).


How to enforce a specific load/unload method ?


In very specific situations, the undocumented parameter ACCESS_METHOD can be used to enforce a specific method to unload or load the data. Example:


%expdp system/manager ... ACCESS_METHOD=DIRECT_PATH 

%expdp system/manager ... ACCESS_METHOD=EXTERNAL_TABLE 



or:



%impdp system/manager ... ACCESS_METHOD=DIRECT_PATH 

%impdp system/manager ... ACCESS_METHOD=EXTERNAL_TABLE 


Important Need-To-Know's when the parameter ACCESS_METHOD is specified for a job:



  • The parameter ACCESS_METHOD is an undocumented parameter and should only be used when requested by Oracle Support.

  • If the parameter is not specified, then Data Pump will automatically choose the best method to load or unload the data.

  • If import Data Pump cannot choose due to conflicting restrictions, an error will be reported:

    ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."EMP" using client specified AUTOMATIC method

  • The parameter can only be specified when the Data Pump job is initially started (i.e. the parameter cannot be specified when the job is restarted).

  • If the parameter is specified, the method of loading or unloading the data is enforced on all tables that need to be loaded or unloaded with the job.

  • Enforcing a specific method may result in a slower performance of the overall Data Pump job, or errors such as:


...

Processing object type TABLE_EXPORT/TABLE/TABLE_DATA

ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."MY_TAB" using client specified DIRECT_PATH method

...



  • To determine which access method is used, a Worker trace file can be created, e.g.:


%expdp system/manager DIRECTORY=my_dir \

DUMPFILE=expdp_s.dmp LOGFILE=expdp_s.log \

TABLES=scott.my_tab TRACE=400300


The Worker trace file shows the method with which the data was loaded (or unloaded for Import Data Pump):


...

KUPW:14:57:14.289: 1: object: TABLE_DATA:"SCOTT"."MY_TAB"

KUPW:14:57:14.289: 1: TABLE_DATA:"SCOTT"."MY_TAB" external table, parallel: 1

...


EXPDP will use EXTERNAL_TABLE mode if:



Data cannot be unloaded in Direct Path mode, because of the structure of the table, i.e.: 

     - Fine-grained access control for SELECT is enabled for the table.

     - The table is a queue table.

     - The table contains one or more columns of type BFILE or opaque, or an object type containing opaque columns.

     - The table contains encrypted columns.

     - The table contains a column of an evolved type that needs upgrading.

     - The table contains a column of type LONG or LONG RAW that is not last.



Data could also have been unloaded in "Direct Path" mode, but the parameters QUERY, SAMPLE, or REMAP_DATA were used for the specified table in the Export Data Pump job.



Data could also have been unloaded in "Direct Path" mode, but the table or partition is relatively large (> 250 Mb) and parallel SQL can be used to speed up the unload even more.


IMPDP will use EXTERNAL_TABLE if:



Data cannot be loaded in Direct Path mode, because at least one of the following conditions exists:

     - A global index on multipartition tables exists during a single-partition load. This includes object tables that are partitioned.

     - A domain index exists for a LOB column.

     - A table is in a cluster.

     - A table has BFILE columns or columns of opaque types.

     - A table has VARRAY columns with an embedded opaque type.

     - The table has encrypted columns.

     - Supplemental logging is enabled and the table has at least one LOB column.

     - The table into which data is being imported is a pre-existing table and at least one of the following conditions exists:

        – There is an active trigger

        – The table is partitioned and does not have any indexes

        – Fine-grained access control for INSERT mode is enabled for the table.

        – An enabled constraint exists (other than table check constraints)

        – A unique index exists



Data could also have been loaded in "Direct Path" mode, but the parameters QUERY, or REMAP_DATA were used for the specified table in the Import Data Pump job.



Data could also have been loaded in "Direct Path" mode, but the table or partition is relatively large (> 250 Mb) and parallel SQL can be used to speed up the load even more.

Conventional Path Load and Direct Path Load : Simple to use in complex situations

One of my seminar participant asked me about what actually happens at oracle data block level during direct path loading in comparison to conventional path uploading. Even I was eager to know when first time I got this question. I patiently told participant that I will find out root of this question. Here is my analysis.

A conventional path load executes SQL INSERT statements to populate tables in an Oracle database. A direct path load eliminates much of the Oracle database overhead by formatting Oracle data blocks and writing the data blocks directly to the database files.

A direct load does not compete with other users for database resources, so it can usually load data at near disk speed.

To start SQL*Loader in direct path load mode, set the DIRECT parameter to true on the command line or in the parameter file, if used, in the format:


DIRECT=true


By default, the loading method is Conventional.


The script, which creates the views used by the Direct Path, is: catldr.sql (OH\rdbms\admin)


(This script is run by catalog.sql, by default).






















































































Conventional Path

Direct Path

SQL*Loader uses the SQL INSERT statement and bind array buffer to load data.

It passes on the data to the Load Engine of the database, which creates a Column Array structure.

Makes use of the Database buffer cache and may increase contention for the resources among other users.

Avoids buffer cache and writes directly to the Disk.


Can make use of Asynchronous I/O if available/supported on the OS.

Slower since the SQL INSERT statements have to be generated, passed to Oracle, and executed.

Faster since, the Load Engine converts the column array structure directly to Oracle Data Blocks and adds it to the Table’s existing segment.

While loading the data, searches for blocks with enough free space, into which the rows can be inserted.

Does not search the existing blocks.


New blocks are formatted and added to the existing segment.

Does not lock the table being loaded into.

Locks table in Exclusive mode. Hence, should not be used if concurrent access to the table is required during the load.


The 10046 trace output shows:


‘Lock table <table_name> exclusive mode nowait;’

Can be used to load into Clustered tables

Cannot be used to load into a cluster.

Check constraints are enabled during the load. Records not satisfying the constraint are rejected and written into the BAD file.

The constraints are disabled during the load.


It explicitly executes an


‘alter table <table_name> disable constraint <constraint_name>’


statement before loading into the table.

Can be used to load into Varrays

Cannot be used to load into Varrays

Can be used to load into BFILE columns.

Cannot be used to load into BFILE columns.

Can be used to load into a Single partition of a table having Global indexes

Direct path cannot be used to load into a particular partition of the table if the table has a global index defined on it.

Cannot be used for loading data in Parallel. But, you can use multiple load session concurrently inserting into the same table.

Parallel loading of data is possible.

Automatically inserts default values for the columns, if any.

The default value specified for the column is not inserted. If it is a ‘null’ column, it inserts a null.

Indexes


Unique Index on the table is in a valid state after the load. The uniqueness of the data for the index column is maintained. Records violating the uniqueness are rejected and written into the BAD file.

The Uniqueness of the data is not validated. The unique index is in an ‘UNUSABLE’ state at the end of the load.

If the table has any indexes, corresponding keys are added into the index for each new row inserted into the table.

After each block is formatted, the new index keys are put in a sort (temporary) segment.


The old index and the new keys are merged at load finish time to create the new index.

The index does not require a re-build at the end of the load. Also, no extra storage is required.


But, since the index is updates for each new row, it increases the processing time.

The index needs to be re-built at the end of the load. The old index, new index and sort segment all require storage space until the indexes are merged.

Loading into Objects


If the type has a User-defined constructor matching the arguments of the attribute-value constructor, conventional path calls the User-defined constructor.

Direct path calls the Argument-value constructor.

If the type has a User-defined constructor not matching the arguments of the attribute-value constructor, you can invoke the user-defined using an SQL Expression.

It is not possible to invoke the user-defined constructor in direct path loading.

Can be used to load into Parent and child table at the same time.

Cannot be used to load into Parent and child table at the same time