Exploring Oracle 10g Datapump: Using EXPDP & IMPDP: 2009

Sunday, May 3, 2009

In oracle datapump Main difference which making all other differences with conventional export import

Data Pump runs as a job inside the database, rather than as a stand-alone client application. This means that jobs are somewhat independent of the process that started the export or import.

11g new view - V$SQL_HINT

In case you wanted to know which Oracle version a particular SQL hint is
applicable in or was introduced in. You can query V$SQL_HINT introduced
in 11g for that. It holds even historical information. Column "version"
gives oracle version in which a particular hint was introduced and
version_outline probably gives upto which version upto which it is
applicable.



This even has a column to give you inverse of an hint.



This view is a undocumented view.



SQL> desc v$sql_hint

Name                                      Null?    Type

----------------------------------------- --------
----------------------------

NAME                                               VARCHAR2(64)

SQL_FEATURE                                        VARCHAR2(64)

CLASS                                              VARCHAR2(64)

INVERSE                                            VARCHAR2(64)

TARGET_LEVEL                                       NUMBER

PROPERTY                                           NUMBER

VERSION                                            VARCHAR2(25)

VERSION_OUTLINE                                    VARCHAR2(25)

Does Oracle datapump uses direct path load?

Yes. This is one of feature that makes impdp or expdp more faster than conventional export and import. To use direct path loading through oracle datapump, one has follow certain condition. Alternatively it can can be used by external table method by which we unload the data on flat file on file system of database server and after user can use those flat file as simple data source in its SELECT statement.

EXPDP will use DIRECT_PATH mode if:

The structure of a table allows a Direct Path unload, i.e.:

   - The table does not have fine-grained access control enabled for SELECT.

     - The table is not a queue table.

     - The table does not contain one or more columns of type BFILE or opaque, or an object type containing opaque columns.

   - The table does not contain encrypted columns.

   - The table does not contain a column of an evolved type that needs upgrading.

   - If the table has a column of datatype LONG or LONG RAW, then this column is the last column.

The parameters QUERY, SAMPLE, or REMAP_DATA parameter were not used for the specified table in the Export Data Pump job.

The table or partition is relatively small (up to 250 Mb), or the table or partition is larger, but the job cannot run in parallel because the parameter PARALLEL was not specified (or was set to 1).

IMPDP will use DIRECT_PATH if:

The structure of a table allows a Direct Path load, i.e.:

   - A global index does not exist on a multipartition table during a single-partition load. This includes object tables that are partitioned.

   - A domain index does not exist for a LOB column.

   - The table is not in a cluster.

   - The table does not have BFILE columns or columns of opaque types.

   - The table does not have VARRAY columns with an embedded opaque type.

   - The table does not have encrypted columns.

   - Supplemental logging is not enabled and the table does not have a LOB column.

   - The table into which data is being imported is a pre-existing table and:

      – There is not an active trigger, and:

      – The table is partitioned and has an index, and:

      – Fine-grained access control for INSERT mode is not enabled, and:

      – A constraint other than table check does not exist, and:

      – A unique index does not exist.

The parameters QUERY, REMAP_DATA parameter were not used for the specified table in the Import Data Pump job.

The table or partition is relatively small (up to 250 Mb), or the table or partition is larger, but the job cannot run in parallel because the parameter PARALLEL was not specified (or was set to 1).

How to enforce a specific load/unload method ?

In very specific situations, the undocumented parameter ACCESS_METHOD can be used to enforce a specific method to unload or load the data. Example:

%expdp system/manager ... ACCESS_METHOD=DIRECT_PATH

%expdp system/manager ... ACCESS_METHOD=EXTERNAL_TABLE

or:

%impdp system/manager ... ACCESS_METHOD=DIRECT_PATH

%impdp system/manager ... ACCESS_METHOD=EXTERNAL_TABLE

Important Need-To-Know's when the parameter ACCESS_METHOD is specified for a job:

The parameter ACCESS_METHOD is an undocumented parameter and should only be used when requested by Oracle Support.

If the parameter is not specified, then Data Pump will automatically choose the best method to load or unload the data.

If import Data Pump cannot choose due to conflicting restrictions, an error will be reported:

ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."EMP" using client specified AUTOMATIC method

The parameter can only be specified when the Data Pump job is initially started (i.e. the parameter cannot be specified when the job is restarted).

If the parameter is specified, the method of loading or unloading the data is enforced on all tables that need to be loaded or unloaded with the job.

Enforcing a specific method may result in a slower performance of the overall Data Pump job, or errors such as:

...

Processing object type TABLE_EXPORT/TABLE/TABLE_DATA

ORA-31696: unable to export/import TABLE_DATA:"SCOTT"."MY_TAB" using client specified DIRECT_PATH method

...

To determine which access method is used, a Worker trace file can be created, e.g.:

%expdp system/manager DIRECTORY=my_dir \

DUMPFILE=expdp_s.dmp LOGFILE=expdp_s.log \

TABLES=scott.my_tab TRACE=400300

The Worker trace file shows the method with which the data was loaded (or unloaded for Import Data Pump):

...

KUPW:14:57:14.289: 1: object: TABLE_DATA:"SCOTT"."MY_TAB"

KUPW:14:57:14.289: 1: TABLE_DATA:"SCOTT"."MY_TAB" external table, parallel: 1

...

EXPDP will use EXTERNAL_TABLE mode if:

Data cannot be unloaded in Direct Path mode, because of the structure of the table, i.e.:

   - Fine-grained access control for SELECT is enabled for the table.

   - The table is a queue table.

   - The table contains one or more columns of type BFILE or opaque, or an object type containing opaque columns.

   - The table contains encrypted columns.

   - The table contains a column of an evolved type that needs upgrading.

   - The table contains a column of type LONG or LONG RAW that is not last.

Data could also have been unloaded in "Direct Path" mode, but the parameters QUERY, SAMPLE, or REMAP_DATA were used for the specified table in the Export Data Pump job.

Data could also have been unloaded in "Direct Path" mode, but the table or partition is relatively large (> 250 Mb) and parallel SQL can be used to speed up the unload even more.

IMPDP will use EXTERNAL_TABLE if:

Data cannot be loaded in Direct Path mode, because at least one of the following conditions exists:

   - A global index on multipartition tables exists during a single-partition load. This includes object tables that are partitioned.

   - A domain index exists for a LOB column.

   - A table is in a cluster.

   - A table has BFILE columns or columns of opaque types.

   - A table has VARRAY columns with an embedded opaque type.

   - The table has encrypted columns.

   - Supplemental logging is enabled and the table has at least one LOB column.

   - The table into which data is being imported is a pre-existing table and at least one of the following conditions exists:

      – There is an active trigger

      – The table is partitioned and does not have any indexes

      – Fine-grained access control for INSERT mode is enabled for the table.

      – An enabled constraint exists (other than table check constraints)

      – A unique index exists

Data could also have been loaded in "Direct Path" mode, but the parameters QUERY, or REMAP_DATA were used for the specified table in the Import Data Pump job.

Data could also have been loaded in "Direct Path" mode, but the table or partition is relatively large (> 250 Mb) and parallel SQL can be used to speed up the load even more.

Conventional Path Load and Direct Path Load : Simple to use in complex situations

One of my seminar participant asked me about what actually happens at oracle data block level during direct path loading in comparison to conventional path uploading. Even I was eager to know when first time I got this question. I patiently told participant that I will find out root of this question. Here is my analysis.

A conventional path load executes SQL INSERT statements to populate tables in an Oracle database. A direct path load eliminates much of the Oracle database overhead by formatting Oracle data blocks and writing the data blocks directly to the database files.

A direct load does not compete with other users for database resources, so it can usually load data at near disk speed.

To start SQL*Loader in direct path load mode, set the DIRECT parameter to true on the command line or in the parameter file, if used, in the format:

DIRECT=true

By default, the loading method is Conventional.

The script, which creates the views used by the Direct Path, is: catldr.sql (OH\rdbms\admin)

(This script is run by catalog.sql, by default).

Conventional Path	Direct Path
SQL*Loader uses the SQL INSERT statement and bind array buffer to load data.	It passes on the data to the Load Engine of the database, which creates a Column Array structure.
Makes use of the Database buffer cache and may increase contention for the resources among other users.	Avoids buffer cache and writes directly to the Disk. Can make use of Asynchronous I/O if available/supported on the OS.
Slower since the SQL INSERT statements have to be generated, passed to Oracle, and executed.	Faster since, the Load Engine converts the column array structure directly to Oracle Data Blocks and adds it to the Table’s existing segment.
While loading the data, searches for blocks with enough free space, into which the rows can be inserted.	Does not search the existing blocks. New blocks are formatted and added to the existing segment.
Does not lock the table being loaded into.	Locks table in Exclusive mode. Hence, should not be used if concurrent access to the table is required during the load. The 10046 trace output shows: ‘Lock table <table_name> exclusive mode nowait;’
Can be used to load into Clustered tables	Cannot be used to load into a cluster.
Check constraints are enabled during the load. Records not satisfying the constraint are rejected and written into the BAD file.	The constraints are disabled during the load. It explicitly executes an ‘alter table <table_name> disable constraint <constraint_name>’ statement before loading into the table.
Can be used to load into Varrays	Cannot be used to load into Varrays
Can be used to load into BFILE columns.	Cannot be used to load into BFILE columns.
Can be used to load into a Single partition of a table having Global indexes	Direct path cannot be used to load into a particular partition of the table if the table has a global index defined on it.
Cannot be used for loading data in Parallel. But, you can use multiple load session concurrently inserting into the same table.	Parallel loading of data is possible.
Automatically inserts default values for the columns, if any.	The default value specified for the column is not inserted. If it is a ‘null’ column, it inserts a null.
Indexes
Unique Index on the table is in a valid state after the load. The uniqueness of the data for the index column is maintained. Records violating the uniqueness are rejected and written into the BAD file.	The Uniqueness of the data is not validated. The unique index is in an ‘UNUSABLE’ state at the end of the load.
If the table has any indexes, corresponding keys are added into the index for each new row inserted into the table.	After each block is formatted, the new index keys are put in a sort (temporary) segment. The old index and the new keys are merged at load finish time to create the new index.
The index does not require a re-build at the end of the load. Also, no extra storage is required. But, since the index is updates for each new row, it increases the processing time.	The index needs to be re-built at the end of the load. The old index, new index and sort segment all require storage space until the indexes are merged.
Loading into Objects
If the type has a User-defined constructor matching the arguments of the attribute-value constructor, conventional path calls the User-defined constructor.	Direct path calls the Argument-value constructor.
If the type has a User-defined constructor not matching the arguments of the attribute-value constructor, you can invoke the user-defined using an SQL Expression.	It is not possible to invoke the user-defined constructor in direct path loading.
Can be used to load into Parent and child table at the same time.	Cannot be used to load into Parent and child table at the same time

Sunday, April 26, 2009

Datapump Export handy examples: Analytical backup of your data

Here I want to show some handy examples, how to run the datapump export.

When we do conventional backup of our database either by using RMAN or by using hot or cold backup methods then output file (backup file) actually contains block to block copy of source file. Now these files used to restore the data then following two condition must be followed up:

Platform must not be changed.
Backup process should identify the oracle blocks from backup files.

Interestingly, if we have dump export then it can be used as analytical backup. It means....... it can imported irrespective platform and oracle database.

Let us have some examples how to perform the export from database:

Performing a Table-Mode Export

Issue the following Data Pump export command to perform a table export of the tables employees and jobs from the human resources (hr) schema:

expdp hr/hr TABLES=employees,jobs DUMPFILE=dpump_dir1:table.dmp NOLOGFILE=y
Because user hr is exporting tables in his own schema, it is not necessary to specify the schema name for the tables. The NOLOGFILE=y parameter indicates that an Export log file of the operation will not be generated.

Data-Only Unload of Selected Tables and Rows

the contents of a parameter file (exp.par) that you could use to perform a data-only unload of all tables in the human resources (hr) schema except for the tables countries and regions. Rows in the employees table are unloaded that have a department_id other than 50. The rows are ordered by employee_id.

Only Unload of Selected Tables and Rows

DIRECTORY=dpump_dir1
DUMPFILE=dataonly.dmp
CONTENT=DATA_ONLY
EXCLUDE=TABLE:"IN ('COUNTRIES', 'REGIONS')"
QUERY=employees:"WHERE department_id !=50 ORDER BY employee_id"

You can issue the following command to execute the exp.par parameter file:

> expdp hr/hr PARFILE=exp.par

Estimating Disk Space Needed in a Table-Mode Export

the use of the ESTIMATE_ONLY parameter to estimate the space that would be consumed in a table-mode export, without actually performing the export operation. Issue the following command to use the BLOCKS method to estimate the number of bytes required to export the data in the following three tables located in the human resource (hr) schema: employees, departments, and locations.

Example 2-3 Estimating Disk Space Needed in a Table-Mode Export

> expdp hr/hr DIRECTORY=dpump_dir1 ESTIMATE_ONLY=y TABLES=employees,
departments, locations LOGFILE=estimate.log

Performing a Schema-Mode Export

The estimate is printed in the log file and displayed on the client's standard output device. The estimate is for table row data only; it does not include metadata.

A schema-mode export (the default mode) is performed, but the CONTENT parameter effectively limits the export to an unload of just the table's data. The DBA previously created the directory object dpump_dir1 which points to the directory on the server where user hr is authorized to read and write export dump files. The dump file dataonly.dmp is created in dpump_dir1.

> expdp hr/hr DUMPFILE=dpump_dir1:expschema.dmp LOGFILE=dpump_dir1:expschema.log

Performing a Parallel Full Database Export

> expdp hr/hr FULL=y DUMPFILE=dpump_dir1:full1%U.dmp, dpump_dir2:full2%U.dmp
FILESIZE=2G PARALLEL=3 LOGFILE=dpump_dir1:expfull.log JOB_NAME=expfull

Because this is a full database export, all data and metadata in the database will be exported. Dump files full101.dmp, full201.dmp, full102.dmp, and so on will be created in a round-robin fashion in the directories pointed to by the dpump_dir1 and dpump_dir2 directory objects. For best performance, these should be on separate I/O channels. Each file will be up to 2 gigabytes in size, as necessary. Initially, up to three files will be created. More files will be created, if needed. The job and master table will have a name of expfull. The log file will be written to expfull.log in the dpump_dir1 directory.

Using Interactive Mode to Stop and Reattach to a Job

While the export is running, press Ctrl+C. This will start the interactive-command interface of Data Pump Export. In the interactive interface, logging to the terminal stops and the Export prompt is displayed.
Stopping and Reattaching to a Job

At the Export prompt, issue the following command to stop the job:

Export> STOP_JOB=IMMEDIATE
Are you sure you wish to stop this job ([y]/n): y

The job is placed in a stopped state and exits the client.

Enter the following command to reattach to the job you just stopped:

> expdp hr/hr ATTACH=EXPFULL

After the job status is displayed, you can issue the CONTINUE_CLIENT command to resume logging mode and restart the expfull job.

Export> CONTINUE_CLIENT

A message is displayed that the job has been reopened, and processing status is output to the client.

Saturday, April 25, 2009

Mapping between the parameters of datapump export and original export

some of the parameter names may be the same, but the functionality is slightly different.

Original Export Parameter	Comparable Data Pump Export Parameter
BUFFER	A parameter comparable to `BUFFER` is not needed.
COMPRESS	A parameter comparable to `COMPRESS` is not needed.
CONSISTENT	A parameter comparable to `CONSISTENT` is not needed. Use `FLASHBACK_SCN` and `FLASHBACK_TIME` for this functionality.
CONSTRAINTS	`EXCLUDE=CONSTRAINT`
DIRECT	A parameter comparable to `DIRECT` is not needed. Data Pump Export automatically chooses the best method (direct path mode or external tables mode).
FEEDBACK	`STATUS`
FILE	`DUMPFILE`
FILESIZE	`FILESIZE`
FLASHBACK_SCN	`FLASHBACK_SCN`
FLASHBACK_TIME	`FLASHBACK_TIME`
FULL	`FULL`
GRANTS	`EXCLUDE=GRANT`
HELP	`HELP`
INDEXES	`EXCLUDE=INDEX`
LOG	`LOGFILE`
OBJECT_CONSISTENT	A parameter comparable to `OBJECT_CONSISTENT` is not needed.
OWNER	`SCHEMAS`
PARFILE	`PARFILE`
QUERY	`QUERY`
RECORDLENGTH	A parameter comparable to `RECORDLENGTH` is not needed because sizing is done automatically.
RESUMABLE	A parameter comparable to `RESUMABLE` is not needed. This functionality is automatically provided for privileged users.
RESUMABLE_NAME	A parameter comparable to `RESUMABLE_NAME` is not needed. This functionality is automatically provided for privileged users.
RESUMABLE_TIMEOUT	A parameter comparable to `RESUMABLE_TIMEOUT` is not needed. This functionality is automatically provided for privileged users.
ROWS=N	`CONTENT`=`METADATA_ONLY`
ROWS=Y	`CONTENT`=`ALL`
STATISTICS	A parameter comparable to `STATISTICS` is not needed. Statistics are always saved for tables.
TABLES	`TABLES`
TABLESPACES	`TABLESPACES` (Same parameter; slightly different behavior)
TRANSPORT_TABLESPACE	`TRANSPORT_TABLESPACES` (Same parameter; slightly different behavior)
TRIGGERS	`EXCLUDE=TRIGGER`
TTS_FULL_CHECK	`TRANSPORT_FULL_CHECK`
USERID	A parameter comparable to `USERID` is not needed. This information is supplied as the `username`/`password` when you invoke Export.
VOLSIZE	A parameter comparable to `VOLSIZE` is not needed.

This table does not list all Data Pump Export command-line parameters.

Original Export and Import Versus Data Pump Export and Import

it is important to understand that many of the concepts behind them do not apply to Data Pump Export (expdp) and Data Pump Import (impdp). In particular:

Data Pump Export and Import operate on a group of files called a dump file set rather than on a single sequential dump file.
Data Pump Export and Import access files on the server rather than on the client. This results in improved performance. It also means that directory objects are required when you specify file locations.
The Data Pump Export and Import modes operate symmetrically, whereas original export and import did not always exhibit this behavior.

For example, suppose you perform an export with FULL=Y, followed by an import using SCHEMAS=HR. This will produce the same results as if you performed an export with SCHEMAS=HR, followed by an import with FULL=Y.
Data Pump Export and Import use parallel execution rather than a single stream of execution, for improved performance. This means that the order of data within dump file sets and the information in the log files is more variable.
Data Pump Export and Import represent metadata in the dump file set as XML documents rather than as DDL commands. This provides improved flexibility for transforming the metadata at import time.
Data Pump Export and Import are self-tuning utilities. Tuning parameters that were used in original Export and Import, such as BUFFER and RECORDLENGTH, are neither required nor supported by Data Pump Export and Import.
At import time there is no option to perform interim commits during the restoration of a partition. This was provided by the COMMIT parameter in original Import.
There is no option to merge extents when you re-create tables. In original Import, this was provided by the COMPRESS parameter. Instead, extents are reallocated according to storage parameters for the target table.
Sequential media, such as tapes and pipes, are not supported.
The Data Pump method for moving data between different database versions is different than the method used by original Export/Import. With original Export, you had to run an older version of Export (exp) to produce a dump file that was compatible with an older database version. With Data Pump, you can use the current Export (expdp) version and simply use the VERSION parameter to specify the target database version.
When you are importing data into an existing table using either APPEND or TRUNCATE, if any row violates an active constraint, the load is discontinued and no data is loaded. This is different from original Import, which logs any rows that are in violation and continues with the load.
Data Pump Export and Import consume more undo tablespace than original Export and Import. This is due to additional metadata queries during export and some relatively long-running master table queries during import. As a result, for databases with large amounts of metadata, you may receive an ORA-01555: snapshot too old error. To avoid this, consider adding additional undo tablespace or increasing the value of the UNDO_RETENTION initialization parameter for the database.
If a table has compression enabled, Data Pump Import attempts to compress the data being loaded. Whereas, the original Import utility loaded data in such a way that if a even table had compression enabled, the data was not compressed upon import.
Data Pump supports character set conversion for both direct path and external tables. Most of the restrictions that exist for character set conversions in the original Import utility do not apply to Data Pump. The one case in which character set conversions are not supported under the Data Pump is when using transportable tablespaces.

DataPump Export and Import: New features

It is introduced with following new features:

The ability to specify the maximum number of threads of active execution operating on behalf of the Data Pump job.
The ability to restart Data Pump jobs.
The ability to detach from and reattach to long-running jobs without affecting the job itself. This allows DBAs and other operations personnel to monitor jobs from multiple locations. The Data Pump Export and Import utilities can be attached to only one job at a time; however, you can have multiple clients or jobs running at one time.
Support for export and import operations over the network, in which the source of each operation is a remote instance.
The ability, in an import job, to change the name of the source datafile to a different name in all DDL statements where the source datafile is referenced.
Enhanced support for remapping tablespaces during an import operation.
Support for filtering the metadata that is exported and imported, based upon objects and object types.
Support for an interactive-command mode that allows monitoring of and interaction with ongoing jobs.
The ability to estimate how much space an export job would consume, without actually performing the export.
The ability to specify the version of database objects to be moved. In export jobs, VERSION
Most Data Pump export and import operations occur on the Oracle database server instead of clients.

Introducing:

Oracle Database 10g have introduced new feature Datapump to more enhance export and import process in database. It will replace the use of exp and imp utility from older versions. Also there is no migration support given from conventional export-import to Datapump export-import.

In this I am going to explore Datapump export-import utility and comparing-contrasting with conventional export-import.

Exploring Oracle 10g Datapump: Using EXPDP & IMPDP