Friday 24 August 2007

Oracle streams - errors and fixes

Hit a problem with the Streams capture process aborting with a duplicate value in the index I_STREAMS_APPLY_SPILL_TXN on the table sys.STREAMS$_APPLY_SPILL_TXN. Quick look on the interweb revealed the answer.. which was to delete old rows from the table.
The view dba_capture tells you the process has aborted.

To start and stop the process use:-
BEGIN
dbms_capture_adm.start_capture(capture_name => 'capture_e1strm');
END;

BEGIN
dbms_capture_adm.stop_capture(capture_name => 'capture_e1strm');
END;



A second error involved the propagation process. The error in the logs was
ORA-25307: Enqueue rate too high, flow control

the view dba_propagation showed the same error but the process
was still enabled
The answer turned out to be to
restart the propagation process.
i.e.

BEGIN
dbms_propagation_adm.stop_propagation(propagation_name => 'PROP_E1DB_TO_E2DB');
END;

BEGIN
dbms_propagation_adm.start_propagation(propagation_name => 'PROP_E1DB_TO_E2DB');
END;

Useful site is :


http://download.oracle.com/docs/cd/B28359_01/server.111/b28321/strms_trouble.htm



enabled

Oracle Streams - streaming two users into one

Hit and (hopefully) resolved an number of issues with Oracle Streams recently.

1) to stream two tables with the same structure owned by two different users into one destination user i.e. user1.tab and user2.tab on the source instance into user1.tab on the destination instance.

The answer seems to be to set up a normal schema stream for user1and user2 on the destination. Then modify the capture progress at the destination for user2 as follows:-

CREATE OR REPLACE PROCEDURE user1.write_reh4_lcrs
(v_any IN sys.AnyData)
IS
v_lcr SYS.LCR$_ROW_RECORD;
v_rc pls_integer;
v_command VARCHAR2(10);
v_old_values SYS.LCR$_ROW_LIST;
BEGIN
v_rc := v_any.GETOBJECT(v_lcr);
v_lcr.SET_OBJECT_owner('user1');
v_lcr.EXECUTE(TRUE);
END;



and then mod the capture process to use the proc

BEGIN
dbms_apply_adm.set_dml_handler(
object_name => 'user2.reh4'
,object_type => 'TABLE'
,operation_name => 'INSERT'
,error_handler => FALSE
,user_procedure => 'user1.write_reh4_lcrs'
,apply_database_link => NULL
,apply_name => NULL
);
END;

BEGIN
dbms_apply_adm.set_dml_handler(
object_name => 'user2.reh4'
,object_type => 'TABLE'
,operation_name => 'UPDATE'
,error_handler => FALSE
,user_procedure => 'user1.write_reh4_lcrs'
,apply_database_link => NULL
,apply_name => NULL
);
END;

BEGIN
dbms_apply_adm.set_dml_handler(
object_name => 'user2.reh4'
,object_type => 'TABLE'
,operation_name => 'DELETE'
,error_handler => FALSE
,user_procedure => 'user1.write_reh4_lcrs'
,apply_database_link => NULL
,apply_name => NULL
);
END;

Tuesday 14 August 2007

Oracle streams - lost archive logs

We recently had a problem when we lost the archive logs from the source database before they could be applied to the destination database..... don't ask how or why OK... enough already!

I spent ages trying to figure out how to reset the scn on the capture process to get the whole thing up and running again ( having manually resynced the tables). Turn out the way to do it was to drop and recreate the capture. I used drop_unused_rule_sets => true which meant I had to recreate all the associated rule but you may be able to get away with just dropping and recreating the capture.
i.e.

BEGIN
dbms_capture_adm.drop_capture(capture_name => 'capture_e1strm'
, drop_unused_rule_sets => FALSE
);
END;

followed by

BEGIN
dbms_capture_adm.create_capture(queue_name => 'capture_e1q'
,capture_name => 'capture_e1strm'
,source_database => 'ecit1'
,use_database_link => TRUE
);
END;

Tuesday 7 August 2007

Gather Stats – the importance of granularity

I got caught out today when using dbms_stats.gather_table_stats. I specified a partition name in the statement because I just wanted to gather stats for a particular partition. I did not however specify the granularity. The default value for the parameter will also cause dbms_stats.gather_table_stats to generate the global table stats. The partition I was trying to get stats for was only 60 millions rows so it should only have taken a few hours, the full table however contained 700 million rows so gathering the stats for the partition and the global table stats took 14 hours!

For reference the table below option for granularity.

GRANULARITY

Table Global

Partition Global

Partition Statistics

Subpartition Statistics

GLOBAL

YES

NO

NO

NO

PARTITION

NO

YES

YES

NO

DEFAULT

YES

YES

YES

NO

SUBPARTITION

NO

NO

YES

YES

ALL

YES

YES

YES

YES

This was obtained from the following webpage

http://www.dbazine.com/oracle/or-articles/nanda5

Moving a partitioned table’s tablespace

Moving the partitions with a table is straight forward e.g.

alter table reh_temp move partition p200705

tablespace TSD_Q_BLB_P200705;

This does not however move the table ‘header’ for that you need to do

alter table reh_temp modify default attributes tablespace tsd_q_blb;

Reference

alter table reh_temp modify default attributes tablespace tsd_q_blb; see

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:416718680907

about half way down the page search for alter table invoices modify default attributes tablespace manual;

The same is true of indexes but I do not know what the syntax is for moving indexe headers

Splitting a table partition

If speed at which this operation occurs is dependent on the number of rows in the partition. Empty partition split immediately. An example of the command is :-

ALTER TABLE reh_temp

SPLIT PARTITION pmax AT (TO_DATE('01-AUG-2007','DD-MON-YYYY'))

INTO (PARTITION p200707

,PARTITION pmax)

UPDATE GLOBAL INDEXES;

Exchanging a table for a table partition

An example of this command is

ALTER TABLE reh_temp

exchange partition p200609 with TABLE blb_200610

including indexes;

If the indexes on the ‘donor’ table are all valid etc and the partitioned table has the same indexes AND THEY ARE LOCAL INDEXES then the index should be carried across into the new partition and remain valid. Any global indexes will go invalid.

Partitioning a non-partitioned table using dbms_redefinition

DBMS_REDEFINTION is a really useful package for partitioning a table that is currently not partitioned. It does all the things you would want to do if you partitioned the table manually but does it is a more reliable and bug free way. The only restriction is that the table to be partitioned must have a primary key constraint. The basis steps involved in partitioning are :-

  • Create an empty copy of the table partitioned up as you need it. This table will eventually become your partitioned table so it is important the table and indexes etc are all in the correct tablespaces etc.
  • Check the table can be partitioned using dbms_redefintion e,.g.

BEGIN

sys.dbms_redefinition.can_redef_table('rehfacts'

,reh_orig);

END;

  • Start the redefinition procedure

BEGIN

dbms_redefinition.start_redef_table(uname => 'rehfacts'

,orig_table => 'reh_orig'

,int_table => 'REH_temp'

);

END;

Among other things this turns the new partitioned copy of the table into a materialized view.

  • Given that the data population of the new partitioned table by the step above may take some time, the next step is to sync any data that may have been changed or added during this processs

BEGIN

dbms_redefinition.sync_interim_table(uname => 'dwfacts'

,orig_table => 'reh_orig'

,int_table => 'REH_temp'

);

END;

  • Swap the tables and finish off

BEGIN

dbms_redefinition.finish_redef_table(uname => 'rehfacts'

,orig_table => 'reh_orig'

,int_table => reh_temp'

);

END;

  • Rename all the indexes and move any the foreign key dependencies.

References

http://www.oracle-base.com/articles/misc/PartitioningAnExistingTable.php

http://www.dbasupport.com/forums/archive/index.php/t-27379.html

Updating columns that are used in table partitioning

If you create a table

Create table reh_temp

(col1 NUMBER(8,0))

PARTITION BY RANGE (col1)

(

partition p1 values less than (100)

,partition p2 values less than (200)

,partition PMAX values less than (MAXVALUE) );

Insert into temp values(1);

Select * from reh_temp partition (p1);

Brings back one row as you would expect

Update reh_temp set col1 = 150

ERROR at line 1:

ORA-14402: updating partition key column would cause a partition change

i.e. The row would have to move partitions so it errors However alter table reh_temp enable row movement; then updates are ok and the row moves from partition to partition as you update

USER_TAB_MODIFICATIONS

This view contains all the changes since the table was last analyzed

Monitor Indexes

Which indexes get used in the database and which do not ?

Alter index index_name monitoring usage;

I’ve written a package that turns monitoring on for each index in turn and then turns it off again once it has been used. By this means you learn which indexes are used and which are not.



Latest on this is
http://www.oracloid.com/2006/05/vobject_usage-empty/

v$object_query only contains rows from the current user
so.........


Use this query

SELECT
aoj.*
from
(
select u.name owner
, io.name index_name
, t.name table_name
, decode(bitand(i.flags, 65536), 0, 'NO', 'YES') monitoring
, decode(bitand(ou.flags, 1), 0, 'NO', 'YES') used
, ou.start_monitoring
, ou.end_monitoring
from
sys.user$ u
, sys.obj$ io
, sys.obj$ t
, sys.ind$ i
, sys.object_usage ou
where
i.obj# = ou.obj#
and io.obj# = ou.obj#
and t.obj# = i.bo#
and u.user# = io.owner#
) aoj
,all_indexes u
WHERE aoj.index_name = u.index_name
and aoj.owner = u.owner


Quotation marks – how many are enough?

I’ve always struggled with how many single quotes to put in some SQL statements.

e.g If you want to select from dual the string

“I’ve got a luverly bunch o’ coconuts”

The SQL if something like

Select ‘I’’ve got a luverly bunch o’’ coconuts’ from dual;

An alternative and a far easier method to remember I think (only works with 10g) is

Select q’(I’ve got a luverly bunch o’ coconuts)’ from dual;

i.e. surround the string with q’( string )’

N.B The bracket can be any bracket (I believe) i.e. q['string]' or q'{string}'


Large number of Extents in a Table – The myth and the consequnces

Does a table having a large number of extents in 10g degrade performance? The answer seems to be an emphatic NO.

I’ve referenced below two different explanations found on the interweb as to why this is so.

The first reference can be summarized as follows:-

The operating system I/O buffer is of a limited size (128Kb say). So in a full table scan of a table with ten 128Kb extents it will require ten reads. Even if the ten extents were compressed into one 1280Kb extent, it would still take ten reads to scan the whole table (10 reads each filling the 128Kb I/O buffer)

By looking at the module2 system and working backwards from the value of DB_FILE_MULTIBLOCK_READ_COUNT (which is 16) and the block size of the majority of the tablespaces i.e. 8k. I assume the operating system I/O buffer size is 128Kb (16*8Kb). Most of our tablespaces have extents sizes of 64Kb. Assuming I’ve understood correctly, this means that for any table which has multiple extents we do have a problem, because the I/O could read 128Kb but can only read 64Kb at a time so a full table scan of a table spread over two extents take 2 reads and not one. Is this correct or is the OS/NAS clever enough to read 2*64Kb extents at a time?

Reference:

http://www.devshed.com/c/a/Oracle/Developing-and-Implementing-Applications-continued/3/

http://www.dizwell.com/prod/node/66

2. Local Indexes - Stats on the index 'header'’

From investigation of a problem I have come to the following conclusion. The standard dbms_gather_stats package used by Oracle 10g as the out-of-the-box method for keeping stats current was NOT regenerating the stats for the ‘header’ or global part of local indexes on partitioned tables.

i.e. for local indexes you can generate stats for each partition of the index AND also for the index itself (the header).

The effect of these out of date stats was that an update statement that had been working well for some time suddenly ran so slowly it effectively killed the live system. It turned out the Cost based Optimizer had decided to use an incorrect index on a table with 130 Millions rows in it. I regenerated the index headers stats for the incorrect index and the index it should have been using. The optimizer switched to using a (better but still incorrect) third index. Once I had regenerated the stats for this third index the optimizer started using the correct index.

Conclusion we cannot rely just on dbms_gather_stats to keep the stats up to date. I have written a packaged procedure to gather the stats for N indexes a day starting with the most out of date. In the absence of anything better, I would suggest we run this procedure on a daily basis on each database.

Reverse Indexes

All the examples on the ‘interweb’ that talk about situations where reverse indexes are useful, centre around inserting a large numbers of rows, in a short period of time, into a table where the primary key is a sequence number. This results in ‘hot spots’ where all the inserts need the same part of the index. Using a reverse index would spread the inserts out over the entire structure more evenly.

The downside is that a reverse index cannot be used for an index range scan. It does, however, work well with exact matches e.g. b_id = 1234567 and will full scan the index, if the CBO thinks it is appropriate.

Global Vs Local Indexes

In this paragraph I discuss the various merits of global and local indexes, this is done partly for my own benefit, as it always helps to write things down, but I would also welcome your comments if you have an opinion on the subject or you think I’ve just got it plain wrong.

On a partitioned table, of whatever flavor (range, hash or list), you have a choice of local or global indexes.

Points to note: -

i) You can have a mix of both local and global indexes on the same partitioned table.

ii) Local indexes are best suited to data warehouse application, while global indexes work better with OLTP systems.

iii) Local indexes are ideal for any index that is prefixed with the same column(s) used to partition the table.

iv) Global indexes can be range partitioned using a different ‘scheme’ to the underlying table.

Local Indexes
Pros

i) The one major advantage of local indexes is that if a table partition is exchanged (using ‘alter table partitioned exchange/drop/add partition’) a local index will remain valid.

Cons

i) The disadvantage of a local index that is not prefixed with the column(s) that are used to partition the table is that in order to find a value(s) in the index, each separate partition has to be searched for the index entry/entries. When a relatively small number of rows are required from the database (typical of a OLTP system) this additional overhead can be a significant part of the query execution time. In a data warehouse system, were typically a large number of rows are returned from a query, this overhead is a smaller proportion of the overall time taken by the query.

Global Indexes
Pros

i) Global indexes can be range partitioned using a different partition scheme from the underlying table. i.e. on a table (containing two columns id and start_date) that is hash partitioned by id, the global index on start_date can be range partitioned. This allows for partition elimination of the index entries when performing queries with a predicate including the start_date column.

ii) They can be used to enforce uniqueness on columns not used to partition the table.

iii) All the index entries are held in one place (i.e. not spread out over ‘n’ partitions)

Cons

i) They go invalid when partitions are exchanged.

V$SESSION_LONGOPS

have a look at the V$SESSION_LONGOPS view when running long running SQL. It shows (among other things) the work the SQL has done and the amount of work it still needs to do to complete the job.

e.g.

SELECT ROUND(((totalwork-sofar)/totalwork)*100) perc,vlo.*

FROM v$session_longops vlo

Where totalwork-sofar > 0

AND start_time > Trunc(Sysdate)

Order By sid,start_time Asc;

Autotrace

I have been using Autotrace under the qgames user on feltemp1. In order get it working I had to get the DBA’s to run the ~oracle/sqlplus/admin/plustrace.sql as SYS and then

grant plustrace to qgames;

Things to look at when investigating poorly performing SQL

i) The stats for the table and index may be out of date.

ii) Set the session variables as above.

iii) Don’t get caught out by explain plan (use toad most of the time because it’s easiest to use but use ‘explain plan for…’ to confirm conclusions.

iv) Don’t get caught out by database caching results in the SGA. Use different data values for queries to hit bits of the table than are not cached.

v) Don’t assume the query path is the same for all parameters in your query. The CBO may change it query path depending on the number of rows it expects to return.

vi) Toad only returns the first 500 rows by default, do a count(*) to confirm your findings.

Trust the CBO and do not use hints

I firmly believe this to be true, I have read it over and over again (but now cannot find a decent reference, best I can do is http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:15799499535198).

If you need to hint your code, in most cases this is merely a sticking plaster over an underlying problem. You should resolve the underlying problem and allow the CBO to do its job. This will result in more robust code since the CBO will compensate for changes in the underlying data patterns.

We use hints everywhere in our code it seems to me, this is a direct result of ‘stale stats’ and/or inappropriate init.paras.

Explain plan

‘All explain planners occasionally lie, some lie more often than others’ – Jonathon Lewis (allegedly).

Toad explain plan (the ambulance icon) lies more than ‘explain plan for …’ / ‘Select * from table(dbms_xplan.display)’.

Toad explain plan caught me out more than once.

Init paras

One of the biggest challenges I encountered during this research was generating conditions under which the Cost Based Optimizer (CBO) would behave in a consistent, predictable fashion. Stats played a major role here, but once I could eliminate them from the equation, I had particular problems persuading the CBO to use an index to perform a query (without the use of hints). The CBO would full-table-scan (FTS) at every opportunity, even though this was demonstrably a slower route. The answer lay in the init.paras which cause the CBO to FTS if the stats show the number of rows returned would be 5% (yes, only 5%) or more of the table.

The default init.paras for

optimizer_index_caching = 0

optimizer_index_cost_adj = 100

optimizer_mode = all_rows

are a set of ‘safe’ values and are adequate for most small/medium size database and designed for datawarehouse application but are ‘unsuitable for most OLTP applications’

By changing these values via

alter session set optimizer_index_caching = 90;

alter session set optimizer_index_cost_adj = 25;

alter session set optimizer_mode = first_rows_100;

The propensity for the CBO to FST so readily is lessened and the use of indexes increases without the need to resort to hints.

This proved to be one of the single most important factors in making queries against the view v_bet_union_vb perform in an acceptable manner.

ref : http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:173561100346113797

http://www.dba-oracle.com/oracle_tips_cost_adj.htm

http://www.dba-oracle.com/art_builder_sql_parm.htm

I have considered proposing the option of changing the init.para of all of the OLTP database to these values but have rejected the idea (for the moment) on the basis that this will affect ALL database operations in an unpredictable way.

Ref http://www.praetoriate.com/t_optimizer_index_caching.htm

Table Stats

Table stats are generated by the default automatic process that is created by Oracle when the database is installed i.e. dbms_stats.gather_database_stats. (I believe)

This process uses the view dba_table_modifications to establish which tables have changed by more than 10% (not configurable) since the last stats were gathered. For the vast majority of tables this works very well but for certain tables it is not adequate. I have carried out repeatable experiments that show (I believe) that certain queries perform extremely well (2 sec response time) on those partitions that have stats that are very recent and perform extremely badly against partitions that have out of date stats (2 minute response times) even though data in the partition has changed by less than 10%.

Reference : http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:3058895998401