Monday 30 November 2009

Another reason not to use PCT and Materializied views

PCT relies on being able to identify which partition a piece of data came from in the source table. Hence in any aggregates you have to 'group by ' the dbms__mview.marker(rowid). A side affect of this is that aggregates have to have the same paritioning regime as the data from which it is derived.

e.g.
if a daily aggregates table is partitioned weekly and you wish to roll it up into a yearly aggregates using materializied views and PCT then you will get 52 rows in the yearly aggregate, one for each weekly partition.
i.e. it is not a yearly aggregate

Friday 23 October 2009

create tables as - specify the column format

CREATE TABLE iab1
AS
SELECT CAST( NULL AS DATE) null_date
, CAST( NULL AS NUMBER(8) ) null_number
, CAST( NULL AS VARCHAR2(30) ) null_vchar
, CAST( dummy AS VARCHAR2(200) ) dummy_vchar
FROM dual

N.B. dummy is varchar2(1)

Wednesday 14 October 2009

compression and parallel

The parallel parameter allows queries against that table to ‘go parallel’.
I.e. execute different components of the query using separate processes and processors.

As it suggests, the compress parameter allows data to be compressed.
For insert/update it only works with an 11g optional (cost) extra called advanced compression.
That said, in 10g and above, you can compress the data after it has been inserted using
‘alter table……compress…’
Compression saves a lot of I/O and hence makes many data warehouse queries much quicker.
There is a down side…
You have to rebuild bitmap indexes after converting a table/partition to be compressed
And ( I’ve just leant this today) compressed table are not supported under 10g for stand-by databases.
It does work with 11g.

On balance compression (I believe) is a really, really good thing.

Monday 20 July 2009

Two Table Update

UPDATE
(SELECT
dcl.*
FROM table1 reh,table2 dcl
WHERE reh.seq_no = dcl.seq_no )
SET col1 = 'N';

works fine as long as both tab1 and tab2 have unique indexes/constraints
In this case on seq_no

Wednesday 15 July 2009

Dynamic cursor for loop

OPEN item_cursor FOR stmt USING item_type_in;


http://blog.mclaughlinsoftware.com/2009/02/24/easier-way-than-nds/

Friday 17 April 2009

Flashback saved my life

To recover I used...

create table reh_temp
as
select *
from table1
AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '60' MINUTE) where cs_date = to_date('22-mar-2009','dd-mon-yyyy')
;

I then did...

Insert into table1
Select * from reh_temp


My original cock up was...

Delete
from table1
where cs_date = to_date('22-mar-2009','dd-mon-yyyy');
COMMIT;

Thursday 5 March 2009

Oracle ODI

The Summary
===============
ODI….hmmmmmm……………..‘nuff said ….

The Detail
========
ODI is really good at

- Gluing the batch together – better than, say, Unix shell scripts
- Scheduling the batch
- Reporting on how the batch did
- Getting data from non-Oracle sources – although even here I would use other ‘Oracle’ methods were possible – e.g. heterogeneous services

Its not so good at
- Moving data from one Oracle source to another e.g. ‘live’ Oracle database to Oracle warehouse
- Checking that the results are correct
- Rerunning processes which different parameters (eg load the data from 20 days ago for this interface)
- The generation of scenarios (compiled code) means it is too easy to loose track of whether to designer version of the interface is the same as the as the scenario(compiled code) that is currently running in live.

Justification
=========

Hang on, moving data around is the one of the major functions of ODI. True.. but it is too simplistic. It will only do simple variations on ‘Get data from source table into staging table and from there to target table’. If you need to do anything more complex (for say performance reasons) you need to resort to either Jython or pl/sql . A good example is that a common method of adding data to large table is to create a table and then swap it with on of the target tables partitions.

Why would you want to learn yet another obscure language (Jython) and if you are going to write PL/SQL, it should be in the form of database packages and (at a pinch) database procedures anyway.

The other problem is it does not do much in the way of checking that the number of rows in landing table is the same as the number of rows in the target/fact table. It is very easy to ‘loose’ rows when doing joins to dimension tables and ODI will not tell you. This is not necessarily a problem if the developer is aware of such issues (he can simply tack a pl/sql check on the end of the scenario) but the ‘clicky, pointy’ nature of the development interface is not conducive to thinking too hard about what you are doing.

Lack of parameters – again this may be because I don’t know the tool well enough but it seems really difficult to run a process with different paras (as you would a pl/sql routine).
e.g. re- load the data for 20 days ago
A developer here always copies the interface changes the variables and then runs the copy – dooh!


Documentation
==============

It is argued that ODI is ‘self documenting’, a phrase that should always be viewed with suspicion. The arguement goes that Oracle is going to integrate (in future version) ODI and OBI so that it will be possible to trace back from a report the original data on the live system through the ODI/OBI system. I will leave you to form an opinion on that statement.
Lets just say (and this may be ignorance on my part) I find it really difficult to work out what interface loads data into which data warehouse tables at the moment.

How’s that , sorry it is not more positive , perhaps I should add that the bits I have mentioned that it is good at, it is really good at, but does that justify the cost ?........dunno

Monday 16 February 2009

simple xml parser

CREATE OR REPLACE FUNCTION dwfacts.parse_xml(p_string IN VARCHAR2
,p_node IN VARCHAR2
)
RETURN VARCHAR2
AS
v_data VARCHAR2(4000);
v_node VARCHAR2(4000);
v_upper_str VARCHAR2(4000);
v_startnode VARCHAR2(4000);
v_endnode VARCHAR2(4000);
v_startnodelength NUMBER;
BEGIN
v_node := UPPER(p_node);
v_upper_str := UPPER(p_string);
v_startnode := '<'||v_node||'>';
v_endnode := '';
v_startnodelength := LENGTH(v_startnode);

v_data := SUBSTR(p_string,INSTR(v_upper_str,v_startnode)+v_startnodelength);
v_data := SUBSTR(v_data,1,INSTR(UPPER(v_data),v_endnode)-1);
RETURN(TRIM(v_data));
END;

Thursday 12 February 2009

External Table fro XML file

Turns out that an east way to load simple xml file is to treat in just like a plain text file
e.g.

CREATE TABLE dwstage.reh_table
(
accountid VARCHAR2(80)
,bets NUMBER
,losses NUMBER
,loyaltyPoints NUMBER
,playedHands NUMBER
,playedTournaments NUMBER
,rakeLosses NUMBER
,tableRake NUMBER
,totalRake NUMBER
,tournamentFee NUMBER
,transferIn NUMBER
,transferOut NUMBER
,wins NUMBER
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY TMP
ACCESS PARAMETERS
(
records delimited by ""
badfile TMP:'table.bad'
logfile TMP:'table.log'
fields
(
accountid CHAR(80) enclosed by "<accountid>" and "</accountid>"
,bets CHAR(80) enclosed by "<bets>" and "</bets>"
,losses CHAR(80) enclosed by "<losses>" and "</losses>"
,loyaltyPoints CHAR(80) enclosed by "<loyaltypoints>" and "</loyaltypoints>"
,playedHands CHAR(80) enclosed by "<playedhands>" and "</playedhands>"
,playedTournaments CHAR(80) enclosed by "<playedtournaments>" and "</playedtournaments>"
,rakeLosses CHAR(80) enclosed by "<rakelosses>" and "</rakelosses>"
,tableRake CHAR(80) enclosed by "<tablerake>" and "</tablerake>"
,totalRake CHAR(80) enclosed by "<totalrake>" and "</totalrake>"
,tournamentFee CHAR(80) enclosed by "<tournamentfee>" and "</tournamentfee>"
,transferIn CHAR(80) enclosed by "<transferin>" and "</transferin>"
,transferOut CHAR(80) enclosed by "<transferout>" and "</transferout>"
,wins CHAR(80) enclosed by "<wins>" and "</wins>"
)
)
LOCATION ('table.xml')
)
REJECT LIMIT UNLIMITED
NOPARALLEL
NOMONITORING;

Friday 16 January 2009

Multiple time rows from dual

SELECT TRUNC(SYSDATE - 4 / 1440, 'MI') - (LEVEL - 1) / 1440 starttime
FROM SYS.DUAL
CONNECT BY LEVEL <= 36)