Tuesday, 24 June 2008

Confession about Compression

I've not look at compression before. What a mistakea to makea!

For warehouse and archive apps the improvements and savings appear to be amazing .

Test1
======
I did a small experiment with weekly_aggregates_fact and monthly_aggregates_fact. Both MV's built on pre-existing tables.

I compressed the partitions and performed a count(*) on weekly_aggregates_fact a partition at the same time.
Partitions that are not compressed returned results in about 2 secs (first time that is, faster subsequently, cos of cacheing) Partitions that have been compressed return results in 150msec. Not an exhaustive test, and the may be some caching going on, but promising I think.

Weekly_aggregates_fact was reduced from 9 Gb to 2.3 Gb monthly_aggregates_fact was reduced from 8.6 Gb to 2.1 Gb

Test2
======
Weekly_aggregates_fact and monthly_aggregates_fact have a similar number of rows, 29,874,318 and 29,095,901 respectively. I had already compressed weekly_aggregates_fact so I only have 'compressed' figure for that MV. For monthly_aggregates_fact I have before and after compression figures.

With compression -
Select Count(*) from Weekly_aggregates_fact took 38 secs initially, 2 secs subsequently. Cost 284,000 (toad)

With compression
select count(*),sum(gbp_stake)
from dwfacts.weekly_aggregates_fact
took 9secs initially and 9secs subsequently. Cost 287,000 (toad)

Without compression -
select Count(*) from monthly_aggregates_fact took 3mins 19 secs initially and 2 mins 40 secs subsequently. Cost 240,000 (toad)

Without compression -
select count(*),sum(gbp_stake)
from dwfacts.monthly_aggregates_fact
took 2mins 47 secs initially and 2 mins 47 secs subsequently (yes, the same). Cost 242,000 (toad)

With compression -
select Count(*) from monthly_aggregates_fact took 29secs initially and 1 secs subsequently. Cost 240,000 (toad) i.e. unchanged....hmmm oh dear

With compression -
select count(*),sum(gbp_stake)
from dwfacts.monthly_aggregates_fact
took 10 secs initially and 8 secs subsequently (yes, the same). Cost 242,000 (toad)


These results are truly remarkable , why aren't we using it?
To be fair, it says in tut'book that MVs benefit particularly well from compression because of the amount of repeated data in the aggregates (the group by columns) but even so..


references

http://boomslaang.wordpress.com/2008/06/05/table-compression-in-oracle/
http://www.oracle.com/technology/products/bi/db/10g/pdf/twp_data_compression_10gr2_0505.pdf
http://husnusensoy.wordpress.com/2008/01/23/compressing-subpartition-segments/

No comments: