Monday 17 September 2007

sqlplus html output

To produce html output from sqlplus

set markup HTML ON entmap off

see
http://download.oracle.com/docs/cd/B12037_01/server.101/b12170/ch8.htm

Thursday 13 September 2007

When is SQLNET.ORA read?

During the investigation into SQLNET problems (see previous blog entry) we had cause to change the EXPIRE_TIME in a SQNET.ORA file shared by more than one TNS listener. I could find no documentation that explicitly said when the SQLNET.ORA file was read but all interweb documents seemed to say 'change the file and then restart the listener'. From this we inferred that the lsnrctl only reads the SQLNET.ORA at start up. THIS IS NOT TRUE.
SQLNET.ORA is read on a regular basis by the listener.

see
http://www.dbasupport.com/forums/archive/index.php/t-28267.html

SQLNET expire_time

Had a lot of problems with SQLNET over our VPN recently. The problem is that any queries that did not return a result for long periods of time would just hang. This typically occurred when doing counts from large tables. If the query took longer than about 15 mins then ,intermittently, it would hang. Sometimes it would work fine and the next time hang, no error message...nothing. This made it virtually impossible to handle or code around. Oddly, if you just logged into SQLPLUS and did nothing for 24 hours there was never a problem.

I discovered that the hanging was nothing to do with the database but had to be somewhere in the network. I concluded this after I found I could reproduce the problem simply by doing a dbms_lock.sleep(2100) (35 mins). This proved the database was not hanging but had something to do with the comms between the sqlnet client and the database server. Furthermore when the problem occurred the server was showing that it was 'waiting for client' even after the allotted sleep time had finished. I tried many different combinations of Oracle client and server versions running both Windows and Unix, all produced the same behaviour...so not a version bug.

During a long and difficult investigation we discovered no packets are sent between client and server while a long running SQL executes. When the server eventually responds, having finished the query execution, the firewall fails to sent the packet on. The details of why this should happen we have never worked out.

Someone came up with the bright idea of turning on EXPIRE_TIME (EXPIRE_TIME=5) in the sqlnet.ora on the server. This causes the server to check the client is still alive every 5 minutes. If the client has died then SQLNET tidys up it's processes on the server. A side effect of this checking is that a conversation occurs between server and client every 5 minutes, enough to convince the firewall that the session is still valid and not chop the comms between the two.

After a brief hiccup (discussed in next blog entry) it fixed the problem and SQLNET is now 100% reliable.

The whole investigation took approximately one month and many, many man hours and the solution was depressingly simple!

Monday 3 September 2007

DBMS_UTILITY.FORMAT_ERROR_BACKTRACE

What a find! The problem with multiple statements in a pl/sql block is that when an error is caught in an exception you loose the exact line that caused the original error. In 10g you can now find out which line caused the original error by using

DBMS_UTILITY.FORMAT_ERROR_BACKTRACE

see http://www.quest-pipelines.com/newsletter-v5/0904_A.htm

for more detail

More streams problems - slow apply process

Managed to hit a bug in 10.2.0.2 streams a few days ago. Bug number 6163622, two sources users into one destination user with a large number of transactions causes the apply process to go V.E..R..Y V..E..R..Y......S..L..O..W.

It's a spill issue, if the number of transactions in one commit is too high (number of rows committed > TXN_LCR_SPILL_THRESHOLD value in the table dba_apply_parameters) then spilling occurs and the apply process then takes for ever to read back from the spill table and apply the change to the destination database. Look in v_$streams_apply_coordinator and you will see the queue with the problem stuck in appyling mode.

TXN_LCR_SPILL_THRESHOLD is set via :-

BEGIN
dbms_apply_adm.set_parameter(
apply_name => 'APPLY_E1STRM',
parameter => 'TXN_LCR_SPILL_THRESHOLD',
value => 20000 );
END;