Thursday, April 19, 2007

Trouble with commits, Oracle Asynchronous Commit Solution

I'm doing some work loading SHP files into oracle using the oracle example the Oracle Java Shapefile Converter which contains the SampleShapefileToJGeomFeature.java source code

It all works quite nicely, but i wanted to give some feedback in the tool I'm writing which automates the loading of SHP file datasets into Oracle spatial, with some neat handling along the way...

Anyway, java ain't my first language, but i got working and added some support for writing progress updates back against the table which tracks the source files to be loaded...

It all worked pretty nicely until it hit some large files and started to choke on oracle waits while it wrote the transactions to disk, which was causing waits, which were compounded by my progress being updated every 10s.... everything slowed down to a crawl

The solution was to use "COMMIT WORK WRITE BATCH NOWAIT" feature in 10g, which lets oracle decide when to commit to disk and get the LGWR process to process the redo logs. Works a treat :)

You have to be careful with this because it means stuff isn't being committed to disk straight away, but you can read the result from other sessions, which was my goal.

I have a coldfusion page which polls the table containing the progress updates and calculates the estimated time, throughput per minute and shows a nice CSS progress bar.

My next step i was to use the same approach for the commits for the spatial inserts at the same time. Previously i was am doing a real commit every 10,000 rows....

The last run was done in 41 mins for 1,123,000 rows, the new approach is looking like about 28 minutes to load... so that's about 25% faster, and less flashing HDD lights :)

As a side note, i like developing on a laptop, because I have only a 5400 rpm hard disk and you get to see issues you might not see on a desktop with a faster HD...

I am loading from an external harddisk to avoid I/O contention, BTW

No comments: