Tuesday, 6 August 2013

Memory Usage of Oracle 10 & 11 JDBC Drivers - Mysterious OutOfMemoryErrors

A client of mine recently had a problem where they were seeing OutOfMemoryError errors despite the fact they had a web application application that should have a small memory footprint, and despite the fact they had allocated a heap size of just over a gig (which was massive considering the type of application).

Taking a look at the nature of the application and how it was doing things did not raise any particular alarm bells - the most likely culprit was a high read/high object creation/high write batch job that could be run from within the application. The rest of the code was by and large (baring business logic) a standard CRUD application.

I intially took some heap dumps and analysed them using the standard jmap/jhat tools to get a feel of what was going on. I also added additional garbage collection tracing:

-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc 
This verified that a large number of objects were indeed being created and held for the duration of the job,  but curiously there were two massive arrays that could not be explained: a large char[] and an equally large SoftReference[]

Investigating further I discovered that the root of both of these seemed to be the BasicDataSource (configured using Spring) - this was surprising and equally confusing.  What was the DataSource doing creating these arrays?  And more significantly, why was it maintaining what must be a strong reference to them (as they were not being garbage collected)?

Ultimate the problem came down to two things:

The application had:
  • Set the defaultFetchSize to 10000
  • Set the write batch size to 10000
After drilling down into the  created objects and much investigation it turns out that the JDBC drivers from Oracle 10 onwards created char[] and byte[] for use in reading and writing out data. The size of these arrays were dependant on both the types of queries ("select * from my_table" would allocate a larger array than just "select id from my_table", for example) as well as on the defaultFetchSize.

I've put together a simple test class that illustrates this behaviour - it does two things:
  • For 3 simple queries prints the char[] and byte[] allocated by the jdbc driver (using reflection) for fetch sizes from 10 to 10000
  • Creates a connection & statement for a fetch size of 10000 and waits (it also creates 200000 integers per second which is then discards) - for use in watching the heap using VisualVM.
The first test, on my particular PC, yields the following results:

For the following query: select * from TAB:
batch size:     10, defineBytes       4096 long, defineChars       4096 long
batch size:    100, defineBytes       4096 long, defineChars       8192 long
batch size:   1000, defineBytes      32768 long, defineChars      65536 long
batch size:  10000, defineBytes     524288 long, defineChars     524288 long
For the following query: select id from TAB:
batch size:     10, defineBytes       4096 long, defineChars            null
batch size:    100, defineBytes       4096 long, defineChars            null
batch size:   1000, defineBytes      32768 long, defineChars            null
batch size:  10000, defineBytes     262144 long, defineChars            null
For the following query: select * from student:
batch size:     10, defineBytes       4096 long, defineChars       4096 long
batch size:    100, defineBytes       8192 long, defineChars      32768 long
batch size:   1000, defineBytes      65536 long, defineChars     262144 long
batch size:  10000, defineBytes     524288 long, defineChars    4194304 long


As you can see from above, arrays are dependant on both the query as well as on the fetch size.

Within the Statement  (Oracles implementation of it) I can see the following lookup table:

4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304

You can see that these values to the power of 2, starting at 12.  It would appear that dependant on the combination of query and the fetch size a value from the above is chosen.

If the query has no character data then the char[] will remain unallocated (as in the 2nd query above), and if the query has no numeric/binary data then the byte[] array will remain unallocated.

In other words, the Oracle JDBC will allocate char[] and byte[] when the statements are created and keep hold of them until the statement is closed, which can a very long time in certain scenarios (as with PreparedStatements for example).

The following screenshots from VisualVM show the heap for a fetch size of 10:







You can see a normal, healthy sawtooth here, which the expected Integer taking the largest slice of memory.

Lets compare the same code, but with a fetch size of 10000:






Here you can see a fairly flat heap with very small effective garbage collections, and the largest occupant is now a char[].

The solution was ultimately to compare fetch sizes to the actual performance gains made - it turned out that a fetch size of 100 was more than sufficient and the memory footprint dropped down to less than 512m, with no more OOM errors.

The oracle documentation around this suggests that the default fetch size of 10 is typically more than sufficient and fetch sizes of 100 are only to be used with great care - given the above memory considerations I can see why.

A small footnote on this - I mentioned that the size of the array was also dependant on the query.

This is true, but is less important than the fetch size - it can however still be a consideration if the fetch size is small (say the default size of 10).

If you have a column that is varchar2(4000) then a char[] of at least 8000 will be allocated (2x4000 2 byte characters). If the same column was only varchar2(100) then the char[] would be at least 200 - considerably smaller.

By the same token a query of select * from table will result in larger buffers than select id from table.

The upshot of this is that if you find the jdbc memory consumption to be excessive in your application and the fetch size isnt large, then take a good look at both your table definition and your queries - fine tuning of both can make a difference.

The code for these tests can be found in GitHub - I used the Oracle 11.2.0.1.0 JDBC for these tests, but the results should be similar for any driver from 10.x onwards.

No comments:

Post a Comment