Normally I write my SQL statements from inside Java (eg PreparedStatements).
I am investigating on creating PL/SQL Packages, supplying procedures and methods, which I can call from Java using CallableStatements. I understand the advantages of having a layer/API on the database side.
For example: retrieve some records from a table. (the PL/SQL function getCustomers returns a REFCURSOR)
- PreparedStatement (select ... from ... where ...) -> executeQuery -> loop ResultSet
- CallableStatement ("begin ? := getCustomers(); end;") -> cast getObject() to ResultSet -> loop ResultSet
What are the disadvantages of using the CallableStatement (REFCURSOR)?
The only (and that highly theoretical) disadvantage is that you're creating code that's Oracle specific.
In reality this is rarely a problem, unless you're creating something generic for external distribution to customers where you can't dictate the database engine they use, in which case you'd be limiting your possible customer base. But as you're using Oracle specific SQL already, that's not a problem for you :)
I haven't had a great deal of experience with the approach you're considering, but as a general rule, I've not noticed a difference using vendor specific code unless your your database access count gets very high (such as in the millions in a short time). If you're in doubt whether you're over the threshold, best way to tell is test, test, test.
I was thinking about row fetching differences (single row OR multiple rows per round trip to the DB) .
As I understand, I can set the Fetch Size on a (Oracle) CallableStatement, thereby getting multipe rows from the REFCURSOR per round trip to the DB.
ALL data retrieval from an Oracle DB uses cursors: no exceptions.
The JDBC layer is what includes the 'batching' functionality that you describe and that functionality is independent of the source of the cursor. If you use the Oracle extensions then you can take advantage of Oracle's automated batching.
Using a REF CURSOR is the preferred, and recommended way to provide data from the DB to an external client.
1. It minimizes the dependencies between the DB and the Client. The function/procedure that provides the REF CURSOR is the only dependency.
2. It maximizes data security. Responsibility for securing the data rests solely on the DB side via the function/procedure. It is simply not possible for the Client software to access unauthorized data unless the function/procedure provides it. The client has NO access to data except via the REF CURSOR.
3. It maximizes database security. Access to the database itself is more easily controlled. Clients only need access to an account that has extremely limited privileges: EXECUTE privilege on the function/procedure. Clients do not need any table, view or other privileges to get their job done.
4. It minimizes the chance for error by the client. It is impossible for the client to access the wrong table/view or data. Clients do not need to know how to join tables or even know what the name of the tables are or where the data resides.
5. It minimizes the knowledge and skill that the client developer needs to have to get their job done. The client developer can focus on their two main issues: 1) what data do I need for my application, 2) what do I need to do with that data.
6. The only 'performance' risk (or penalty) is if the procedure/function is poorly written or uses poorly written queries to obtain the data.
In short use of a REF CURSOR lets the DB developer deal with the best way to provide the data and the client developer deal withj the bests way to consume that data.
Use of any solution other than a REF CURSOR should be the EXCEPTION rather than the rule.
While that article may be a good intro for someone not familiar with JDBC it has some misinformation in it that can be misleading at best and cause performance and memory issues at worst.
Row-prefetching - there is little benefit to using a large value for this setting. As with most things extremes are to be avoided.
It is important to know how the data is going to be used. What is the point of bringing ALL the data from the DB in one call if you are going to write it to the file system. File I/O is generally the slowest part of an application.
It is also important that the code be scaleable. Apps should be written so that future modifications will impact existing functionality as little as possible. If you eat up the memory for one table or statement that limits the memory available for other uses. Worse, when things go wrong and you need to reduce the fetch setting you may now have problems everywhere because the testing you originally performed may have relied on that larger fetch setting.
If the data fetching is not done in its own thread then using a very large value can actually make an application appear to 'freeze' while it waits for the data fetch to complete.
A large fetch size can also waste significant amounts of memory; essentially allocating it but never using it and never allowing any other process to use it. This is because for a fetch size of 'n' Oracle has to allocate space for 'n' rows that are each of maximum size.
This wasted is most often due to the common VARCHAR2 datatype. Consider a single column defined as VARCHAR2(100). In the DB this column is stored as variable length. But Oracle will allocate 100 characters (could be multi-byte characters) in order to be able to hold a maximum length value.
If the actual data values never exceed 30 bytes then 70 bytes in each row are totally wasted. The more rows that are prefetched the more memory that is wasted simply in anticipation of a maximum length value that never occurs or only occurs rarely.
I've never used a prefetch/fetch size greater than one thousand and that was for tables with small records (no more than 10 or 20 columns).
And the Oracle docs don't even recommend a setting that high. See the JDBC Dev Guide
Oracle Row-Prefetching Limitations
There is no maximum prefetch setting, but empirical evidence suggests that 10 is effective. Oracle has never observed a performance benefit to setting prefetch higher than 50. If you do not set the default row-prefetch value for a connection, then 10 is the default.