Most processes in EDQ have some level of footprint in the database. Does the footprint keep growing and growing or does it reach a point and then stabilize? Which version are you running?
We're currently using v9.0.5.
The footprint just keeps on growing. Due to the sheer number of files under the Postgres “data” directory, a Unix “du –sk data” won’t return a result in several hours of running. So it’s difficult to get an accurate size.
In that case, I believe the best answer here is that you need to upgrade. There were some tables that were found to be growing and a number of fixes for this went into 9.0.7 (and more again into EDQ 11g.)
The latest maintenance release of 9.0 is 9.0.10 - available on My Oracle Support.
If you choose to go to 11g, the latest is 18.104.22.168.4.
It looks like we're going to have to - Postgres has just died...
WARNING: 22-May-2014 04:24:27: Unable to index table : 'DNM_819_776_1_ingr'
WARNING: 22-May-2014 04:24:27: A database error has occurred : ERROR: could not create relation base/25589/3549571: File too large. (Code: 200,302)
- com.datanomic.director.results.database.exception.sql.ResultsSQLException: A database error has occurred : ERROR: could not create relation base/25589/3549571: File too large. (Code: 200,302)
Do you recognise this as being the ultimate failure of too much data written to the Postgres DB?
It is not an error I have seen before.
Am I correct in saying you have a match processor in the service? Is real-time review of match results a requirement? Does the match processor do reference data matching?
You may find you can make some significant savings to the data written by disabling output from match that is not needed, and also by disabling real-time review if it is not required. The latter in 9.0 is only available at the server level by adding realtime.review.generate.records = false to config/match/match.properties.
I think quite possibly the growth is due to the match processor writing results for real-time review purposes.
Hi Mike, you'll like this (not a lot).
We're running in Interval Mode as this actually reduces the likelihood of a StreamMerge Explosion (I know an upgrade would cure this but that would be sensible). Obviously I didn’t want any results saved so on the Run Mode tab in the job config I unchecked the “Save all Intervals?” check box. You can’t save/close the Configuration pop-up until you’ve entered a value in the “Keep a maximum of” text box so I entered 0 (zero Intervals) and saved this configuration. We then ran the job and we eventually ran out of Postgres DB space.
This morning, after a rebuild of our EDQ environment to clear everything down, I restarted the job but with one small change. I decided to change the “Keep a maximum of” value to 1 (Interval) just on the off-chance someone had coded for a value of zero to actually mean “infinity” - even though this is surely the function of the “Save all Intervals?” check box above. Lo and behold at job startup a few results tables written to Postgres and virtually zero growth thereafter.
Upgrade would be great but so close to go-live date others are nervous.
Thanks for your help.
Well, it sounds like it works for you in any case. There could be a negative performance impact of running in interval mode in high load scenarios depending on the type of service it is - but then to get the best out of the product you would need to upgrade in any case.
Saving all intervals would definitely eat space - I am also wondering if interval mode might disable the writing of data for real-time review if you have a match processor in there (will check) - in which case using the property may be advisable (if you don't need real-time review) for belt and braces.
In the end we upgraded from 9.0.5 to 9.0.10 to stop the StreamMerge Explosions (and stopped running in Interval Mode).
Hi Mike, I thought upgrading would cure our ills, but I was wrong. The StreamMerge Explosion is no more but the Postgres DB is still growing relatively quickly (we're not using Interval Mode now). The DB grew by 3GB on the single EDQ instance running a real-time job (JMS reader) and our 4 instances of a web service caused their DBs to grow by 2GB each. Results capture is turned off on all processors in the jobs so I'm at a total loss as to what is being written, and not subsequently tidied up, to the Postgres DB.
1. Does the process contain one or more match processors?
2. If so, do you need Match Review enabled in real-time, with users making decisions on real-time matching results, or are the match results presented externally.
If the process is running matching and you don't need review (as I suspect), then you can disable the writing of data for review purposes at the server level by setting the following in config/match/match.properties:
realtime.review.generate.records = false
(In EDQ 11, this is configurable per match processor).
The other thing to check is that you are not running in interval mode and writing unnecessary results data. The job should be running in Normal mode unless you need real-time Dashboard publication, and should be configured to write no drilldown data.
If you would like a detailed check on your configuration for any other issues let me know.
Yes we do have some Match processors in use and we're not using REVIEW at all. We're no longer running in Interval Mode as v9.0.10 has saved us from the dreaded StreamMerge explosion. I'll try the property setting you advise and see what happens.
Thanks again for your prompt assistance.
Drill Down is None everywhere. Not all of our "Enable sort/Filter in Match?" are unchecked though. Will that make a difference?
As we don't have any match rules with a decision of REVIEW will the property setting "realtime.review.generate.records = false" actually make any difference?