1 2 Previous Next 16 Replies Latest reply: Jun 23, 2014 10:44 AM by Mikematthews-Oracle RSS

    Real-time process slowly filling up Postgres DB space

    Jon Coat

      I have a real-time job which takes data from a JMS queue, processes the data, and then applies inserts/updates to an Oracle DB. In the task definition for the process Results drill down is set = "None" yet the Postgres DB grows substantially as the job runs. Also the following SQL seems to find a row per JMS message processed...

       

      select count(*) from  pg_stat_user_tables where schemaname = 'results'


      Anybody any clues as to what may be causing the DB to grow?

       

      Cheers

       

      Jon

        • 1. Re: Real-time process slowly filling up Postgres DB space
          Mikematthews-Oracle

          Hi,

           

          Most processes in EDQ have some level of footprint in the database. Does the footprint keep growing and growing or does it reach a point and then stabilize? Which version are you running?

           

          Mike

          • 2. Re: Real-time process slowly filling up Postgres DB space
            Jon Coat

            Hi Mike

             

            We're currently using v9.0.5.

             

            The footprint just keeps on growing. Due to the sheer number of files under the Postgres “data” directory, a Unix “du –sk data” won’t return a result in several hours of running. So it’s difficult to get an accurate size.

            Jon

            • 3. Re: Real-time process slowly filling up Postgres DB space
              Mikematthews-Oracle

              Hi,

               

              In that case, I believe the best answer here is that you need to upgrade. There were some tables that were found to be growing and a number of fixes for this went into 9.0.7 (and more again into EDQ 11g.)

               

              The latest maintenance release of 9.0 is 9.0.10 - available on My Oracle Support.

               

              If you choose to go to 11g, the latest is 11.1.1.7.4.

               

              Regards,

               

              Mike

              • 4. Re: Real-time process slowly filling up Postgres DB space
                Jon Coat

                It looks like we're going to have to - Postgres has just died...

                 

                WARNING: 22-May-2014 04:24:27: Unable to index table : 'DNM_819_776_1_ingr'

                WARNING: 22-May-2014 04:24:27: A database error has occurred : ERROR: could not create relation base/25589/3549571: File too large. (Code: 200,302)

                1. com.datanomic.director.results.database.exception.sql.ResultsSQLException: A database error has occurred : ERROR: could not create relation base/25589/3549571: File too large. (Code: 200,302)

                at com.datanomic.director.results.database.translator.MapErrorCodes.mapException(MapErrorCodes.java:70)

                at com.datanomic.director.results.database.AbstractTableDAO.executeSQL(AbstractTableDAO.java:66)

                at com.datanomic.director.results.database.AbstractTableDAO.executeSQL(AbstractTableDAO.java:39)

                at com.datanomic.director.results.database.TableInsertDao.addIndexes(TableInsertDao.java:291)

                at com.datanomic.director.results.database.TableInsert.close(TableInsert.java:423)

                at com.datanomic.director.results.database.TableInsert.close(TableInsert.java:301)

                at com.datanomic.director.match.runtime.data.writers.AbstractDBWriter.close(AbstractDBWriter.java:173)

                at com.datanomic.director.match.runtime.data.realtime.ResultsBucket.finishRealtimeBuckets(ResultsBucket.java:63)

                at com.datanomic.director.match.runtime.RealtimeHandler.finalizeDBStore(RealtimeHandler.java:623)

                at com.datanomic.director.match.munger.MatchRealtimeExecutor.doTheStuff(MatchRealtimeExecutor.java:303)

                at com.datanomic.director.runtime.engine.RuntimeProcessMunger$MungerExecutable.execute(RuntimeProcessMunger.java:872)

                at com.datanomic.utils.execution.Parallelizer$Worker.run(Parallelizer.java:210)

                at com.datanomic.utils.execution.Parallelizer$Worker.runHere(Parallelizer.java:156)

                at com.datanomic.utils.execution.Parallelizer.run(Parallelizer.java:85)

                at com.datanomic.director.runtime.engine.RuntimeProcessMunger.execute(RuntimeProcessMunger.java:459)

                at com.datanomic.utils.execution.Parallelizer$Worker.run(Parallelizer.java:210)

                at java.lang.Thread.run(Thread.java:722)


                Do you recognise this as being the ultimate failure of too much data written to the Postgres DB?


                • 5. Re: Real-time process slowly filling up Postgres DB space
                  Mikematthews-Oracle

                  It is not an error I have seen before.

                   

                  Am I correct in saying you have a match processor in the service? Is real-time review of match results a requirement? Does the match processor do reference data matching?

                   

                  You may find you can make some significant savings to the data written by disabling output from match that is not needed, and also by disabling real-time review if it is not required. The latter in 9.0 is only available at the server level by adding realtime.review.generate.records = false to config/match/match.properties.

                   

                  I think quite possibly the growth is due to the match processor writing results for real-time review purposes.

                   

                  Mike

                  • 6. Re: Real-time process slowly filling up Postgres DB space
                    Jon Coat

                    Hi Mike, you'll like this (not a lot).

                     

                    We're running in Interval Mode as this actually reduces the likelihood of a StreamMerge Explosion (I know an upgrade would cure this but that would be sensible). Obviously I didn’t want any results saved so on the Run Mode tab in the job config I unchecked the “Save all Intervals?” check box. You can’t save/close the Configuration pop-up until you’ve entered a value in the “Keep a maximum of” text box so I entered 0 (zero Intervals) and saved this configuration. We then ran the job and we eventually ran out of Postgres DB space.

                     

                    This morning, after a rebuild of our EDQ environment to clear everything down, I restarted the job but with one small change. I decided to change the “Keep a maximum of” value to 1 (Interval) just on the off-chance someone had coded for a value of zero to actually mean “infinity” - even though this is surely the function of the “Save all Intervals?” check box above. Lo and behold at job startup a few results tables written to Postgres and virtually zero growth thereafter.

                     

                    Upgrade would be great but so close to go-live date others are nervous.

                     

                    Thanks for your help.

                     

                    Jon

                    • 7. Re: Real-time process slowly filling up Postgres DB space
                      Mikematthews-Oracle

                      Aha!

                       

                      Well, it sounds like it works for you in any case. There could be a negative performance impact of running in interval mode in high load scenarios depending on the type of service it is - but then to get the best out of the product you would need to upgrade in any case.

                       

                      Saving all intervals would definitely eat space - I am also wondering if interval mode might disable the writing of data for real-time review if you have a match processor in there (will check) - in which case using the property may be advisable (if you don't need real-time review) for belt and braces.

                       

                      Best regards,

                       

                      Mike

                      • 8. Re: Real-time process slowly filling up Postgres DB space
                        Jon Coat

                        In the end we upgraded from 9.0.5 to 9.0.10 to stop the StreamMerge Explosions (and stopped running in Interval Mode).

                         

                        Thanks Mike.

                        • 9. Re: Real-time process slowly filling up Postgres DB space
                          Jon Coat

                          Hi Mike, I thought upgrading would cure our ills, but I was wrong. The StreamMerge Explosion is no more but the Postgres DB is still growing relatively quickly (we're not using Interval Mode now). The DB grew by 3GB on the single EDQ instance running a real-time job (JMS reader) and our 4 instances of a web service caused their DBs to grow by 2GB each. Results capture is turned off on all processors in the jobs so I'm at a total loss as to what is being written, and not subsequently tidied up, to the Postgres DB.

                           

                          Any thoughts?

                          • 10. Re: Real-time process slowly filling up Postgres DB space
                            Mikematthews-Oracle

                            1. Does the process contain one or more match processors?

                             

                            2. If so, do you need Match Review enabled in real-time, with users making decisions on real-time matching results, or are the match results presented externally.

                             

                            If the process is running matching and you don't need review (as I suspect), then you can disable the writing of data for review purposes at the server level by setting the following in config/match/match.properties:

                             

                            realtime.review.generate.records = false

                             

                            (In EDQ 11, this is configurable per match processor).

                             

                            The other thing to check is that you are not running in interval mode and writing unnecessary results data.  The job should be running in Normal mode unless you need real-time Dashboard publication, and should be configured to write no drilldown data.

                             

                            If you would like a detailed check on your configuration for any other issues let me know.

                             

                            Regards,

                             

                            Mike

                            • 11. Re: Real-time process slowly filling up Postgres DB space
                              Jon Coat

                              Hi Mike

                               

                              Yes we do have some Match processors in use and we're not using REVIEW at all. We're no longer running in Interval Mode as v9.0.10 has saved us from the dreaded StreamMerge explosion. I'll try the property setting you advise and see what happens.

                               

                              Thanks again for your prompt assistance.

                               

                              Cheers

                               

                              Jon

                              • 12. Re: Real-time process slowly filling up Postgres DB space
                                Mikematthews-Oracle

                                Check also that the process tasks in your job are all configured as below:

                                 

                                taskconfignodrilld.png

                                • 13. Re: Real-time process slowly filling up Postgres DB space
                                  Jon Coat

                                  Drill Down is None everywhere. Not all of our "Enable sort/Filter in Match?" are unchecked though. Will that make a difference?

                                  • 14. Re: Real-time process slowly filling up Postgres DB space
                                    Jon Coat

                                    As we don't have any match rules with a decision of REVIEW will the property setting "realtime.review.generate.records = false" actually make any difference?

                                    1 2 Previous Next