This content has been marked as final. Show 4 replies
Intergrator is a tool that uses alot of memory, but there are steps you can take to decrease it's foot print. We have a document with a few notes on how to reduce it's usage. Since you graph sounds pretty simple, if you do have large text fields, I would suggested changing the edge type to "fast propagate edge".
(Doc ID 1043320.1)
LDI graph which is processing a number of very large text columns is using alot more memory then expected. What are some ways to reduce the amount of memory consumed?
The use of external jars in jdbc, jms connections consume alot of memory, it is better to add them to the classpath when running graph.
Some component use more memory then others, examples are ExtSort and FastSort, or HashJoin and MergeJoin. Review the use of these components. Place them in the graph in locations that were they will need to process the minimum number of records, and if necessary isolate them in their own phase of the graph.
Finally if the graph is run in verbose mode look for the following message:
INFO [WatchDog] - Edge5 type: buffered
The default edge type is "detected", which leaves it up to LDI to determine the edgetype, if you have large text data, the buffered edge will consume alot of memory, it is better to the the fast propagate edge.
Thanks for your help!
I followed your advices and made the following changes:
1. Changed edge type to "fast propagate edge" in the graph.
2. Added the Oracle JDBC jar file to the JRE System Library under the Integrator Preference/JRE Definition settings.
After these changes, I ran the graph. It noticed that the memory usage for DB_INPUT_Table and Reformat component improved. But I still got the same outOfMemory error during the ExHash Join. Are there any further steps I need take to resolve this issue? I can increase the memory setting. But just not sure how big it needs to be.
I am quoting the following from the section on ExtHashJoin in the Endeca Information Discovery Integrator Guide (available here: http://docs.oracle.com/cd/E29805_01/index.htm):
This joiner should be avoided in case of large inputs on the slave port. The reason is slave data is cached in the memory.
Tip: If you have larger data, consider using the ExtMergeJoin component. If your data sources are unsorted, use a sorting component first (ExtSort, FastSort, or SortWithinGroups).
As the documentation says, please make sure that your smaller dataset is attached to the slave port. Alternatively, you can use ExtMergeJoin.