The short answer is "it depends". There's a few quick hit things you can watch out for:
work done in Forge that could be pushed to a database,
Then there's more involved efforts that may or may not apply to your case, such as moving to parallel forge or removing Java manipulators.
Dgidx has a few other things you can tune but it's mostly dependent on the character of your index and whether or not you are enabling unnecessary features on your properties and dimensions.
Hope that helps, let us know if there's anything else you need.
As Patrick says - it depends. Where is the time being spent in your current pipeline? Directionally, you should start to explore CAS. It is a java-based system that runs multi-threaded and manipulators run in process, unlike forge which runs java manipulators in their own jvm. You still need forge to do joins beyond switch joins (which CAS can do) but you could see help from CAS or moving logic into the extraction step. I don't recommend adding parallel forge. It's expected to be deprecated.