I have a high-level requirement to display related keywords within my search results page. The spirit of this is to provide recommendations (that are likely broader in nature than the actual query itself) to users in the event they cannot find what they are looking for (likely because their query was too specific and over-constrained the results set).
In this initial launch I want to be as simple as possible. I've come with some options (below) but am interested to see if anyone else has tried to tackle this problem:
Option 1: Index a set of high-volume keywords that could be recommended
In this approach I would come up with a set of high-volume keywords, likely from analytics, and have them available within the mdex. A user would do a search, and in addition to the regular content results, I'd be able to render a "related searches" box with results from the high-volume keyword record type.
Option 2: Use each records keywords as a means of determining relatedness
Each piece of content within the mdex has one or more keywords related to it. While I have not done any analysis, I hypothesize an acceptable level of relatedness between keywords associated with a single document. Aggregating this data against the entire set of content would give me some quantitative information about how often word n appears with word y. Using this data I could render the "related searches box" probably more accurately, but with a greater level of complexity :-/
We spent the past 6 weeks working through a similar set of requirements. We chose option 1 <edited from the original posting> to support our similar request. The source of the keywords could have came from a report from the external analytics package, the Endeca logger, or the raw dgraph request logs. Or in your case, some other source. From there it's fairly trivial to integrate into the pipeline as the separate records. You'll probably need a new search interface with the appropriate relevancy modules.