This discussion is archived
1 Reply Latest reply: Sep 14, 2007 11:57 AM by 589974 RSS

Basic sizing guidelines for Data Quality SSA Deduplication?

596526 Newbie
Currently Being Moderated
Is there a standard sizing approach towards Data Quality SSA de-duplication?
  • 1. Re: Basic sizing guidelines for Data Quality SSA Deduplication?
    589974 Newbie
    Currently Being Moderated
    what is the volume of records to be deduplicated, and what is the desired throughput?

    SSA matching is very CPU intensive, and for each CPU it is kind of single threaded, therefore, adding CPU could help, while RAM may not have significant impact. In most cases, i would recommend to start with a 4CPU box, and test out the throughput, then go from there. We should set the right expectation that matching is just one step in the data management process, the actual merge of duplicates found often would take much longer not to mention the user intervention needed to review the duplicates identified.