This content has been marked as final. Show 1 reply
what is the volume of records to be deduplicated, and what is the desired throughput?
SSA matching is very CPU intensive, and for each CPU it is kind of single threaded, therefore, adding CPU could help, while RAM may not have significant impact. In most cases, i would recommend to start with a 4CPU box, and test out the throughput, then go from there. We should set the right expectation that matching is just one step in the data management process, the actual merge of duplicates found often would take much longer not to mention the user intervention needed to review the duplicates identified.