Compare counts of source and target after a initial load
Anyone has a existing solution to prove that the source (any database system) and the target (hadoop Avro files) have the same amount of rows directly after a initial load.
I do it by comparing, using count (*). but get not equal results because the initial load takes many hours and the source is changed during the initial load.