Filter based requests (including filter based aggregation) are processed in batches using single thread. Batch could be any number of partition (from1 to all owned partitions).
Deserialized cache entry are kept in memory until batch is finished (i.e. filter may require deserialization and aggregator later may require deserialization too).
If whole cache is to be processed in one batch, it would cause to much memory pressure, so Coherence is splitting processing into several batches.
First batch is usually one partition, it is used to estimate real memory pressure per partition for request and choose optimal number for further batches (often you will see exactly two batches - single partition, then rest of partitions).
Each batch is separate aggregate() call (or processAll() in case of entry processor).