Yep that would be what I would do - effectively a functional approach where no memory is required besides the current set and iteration.
A big part of materializing datasets for performance is finding the right grain that is both easy enough to calculate and also can do nice things like be resumeable for snapshots.
A big part of materializing datasets for performance is finding the right grain that is both easy enough to calculate and also can do nice things like be resumeable for snapshots.