The biggest fears are "AI is theft, the model just remembers what you give it. The proof is in the fact that you can reproduce what it gets". When in reality, a model that is smaller than the training data literally cannot reproduce all of the images it's trained on.
The argument isn't that it can reproduce ALL images, but that it might occasionally reproduce one image. And if that is devastating to e.g. the economic feasibility of a product built on the model - then that's an existential threat to that business model.
To make a really dumb example:
I can make a black box model that I push in 1000 bestseller novels through, that has the total "storage size" of 1 novel. The pigeonhole principle says my model can't possibly contain all 1000 novels. If you ask it (anything) it will respond with that one novel, verbatim. It never reproduces anything else.
Does it now matter whether I trained it on 1000 novels? Does it matter that my "black box" is just that, a literal black cardboard box containing just a copy of a novel?
Actually a really great example. Wouldn't be surprised if some enthusiastic future corporate-friendly circuit court ruling accidentally made this a legal copyright-removing machine.