It's work like this that makes me frustrated at the popular discourse around generative models (especially here). There's a ton we don't know about these models, and yet you get tons of people arguing that these models absolutely don't memorize, or that they learn like we do and so their learning should be treated like ours (legally and ethically). Then you get work like this showing that yes they actually do some memorization and regurgitation. There's still so much we don't know here.
My fear is that when things like this come up for lawsuits, overconfident experts are going to talk out of their asses about how these models do or don't work, and that's going to determine how automation affects our society.
On a technical level, I'd love to see a patch-wise version of this investigation. This shows whole images being regurgitated near-exactly rarely. I expect that small part-of-the-image patches are regurgitated even more often. But is it simple stuff like edges being regurgitated or are larger parts regurgitated frequently too? Given the architectures generally used, I'd guess that it's significant.
The thing I gather is a lot of these people were never experts. Just like crypto, a lot of grifters and hype men gather around the tech because they see it like a get rich quick scheme, and being the type who are grifters, they are far too lazy to learn any of the actual details. They instead just harp on the popular narratives, one among them "your brain is a neural net!" that I've heard repeated ad nauseum even here on HN for almost ten years now.
I'm not sure you can do "small part-of-the-image patches" when comparing against 175,000,000 images. And I don't mean that from a scale/processing perspective I mean it would seem you'd always get tons of small patch false positives from any realistic looking image.
The relevant law here isn't patent law. It is copyright and trademark law. If you memorize a famous (recent) painting and recreate that as closely as you can, is it copyright violation or a transformative derivative work? I guess it depends on your process, intent, and fidelity. Oh, and on local laws.
> is it copyright violation or a transformative derivative work?
Short answer: for any country participating in WIPO, yes it is.
Edit: And if you don't believe me, let's replace the word "painting" with the word "song":
> If you memorize a famous (recent) song and recreate that as closely as you can, is it copyright violation or a transformative derivative work?
Hopefully everyone here knows the answer is "absolutely yes!" which is why artists need permission to cover other people's music.
Now, yes, intent factors in, insofar as it affects a potential fair use defense. But that doesn't affect that status of the work, it only determines whether the action to violate copyright is defensible
This is how Google gets away with returning images in their search result, and why I can't just copy an image they return and use it without myself violating copyright.
It does if it's over 20 years from the application date.
It does if you match on a design published before the application date.
It does if you change the design just enough not to read onto a single claim.
Honestly, training a generative model on the patent database could be very useful for inventing and invalidating patents. I wouldn't be surprised if examiners are using it in 5+ years. Since they've got required response count and time, I wouldn't be surprised, if they started using it just to speed up their own work.
My fear is that when things like this come up for lawsuits, overconfident experts are going to talk out of their asses about how these models do or don't work, and that's going to determine how automation affects our society.
On a technical level, I'd love to see a patch-wise version of this investigation. This shows whole images being regurgitated near-exactly rarely. I expect that small part-of-the-image patches are regurgitated even more often. But is it simple stuff like edges being regurgitated or are larger parts regurgitated frequently too? Given the architectures generally used, I'd guess that it's significant.