The problem and solution are independent of the implications of this finding and...

gpderetta · on Feb 1, 2023

Human brains can produce near duplicates of things they have seen in the past.

Of course, like an human, an NN can be used to generate copyright infringing images, but it doesn't follow that any generated image is infringing.

haswell · on Feb 1, 2023

> Human brains can produce near duplicates of things they have seen in the past.

Most humans cannot, and this is important, because the fact that most humans cannot arguably played a role in the formulation of all current rules.

So can cameras. And there are laws that restrict where cameras can be pointed in places that do not restrict human sight, because the two kinds of "seeing" are distinctly different.

Layering AI into the mix doesn't suddenly remove or mitigate the technical realities of the software.

> but it doesn't follow that any generated image is infringing.

I agree, and this is not my claim.

pixl97 · on Feb 1, 2023

Laws very rarely restrict where cameras can be pointed, and most those laws focus on commercial redistribution. They almost never stop individuals from doing so.

Furthermore the laws tend to be a complete wreck of logical paradoxes that fall apart in situations like this, hence forcing more case law to be generated.

haswell · on Feb 1, 2023

I think the reason those restrictions exist is more important here than the rarity of the restriction. I argue this because I think the magnitude of the implications of unrestricted ingestion of public and private data is similar to the magnitude of the implications of unrestricted camera use without limit, i.e. as good a candidate for a restriction as those camera use cases currently restricted by law.

The reason you cannot record what's going on in a bathroom has as much to do with the implications of the recording as it does with the implications of being observed in the first place.

I don't disagree about the ensuing mess of laws, but I don't think we have a reason to believe Stable Diffusion will be spared from it.

unusualmonkey · on Feb 1, 2023

Why?

Are you saying humans are incapable of accidentally reproducing previously seen work?

haswell · on Feb 1, 2023

Many of the core arguments that claim Stable Diffusion training is exercising fair use do so by claiming that this computer program is similar enough to a human brain to qualify it for the protections described by copyright law, and they base that claim of similarity on the idea that the software doesn't copy, it just learns, and that all output is fundamentally new.

I'm arguing that a finding like this harms such core arguments, and highlights just one of many ways these models are entirely unlike humans.

> Are you saying humans are incapable of accidentally reproducing previously seen work?

To conclude that would be a form of propositional and possibly equivocation fallacy, IMO.

While I acknowledge that it's possible someone might "accidentally remember" someone else's work and then create a piece that is very similar to it, this hardly seems likely as a general case, and such a possibility was baked in to the current rules as written. To take this further and claim that a human could do so with precision and in a reproducible manner seems questionable.

The Stable Diffusion equivalent of this kind of "remembering" and resulting duplication is again different in context and contents than a human exposed to the same images.

The fact that it can be replicated systematically is the most distinctly non-human part, and is a strong hint that we're comparing very different things.

usrbinbash · on Feb 2, 2023

> I'm arguing that a finding like this harms such core arguments

Why? If a really good artist studies a single piece long enough, he could be able reproduce it to a degree where it takes expert analysis to determine which is the original. It's not as if there have never been forgeries of expensive artworks.

The difference between a human studying a certain piece intensely, and a model overfitting to it, is that to the model, it happens by accident. Overfitting to the training set is not a desired outcome, it's something ML techniques are trying to actively avoid.

haswell · on Feb 2, 2023

> Why? If a really good artist studies a single piece long enough, he could be able reproduce it

Part of the answer is in your question. A really good artist is rare, and forgery is a specialized skill.

The point is not that humans are capable of forgery, but that the model/software is capable of it with minimal effort, and enables anyone regardless of skill to achieve similar outcomes.

Setting aside the resulting privacy issues, this has major implications. Two are:

1. The risk of forgery previously accepted by a copyright framework that assumed human actors has drastically changed.

2. The claims that Stable Diffusion only produces derivative work and not originals was at the heart of the argument that says SD model training is fair use, and regardless of why, we know this is not true.

> The difference between a human studying a certain piece intensely, and a model overfitting to it, is that to the model, it happens by accident.

That is one difference in the factors surrounding the initial creation of the output.

The differences continue to stack up as you examine the process of learning, the algorithms that produce new output, the computational context (both hardware and software), the agency of the entity creating the output, etc.

The software doesn’t do anything by accident. It does exactly what it has been instructed to do.

The humans training the model are at the core of any accident. This might seem obvious, but I think it bears restating because it highlights the nature of the situation. The model is not an independent actor.

Whether or not it’s a desired outcome is not relevant. The fact that it is possible - and that this possibility has further implications about the nature of the software itself - is what I argue hurts the standard arguments claiming Stable Diffusion is not infringing by virtue of its similarity to human processes of thinking and expression.

usrbinbash · on Feb 3, 2023

> A really good artist is rare, and forgery is a specialized skill.

People who could machine very small cogwheels precisely enough used to be rare, which is, as far as I know, part of the reason Charles Babbage never completed the Analytical Engine (prohibitive costs for the parts). Today we mass produce tiny cogwheels and other mechanical parts due to automation.

> The claims that Stable Diffusion only produces derivative work and not originals was at the heart of the argument that says SD model training is fair use, and regardless of why, we know this is not true.

Brushes, canvas and pigments can be used to produce non-derivative works as well. So can pencils, or photocopiers. There are thousands of tools that could be used to infringe copyrights or make forgeries of other peoples works. We don't blame the car when bankrobbers use it as a getaway vehicle.

haswell · on Feb 3, 2023

> People who could machine very small cogwheels precisely enough used to be rare

I don't think an analogy that compares the manufacture of machinery with the creation of artwork is a good one. I understand the similarity you're teasing at, but I think this is a category error.

> Brushes, canvas and pigments can be used to produce non-derivative works as well.

None of these tools require the original artwork of other artists to function. They are primitive tools, and I don't think it's reasonable to include them in the same category as an AI that was explicitly trained for a particular purpose.

> We don't blame the car when bankrobbers use it as a getaway vehicle.

We might blame the car manufacturer though if the car was autonomous and had been trained on a dataset of all drivers and driving styles, and had "learned" to be a getaway car, replete with situational awareness of a typical robbery scene, knowledge of how to avoid cops, etc.

usrbinbash · on Feb 4, 2023

> I don't think an analogy that compares the manufacture of machinery with the creation of artwork is a good one.

Why not? The analogy just shows that there are many prior examples of skills that only comparatively few humans could do, and/or humans couldn't do consistenly/cheap/quickly/at-scale, until suddely the tasks were automated. My point from this is, if this happened in the past, regularly, why would it suddenly be a problem now?

> None of these tools require the original artwork of other artists to function

They do if someone deliberately wants to infringe copyright and/or forge someone elses work, because without an example, how would they do that?

> an AI that was explicitly trained for a particular purpose.

That purpose being "translate an input prompt into a consistent image".

Yes, people can use them do bad things, same as they can use cars as getaway vehicles. Neither is the fault of the tool, but the fault of people using them for nefarious purposes.

> We might blame the car manufacturer though if the car was autonomous and had been trained on a dataset of all drivers and driving styles, and had "learned" to be a getaway car, replete with situational awareness of a typical robbery scene, knowledge of how to avoid cops, etc.

Why would we wait with the blame until the car was autonomous? Why don't we start with cars that have, eg. strong engines? I'd guess a strong engine is a desireable property for a getaway vehicle. So, why don't we blame string engines?

For the same reason why it doesn't make sense to blame generative AI for people using it for copyright infringement: Strong engines have a ton of legitimate, useful properties. The fact that some people use them for bad things, doesn't outweigh the usefulness to the many more people using them for good things.

The hypothetical car that was trained on tons of extreme driving techniques might also use that training to avoid a crash in an emergency situation, or to detect dangerous behaviour of other vehicles sooner.

Again, we don't blame the tools, we blame the people using them for bad things.

988747 · on Feb 1, 2023

Most humans are not capable of exactly reproducing previously seen work even on purpose. Every human artist has unique style that shows even when they try to imitate. That's why perfect forgery is a form of art in itself :)

wahnfrieden · on Feb 1, 2023

Correct