Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I'm arguing that a finding like this harms such core arguments

Why? If a really good artist studies a single piece long enough, he could be able reproduce it to a degree where it takes expert analysis to determine which is the original. It's not as if there have never been forgeries of expensive artworks.

The difference between a human studying a certain piece intensely, and a model overfitting to it, is that to the model, it happens by accident. Overfitting to the training set is not a desired outcome, it's something ML techniques are trying to actively avoid.



> Why? If a really good artist studies a single piece long enough, he could be able reproduce it

Part of the answer is in your question. A really good artist is rare, and forgery is a specialized skill.

The point is not that humans are capable of forgery, but that the model/software is capable of it with minimal effort, and enables anyone regardless of skill to achieve similar outcomes.

Setting aside the resulting privacy issues, this has major implications. Two are:

1. The risk of forgery previously accepted by a copyright framework that assumed human actors has drastically changed.

2. The claims that Stable Diffusion only produces derivative work and not originals was at the heart of the argument that says SD model training is fair use, and regardless of why, we know this is not true.

> The difference between a human studying a certain piece intensely, and a model overfitting to it, is that to the model, it happens by accident.

That is one difference in the factors surrounding the initial creation of the output.

The differences continue to stack up as you examine the process of learning, the algorithms that produce new output, the computational context (both hardware and software), the agency of the entity creating the output, etc.

The software doesn’t do anything by accident. It does exactly what it has been instructed to do.

The humans training the model are at the core of any accident. This might seem obvious, but I think it bears restating because it highlights the nature of the situation. The model is not an independent actor.

Whether or not it’s a desired outcome is not relevant. The fact that it is possible - and that this possibility has further implications about the nature of the software itself - is what I argue hurts the standard arguments claiming Stable Diffusion is not infringing by virtue of its similarity to human processes of thinking and expression.


> A really good artist is rare, and forgery is a specialized skill.

People who could machine very small cogwheels precisely enough used to be rare, which is, as far as I know, part of the reason Charles Babbage never completed the Analytical Engine (prohibitive costs for the parts). Today we mass produce tiny cogwheels and other mechanical parts due to automation.

> The claims that Stable Diffusion only produces derivative work and not originals was at the heart of the argument that says SD model training is fair use, and regardless of why, we know this is not true.

Brushes, canvas and pigments can be used to produce non-derivative works as well. So can pencils, or photocopiers. There are thousands of tools that could be used to infringe copyrights or make forgeries of other peoples works. We don't blame the car when bankrobbers use it as a getaway vehicle.


> People who could machine very small cogwheels precisely enough used to be rare

I don't think an analogy that compares the manufacture of machinery with the creation of artwork is a good one. I understand the similarity you're teasing at, but I think this is a category error.

> Brushes, canvas and pigments can be used to produce non-derivative works as well.

None of these tools require the original artwork of other artists to function. They are primitive tools, and I don't think it's reasonable to include them in the same category as an AI that was explicitly trained for a particular purpose.

> We don't blame the car when bankrobbers use it as a getaway vehicle.

We might blame the car manufacturer though if the car was autonomous and had been trained on a dataset of all drivers and driving styles, and had "learned" to be a getaway car, replete with situational awareness of a typical robbery scene, knowledge of how to avoid cops, etc.


> I don't think an analogy that compares the manufacture of machinery with the creation of artwork is a good one.

Why not? The analogy just shows that there are many prior examples of skills that only comparatively few humans could do, and/or humans couldn't do consistenly/cheap/quickly/at-scale, until suddely the tasks were automated. My point from this is, if this happened in the past, regularly, why would it suddenly be a problem now?

> None of these tools require the original artwork of other artists to function

They do if someone deliberately wants to infringe copyright and/or forge someone elses work, because without an example, how would they do that?

> an AI that was explicitly trained for a particular purpose.

That purpose being "translate an input prompt into a consistent image".

Yes, people can use them do bad things, same as they can use cars as getaway vehicles. Neither is the fault of the tool, but the fault of people using them for nefarious purposes.

> We might blame the car manufacturer though if the car was autonomous and had been trained on a dataset of all drivers and driving styles, and had "learned" to be a getaway car, replete with situational awareness of a typical robbery scene, knowledge of how to avoid cops, etc.

Why would we wait with the blame until the car was autonomous? Why don't we start with cars that have, eg. strong engines? I'd guess a strong engine is a desireable property for a getaway vehicle. So, why don't we blame string engines?

For the same reason why it doesn't make sense to blame generative AI for people using it for copyright infringement: Strong engines have a ton of legitimate, useful properties. The fact that some people use them for bad things, doesn't outweigh the usefulness to the many more people using them for good things.

The hypothetical car that was trained on tons of extreme driving techniques might also use that training to avoid a crash in an emergency situation, or to detect dangerous behaviour of other vehicles sooner.

Again, we don't blame the tools, we blame the people using them for bad things.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: