Also another previous study that has Stable Diffusion (SD) emitting images in the training set [0]
It is now clear that SD is treading on thin ice: training on watermarked and copyrighted images without their author's permission, then attempting to commercialize it even when the model emits images that resemble a high similarity of the original training data including watermarks or copyrighted images: (Mickey Mouse, Getty Images watermarks, Bloodborne art cover, etc).
This weakens their fair use argument, especially with Getty Images also threatening to sue SD for the same reason. If OpenAI was able to get permission to train on shutterstock images [1], then SD could have done the same, but chose not to.
Perhaps SD thought they could get away with it and launch their grift (DreamStudio) on digital images and artists. It turns out that now SD creates an opt-out system afterwards but artists can already find out if their images are in the training set. [2].
Note that the authors of this study believe that deduplication should be effective with SD, where there is a large ratio of dataset size to model size. (See section 7.1.)
There would have to be a new law, because stable diffusion is a type of search space algorithm. Previous rulings said watermarked images are fair use for search engines.
Hmmm, and why should an artist have to keep track of all and any companies that may be scraping their online portfolio (which any professional or even semi professional artist is expected to have) to opt out, just in case their work has been used.The burden should be on the company profiting from them.
It is now clear that SD is treading on thin ice: training on watermarked and copyrighted images without their author's permission, then attempting to commercialize it even when the model emits images that resemble a high similarity of the original training data including watermarks or copyrighted images: (Mickey Mouse, Getty Images watermarks, Bloodborne art cover, etc).
This weakens their fair use argument, especially with Getty Images also threatening to sue SD for the same reason. If OpenAI was able to get permission to train on shutterstock images [1], then SD could have done the same, but chose not to.
Perhaps SD thought they could get away with it and launch their grift (DreamStudio) on digital images and artists. It turns out that now SD creates an opt-out system afterwards but artists can already find out if their images are in the training set. [2].
[0] https://arxiv.org/pdf/2212.03860.pdf
[1] https://www.prnewswire.com/news-releases/shutterstock-partne...
[2] https://haveibeentrained.com/