>it's also not using many of the complex and more efficient extraction approache...

lelandfe · on Feb 1, 2023

Is it implausible if they've done it in this paper?

This paper seems to answer the question of, "can SD, even just in theory, produce copyright-infringing work?" with "yes, it can."

For other images that are a product of thousands - if not millions - of source images, it becomes murkier.

GaggiX · on Feb 1, 2023

>Is it implausible if they've done it in this paper?

Extracting images in the wild yes, the authors of the paper have access to the dataset, they could sort prompts and images based on their presence in the dataset and they have an incredible amount of computation to do so, generating 175 mln using a diffusion model is an extremely resource-intensive task.

lelandfe · on Feb 1, 2023

I believe everyone has access to their dataset, no? https://laion.ai/blog/laion-5b/

Anyway, I don't think the point of this was to indicate that people can stumble on these incidents, but rather that it is possible. It's hard to see how this won't affect the ongoing suit.

GaggiX · on Feb 1, 2023

In the case of Stable Diffusion yes the dataset publicly available but these types of attacks would make much more sense if the attacker wants to extract some private data.

lelandfe · on Feb 1, 2023

Ah, I understand what you're getting at. True.