Midjourney's Incredible Copying of Images
Scraped from the bowels of the internet packed in shiny Discord newb channels.
Hey Guys,
Midjourney is incredible, it’s maybe one of the most accessible and well performing text-to-image generators. The reason these LLMs are controversial though is also about how they scrape the internet.
David Holz, creator of the now very popular and also controversial AI image rendering platform Midjourney. But is it really that controversial, hey everyone is doing it!
I respect him actually for admitting the obvious, that his company never received any consent for the use of hundreds of millions of images it used. These were fed to its AI for the sake of training it at image generation.
I’m not sure why it matters, but somehow a Tweet went viral. The revelation by Holz was first made public virally by Twitter users who shared an interview (Forbes Paywalled) with the entrepreneur conducted by Forbes Magazine in September of this year.
During the Q&A with the famous business magazine, the Midjourney founder is at one point asked if he sought consent from living artists or owners of work still under copyright for his AI training.
Holz bluntly answered, “No. There isn’t really a way to get a hundred million images and know where they’re coming from.” This of course should have been obvious.
Everything from ChatGPT, to GitHub Copilot to DALL-E 2 just scrapes the internet. And we are talking about the infinite incredible A.I. of 2022? It’s just funny. These companies laugh in the face of copyright under a “fair use” argument. They claim that gaining permission from the creators of so many different photos and other visuals would indeed have been extraordinarily difficult if not impossible.
Everyone is cutting corners. I wonder what Lensa A.I. did to get to that platinum selling app and profile pictures made to “augment” your profile pictures? To be honest, it’s legitimizing deepfake culture and an anonymous internet culture that’s even more difficult to moderate. It’s obviously degrading to women. Prisma Labs is basically made by a bunch of scheming Russians. They they made $Billions on the back of Generative A.I. hype in 2022, is the sad state of A.I. today.
I have much more respect for Stability A.I. and Midjourney. The Midjourney creator further elaborated, “It would be cool if images had metadata embedded in them about the copyright owner or something. But that’s not a thing; there’s not a registry.”
In scraping our internet, and transforming it with LLMs, are we building something actually good or useful or engaging? There’s no regulation of internet culture, so it’s the wild-wild west of rule of law, monetization reigns supreme. These are really poor incentives for A.I. to evolve with. The commercialization of Generative A.I. could result in major harms, foundationals bias and use that like algorithmic feeds, might damage people in a mental health or undermine free speech, democracy and increase civil unrest, even instigate conditions whereby civil wars become more probable.
And yet, it’s business as normal? That’s not a healthy version of Capitalism.
He also said that “There’s no way to find a picture on the internet, and then automatically trace it to an owner and then have any way of doing anything to authenticate it.” So while You.Com seeks to add a ChatGPT competitor with citations, and others play around with the idea, we’re mostly going blind and riding a virus unleashed into the wild.
Yet people are finding ways to utilize ChatGPT and Midjourney together in novel ways:
The Generative A.I. hype isn’t necessarily innovative, it’s mostly degrading rule of law, normalizing deepfake culture and stimulating a less human internet. ChatGPT hype is a wave for commercial profit, and everyone wants in on the game. Of course if you’ve studied Silicon Valley and its VC system, you’ve seen many such hype cycles before (and know how they usually turn out).
PetalPixel notes: Holz’s words have resurfaced in interviews of outraged Twitter users, partly due to an artist protest against AI images. I’ve been thinking in my “end of year” ruminations about how Anti-AI protests are getting shut down, barely covered by the mainstream media and more or less suppressed. This will tempt Substack provocative narrative builders to build a BigTech revolut movement. I expect it to be born in 2023 or 2024. The A.I. ethics concern, is a growing concerns and every indication I’ve seen means 2023 is worse for potential outcomes for the health and well-being of the internet.
OpenAI for not being open has started a trend, some even speculate this will become the norm:
This means 2023 could usher in a dark era of the commercialization of A.I. without rules or ethics or any much rule of law. I won’t speculate on the harm ChatGPT “hallucinations” could cause people but suffice to say it was enough to ban it on Stack Overflow.
Guys this is not A.I. Utopia. Also in the interview, Holz says that Midjourney’s dataset was built from “a big scrape of the internet.” This is basically just A.I. learning to copy the world of people. Is that A.I. for Good at work? If GitHub can replace some of your coders with a tool that makes software engineers more productive, necessarily a good thing? Hey only $19 for an enterprise account!
“What baffles me is that David Holz blatantly admits to theft and copyright infringement in this article! His attitude is, ‘yeah, we stole from you to build a platform that we make a profit from, what are you going to do about it,'” says artist Dave Lung.
Standing on the Shoulders of Monopoly Capitalists
People like David Holz and Sam Altman make these various disclaims on Twitter, also for their own self-interests. It makes them look more transparent. It’s all just a VC game to them in the end. Their entire job is to maximize profit with the least legal liability. And if any lawsuits came, we know in America that would be at least a decade in the future. Such is the obscene corrupt of Silicon Valley as it relates to Congress and Washington, lobbying and Senators on the take.
Facebook and Google have led with mafia tactics, so why should we expect anything different from mere startups? Even if such a registry did indeed exist for so many photos, the sheer quantity of them, likely in the hundreds of millions, would make actually seeking consent from each image’s creator into a task worthy of the pyramid builders.
Creating pyramid schemes and walled gardens, from Bitcoin to Facebook or Google, is what Silicon Valley does. It’s a lawless aristocracy of white men in Venture Capital calling the shots. Didn’t you know?
We cannot expect Midjourney to behave otherwise. Stability AI announced it would allow artists to remove their work from the training dataset for an upcoming Stable Diffusion 3.0 release. The move comes as an artist advocacy group called Spawning tweeted that Stability AI would honor opt-out requests collected on its Have I Been Trained website. The details of how the plan will be implemented remain incomplete and unclear, however.
How would we even implement this at scale? It makes no sense. These startups don’t want to do any heavy lifting for citations, so it all rests on the artists? It’s practically criminal.
Generative A.I. is not all hype and app monetization. There is a darks side to Generative A.I. The internet is about to become even less trustworthy with even more spam, with even less ability for A.I. to help moderate content that’s already letting in a lot of fake content and misinformation. The internet has become more about sales, spam, PR and online personas that have little to do with actual people. The flood of content from Generative A.I., won’t’ be doing it any favors.
There’s already been an exodus from legacy social media. This might actually make it worse.
Despite the implicit logistics hurdles, the artistic Twitter backlash from some corners against this admission by Holz shouldn’t have surprised anyone. It’s hurting real people, even if they are limited in their ability to advocate for their own online human rights.
There is not a great deal of poetic justice, when Silicon Valley comes knocking on your craft with roses and cheap knock-offs:
So what we going to do? Mostly nothing. 2023 will be a chaos-spawn of more infringement on humans and their creative work thanks to A.I. fit on a pedestal pushed by rich men. Silicon Valley and the CCP are now more and more able to manipulate the internet as a whole, as evidenced by Elon Musk Buying Twitter for $44 Billion. LinkedIn itself has this PayPal Mafia vibe all over it, even do this day.
And we’re supposed to trust the internet? Find entertainment in these tools and products? Is this all that humanity is capable of?
What’s more, this backlash comes right on the heels of an earlier but still ongoing protest against the art sale platform ArtStation because this latter site is now allowing AI-rendered images to be sold on it. Art Station recently decided to ban anti-AI images.
So to protect the Generative A.I. swag now, we are now censoring anti-AI images and online protests? That’s super ghetto and dystopian folks.
While the majority of uproar in recent days has come from digital artists, it’s something that affects photographers exactly the same. It affects more professionals than we care to realize, I recently wrote a Letter about it I’m about to publish on A.I. Supremacy Newsletter.
While we praise the tools, let’s also admit the dangers and point out the A.I ethics issues involved here. It’s not all for the good of Microsoft and more subscription services you know. WE are steering the internet down a dark alley, and it’s straight out of a dark dystopian Sci-fi fiction novel I may have read in the 1990s.
So, the right thing to do for all the short cuts taken would be to establish and maintain a trust fund from which public services could be rendered with the view, “pay it forward” might be the way to go; won’t you think?