This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
A few followups to last week's post on the shifting political alignment of artists:
HN: Online art communities begin banning AI-generated images
The AI Unbundling
Vox: What AI Art means for human artists
FurAffinity was, predictably, not the only site to ban AI content. Digital artists online are in crisis mode, and you can hardly blame them -- their primary income source is about to disappear. A few names for anyone here still paying for commissions: PornPen, Waifu Diffusion, Unstable Diffusion.
But what I really want to focus on is the Vox video. I watched it (and it's accompanying layman explanation of diffusion models) with the expectation it'd be some polemic against the dangers of amoral tech nerds bringing grevious harm to marginalised communities. Instead, what I got was this:
So, although artists are organising a reactionary/protectionist front against AI art, the media seems to be siding with the techbros for the moment. And I kind of hate this. I'm mostly an AI maximalist, and I'm fully expecting whoever sides with Team AI to gain power in the coming years. To that end, I was hoping the media would make a mistake...
Reminds me of that bodybuilding/math article someone was looking for in the Sunday thread. The premise is that there are two types of 'hard', bodybuilding hard (where we know how to do something, but you have to put in the effort), and math hard (where it's hard to figure out, but once you do, it's easy).
AI is basically just plowing through the 'math hard' stuff. And for the most part, creative arts are mostly 'math hard'.
But how much will society really suffer if art, on average, becomes higher quality (because low-quality art basically vanishes when an AI can do it well for free). It reminds me a bit of projectionists in movie theaters. For many years you heard projectionists sounding the alarm about what would happen if they were replaced, how the theater experience would drop. But the honest truth is that most projectionists were teenagers who had no idea what they were doing, and the average experience for theater goers was that the film would be poorly lighted, framed incorrectly, that reels wouldn't be changed properly, that sound and video would be out of sync.
Sure, in a few large cities there were great projectionist made a film dramatically better. But for the whole of society, the death of the projectionist was a net benefit.
Anyways, the average artist (and the crappy artists) probably benefit the most from AI art, because they can increase their output and potentially their quality. For those who get stuck (like writers block) AI is a great way to blow through that. And maybe society as a whole benefits if people who have great vision, but not the skill to put their vision on paper, have the tools to be able to do so.
Most 'great' artists will survive fine, since most 'great' artists are actually just 'good' artists who built a brand. At some point AI will figure out how to build a brand, an audience, a following, and that'll be a major changing point for humanity.
Who knows, maybe the rise of AI in digital media will lead to an increase in demand for in-person, local arts, like theater. If everyone can simply type an idea into a prompt and get a theatrical quality film, then the real treat is going to be seeing performances live (just like with concerts).
More options
Context Copy link
This is definitely not as bad as it could have been but I find the reasoning here really strange. Since when is time evolution spent creating something a good measure of difficulty. Evolution has been "perfecting" tons of chemical reactions since before there were multi-cell organisms and it's trivial for us to cause chemical reactions.
I can't speak for Ted Underwood and it's possible that he hasn't given it much thought.
But it's reasonable in this specific context, because evolution consists of a semi-random exploration of the fitness landscape, and neural net training is an attempt to discover the global minimum in the loss landscape; length of training is trivially expected to contribute to the «polish» and optimization of the feature – some old things like ribosomes can well be approaching the thermodynamical limits of efficiency, and then there's... how we do arithmetic (I've made this point to darwin somewhere in this thread). Animals have been navigating 3D space for a long time, as a result they're pretty good at it.
Further, short-term evolution is necessarily dominated by simple changes – often just a few substitutions here and there affecting quantitative parameters like the rate of expression of a protein which upregulates the secretion of some hormone, that leads to general size change and allometric growth, in other words, unequal scaling of body parts with changed size. Or even more commonly and to a greater extent, selection on «standing» variation, changing distributions of already polygenic traits:
Short bursts of evolution like those are simple to approximate with technology: once we achieve the very basic performance (at least using a somewhat analogous architecture, like with connectionist models), we can keep going, scaling, even if the exponent is more punishing than it was in the organic substrate.
Our higher-order cognition including symbolic thought (probably necessary for art) and speech is physically implemented on the array of almost homogenous cortical columns (with some priority for Wernicke and Broca areas in the case of speech), which has been scaled up by a factor of like, two in the last 2 million years, or something to this effect, depending on where you start assuming hominids had any semblance of speech; and Cochran argues even that was only part of the prerequisite, with real hot stuff – including cave art – starting to happen tens of thousands of years ago. So the expectation is that the change was something even simpler.
Having (presumably) discovered the general trick to learning, particularly in the domain of image recognition, and shown it with decent machine vision and other achievements, we can reasonably expect to cover the rest of the ground very quickly with scaling and scientifically trivial tweaks – which is all there is to those generative models.
We haven't yet shown equivalent mastery in tasks involving locomotion of real robots, though that's probably more an issue of iteration time.
Without evolutionary pressure both populations would regress to the mean. If you're stuck on a desert island, what good does a 130IQ brain do? It just wants more carbohydrates and mopes. Someone who's born with IQ125 will thrive a little bit more.
Leaving probable nutritional deficiencies etc. aside: the next generation will have an IQ of like 120, and the third one, I'd bet, of 119.6. Regression to the mean has nothing to do with evolutionary pressures, it's just the issue of resetting beneficial non-hereditary effects (which we assume explain 30-50% of the deviation from median phenotype in these particular specimens). It's not some abstract global mean but just the mean of the island population's genetic value for the trait. Cochran himself explained this well to Edge, in a kinder era:
(...Does he actually hope to do this?)
In the long run, the trait may well be watered down, of course – unless they discover some fitness peak that normie island populations couldn't get to because of all the valleys; I think Scott had a short story on brilliant island eugenicists?
But this happens because of purifying selection.
More options
Context Copy link
More options
Context Copy link
This is all a very good description of how things that have long been under the thumb of evolution approach efficiency thresholds better than things that have not been but I'm still not sure why we should expect that to be the relevant criteria for modern job replacement. Evolution spent a whole lot of time concerned about getting the absolute maximum out of a calorie balancing many concerns so it's resulting 'design' of our ability to jump very high is limited by many those concerns. We're able to use technology to circumvent most of these tradeoffs, a rocket is incomparably better at moving things very high.
Physical job requirements needn't be 'move through 3D space making exactly the same tradeoffs as humans make'. Artificial constructs don't need to worry about protecting a fragile cranium, keeping a supply of oxygen handy, storing all the needed energy inside themselves, reproduction and many more things that are vitally important to humans. They're not solving the same problem set as humans so why would we expect the optimization to be all that fit.
Well that's an easy one: observation bias on part of the commenters. Because everything we could do with neat streamlined engineering, we've automated already or are in the middle of automating. Rockets are simple, do one very basic thing very well, and follow largely from first principles; so do cars and these boxes. In the end, what's left is tasks that genuinely require good spatial perception, mechanical understanding, free navigation in human-centric environments, articulated manipulators with many degrees of freedom, high-fidelity sensors, fast response and so on. Fundamentally, those are tasks the complexity of which comes almost entirely from special context-dependent cases, the long tail of failures to apply generic solutions; like HVAC maintenance or repairing automated boxes in the warehouse. You can either leave it to humans or create something on par with them. And it turns out that for developing (software-wise, first of all) tools that solve hairy tasks like those, galaxy brain engineering doesn't work that well, compared with approaches leveraging stochastic trial and error, learning. So parallels with evolution, and inferences from evolutionary hardness of adaptation, are apt.
But again: it's more of an issue of data availability and iteration time. Training CLIP or SD is much easier, faster and cheaper than training robots.
True but (non-fat) humans are remarkably well-built, most of the body is useful for mechanical performance. I don't know about you but my balls are only a tiny fraction of my overall mass. You don't need to protect the cranium all that much, because error rate is so low (if you're dropping heavy stuff or something, you're already failing at the primary task), and even if you do, a basic helmet would typically suffice. Local energy storage is handy because it simplifies logistics of the workspace. There's only so much that can be trimmed off. Humanoid body really is close to the optimum for many of our tasks, and making a machine perform comparably well is in fact a big challenge.
Our actuators are also very, very good. This is probably the best we can do with current hobbyist tech. Invincible.jpg.
It's not the balls, it's the optimization for finding mates. Evolution only optimizes for moving through 3D spaces so long as that's a means to successful reproduction. going to a stranger's house and moving some plastic tubes around in a cramped space has at best tangentially benefitted from the primary 'goal' of evolution. If the primary 'goal' of evolution was to create the best possible plumber I'd imagine something much more like a raccoon.
For the purposes of this discussion a raccoon is not that different from a human, it's a series of minor allometric changes really. (Raccoon body plan is also affected by reproductive needs, of course). And I suspect that making a raccoon-like plumber is about as hard as making a humanoid plumber (or even harder, because sometimes you need a ton of power in this line of work, and actuators we can produce cheaply and at scale are weak per unit of mass, compared to muscles; we could make a hydraulic raccoon with external power, but...) All creatures with such capabilities will be comparably hard to make. One additional aspect is that we have already made lots of specialized tools adapted for our hand grip and arm strength; it's probably much cheaper to make a robot who can wield them than reinvent the hand and all it holds.
(In the long run though, I agree, our infrastructure will change and so will robots who serve it. Probably a lot more cramped spaces, if nothing else).
Reproductively advantageus traits tend to also be helpful for general survival and capability, or rather, beneficial traits get reinforced by sexual selection (see koinophilia); exceptions are so striking exactly because they violate our intuitions about natural selection.
Then I think we're largely in agreement. I would however say that the primary difference is not that evolution has had more time to work on the 3d space problem so much that there is a massive amount momentum in the infrastructure that physical jobs interface with that is much more difficult to replace/adapt than the fairly infrastructure light world of art. It's certainly harder to prototype, test and iterate on a buildings designed to automatically need no human maintenance than it is to prototype, test and iterate on art that needs no artist. And even if it wasn't it would still take a long time to get that design out in a world where most people are content living in multi-decade old buildings that need occasional maintenance.
I do feel like the next step is going to be claiming that, yes machines are faster, stronger, more energy efficient and have better articulation but they can't compete with humans in having all those things while being made out of meat. Technology has been driving human physical laborers into progressively tighter niches since the wheel.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
openDogv3 is a really impressive project, but it's also optimized for low-cost and weight. There are a lot of better options out there than 8308s and a 3d-printed gearbox at the enthusiast or hobbyist level; they just blow out the rest of your budget out of the water and dramatically increase cost-of-entry.
That said, while the gap isn't as huge as you're suggesting, it's still pretty big. More efficient artificial approaches usually work by optimizing for entirely different purposes or environments.
I've been thinking today how good (smaller, lighter, more efficient) opendog would become if they just replaced all 3D-printed nonsense with CNC-machined or stamped metal and injection-molded polymers (and of course revamped electronics). Maybe it'd really be on par with Spot then, or (with added sensors, brains etc.) wipe the floor with Chinese knock-off dogs.
But that requires scale. I really hope somebody helps here: we need some sort of Stability for robotics.
If we don't optimize for low cost, at current costs those machines will be completely non-competitive.
What projects do you have in mind?
Yeah, there's a lot of low-hanging fruit available for improvements; even simple drill-press and 6061 aluminum could do a lot. But the toolchains for those processes are much more complicated and the processes themselves much messier, so it's not really in consideration. And, conversely, there's a lot of potential spaces for... more improvizational materials, where people are willing to design around them.
Scale is part of the problem, but you don't need that much scale. The FIRST FRC environment has a ton of devices being sold on scales of hundreds or low thousands that involve a lot of custom metal parts, and while they're not always good, they're definitely extant and productive. Part of that reflects the tax- and labor-advantaged nature of a situations where most customers and some sellers are non-profits or subsidiaries of non-profits, but that's ultimately a political choice: there's no that must favor FRC or Vex but not more productive matters.
The deeper issues... I think the big one is that there's simultaneously a big desire to build everything from 'scratch', but also to see some level of devices as indivisible, at least for this class of project. LEGO could make (arguably, does make, through Mindstorms) an injection-molded-polymer Spot knockoff, but the sort of people who want to build a LEGO kit aren't trying to put together a Spot variant. Even a lot of the Pi-and-cheap-servo posebots are largely marketed under the theory that they're an introduction to everything you'd need to learn for the project.
Some of this is just inevitable Pareto Principle stuff, but I think a lot of it's downstream of the death of manufacturing. The emphasis and ease-of-access to bits makes it so easy to considering scaling and production as someone else's problem, because, for no small part, it has been. I think the extreme time constraints and very limited purchaser base have done more to keep the FIRST ecosystem around as long as it has.
There's a few interesting takes on custom motors like the DizzyMotors, but almost all have a step one that involves taking apart a larger, expensive motor. Moteus is getting closer, but it's still (AFAIK) still in a prototype level, and it's very far from anything especially hitting the limits of the medium.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Crap. I DGAF about AI art per se, but as someone who thinks neural nets are a suicide-pact technology, if the MSM goes full pro-AI and uses AI to boost itself we're essentially banking on a failed Skynet to save us.
More options
Context Copy link
Concerns over AI art continue to be vastly overblown. Such art is only really threatening things where the graphic design budget is next to $0, e.g. stuff like placeholder art or stock images. AI art continues to be terrible at generating pornographic images where a lot of freelance artists' requests come from. It also has trouble maintaining a coherent style across multiple images, so needing a set of themed images becomes problematic. Some of these issues might be solved in the near term... or they might not be. Remember that people were extremely gung ho about the future of stuff like motion controls and VR in gaming, and they thought that just a little bit more time and investment would fix the major issues. These predictions have not panned out, however, and both technologies remain a gimmick. You should likewise be skeptical of claims that artists are going to be out of work en-masse any time soon.
It already works well enough that I'll likely use it to generate all the profile art for characters (in a text rpg), and also various fantasy background images. It won't be perfect at representing what I want, but the tools are already there to do all of that for free. The art is likely also higher quality than a lot of the stuff you can use commercially for free. People are already experimenting at doing things like changing the characters expression and clothing without making the image distort/change undesirably.
I don't have any particular reason to believe we've just now basically hit a wall. I expect that NSFW art just needs to be finetuned on, and then it likely can do it quite well.
I'm not sure why you think VR is a gimmick? While I agree that people overhyped it, it seems to be steadily growing and has a good enough library to be more than worth playing. (I'm also not sure what you consider a 'major issue' in VR that wasn't fixed?)
I do agree that a large chunk artists are unlikely to be out of their jobs in a couple years. However, it seems reasonable to expect that there will be simply less artists needed in lots of areas over time, and I don't expect it to take twenty years to replace many artists by AI art.
More options
Context Copy link
The controllability of Stable Diffusion and all finetunes has just shot through the roof (not everyone noticed) by means of pilfering some very rudimentary ideas from Imagen lit, such as Cross Attention Control for isolating and manipulating aspects of the prompt without manual inpainting masks; Dreambooth promises superior novel concept learning. I also expect great things from noise manipulations like here. Emad wonders aloud why nobody tries CLIP guidance, and there are increasingly capable alternatives. 1.5 checkpoint is visibly qualitatively better than 1.4 too. All this has happened in the span of weeks. And that's still the same heavily handicapped proof of concept model with <1B parameters, in the age of Parti 20B as near-SOTA. Though Scott still finds Imagen superior when concluding that he's won his 2025 bet before the end of 2022.
High-fidelity image synthesis from plain English instruction (ERNINE-VILG shows that Chinese is also getting attention) is a conceptually solved problem, even if engineering and training can make us wait a little while. Importantly, it has always been just another benchmark on the way to next level machine vision and language understanding, just like winning at board games, per se, has never been the point of deep reinforcement learning. We're doing audio, video and 3D next, and we need to train ourselves to stop getting mesmerized by pretty stimuli, and focus on the crux of the issue.
EDIT: fixed the link
(your first two links are the same)
More options
Context Copy link
More options
Context Copy link
You can't conclude that just because one technology didn't pan out. To me VR problems seem much more 'hard', related to hardware and biology and economies of scale, while AI continues to make rapid progress is infinitely more valuable economically.
A year ago you would've listed a different set of problems. They got solved. Things like "maintain coherent style" sound like research problems that are being solved right now.
If anything, while the Big VR Wave hasn't exactly come, I suspect it has contributed to the Big VTuber Wave we got.
Imma be real I have no idea what VTubers are. Virtual YouTubers? Strong suspicion that it's all media invention. Have never seen literally anyone mention them irl. And people here follow streamers.
Besides, how are they related to VR?
Essentially, a "virtual" streamer or video-maker, one who uses an avatar as their medium of creative expression.
To a degree, it was a "media invention" in that the first big one (Kizuna AI) and precursor concepts (like this and this) were only really possible with massive corporate backing that could afford mocap technology and the like. Nowadays, though, it's easier for independent content creators to get into the space thanks to what I'm going to talk about next:
The development of motion-tracking technology in the past decade-plus (one example of which is LeapMotion, used by some 3D VTubers) has been tied to the new age of VR that started in the 2010's (Oculus and Valve co-developed the modern form of motion-tracking in VR, whether that is done by referencing off of generated infrared signals (Valve's Lighthouses) and/or cameras scanning your surroundings). As the hardware developed, so did the software; now you almost don't need IR-based tracking or specific hardware like LeapMotion, and you can use an iPhone camera to read your face and map its movements to a 2D or 3D model thanks to programs like VBridger or Vseeface.
Speaking of which, models themselves have a link to modern VR. Consider VRoid, originally an initiative to bring 3D models to the masses, becoming a thing promoted within VRChat itself (allowing for people to have unique VR models at low or no cost), and often a good option for prospective VTubers who don't want or need to drop thousands on a quality model. Some VTubers use or have used Unity or Unreal Engine (both engines also being used for VR games) as programs to present their 3D models in an environment.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
My dude, I listed three services that provide what I believe to be good quality AI pornography. I have personally been making use of these services and I suspect I will not be using my old collection anymore, going forwards.
This is just a prompt engineering problem, or more specifically cranking up the
scale
factor for whichever art style you're aping && avoiding samplers that end with_A
.And I can assure you I was not one of these people. Neither was I a web3 advocate, or self-driving car optimist, or any other spell of "cool tech demo cons people into believing the impossible".
For Stable Diffusion, there is no demo. The product is already here. You can already get your art featured / sold by putting it up on the sites that permit it. I know with 100% certainty that I am never going to pay an old-school artist* for a piece of digital art again, because any ideas I had were created by me with a few prompt rolls an hour ago.
*I might pay for a promptmancer if I get lazy. But that will be magnitudes cheaper, and most likely done by people who used to not be artists.
I am not aware of a single high-quality AI image of two people having sex. Hell, I haven’t even seen a convincing example of masturbation. To say nothing of the more obscure fetishes you find on /d/. Do such pictures already exist?
It seems to be a special case of the more general problem with existing models that, as you increase the number of objects in the scene and have the people engage in more complex actions, you increase your chances of getting incoherence and body horror.
I mean if you've got a super-obscure fetish it's not like you're commissioning Rembrandt to do the already-available art. In my meandering experience most niche stuff trends towards being low quality anyways, and the inherent... super-specialization of fetish probably goes a long way.
If I'm into pale girls wearing red high heels and a police uniform, the current situation probably means I can get police uniform or red high heels but AI art is gonna let me layer in all the levels of fetish.
Honestly it's kind of concerning seeing how much internet communities have already done for sexual dysfunction and getting overly fetish-focused.
I’ve seen this response multiple times now in discussions of AI art, and it’s pretty baffling. “It doesn’t matter if the AI can’t do X because X type of art doesn’t have to be that good in the first place.” That’s not exactly a reassuring marketing pitch for the product you’re trying to sell.
Obviously determinations of quality should be left to people who appreciate the type of art in question in the first place, which you clearly do not.
The discussion is over whether the AI can satisfy the requirements in question, not the moral status of the requirements themselves.
I mean my point is that the 'competition' for obscure fetish art really isn't anything great, so the AI's got a lower barrier to entry into the marketplace.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
This does exist, but you are right to point out it is exceedingly difficult to make.
Given the volume of responses affirming the failures of generated porn, I'm realising my tastes must've bubbled me from dissent. I mostly consume images with only 1 figure involved && this has evidently biased my thinking.
More options
Context Copy link
More options
Context Copy link
I checked out Pornpen's feed, and the faces are still offputtingly deformed in about half of the images.
People on 4chan have been running the output through FaceApp in order to fix this problem. (archive)
More options
Context Copy link
More options
Context Copy link
Ahem, as a degenerate myself I highly doubt it. Not until we get an ML model trained on a booru, "properly". The current stuff is just too uncanny to fap to.
I can't draw conclusions without knowing what kind of degenerate you are. If you're into hentai, the waifu diffusion model was trained on the 1.4 SD checkpoint && has much room for improvement. If you're a furry, fine-tuned models are currently a WIP and will be available soon. If you're a normal dude, I don't really understand because I honestly think it's good enough at this point.
The only thing I think is really poorly covered at the moment is obscure fetish content. A more complicated mixture of fine-tuning + textual inversion might be needed there, but I do truly believe the needs of >>50% of coomers are satisfiable by machines at this point.
Edit: I am less confident of my conclusion now.
I think it depends pretty heavily on what you're looking for. It's not too hard to get some moderately decent cheesecake or beefcake out of StableDiffusion1.4 even using prompts that don't aim for nudity or even a specifically gender. These aren't quite Hun-level, for a very (uh, mostly androphillic) good pin-up artist, but then again most furry artists aren't Hun-level. There are some problems here, but they're things like species or genital configurations that are probably outside of the training dataset or don't have a lot of variety in the training dataset. Which doesn't make that an easy problem -- it's quite possible the entire internet doesn't have sufficient training data for some things people want -- but it's at least an almost certainly solvable one.
Compositionality is harder, and relevant for more people. Scott's won a bet for some solutions for it, but a lot of people are going to be looking for something more than "two people having sex", or even "<color> <animal-person> screwing <color> <animal-person> in <orifice>". This is SFW (as much as a few Donald-Duck-esque animal-people can be), but I'm hard-pressed to summarize it in a short enough phrase for current engines to even tokenize it successfully into its attention span, and it's not like there's a shortage of (porn!) content from the same artist with similar or greater complexity.
And some stuff is really hard to write out for any non-human reader. Furfragged's probably an extreme case: the content isn't very complex to mimic (ie, mostly pretty vanilla, if exhibitionist, gay or straight sex), and the overarcing concept of 'orientation play' is a common enough kink that several big name gay sites focus on it (although straight4'pay' less so), but it's hard to actually develop a prompt that can even get the least-compositionality-dependent variants out. "Straight fox guy sucks a deer dude" is... not something that I'd expect to be coherent to AI. Well before that level of contradiction, even things like 'knot' and 'sheath' have a lot of space for confusion.
Beyond even that, it's not clear how the extant process will work for larger series pieces. There's a reason story-heavy pieces like those from Meesh, Nanoff, Braeburned, SigmaX, Ruiaidri, or Roanoak get a lot of attention, even if the story isn't anything exceptionally deep. It's not that tools like SD can't just write out a full comic or struggles with dialogue; just getting obviously samish-characters from several different perspectives is difficult, even with textual_inversion. And the attention limit remains a problem, and even if a solvable one is something that requires significant structural changes.
I think these programs will become useful tools for some artists in combination with their normal workflow, and some non-artists may use them for some simple pieces where these constraints don't show up, but there are some hard limitations that may not be as readily solved as just throwing more parameters or stronger language models in.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
I honestly don't get a lot of concern over AI generated images, you'd think from the all the apocalyptic rhetoric that 90% of the average Mottizens income came from fulfilling online requests for furry porn. Somehow I don't think that's actually the case.
I think it's less interesting for the economic impact -- in addition to there just not being that many full-time furry porn artists, most of their commissioners are likely to want things more specific or outside of the realm of current ML generators, and there's a lot of social stuff going on that can't really been done by a python script -- and more because it's an interesting case that should allow exploration that isn't easy to do elsewhere. In order:
Furries are identifiable by other furries, but avoid a lot of the privacy and identify concerns relevant for most living people with large numbers of photos available, and while technically as much protected by copyright as Mario are a lot less likely to result in a takedown request.
On-topic training data is limited, and of sporadic quality, and unlikely to be used by professional environments. There's a lot of pre-tagged data on the various furry boorus! But it's probably on the order of 3-5m images, quite a lot of which won't meet the requirements used for LAION et all, and covering a pretty wide variety of topics, and a lot of reasons for ML researchers to not want to touch it. In retrospect, 'can AI realize your text string is a request to photoshop a wildlife shot onto a human's body' is pretty trivially obvious, but I don't think it was such a given three years ago.
On-topic enthusiast data is wildly available: randos can and already have started going everything from exploring the limits of textual_inversion to fine-tuning the algorithm on personally-curated data sets. So we'll get a lot of that.
There are a lot of symbols with few or marginal referents. There might be a fuschia otter in the training set, somewhere, or ghibli-style foxes and rabbits, but it's probably pretty rare in the training data. There's a scale here from dragons to griffons to hippogryphs to khajit or tabaxi to avali, not in that the subjects are fictional, but that a specific generated image is less and less likely to be pulled from a large class of input images and instead reflect interaction at a trait level (to whatever extent these are different things) or something more complex. (As an example: StableDiffusion gives robot-like outputs for protogen, despite very clearly having no idea what they are. Which isn't surprising that 'proto' and 'gen' have those connotations, but it's not clear people three years ago would have assumed ML could find them out.) At the very least, this points toward a upper bound for the minimum number of examples for a class; at most, it may explore important areas about how much a neural net can intuit. While I don't expect image generation solutions to these problems to generalize, that they exist is reason to think they at least can exist for other realms.
Outputs need be specific. This is partly just an extrapolation of composition problem that Scott was betting on, but there's also matters that are elevated from the connotations in ways that powerful AI will need to be able to do to handle a number of real-world situations, and most current ML can't.
Outputs are not human-specialized, in the way that even minor face abnormalities are able to trigger people, but they are still obvious where defects occur. StableDiffusion can't reliably do human faces, or even rabbit or weasel faces, but that it can do tigers most of the time, and foxes and wolves often, and this kinda says something interesting about what we expect and what ML (and even non-ML applications) may be able to easily do moving forward.
Inputs need be complex. Most current AI generators struggle with this, both because it's a generally hard problem and because a lot of the techniques they've used to handle other issues make this harder. I don't think the ability for a hypothetical 2025 version of StableDiffusion to handle this will be especially destructive on its own, but it will mean a pretty significant transformation that will impact a wide variety of other fields, and be very obvious here.
Much of that is an issue of data efficiency. I wonder how it'll be improved for really big models, but expect general few-shot learning in SD-type system scaled even 10x. Of course a different architecture would help.
More options
Context Copy link
More options
Context Copy link
People are expecting/fearing the singularity, and everything that looks vaguely like a part of it is liable to be catastrophized. In fairness, it's a genuine win for the transhumanists- it wasn't so long ago when people could credibly claim that this would never happen, and it unquestionably will affect real lives.
Over at The Dispatch, I was mildly startled to see a caption under an image that went something like (generated by Midjourney).
More options
Context Copy link
I see it as a herald for things to come. Perhaps you feel that furries are scum and deserve what's coming for them. That's all well and good, but the broader point to be read lies in the topic of job displacement in general.
"AI workers replace humans" used to be a prediction, not an accurate description of current reality. We now have (or are on the brink of having) a successful demonstration of just that. The reactions and policies and changes that arrive from the current ongoing chaos are going to set precedent for future battles involving first-world job replacement, and I am personally very interested in seeing what kind of slogans and parties and perhaps even extremism emerges from our first global experiment.
"Technology displaces workers" is not a new thing or a very controversial prediction that I am aware of anyone on the other side of. The contentious prediction is that AI would create structural persistent unemployment effects across the entire economy which every prior technological paradigm shift has yet failed to do. A few commission artists having to find jobs elsewhere in the service sector won't be evidence for that, nor would they really be the first to be impacted by AI in general (most translation work is now done by deep learning models, for example -- similar to AI art, a human in the loop is only necessary when the requirements are particularly complex or the quality demanded exceeds some nominal bar).
The part that you might not quite appreciate if you weren't monitoring every advance in this field is how quickly things have improved, which is to say how rapidly this disruption occurred.
We passed a point where computers became better at chess than any possible human a couple decades ago. Computers became better at Go about 6 years ago. This year they became better at producing art than 99.9% of humans, and they're certainly faster at it than any human could be. Most of the advances there occurred in the last 2 years.
And now there are models that can be applied to basically any game or task that can be effectively digitized, and can reliably train themselves to [better-than-human levels in a matter of days, maybe weeks.]
That's not to say that we're going to see unprecedented levels of 'hard' unemployment, but it is likely to sweep into unexpected places in very short order.
More options
Context Copy link
Completely true. Current advances do not guarantee the "no more jobs" dystopia many predict. My excitement is likely primarily a result of how much I've involved myself in observing this specific little burst of technological displacement.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Art and artists went through a similar crisis with the advent of photography -- what does it mean for technical skill when you can replicate a master's work with the click of a button. Art evolved, new categories developed and so on. The role of the artisan in art has been a bit contingent for ages, accordingly. It's not like Ai Weiwei welded all those bikes together himself, or that the interesting bit about Comedian was the subtle technique in its execution. Artists will come out the other side of this as they came out from photography -- much changed, and with new debates and reflexivity. (One interesting example is to compare paintings of water, ripples on streams etc, before and after photography revealed exactly how light played on and through the ever-changing surfaces).
I, for one am keenly anticipating the advent of the AI equivalent of photorealism -- replicating AI-generated aesthetic tells in the manual medium.
While artists will move to a different medium (or just a different job), that doesn't automatically make them safe. If an artist moves to 3d modeling, well, we're getting closer to solving various parts of modeling with AI (like NVIDIA has some cool pieces where they generate models from pictures). If an artist moves to writing stories, well, GPT-3 and Novel-AI are already relatively good. They're not as good as an actual writer at weaving a story, but they can be a lot better at prose.
I agree that people will move to new mediums. However, I expect that those mediums will be used as new areas where AI generation can advance. AI progress won't just sit still at this level of image and text generation forever. We 'just' need to throw enough compute and problem-solving skills at it.
Ding ding ding.
The notable thing here is less that the AI got good at art, but more that it got good at art way quicker than most expected, and is demonstrating the ability to improve and learn 'new' skills with surprising ease.
It is near impossible to predict right now which particular fields are safe for a while vs. those which are mere months from being disrupted. Safe to say that any field or medium could be next.
More options
Context Copy link
More options
Context Copy link
Whenever someone brings up the photography analogy, I always think they're completely missing the point. It's almost like you're Seeing as a State -- artists exist now, revolution happens, artists exist after.
What you're neglecting to mention is that the artists that exist in the present will not be the artists of the future. We had photorealistic painters, and later we had photographers. The latter were not made of the former. People will suffer, perish, anguish, and all of this stuff is important for understanding how things play out in the near future.
We had photographers, and then later we had photorealistic painters. Photorealism is an artistic reaction to photography.
I feel we are talking past each other. "In terms of the historical narrative, some artists were inspired by photography and made a cool synthesis of traditional art && the new technology" -- okay. But were there more artists (adjusting for base rate) creating realistic looking hand-drawn art pieces before or after the proliferation of the camera? Do you agree that the answer is before? Do you grasp the standard concerns shared amongst artists that believe before is the obvious answer?
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
Given that the media will undoubtedly be making significant cost-saving use of AI generated art, it would turn out to be embarrassing to them to condemn it now.
More options
Context Copy link
More options
Context Copy link