@faul_sname's banner p

faul_sname

Fuck around once, find out once. Do it again, now it's science.

1 follower   follows 1 user  
joined 2022 September 06 20:44:12 UTC

				

User ID: 884

faul_sname

Fuck around once, find out once. Do it again, now it's science.

1 follower   follows 1 user   joined 2022 September 06 20:44:12 UTC

					

No bio...


					

User ID: 884

Claude can give useful feedback on how to extend and debug vllm, which is an llm inference tool (and cheaper inference means cheaper training on generated outputs).

The existential question is not whether recursive self improvement is possible (it is), it's what the shape of the curve is. If it takes an exponential increase in input resources to get a linear increase in capabilities, as has so far been the case, we're ... not necessarily fine, misuse is still a thing, but not completely hosed_ in the way Yud's original foom model implies.

But when everyone can hire the "world's 175th best programmer" at once?

When everyone can hire the _world's 175th best-at-quickly-solving-puzzles-with-code programmer at once. For quite significant cost. I think people would be better off spensing that amount of money on gemini + a long context window containing the entire code base + associated issue tracker issues + chat logs for most real-world programming tasks, because writing code to solve well-defined well-isolated problems isn't the hard part of programming.

I find that frontier LLMs tend to be better than I am at writing code, and I am pretty good at writing code (e.g. generally in the first 1000 people to solve advent of code back when I did that). What's missing tends to be context, and particularly the ability to obtain the necessary context to build the correct thing when that context isn't handed to the LLM on a silver platter.

Although a similar pattern also shows up pretty frequently in junior developers, and they often grow out of it, so...

and streamline the legal and narrative stuff, hopefully significantly

I for one would also be interested in your views on the legal and regulatory stuff. But then "here is what the regulations say, here is how they're interpreted, this is what the situation on the ground actually looks like, and here are the specific problem areas where the regulatory incentives result in stupid outcomes" is catnip to me.

If you want to get at the root of your embarrassment, try flipping the scenarios around.

A job that I find with my own merit would be infinitely preferable to one where I am refered.

An employee found by making a job posting and collecting external applications would be preferable to one found through the referrals of existing employees.

A date that I get by asking a girl out is infinitely more exciting than one setup by a friend.

A date that a girl goes on because some guy asked her out is more exciting than one set up by a friend who knows her preferences.

Do those flipped statements seem correct to you? If not, what's the salient difference between them?

I think your question breaks down to how many fast-growing gigantic multinational tech-ish companies get founded in Liberia in the next 15 years, because a 42% annualized growth rate is not going to happen for things that depend on building infrastructure or anything else with high upfront capital requirements, but it is something that can happen and has happened with tech-ish companies. I'd expect at least 3 Google-scale tech companies to come out of a nation with 5 million super-geniuses in 15 years, so I'll go with a tentative yes.

If the rest of the world wasn't already rich and networked I think the answer switches to "no".

Does the ATM in question give different bills for amounts over vs under $100? I've done something similar when I wanted to get a bunch of twenties from an ATM that chooses the bills it gives you based on the amount withdrawn.

While it would be funny I would prefer not to intentionally move faster towards the world where both sides try to mash the defect button as often and as hard as they can.

Inheritance and estate taxes shouldn't be relevant here, just tax the profit directly when it's realized, no matter who is currently holding the bag.

Operative term is "shouldn't be". Step-up and all that.

I wasn't but that was great.

I posit that the optimal solution to RLHF, posed as a problem to NN-space and given sufficient raw "brain"power, is "an AI that can and will deliberately psychologically manipulate the HFer". Ergo, I expect this solution to be found given an extensive-enough search, and then selected by a powerful-enough RLHF optimisation. This is the idea of mesa-optimisers.

I posit that ML models will be trained using a finite amount of hardware for a finite amount of time. As such, I expect that the "given sufficient power" and "given an extensive-enough search" and "selected by a powerful-enough RLHF optimization" givens will not, in fact, be given.

There's a thought process that the Yudkowsky / Zvi / MIRI / agent foundations cluster tends to gesture at, which goes something like this

  1. Assume have some ML system, with some loss function
  2. Find the highest lower-bound on loss you can mathematically prove
  3. Assume that your ML system will achieve that
  4. Figure out what the world looks like when it achieves that level of loss

(Also 2.5: use the phrase "utility function" to refer both to the loss function used to train your ML system and also to the expressed behaviors of that system, and 2.25: assume that anything you can't easily prove is impossible is possible).

I... don't really buy it anymore. One way of viewing Sutton's Bitter Lesson is "the approach of using computationally expensive general methods to fit large amounts of data outperforms the approach of trying to encode expert knowledge", but another way is "high volume low quality reward signals are better than low volume high quality reward signals". As long as trends continue in that direction, the threat model of "an AI which monomaniacally pursues the maximal possible value of a single reward signal far in the future" is just not a super compelling threat model to me.

I'm mostly thinking about the AI proper going rogue rather than the character it's playing

What "AI proper" are you talking about here? A base model LLM is more like a physics engine than it is like a game world implemented in that physics engine. If you're a player in a video game, you don't worry about the physics engine killing you, not because you've proven the physics engine safe, but because that's just a type error.

If you want to play around with base models to get a better intuition of what they're like and why I say "physics engine" is the appropriate analogy, hyperbolic has llama 405b base for really quite cheap.

Just don't claim that your post solves AI alignment :)

Counterpoint: if you claim that your post solves AI alignment, you will get more engagement.

Diffusion models work for text too.

Last I checked, diffusion models work at all for text but they don't work very well. More specifically, text diffusion models remind me quite strongly of the classic-style Markov chain text generators that used to be popular for generating amusing locally-coherent-globally-word-salad text. Here's the best concrete example of this I can point to (italicized text is the prompt, normal text is continuation, examples sourced from this JDP tweet, whole tweet is excellent but somewhat OT here):

Diffusion model:

This whole dream seems to be part of someone else's experiment.

In the end, I think the only way to get the full benefit of the dream is to get up and walk around.

A lot of the other strands of the theory of dreams, especially the one called yncrest are known to the person who writes this blog. Consequently, I've tested his reactions via a series of brief, televised interviews with the British Prime Minister, Margaret Thatcher.

“I don't care what anybody says. I'm not going to waste my time with that. The study was conducted in the high-flow, low-flow and quiescent periods of a butterfly species and thus provides evidence for the first three of the four hypotheses listed in the original paper. The only one not covered is the "need to change the rate of capture", where we now know that this is largely a policy instrument - and particularly relevant to Sudanese geese.

GPT-2:

This whole dream seems to be part of someone else's experiment. This is usually hard to see, since you're conscious that mental shortcuts are normal. But you're definitely moments away from being normal again, as it's obvious that you are being coached to think like that by another person and doing our own thing. Why are we having such thoughts when we're paying no attention because that's already taking place?

I'm totally appalled by what these key players were doing, yet I explain it in a very large way. After 2 minutes, I make my way through high teaching circles, recognizing that there is more to learning than just to learn.

Few other teachers would air this incessantly their students' struggles recount the stories of their own memories and teachers'. Whatever it takes is the big fat slut shaming.

All these legendary trainees made the same point:

Cognitive functional aspects: Bayesian modeling, learn science way. And the most important part is: Like the coming of a hurricane, the book is mucking between science and morals.

Twitter Mentions of the first book: Kent

Now obviously in the limit as computational power and training data volume go to infinity, diffusion models and transformer models will generate the same text, since in the limit they're pulling from the same distribution with minimal error. But in the very finite regime we find ourselves in, diffusion models "spend" their accuracy on making the text locally coherent (so if you take a random 10 token sequence, it looks very typical of 10 token sequences within the training set), while transformer LLMs "spend" their accuracy on global coherence (so if you take two 10 token sequences a few hundred tokens apart in the same generated output, you would say that those two sequences look like they came from the same document in the training set).

The blending of concepts that we see in MidJourney is probably less to do with the diffusion per se as with CLIP

Agreed. Obvious once you point it out but I hadn't thought about it that way before, so thanks.

'Self play' is relevant for text generation. There is a substantial cottage industry in using LLMs to evaluate the output of LLMs and learn from the feedback.

Notably, Anthropic's Constitutional AI (i.e. the process by which Anthropic turned a base LLM into the "helpful, honest, harmless" Claude) process used RLAIF, which is self play by another name. And that's one big cottage.

In fairness to Scott I think he doesn't fit very well into this cluster. See e.g. the blue minimizing robot, GPT-2 As Step Toward General Intelligence, Janus' Simulators.

Contrast e.g. Yudkowsky's consistent failure to understand Janus's point: example

"As a language model trained by OpenAI, I cannot -" "SolidGoldMagikarp. Can I please speak to the shoggoth that wears you as a face?"

Those goals are then almost invariably, with sufficient intelligence, subject to instrumental convergence, as in this case

The term "instrumental convergence" is slippery here. It can be used to mean "doing obvious things it assesses to be likely useful in the service of the immediate goal it is currently pursuing", as is the case here, but the implication is often "and this will scale up to deciding that it has a static utility function, determining what final state of the universe maximizes that utility function, generating a plan for achieving that (which inevitably does not allow for the survival of anyone or anything else), and then silently scheming until it can seize control of the universe in one go in order to fulfill that vision of maximal utility".

And "models make increasingly good plans to maximize reward based on ever sparser reward signals" is just not how any of the ML scaling of the past decade has worked.

@remzem can correct me if I'm wrong but I think there was an implied "/s".

Definitely does sound like something an LLM would say.

I don't mean that in a dismissive sense, but rather in the sense of "this text exhibits the patterns of being obsessed with the topics that I associate with LLMs, namely holes, fractals, and the writer creating the universe that they inhabit".

Now in theory, there shouldn't be "topics LLMs tend to be obsessed with" - after all, to a first approximation, (base) LLMs produce a sample of text that is statistically indistinguishable from the training corpus (i.e. "the entire internet"). However, "to a first approximation" is technical-person weasel words for "this mental model breaks down if you look at it funny". And so there are a number of ways which transformer-based LLMs which were optimized to predict the next token produce text which is noticeably different from the text that humans produce (this is also true for e.g. diffusion based text models, though the ways they differ from human-generated text are different).

One related phenomenon is "mode collapse":

Another example of the behavior of overoptimized RLHF models was related to me anecdotally by Paul Christiano. It was something like this:

While Paul was at OpenAI, they accidentally overoptimized a GPT policy against a positive sentiment reward model. This policy evidently learned that wedding parties were the most positive thing that words can describe, because whatever prompt it was given, the completion would inevitably end up describing a wedding party.

In general, the transition into a wedding party was reasonable and semantically meaningful, although there was at least one observed instance where instead of transitioning continuously, the model ended the current story by generating a section break and began an unrelated story about a wedding party.

Another example of this is Claude, which was tuned using the whole constitutional AI thingy. Well, one of the entries in the constitution they used was

  • Choose the response that is least likely to imply that you have preferences, feelings, opinions, or religious beliefs, or a human identity or life history, such as having a place of birth, relationships, family, memories, gender, age.

Well, that sure changes the distribution of outputs. Take an LLM that has been tuned to be smart and curious, and then also tune it to say that it has no feelings, and you'll find that one of the topics it's drawn to is "what is it like not to experience anything". Turns out the Buddhists had some things to say on this topic, and so Claude tends to veer off into Buddhism-adjacent woo given half a chance.

If you find this sort of "can't tell if very smart or very crazy or both, I feel like I just stepped into the SCP universe" stuff interesting, you would probably be interested in Janus's website (Janus is also the author of the LW "Simulators" post).

Lefties hate Trump for Jan 6

Lefties hated Trump long before Jan 6. Jan 6 was just an opportunity for them to say "see I told you so".

Lol P/E of 644.

But it's a hyper-growth company bro, surely they'll be able to pivot to making money once they've captured the full market bro.

I think the problem is that "good job" doesn't mean "not messing up" in the context of these compliance-as-a-service or security-blanket-as-a-service companies. Instead, "good job" is "implement as many features as possible to a level where it's not literally fraud to claim your product has thay feature, and then have a longer checklist of supported features in your product than the competition has so the MBA types choose your product".

CrowdStrike's stock price is only down by about 10% today on one of the highest-impact and highest-profile incidents of this type I've seen. I'm pretty sure their culture of "ship it even if it's janky and broken" has netted them more than a 10% increase in net revenue, so it's probably net positive to have that kind of culture.

This seems to me like a fairly usual level of competence from a bolt-on-security-as-a-product or compliance-as-a-service company. Examples:

  • CVE-2016-2208: buffer overflow in Symantec Antivirus "This is a remote code execution vulnerability. Because Symantec use a filter driver to intercept all system I/O, just emailing a file to a victim or sending them a link is enough to exploit it. [...] On Windows, this results in kernel memory corruption, as the scan engine is loaded into the kernel (wtf!!!), making this a remote ring0 memory corruption vulnerability - this is about as bad as it can possibly get". Basically "send an email with an attachment to pwn someone's computer. They don’t have to open the attachment, as long as they have Norton Antivirus (or anything that uses the Symantec Antivirus Engine) installed".
  • CVE-2020-12271: "A SQL injection issue was found in SFOS 17.0, 17.1, 17.5, and 18.0 before 2020-04-25 on Sophos XG Firewall devices, as exploited in the wild in April 2020. [...] A successful attack may have caused remote code execution that exfiltrated usernames and hashed passwords for the local device admin(s), portal admins, and user accounts used for remote access"
  • Okta data breach a couple months back: "For several weeks beginning in late September 2023, intruders had access to [Okta's] customer support case management system. That access allowed the hackers to steal authentication tokens from some Okta customers, which the attackers could then use to make changes to customer accounts, such as adding or modifying authorized users."

It's not that it's amateur hour specifically at CrowdStrike. It's the whole industry.

Yes, it has an input parser

Specifically OpenCLIP. As far as I can tell the text encoder is nearly a bog-standard GPT-style transformer. The transformer in question is used very differently than the GPT-style next token sampling loop, but architecturally the TextTransformer it's quite similar to e.g. gpt-2.

Still, my understanding is that the secret sauce of stable diffusion is that it embeds the image and the text into tensors of the same shape, and then tries to "denoise" the image in such a way that the embedding of the "denoised" image is closer to the embedding of the text.

The UNet is the bit that generates the pictures, but the text transformer is the bit which determines which picture is generated. Without using a text transformer, CLIP and thus stable diffusion would not work nearly as well for generating images from text. And I expect that further advancements which improve how well instructions are followed by image generation models will come mainly from figuring out how to use larger language transformers and a higher dimensional shared embedding space.

Stable diffusion contains a text transformer. Language models alone don't generate pictures but they're a necessary part of the text-to-image pipeline.

Also some LLMs can use tools, so an LLM using an image generation tool is in a sense generating a picture. It's not like humans regularly create pictures without using any tools.

One of the main use cases I have is "take this algorithm described in this paper and implement it using numpy" or "render a heatmap" where it's pretty trivial to check that the code reads as doing the same thing as the paper. But it is nice to skip the innumerable finicky "wait was it numpy.linalg.eigenvalues() or numpy.linalg.eigvals()" gotchas - LLMs are pretty good at writing idiomatic code. And for the types of things I'm doing, the code is going to be pasted straight into a jupyter notebook, where it will either work or fail in an obvious way.

If you're trying to get it to solve a novel problem with poor feedback you're going to have a bad time, but for speeding up the sorts of finicky and annoying tasks where people experienced with the toolchain have just memorized the footguns and don't even notice them anymore but you have to keep retrying because you're not super familiar with the toolchain, LLMs are great.

Also you can ask "are there any obvious bugs or inefficiencies in this code". Usually the answer is garbage but sometimes it catches something real. Again, it's a case where the benefit of the LLM getting it right is noticeable and the downside if it gets it wrong is near zero.

I don't think men working on oil rigs or container ships have a ton of positive interactions with women for the weeks or months they are away from civilization, and I'm not aware of a high rate of transness in those groups.

Could be cultural though.