Contact Us
Sign In
Sign Up
Rules Admins Moderation Log Random Post Random User
What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules
Recommended Posts And Communities
Recommended Realtime Chats
- Astral Codex Ten Discord
- Quokka's Den Telegram

PaperclipPerfector 13d ago (text post) 1991 thread views

Friday Fun Thread for November 8, 2024

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

Jump in the discussion.

No email address required.

TowardsPanna 12d ago

Do we have anyone running local offline LLMs here?

How are they coming along? Do you need to load them into VRAM, or can you load them into RAM or something and use either CPU or GPU from there?

Context

4bpp このMOLOCHだ！ TowardsPanna 10d ago · Edited 10d ago

If you use llama.cpp, you can load part of the model into VRAM and evaluate it on the GPU, and do the rest of the evaluation on CPU. (The -ngl [number of layers] parameter determines how many layers it tries to push to the GPU.)

In general, I strongly recommend using this over the "industry standard" Python-based setup, as the overheads of 1GB+ of random dependencies and interpreted language do tend to build up in ways beyond what shows up in benchmarks. (You might not lose so much time per token, but you will use more RAM (easy to measure) and put more strain on assorted caches and buffers (harder to attribute) and have more context switches degrading UI interactivity.)

Context

No_one 4bpp 9d ago

In general, I strongly recommend using this over the "industry standard" Python-based setup, as the overheads of 1GB+ of random dependencies and interpreted language do tend to build up in ways beyond what shows up in benchmarks

Very true. Playing around with stable diffusion has made the impossible - made my windows seize up over memory, actual app crashes & need to restart the system more than every three weeks when the inevitable update happens.

I had to download rammap from sysinternals because it could delete leaked memory a1111 spilled in gigabytes every time a model was being swapped. Every ~50 mb of pics generated it will need restarting.

Maybe I should go to comfyUI.

Context

TowardsPanna 4bpp 9d ago

Thanks! Do you use it?

Context

4bpp このMOLOCHだ！ TowardsPanna 9d ago · Edited 9d ago

Yes, though I haven't paid attention to it in about half a year so I couldn't answer what the capabilities of the best models are nowadays. My general sense was that performance of the "reasonably-sized" models (of the kind that you could run on a standard-architecture laptop, perhaps up to 14B?) has stagnated somewhat, as the big research budgets go into higher-spec models and the local model community has structural issues (inadequate understanding of machine learning, inadequate mental model of LLMs, inadequate benchmarks/targets). That is not to say they aren't useful for certain things; I have encountered 7B models that could compete with Google Translate performance on translating some language pairs and were pretty usable as a "soft wiki" for API documentation and geographic trivia and what-not.

Context

Silverdawn It's a DOGE eat DOJ world. TowardsPanna 11d ago

Do we have anyone running local offline LLMs here?

How are they coming along?

There are some really good models available to run but they require beastly graphics cards. Here are some llama benchmarks, for a rough idea.

Do you need to load them into VRAM, or can you load them into RAM or something and use either CPU or GPU from there?

In theory, they can be ran on a CPU but GPUs are way better at this task.
The best places to find information on local LLMs that I'm aware of are https://old.reddit.com/r/LocalLLaMA/ and https://boards.4chan.org/g/ and especially the LLM general there.

Context

TowardsPanna Silverdawn 9d ago

Thank you.

Context

100ProofTollBooth Dumber than a man, but faster than a dog. TowardsPanna 11d ago

I can run 7B models on a Macbook M2 with 8 GB of ram. This is because of how Macbooks handle VRAM.

It's pretty slow, and 7B models aren't great for general tasks. If you can use one that's fine tune for a specific thing, they're worth it.

Frankly, however, I'd just recommend using something like together(dot)AI or OpenRouter to run larger models elsewhere. Normal caveats about not pushing sensitive info out there, of course. $30-$50 worth of credits, even for monster models like Meta's 405B, will take you easily though a month of pretty heavy usage (unless you're running big automated workloads 24/7).

I think there's going to be a race between local AI specific hardware for consumers and just cloud based hyperscaling. I don't know which will win. Privacy definitely plays a part. I'm quite optimistic to see a new compute hardware paradigm emerge.

Context

TowardsPanna 100ProofTollBooth 11d ago

I'm using openrouter.ai daily. The credits last for a surprisingly long time. Sonnet 3.5 is my go-to model.

I'd like something offline and private for sensitive use though.

Context

What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules

Recommended Realtime Chats

Link copied to clipboard

Action successful!

Error, please try again later.