Contact Us
Sign In
Sign Up
Rules Admins Moderation Log Random Post Random User
What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules
Recommended Posts And Communities
Recommended Realtime Chats
- Astral Codex Ten Discord
- Quokka's Den Telegram

naraburns nihil supernum 2yr ago (text post) 4607 thread views

Small-Scale Question Sunday for November 20, 2022

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

Jump in the discussion.

No email address required.

Bernd Fighting algorithmic racism like John Henry 2yr ago

Oh, good chance to ask, how good is acrobat OCR? I've been using the one built into Google drive, but it's not possible to batch it.

Context

Rov_Scam Bernd 2yr ago

It's pretty good but it's time consuming for larger files. To provide some context, I was doing legal work for oil and gas and I had to determine if certain assignments pertained to certain leases (an assignment is when one company conveys lease rights to another; I'll include things like mortgages and financing statements in this category). They often do this in large documents conveying several thousands of interests at one time. It can be incredibly time consuming to do this by simply reading the document, especially since most of them are ordered by some kind of internal lease number rather than alphabetically or geographically or by some other parameter that I have access to. It gets even worse when they're conveying different interests for different leases and there are several exhibits to go through. After OCR I'd usually search by lessor name first. If I found what I was looking for, great, if not, I'd try parcel number, and if that failed, I'd search by the recording information for the original lease. These latter two parameters were kind of dicey because the information is often laid out in a table and the OCR occasionally has trouble determining where the line breaks are. With a name you at least have the security of knowing that the first few letters will be consecutive without a line break. If I got to this point and didn't find anything then I figured I could safely assume that the document didn't apply to the lease I was concerned about, unless, of course, there was some kind of blanket language, but that's usually easy to find. It wasn't 100% accurate, though, because there were some cases where I knew that what I was looking for was in there but it wasn't coming up because of a typo, or bad scanning, too-small printing, etc. at which point I'd have to search the whole document manually. My superiors didn't like relying on OCR because of this, but in my experience mindlessly scanning page after page was more likely to lead to an error than the OCR was. The advice I'd give to the client relied pretty heavily on the applicability of certain of these documents, so I'd say that it's probably good enough for whatever you plan on using it for, assuming that it isn't an application that could get you fired or cause some other kind of serious problem.

I never had to batch scan so I can't comment on how well this works. One final caution I'd give is that OCR info causes the file sizes to balloon considerably. The firm I worked at required us to eliminate all exhibit pages from these documents except the ones that were directly applicable to prevent the already-large size of the client's product to balloon to unmanageable levels and take up too much room on our cloud storage. This was followed by a prohibition on including OCR'd stuff in our final client PDFs for the same reason, as we saved copies of all our work and it was taking up entirely too much space. It wasn't uncommon for one of these large documents to take up in excess of 300 megs due to all the additional OCR data. So if you plan on saving all of these PDFs locally, it's something to be aware of.

Context

Bernd Fighting algorithmic racism like John Henry Rov_Scam 2yr ago

Wow, thanks for the review. If you trusted it with that, it must be more than good enough for the stuff I was doing (casually browsing through old French books)

Context

What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules

Recommended Realtime Chats

Link copied to clipboard

Action successful!

Error, please try again later.