What's up with all those equals signs anyway?

509 points - today at 9:37 AM

Source

Comments

kstrauser today at 4:11 PM
For context, this is the Lars Ingebrigtsen who wrote the manual for Gnus[0], a common Emacs package for reading email and Usenet. It’s clever, funny, and wildly informative. Lars has probably forgotten more about email parsing than 99% of us here will ever have learned.

The manual itself says[1]:

> Often when I read the manual, I think that we should take a collection up to have Lars psycho-analysed.

0: https://www.gnu.org/software/emacs/manual/html_mono/gnus.htm...

1: https://www.gnus.org/manual.html

ruhith today at 11:56 AM
The real punchline is that this is a perfect example of "just enough knowledge to be dangerous." Whoever processed these emails knew enough to know emails aren't plain text, but not enough to know that quoted-printable decoding isn't something you hand-roll with find-and-replace. It's the same class of bug as manually parsing HTML with regex, it works right up until it doesn't, and then you get congressional evidence full of mystery equals signs.
tiborsaas today at 11:09 AM
> We see that that’s a quite a long line. Mail servers don’t like that

Why do mail server care about how long a line is? Why don't they just let the client reading the mail worry about wrapping the lines?

heikkilevanto today at 11:42 AM
I thought the article would be about the various meanings of operators like = == === .=. <== ==> <<== ==>> (==) => =~=
TazeTSchnitzel today at 3:23 PM
The most interesting thing to me wasn't the equals signs, which I knew are from quoted-printable, but the fact that when an equals sign appears, a letter that should have been preceding or following it is missing. It's as if an off-by-one error has occurred, where instead of getting rid of the equals sign, it's gotten rid of part of the actual text. Perhaps the CRLF/LF thing is part of it.
xg15 today at 12:22 PM
I'm just wondering why this problem shows up now. Why do lots of people suddenly post their old emails with a defective QP decoder?

> For some reason or other, people have been posting a lot of excerpts from old emails on Twitter over the last few days.

On the risk of having missed the latest meme or social media drama, but does anyone know what this "some reason or other" is?

Edit: Question answered.

thedanbob today at 12:19 PM
I wrote my own email archiving software. The hardest part was dealing with all the weird edge cases in my 20+ year collection of .eml files. For being so simple conceptually, email is surprisingly complicated.
beejiu today at 10:44 AM
> So what’s happened here? Well, whoever collected these emails first converted from CRLF (i.e., “Windows” line ending coding) to “NL” (i.e., “Unix” line ending coding). This is pretty normal if you want to deal with email. But you then have one byte fewer:

I think there is a second possible conclusion, which is that the transformation happened historically. Everyone assumes these emails are an exact dump from Gmail, but isn't it possible that Epstein was syncing emails from Gmail to a third party mail server?

Since the Stackoverflow post details the exact situation in 2011, I think we should be open to the idea that we're seeing data collected from a secondary mail server, not Gmail directly.

Do we have anything to discount this?

(If I'm not mistaken, I think you can also see the "=" issue simply by applying the Quoted-Printable encoding twice, not just by mishandling the line-endings, which also makes me think two mail servers. It also explains why the "=" symbol is retained.)

maartin0 today at 12:29 PM
Fun how the archive.today article near the top has this exact issue

https://pastes.io/correspond

https://news.ycombinator.com/item?id=46843805

JKCalhoun today at 1:44 PM
(The title of the blog reminded me the late Bob Pease [1] who had the signature, "What's all this XXX stuff, anyhow?" [2] where XXX might be "noise gain", "capacitor leakage"…)

[1] https://en.wikipedia.org/wiki/Bob_Pease

[2] https://www.qsl.net/n9zia/pease/index.html

deleted today at 2:32 PM
jojomodding today at 10:00 AM
lordnacho today at 9:52 AM
I love how HN always floats up the answers to questions that were in my mind, without occupying my mind.

I, too, was reading about the new Epstein files, wondering what text artifact was causing things to look like that.

quibono today at 10:19 AM
CLRF vs LF strikes again. Partly at least.

I wonder why even have a max line length limit in the first place? I.e. is this for a technical reason or just display related?

ErigmolCt today at 4:39 PM
What's funny is that the failure mode here is so quietly destructive
voxelghost today at 12:07 PM
My main takeaway from this article, is that I want to know what happened to the modified pigs with non-cloven hoofs
lucb1e today at 12:07 PM

    cat title | sed 's/anyway/in email/'
would save a click for those already familiar with =20 etc.
noduerme today at 11:24 AM
Great. Can't wait for equal signs to be the next (((whatever this is))). Maybe it's a secret code. j/k

On a side note: There are actually products marketed as kosher bacon (it's usually beef or turkey). And secular Jews frequently make jokes like this about our kosher bros who aren't allowed to eat the real stuff for some dumb reason like it has too many toes.

MarginalGainz today at 1:24 PM
"It’s a fascinating case of 'Abstraction Leak'.

We’ve become so accustomed to modern libraries handling encoding transparently that when raw data surfaces (like in these dumps), we often lack the 'Digital Archeology' skills to recognize basic Quoted-Printable.

These artifacts (=20, =3D) are effectively fossils of the transport layer. It’s a stark reminder that underneath our modern AI/React/JSON world, the internet is still largely held together by 7-bit ASCII constraints and protocols from the 1980s.

seydor today at 10:18 AM
TLDR "=\r\n" was converted to "=\n"
VoodooJuJu today at 1:27 PM
[dead]
ValveFan6969 today at 2:24 PM
[flagged]
brador today at 11:14 AM
Could be worsened by inaccurate optical character recognition in some cases.

Back in those days optical scanners were still used.

zabzonk today at 12:04 PM
People posting Excel formulae?
ccppurcell today at 11:19 AM
Rock dots? You mean diacritics? Yeah someone invented them: the ancient Greeks, idiöt.