olalonde 5 days ago

As an aside, I’ve noticed that ChatGPT uses em dashes (“—”) quite frequently — much more often than I’m used to seeing on the web. It’s a bit surprising, considering it's largely trained on web-based data.

5
noirscape 5 days ago

It's pretty easy to explain; Apple devices turned the double dash (--) into an em dash by default. (Not sure if they still do.)

You probably won't encounter it in professional(-ish) writing very often (since that's premeditated and usually written on a computer, and most computer users are still on Windows), but in more informal situations (microblogs, text chat) most people don't really care as much to correct the behavior and nobody turns it off, meaning that the frequency of em dashes is unintentionally way higher than it probably should be.

mysterypie 5 days ago

Does anybody know why? How was ChatGPT able to develop a style that's so different from its training data?

probably_wrong 5 days ago

My personal theory is that they do it for watermarking purposes, which would also correlate well with the brown tint of their AI-generated images.

kiitos 5 days ago

em dashes aren't cool—you know what's cool? en dashes, like maybe 10–15 of them?

mmooss 5 days ago

I think you misued the em dash. It shouldn't separate two independent clauses (I think).

DonHopkins 5 days ago

I love side-bangs: dot-em-dash and em-dash-dot horizontal exclamation marks •— like aroused upside-down question mark parentheticals, thrust sideways at ±90° —• to cleave and erect an exciting, turgid sub-clause from an otherwise limp, boring sentence.

lieks 5 days ago

It also uses them correctly—with no spaces. I have never seen anyone do that on the web.

dragonwriter 5 days ago

Almost everyone I’ve seen using them on the web (myself included) does that. Very few people I’ve seen set them open.

(Lots of people use en-dashes set open instead of em-dashes set closed for the uses for which they are interchangeable as a matter of stylistic preference, though.)

strken 5 days ago

I believe this is specific to the US. Writing from other English-speaking countries often uses an en dash surrounded by spaces instead of an em dash.

latentsea 5 days ago

I didn't know that was the correct way to use them. It feels incorrect in a space delimited language. Interesting.

dragonwriter 5 days ago

English is not actually a space-delimited language; that's an approximation which is, in this case, throwing you off.

Punctuation is usually set closed on at least one, if not both, sides, though there are exceptions.

latentsea 4 days ago

Come to think of it, you're right. Hmm...

mmooss 5 days ago

I think this is misinformation, a red herring or stalking horse. It's also a bit anti-intellectual, as if people haven't been using em dashes for forever, long before LLMs existed.

olalonde 5 days ago

I'm fairly confident it's true. I just asked ChatGPT to generate a 1000 words comment for HN[0] and it used 15 em dashes. Now scroll through HN comments and count how many em dashes you encounter[1]. You can go multiple pages without encountering a single one.

[0] https://chatgpt.com/share/6836f23d-9b58-800b-8cea-0a86a58076...

[1] https://news.ycombinator.com/newcomments?next=44113996