this post was submitted on 16 Aug 2023
901 points (97.0% liked)

xkcd

8980 readers
229 users here now

A community for a webcomic of romance, sarcasm, math, and language.

founded 2 years ago
MODERATORS
901
submitted 1 year ago* (last edited 3 months ago) by [email protected] to c/xkcd
 

Title text: The heartfelt tune it plays is CC licensed, and you can get it from my seed on JoinDiaspora.net whenever that project gets going.


Transcript2003:

[Cueball approaches a bearded fellow.]

Cueball: Did you get my essay?
Bearded Fellow: Yeah, it was good! But it was a .doc; You should really use a more open-
Cueball: Give it a rest already. Maybe we just want to live our lives and use software that works, not get wrapped up in your stupid nerd turf wars.
Bearded Fellow: I just want people to care about the infrastructures we're building and who-
Cueball: No, you just want to feel smugly superior. You have no sense of perspective and are probably autistic.

2010:

Cueball: Oh my God! We handed control of our social world to Facebook and they're DOING EVIL STUFF!
Bearded Fellow: Do you see this?

[Inset, the bearded fellow rubs his index and middle fingers against his thumb.]

Bearded Fellow: It's the world's tiniest open-source violin.


you are viewing a single comment's thread
view the rest of the comments
[–] Potatos_are_not_friends 162 points 1 year ago (5 children)

Hot take but PDFs became the primary form of document transfer because Microsoft made .doc, docx, docm, rtf, doc 2003-2020...

All those "It won't open" just forced everyone to say "Fuck it send me the PDF"

[–] [email protected] 65 points 1 year ago (1 children)

Pretty much. PDF was specifically designed to retain the same look across any device. The goal was that if you designed a document to look a certain way, that opening it on another device wouldn’t fuck your entire design. That’s also why editing PDFs is so damned frustrating, because they’re designed to not change. It largely started as a frustration with the “move an image 3 pixels to the left, and now all your text is in the wrong place” issue. But the EEE strategy by Microsoft directly contributed to pdf becoming the de facto way to share documents.

[–] Darkmuch 30 points 1 year ago* (last edited 1 year ago) (1 children)

My Dad got frustrated with docs as people saw that as an invitation to edit the document, or cut and paste stuff he would write. So he switched to using PDF whenever other people got involved.

[–] [email protected] 26 points 1 year ago

Yeah, that’s ironically what Microsoft has been moving towards. Collaborative editing is incredible when used properly. But that also means anyone with edit access can mess up your carefully crafted document. Luckily, things like Comments are becoming more commonplace, so people can suggest edits without actually being able to commit them.

It doesn’t solve the copy/pasting issue, but you can copy/paste from PDFs these days anyways. Realistically, even saving it as an image won’t solve that, since most devices can recognize text in images now.

[–] overzeetop 50 points 1 year ago (3 children)

Well, that and every time you touch a DOC/DOCX file it reformats itself to your local settings, fucking up the entire layout. PDF is a terrible, inefficient, poorly (or at least variably) implemented format which was proprietary for two decades but is now about the best option we have for a document to look the same at the recipient end as the sender and still include text, vector, bitmapped, semi-interactive, and certifiable/traceable contents.

[–] [email protected] 35 points 1 year ago (1 children)

I really, really hate that so many people still try to share ebooks as PDFs. Why that was ever a thing makes no sense to me. Yes, I absolutely wish to read a 500 page novel on portrait letter size pages with tiny font that completely ignores my screen size.

[–] [email protected] 8 points 1 year ago (1 children)

I've given up on trying to find certain books in sane formats. Thankfully Calibre is really good at converting PDFs to actual ebook formats.

There's a bit of a learning curve, and sometimes I have to do a little semi-automated cleanup -- but it works.

[–] [email protected] 4 points 1 year ago (1 children)

Really? I must have had a particularly troublesome PDF. It was almost like running it through OCR, generating hundreds of weird typos and formatting errors when I tried to convert with calibre.

[–] [email protected] 3 points 1 year ago (1 children)

The OCR struggles with some PDFs for whatever reasons: font, formatting, etc.

There are 3rd party PDF OCR websites/programs that work better. If I'm having issues I run it through one of those first.

[–] [email protected] 2 points 1 year ago (1 children)

Any suggestions? Even the good ones had error rates that might not matter for a couple of pages, but when scaled to a 500 page book, even a 1% error rate results in an annoying level of typos.

[–] [email protected] 1 points 1 year ago

I use gImageReader + Tesseract, but that probably doesn't meet your criteria. Unfortunately OCR is very rarely perfect unless the input is perfectly clear and with a "OCR friendly" font/formatting. There are "AI powered" OCR out there, but I can't speak to how well they work and I don't know of any free ones.

[–] FrullaPapaya 4 points 1 year ago (3 children)

What are more efficiente and better implemented formats for documents sharing?

[–] MajorHavoc 6 points 1 year ago (1 children)

Markdown is gaining traction. There's lots of tools that will edit and display Markdown consistently, and without a dedicated tool, it's just a very readable text file.

And, most importantly for today, it's easy to generate a PDF file from, haha.

[–] TAG 2 points 1 year ago (1 children)

It produces a very readable text file, but not necessarily the one I meant to send. It is good for capturing text, reasonable at formatting, and has no notion of layout. For example, when I send a resume, I format it so that it is compact (to fit in 2 pages, since some people care about that) yet readable (and skimable).

[–] MajorHavoc 1 points 1 year ago

Great points.

I generate my resume from Markdown, but I use a special CSS file I created so that the final PDF has the layout I want. Which is not a trick must Markdown editors can do yet.

[–] [email protected] 3 points 1 year ago

Djvu, but it's toolset is proprietary.

[–] overzeetop 3 points 1 year ago

TIFF, but the constraints are pretty sever and text must be ocr’d.

[–] [email protected] -3 points 1 year ago* (last edited 1 year ago) (3 children)

I don't get why it always must look the same. If i look at Markdown or Asciidoc/tor, Restext, you get content and formatting. Pack it in a tar.gz and create a directory structure for pages and media, etc. and it would imho suffice. And i would gladly see document X in my prefered font size and family instead of creators favorite.

I mean, i get it for typesetting etc. But not for common use.

[–] overzeetop 14 points 1 year ago (2 children)

You don’t want to get an architectural plan, a marketing brochure, a newsletter, a corporate report, a tax form, or any type of legal contract that way.

If you’re just sending text and don’t need formatting, send it as a txt file. If you need formatting preserved - especially for someone who isnt an expert in your field - you want it formatted properly.

[–] MajorHavoc 1 points 1 year ago* (last edited 1 year ago) (1 children)

Most environments will correctly format a Markdown document without any trouble now if sending it to a co-editor.

If it needs to be tamper resistant, it's easily converted to PDF.

What's not especially easy, today, is adding advanced styling (like a watermark) to Markdown, since Markdown itself has no provision for it. I accomplish that through a connected CSS file, but that's a bit of an advanced move.

[–] overzeetop 3 points 1 year ago (1 children)

Most environments

See, that's the issue that PDF serves due to its ubiquity. You say "environments" like my mother can pull up a markdown version of a recipe and print it out. Tons of stuff gets sent to people who have no idea what markdown is or how to open it in an appropriate reader. Windows, for example, doesn't know how to open a .md file, even if the recipient could figure out why they got a zip file with a bunch of randomly (or specifically) labeled parts. Edge will render a PDF in a default windows installation and Safari will do the same in a default OSX install (IIRC); no zips, no extra files, all neatly packed into one.

It's usually not ideal for communication between people with experience in whatever field is being discussed. I'd rather get a plan in DWG format if it's a building design, or in Word if it's a written document I'll need to edit or reformat. With the exception of an exclusively-text document like an ebook that I'd like to re-flow to a myriad of devices, PDF is the digital form which is the most universal for anything I would previously have requested in dead tree format.

[–] MajorHavoc 1 points 1 year ago

Your mother really can open and print a Markdown file now. It has come that far.

But I totally agree with your core point. The gulf between "most environments" and "everywhere" is still a big deal.

That said, for those who hate creating PDF files, I know of a great pure text format that converts very nicely to PDF.

[–] [email protected] -2 points 1 year ago* (last edited 1 year ago)

architectural plan, a marketing brochure, a newsletter

I don't want that in PDF anyway. Give me plans in vector graphics or at least TIFF. Newsletter and co. is up to the RSS reader. Oops.

a corporate report, a tax form, or any type of legal contract that way.

Sure, why not? Is the representation legally important or the content?

If you’re just sending text and don’t need formatting, send it as a txt file. If you need formatting preserved - especially for someone who isnt an expert in your field - you want it formatted properly.

There's something called Lightweight Markup which preserves formatting but leaves presentation up to the user/default settings. I mentioned them in my original comment.

[–] Zron 9 points 1 year ago (3 children)

Yes, let’s allow the end user to apply their custom font to their tax documents and employee contracts

[–] MajorHavoc 0 points 1 year ago

Good point. Markdown is easily turned into PDF for that use case.

[–] [email protected] -2 points 1 year ago* (last edited 1 year ago)

What i say is, why save something like font family in the document.

What are you all so stiff on legal documents? Depends maybe on your juristiction, but my (swiss) employee contract was e-mailed to me as a scan. I put a scan of my sign in and sent it back, informed my employee and that was fine. Sure, a certificate to sign would be more practical, but we are not there yet.

[–] [email protected] -3 points 1 year ago

Change it back if you don't like it? If everyone gets to set the fonts locally then everyone gets to use their favourite.

[–] TAG 1 points 1 year ago

Markdown is a bit limited (the spec doesn't cover common extensions like tables of contents, internal links, and explicit page breaks). AsciiDoc is better on that issue.

The only use case I have for being picky about the formatting/layout of a document is my resume. Some people have a threshold for how long a resume is allowed to be (for example 1 additional page per 10 years of experience). Also, I have all of the dates right justified (for easy skimming) but still on the same line as the job title (to save space on the page).

[–] geekworking 18 points 1 year ago

Yes and No.

They were really designed to show the same output on the screen and printer.

Even if you are using the same word processor software and file format, a document can look vastly different when you send it to someone else who doesn't have the same screen resolution or the same fonts installed.

PDF started as just a print preview for the postscript printer language. They should have just stopped there instead of trying to make it do all sorts of other shit that can open security holes.

The constant parade of file formats drove popularity, but it was really about being the only popular format to look the same.

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (1 children)

Instead of using .odt.

Maybe with more advertising? Most people don't know about the Open Document Format and that it's standardization sent MS to panicky rework their .doc & co. to pseudo-open OOXML (.docx etc.).

[–] [email protected] 11 points 1 year ago (1 children)

When you save an odt from Word and open it in OpenOffice, the formatting is usually all fucked. At least that used to be the case. A pdf comes out right on the other side.

[–] okamiueru 20 points 1 year ago (1 children)

It's intentionally fucked by MS. It doesn't matter that this non-MS software actually follows consistent standards. As long as its only the minority, they get away with it looking like it's the others not being consistent.

MS has a history of doing it. It's in the company ethos of "embrace, extend and extinguish". Imagine something as simple as storing the contents of a document being at the behest of a private company. Humanity is all the worse for it.

[–] [email protected] 3 points 1 year ago

DOS isn't done until Lotus won't run.

[–] [email protected] 3 points 1 year ago (2 children)

Except for my local printing shop, which couldn't print my PDF poster for some reason so now they are asking for a PPT. WTF!

[–] themeatbridge 6 points 1 year ago

There's someone at your local print shop unqualified to be doing their job.

[–] [email protected] 3 points 1 year ago

My local print shop takes only PDF. I hand them a PNG and they say no.

It's the second most common format on the planet. WTF