this post was submitted on 26 Mar 2024
36 points (95.0% liked)

Ask Lemmy

26985 readers
2403 users here now

A Fediverse community for open-ended, thought provoking questions

Please don't post about US Politics. If you need to do this, try [email protected]


Rules: (interactive)


1) Be nice and; have funDoxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them


2) All posts must end with a '?'This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?


3) No spamPlease do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.


4) NSFW is okay, within reasonJust remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either [email protected] or [email protected]. NSFW comments should be restricted to posts tagged [NSFW].


5) This is not a support community.
It is not a place for 'how do I?', type questions. If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email [email protected]. For other questions check our partnered communities list, or use the search function.


Reminder: The terms of service apply here too.

Partnered Communities:

Tech Support

No Stupid Questions

You Should Know

Reddit

Jokes

Ask Ouija


Logo design credit goes to: tubbadu


founded 1 year ago
MODERATORS
 

My goal here is mostly to read articles, but sometimes I save the website for archive purposes. Which one do you think I should choose?(Sorry for my bad English I am not a native speaker)

all 20 comments
sorted by: hot top controversial new old
[–] [email protected] 33 points 8 months ago (1 children)

In my experience, I can never fully archive HTML properly. There's so many associated files to go along with it now a days that I usually end up with a broken page or stuff missing. PDF at least gives you a self contained snapshot.

[–] [email protected] 5 points 8 months ago (1 children)
[–] kamenlady 3 points 8 months ago

This looks nice - thanks

[–] [email protected] 18 points 8 months ago

There's an addon for Firefox called SingleFile which lets you save a page as an HTML file but also includes all images, formatting, etc. It might be available for other browsers, but I'm not sure.

[–] stanleytweedle 15 points 8 months ago (1 children)

Saving the website will be smaller so if you're doing a lot of archiving that'll save space. Also probably easier to find HTML renderers on weird platforms than PDF. Full disclosure I hate Adobe so I'm not unbiased, but HTML still has advantages.

[–] [email protected] 3 points 8 months ago (2 children)

Well thank you for answering my question but I think there is a thing called odt if Adobe is problem for you I think you can convert your pdf's into odt but overall thank you

[–] [email protected] 4 points 8 months ago

Firefox opens PDFs now

[–] stanleytweedle 3 points 8 months ago (1 children)

Yeah I have lots of options for my own use, but I hate adobe because they've infected every workforce I've been anywhere near with the idea that PDFs are gods will. And I'm commonly the one that has to interpret gods will for the congregation.

[–] [email protected] 4 points 8 months ago

Valid reason not supporting monopoly

[–] reddig33 8 points 8 months ago

Use “Reader” mode in the browser, then print that to PDF.

[–] tom42 7 points 8 months ago* (last edited 8 months ago)

If you just want to save the text to read later, go with HTML.

If you want to archive it with graphical elements and embedded images, PDF is the better choice.

[–] [email protected] 6 points 8 months ago

I'd say html. Websites don't translate well to pdf and and pdf is a hellish format that cannot be modified after the fact

[–] j4k3 6 points 8 months ago (1 children)

If I only plan to read or view the file myself, I save PDF. If I need formated text extraction, save the page.

[–] [email protected] 4 points 8 months ago

I appreciate for your help

[–] [email protected] 2 points 8 months ago

PDF would likely be more useful unless you take extra care with copying the website using a crawler.

[–] [email protected] 2 points 8 months ago

@[email protected] that is a good question! I would say as HTML because it is easier to do post processing (e.g., extract), but you will probably lose the layout (libraries and css will go 404, etc). If the amount is not too large, why not both?

[–] [email protected] 1 points 8 months ago

I don't use it and inthibibthere might be some privacy concerns but I think Firefox bought pocket. It's useful for just this purpose. You bookmark (pocket) a webpage for later reading and it syncs it to your devices in a readable format.

However, to more directly answer your question, it will completely depend on your use csse. Either should work but pdf will be more reliable.

[–] [email protected] 1 points 8 months ago (1 children)

On desktop all browsers should be able to save Websites as HTML or PDF. Firefox on Android also offers "printing" sites to PDF.

[–] [email protected] 2 points 8 months ago (1 children)

Yes but I am still confused which one is better option