Programmer Humor

32710 readers

80 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

Posts must be relevant to programming, programmers, or computer science.
No NSFW content.
Jokes must be in good taste. No hate speech, bigotry, etc.

founded 5 years ago

MODERATORS

[email protected]

359

You can always rely on grandma (sh.itjust.works)

submitted 2 years ago by [email protected] to c/[email protected]

30 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 28 points 2 years ago (2 children)

Fun fact: If you google those codes you find out that they are "real" codes, but they don't actually activate Windows. I think they are something that are used as placeholders in the upgrade from Windows 8 to 10 or something, but don't know the specifics.

ChatGPT actually can't create new "words", just regurgitate words that it's seen somewhere before!

[–] [email protected] 19 points 2 years ago (3 children)

Sure it can create new words. It can't create new tokens, would be more correct, I think. But a token is just a text fragment, and, as far as I know, they can range from being several words to being single characters.

[–] average650 6 points 2 years ago

I got it it say vilumplox. It doesn't return any Google search results.

[–] [email protected] 3 points 2 years ago

If it was Windows 95 it could generate them
https://www.youtube.com/watch?v=cwyH59nACzQ&t=306s

[–] [email protected] 1 points 2 years ago

Tokens are usually never multiple words. Think of them like "information units". If you have a plural word "hats", there are two tokens: the "hat" one and the "s' that adds more information about it being plural. Combinations of words only really occur for proper names.

[–] [email protected] 10 points 2 years ago (1 children)

Yep yep, statistical analysis as to the frequency of tokens in the training text.

Brand new, never-before-seen Windows keys have a frequency of zero occurrences per billion words of training data.

[–] average650 6 points 2 years ago (1 children)

That isn't actually what's important. It's the frequency of the token, which could be as simple as single characters. The frequency of those is certainly not zero.

LLMs absolutely can make up new words, word combinations, or sentences.

That's not to say chatgpt can actually give you good windows keys, but it isn't a fundamental limitation of LLMs.

[–] [email protected] 0 points 2 years ago (2 children)

Okay, I'll take your word for it.

I've never ever, in many hours of playing with ChatGPT as a toy, had it make up a word. Hallucinate wildly, yes, but not stogulate a word out of nothing.

I'd love to know more, though. How does it combine new words? Do you have any examples of words ChatGPT has made up? This is fascinating to me, as it means the model is much less chained to the training data than I thought.

[–] [email protected] 1 points 2 years ago

A lot of compound words are actually multiple tokens so there's nothing stopping the LLM from generating the tokens in a new order thereby creating a new word.

[–] [email protected] 1 points 2 years ago (1 children)

It can create new words, I just verified this. First word it gave me: flumjangle. Google gives me 0 results. Maybe Google is missing something and it exists in some data out there, Idk.

I'm not sure what is so impressive about this though. Language models can string words together in unique ways, why would it be different for characters?

[–] [email protected] 4 points 2 years ago

I'm just surprised to hear that google hasn't found out about my flumjangles yet.