this post was submitted on 18 Dec 2024

229 points (97.5% liked)

Programmer Humor

19809 readers

1115 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

Keep content in english
No advertisements
Posts must be related to programming or programmer topics

founded 2 years ago

MODERATORS

[email protected]

229

"AI is gonne take our jobs." The AI: (lemmy.world)

submitted 1 day ago by Loner to c/[email protected]

54 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] cm0002 18 points 1 day ago* (last edited 1 day ago) (3 children)

Just outta curiosity:

Full o1 model

"\\id:\[^]]+\\\\[^]]+\\\"

Claude 3.5 Haiku:

Never used elisp, no idea of any of this is right lmao

[–] [email protected] 11 points 1 day ago (1 children)

Claude at least created an elisp function that looks ok

[–] cm0002 2 points 1 day ago

3.5 sonnet might do a lot better, idk I'm on the free plan with Claude lmao

[–] [email protected] 10 points 1 day ago* (last edited 1 day ago) (1 children)

o1 without Markdown misformatting:

\\id:\\[^]]+\\\\\[^]]+\\\

No idea what the rectangles are supposed to be, I just copy-pasted it

[–] marcos 4 points 1 day ago (1 children)

They are valid unicode points that your font doesn't know about.

... or at least they represent that, but I think there's a character that looks like one too.

[–] [email protected] 3 points 1 day ago

It's U+E001 from a Private Use Area. The UnicodePad app renders it as something between 鉮 and 鋁 (separate boxes stricken through; I wasn't able to find it even with Google Lens)

[–] Skullgrid -1 points 1 day ago (3 children)

I swear to god,someone must have written an intermediary language between regex and actual programming, or I'm going to eventaully do it before I blow my fucking brains out.

[–] [email protected] 1 points 19 hours ago* (last edited 19 hours ago)

Any self-respecting language (that is, a functional one) has something like this. Here are some libraries: Haskell OCaml elisp

[–] BassTurd 6 points 1 day ago (4 children)

How do you think that would look? Regex isn't particularly complicated, just a bit to remember. I'm trying to picture how you would represent a regex expression in a higher level language. I think one of its biggest benefits is the ability to shove so much information into a random looking string. I suppose you could write functions like, startswith, endswith, alpha(4), or something like that, but in the end, is that better?

[–] [email protected] 1 points 17 hours ago* (last edited 17 hours ago)

Like any other set of parsing combinators just limited to regular grammars. Most if not even all such libraries already contain everything to express the regular subset as it's actually quite useful.

It's certainly more readable once you get past trivial stuff.

Random example: Here's nom's Kleene star, many0.

[–] [email protected] 6 points 1 day ago (2 children)

People have unironically done that. No, it isn't better. The fundamental mental model is the same.

[–] [email protected] -1 points 19 hours ago (1 children)

I honestly think it can be a lot more readable, especially when the regex would have been in the thousands of characters.

[–] [email protected] -1 points 18 hours ago (1 children)

There's a built-in feature that Perl has that only a few of the languages claiming PCRE have actually done, and it makes things a lot more readable. The /x modifier lets you put in whitespace and comments. That alone helps a lot if you stick to good indentation practices.

If all other code was written like an obfuscated C contest, it would be horrible. For some reason, we put up with this on regex, and we don't have to.

https://wumpus-cave.net/post/2022/06/2022-06-06-how-to-write-regexes-that-are-almost-readable/index.html

[–] [email protected] 0 points 16 hours ago (1 children)

I agree, but then there's also some other niceties that come from expression parsers in the language itself (as noted in the article): syntax highlighting, LSP, a more complete AST for editors like helix.

[–] [email protected] 0 points 16 hours ago (1 children)

Syntax highlighting works fine as long as your language has a way to distinguish regexes from common strings. Another place where Perl did it right decades ago and the industry ignored it.

[–] [email protected] 1 points 16 hours ago (1 children)

Nah, the language itself should be as simple as possible. Bloating it with endless extensibility and features is exactly what makes Perl a write-only language in many cases and why it is becoming less and less relevant with time.

[–] [email protected] 1 points 16 hours ago

Except it has some really good ideas that should be copied. There are other languages that have a syntax for denoting regex, such as ~r'foo' in Elixir. This gets the syntax highlighting you need without a big addition to the language.

[–] Skullgrid 2 points 1 day ago (1 children)

I want to see their unironic attempts, maybe they're useful to me at least if they're not better.

The fundamental mental model is the same.

It's not the fundemental model that I have a problem with for Regex, it's the fucking brainfuck tier syntax

[–] [email protected] 5 points 1 day ago (1 children)

Here's one example

[–] Skullgrid 1 points 20 hours ago

fucking SOLD

[–] [email protected] 1 points 23 hours ago (1 children)

string.contains("something")

Just do that repeatedly

[–] BassTurd 1 points 19 hours ago

The "something" is where the regex goes. For simple cases contains by itself does just fine, but for almost anything kind of dynamic input, it's going to not be capable of what regex does.

[–] Skullgrid 4 points 1 day ago (1 children)

I suppose you could write functions like, startswith, endswith, alpha(4), or something like that,

yes.

but in the end, is that better?

YES.

startswith('text');
lengthMustBe(5);
onlyContain(CHARSETS.ALPHANUMERICS); 
endswith('text');

is much more legible than []],[.<{}>,]'text'[[]]][][)()(a-z,0-9){}{><}<>{}'text'{}][][

[–] BassTurd 6 points 1 day ago (1 children)

Assuming "text" in your example is a placeholder for a 5 digit alpha string, it can be written like this in regex: /[a-zA-Z0-9]{5}/

If ”text" is literal, then your statement is impossible.

I think that when it gets to more complex expressions like a phone number with country code that accepts different formats, the verbosity of a higher level language will be more confusing, or at least more difficult to take in quickly.

[–] [email protected] 3 points 18 hours ago

Exactly. It's a lot like Java to me. Looks readable on the surface, but it's actually adding a bunch of crap you don't need and does not help anything.

They also have to implement a long list of features. These projects tend to focus on the handful of features the authors specifically use, and the rest get sent by the wayside. Taking the Melody language that was mentioned in another message, it hasn't even fully implemented [^A] or [abc]. We're not even talking about somewhat obscure stuff like zero width assertions or lookaheads. These are very basic.

[–] marcos 1 points 1 day ago

intermediary language between regex and actual programming

It's called Haskell.