Python

6405 readers

10 users here now

Welcome to the Python community on the programming.dev Lemmy instance!

📅 Events

Past

November 2023

PyCon Ireland 2023, 11-12th
PyData Tel Aviv 2023 14th

October 2023

PyConES Canarias 2023, 6-8th
DjangoCon US 2023, 16-20th (!django 💬)

July 2023

PyDelhi Meetup, 2nd
PyCon Israel, 4-5th
DFW Pythoneers, 6th
Django Girls Abraka, 6-7th
SciPy 2023 10-16th, Austin
IndyPy, 11th
Leipzig Python User Group, 11th
Austin Python, 12th
EuroPython 2023, 17-23rd
Austin Python: Evening of Coding, 18th
PyHEP.dev 2023 - "Python in HEP" Developer's Workshop, 25th

August 2023

PyLadies Dublin, 15th
EuroSciPy 2023, 14-18th

September 2023

PyData Amsterdam, 14-16th
PyCon UK, 22nd - 25th

🐍 Python project:

💓 Python Community:

#python IRC for general questions
#python-dev IRC for CPython developers
PySlackers Slack channel
Python Discord server
Python Weekly newsletters
Mailing lists
Forum

✨ Python Ecosystem:

🌌 Fediverse

Communities

#python on Mastodon
c/django on programming.dev
c/pythorhead on lemmy.dbzer0.com

Projects

Pythörhead: a Python library for interacting with Lemmy
Plemmy: a Python package for accessing the Lemmy API
pylemmy pylemmy enables simple access to Lemmy's API with Python
mastodon.py, a Python wrapper for the Mastodon API

Feeds

founded 1 year ago

MODERATORS

[email protected]

Interactive TUI app with 100+ Python re(gex)? exercises (github.com)

submitted 4 months ago by [email protected] to c/[email protected]

3 comments fedilink hide all child comments

I wrote a TUI application to help you practice Python regular expressions. There are more than 100 exercises covering both the builtin re and third-party regex module.

If you have pipx, use pipx install regexexercises to install the app. See the repo for source code and other details.

you are viewing a single comment's thread
view the rest of the comments

[–] alyth 1 points 4 months ago (2 children)

Thanks for sharing this. I took the time to read through the documentation of the re module. Here's my review of the functions.

Useful:

re.finditer returns an iterator over all Match objects
re.search returns the first Match object or None if there are no matches.
r'' use raw strings for patters so you don't have to worry about backslashes
the optional flags argument modifies the behaviour (case insensitive, multiline)

Utility:

re.sub replace each match in the string
re.split split a string by a regular expression

The Match object:

match.groups(0) returns the portion of text matched by the pattern
match.groups(1) returns the first capturing group
match.groups(2) returns the second capturing group, and so on

I don't understand why these exist:

re.match like search, but only matches at the beginning of the string. why not just use '^' or '\A' in the pattern you pass to 'search'?
re.fullmatch like 'search', but only if the full string matches. Why not just use '\A' and '\Z' in the pattern you pass to 'search'?
re.findall Returns all matches. It seems like a shitty version of 'finditer'. The function has three different return types which depend on the pattern you pattern you pass to the function. Who wants to work with that?

[–] [email protected] 4 points 4 months ago (1 children)

I would argue that having distinct match and search helps readability. The difference between match('((([0-9]+-[0-9]+)|([0-9]+))[,]?)+[^,]', s) and search('((([0-9]+-[0-9]+)|([0-9]+))[,]?)+[^,]', s) is clear without the need for me to parse the regular expression myself. It also helps code reuse. Consider that you have PHONE_NUMBER_REGEX defined somewhere. If you only had a method to "search" but not to "match", you would have to do something like search(f"\A{PHONE_NUMBER_REGEX}\Z", s), which is error-prone and less readable. Most likely you would end up having at least two sets of precompiled regex objects (i.e. PHONE_NUMBER_REGEX and PHONE_NUMBER_FULLMATCH_REGEX). It is also a fairly common practice in other languages' regex libraries (cf. [1,2]). Golang, which is usually very reserved in the number of ways to express the same thing, has 16 different matching methods[3].

Regarding re.findall, I see what you mean, however I don't agree with your conclusions. I think it is a useful convenience method that improves readability in many cases. I've found these usages from my code, and I'm quite happy that this method was available[4]:

digits = [digit_map[digit] for digit in re.findall("(?=(one|two|three|four|five|six|seven|eight|nine|[0-9]))", line)]
[(minutes, seconds)] = re.findall(r"You have (?:(\d+)m )?(\d+)s left to wait", text)

[1] https://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html

[2] https://en.cppreference.com/w/cpp/regex

[3] https://pkg.go.dev/regexp

[4] https://github.com/search?q=repo%3Ahades%2Faoc23%20findall&type=code

[–] alyth 3 points 4 months ago

Thank you for the very thorough reply! This is kind of high quality stuff you love to see on Lemmy. Your use cases seem very valid.