Given an input c
, outputs the number of distinct lists of strings lst
such that:
''.join(lst) == c
- for every string
s
inlst
,s
consists of an arbitrary character followed by one or more characters from '0123456789'
Sure hope I didn't mess this up, because I think the fundamental idea is quite elegant! Should run successfully for all "reasonable" inputs (as in, the numeric output fits in a uint64 and the input isn't ridiculously long). Fundamental algorithm is O(n) if you assume all arithmetic is O(1). (If not, then I don't know what the time complexity is, and I don't feel like working it out.)
from functools import cache
from itertools import pairwise
from math import prod
@cache
def fibonacci(n: int) -> int:
if n == 0:
return 0
if n == 1:
return 1
return fibonacci(n - 1) + fibonacci(n - 2)
def main(compressed: str) -> int:
is_fragment_start = [i == 0 or c not in '0123456789' for i, c in enumerate(compressed)]
fragment_start_positions = [i for i, s in enumerate(is_fragment_start) if s]
fragment_lengths = [stop - start for start, stop in pairwise(fragment_start_positions + [len(compressed)])]
return prod(fibonacci(fragment_length - 1) for fragment_length in fragment_lengths)
if __name__ == '__main__':
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('compressed')
print(main(parser.parse_args().compressed))
Idea: a00010 -> [a000, 10] -> [length 4, length 2] -> F(4) * F(2)
01a102b0305 -> [01, a102, b0305] -> [length 2, length 4, length 5] -> F(2) * F(4) * F(5)
where F(n) = fibonacci(n - 1) is the number of ways to partition a string of length n into a list of strings of length ≥2.
F(2) = 1 = fibonacci(1), F(3) = 1 = fibonacci(2), and F(n) = F(n - 2) + F(n - 1), so F is indeed just an offset version of the Fibonacci sequence.
To see why F(n) = F(n - 2) + F(n - 1), here are the ways to split up 'abcde': ['ab'] + (split up 'cde'), ['abc'] + (split up 'de'), and ['abcde'], corresponding to F(5) = F(3) + F(2) + 1.
And the ways to split up 'abcdef': ['ab'] + (split up 'cdef'), ['abc'] + (split up 'def'), ['abcd'] + (split up 'ef'), and ['abcdef'], corresponding to F(6) = F(4) + F(3) + F(2) + 1 = F(4) + F(5) = F(6 - 2) + F(6 - 1).
The same logic generalizes to all n >= 4.
My implementation is memoized by
functools.cache
, but that is a concern when it comes to recursive Fibonacci. That, and stack overflows, which are also a problem for my code (but, again, not for "reasonable" inputs -- fibonacci(94) already exceeds 2^64).Time complexity-wise, I was more thinking about the case where the numbers get so big that addition, multiplication, etc. can no longer be modelled as taking constant time. Especially if
math.prod
andenumerate
are implemented in ways that are less efficient for huge integers (I haven't thoroughly checked, and I'm not planning to).