Question
Nested named regex groups: how to maintain the nested structure in match result?
A small example:
import re
pattern = re.compile(
r"(?P<hello>(?P<nested>hello)?(?P<other>cat)?)?(?P<world>world)?"
)
result = pattern.match("hellocat world")
print(result.groups())
print(result.groupdict() if result else "NO RESULT")
produces:
('hellocat', 'hello', 'cat', None)
{'hello': 'hellocat', 'nested': 'hello', 'other': 'cat', 'world': None}
The regex match result returns a flat dictionary, rather than a dictionary of dictionaries that would correspond with the nested structure of the regex pattern. By this I mean:
{'hello': {'nested': 'hello', 'other': 'cat'}, 'world': None}
Is there a "built-in" (i.e. something involving details of what is provided by the re
module) way to access the match result that does preserve the nesting structure of the regex? By this I mean that the following are not solutions in the context of this question:
- parsing the regex pattern myself to determine nested groups
- using a data structure that represents a regex pattern as a nested structure, and then implementing logic for that data structure to match against a string as if it were a "flat" regex pattern.