parmancer

Parse text into structured data types with parser combinators.

Parmancer has type annotations for parsers and intermediate results. Using a type checker with Parmancer gives immediate feedback about parser result types, and gives type errors when creating invalid combinations of parsers.

Installation

pip install parmancer

Documentation

https://parmancer.com

Introductory example

This example shows a parser which can parse text like "Hello World! 1 + 2 + 3" to extract the name in Hello <name>! and find the sum of the numbers which come after it:

from parmancer import regex, digits, seq, string

# A parser which extracts a name from a greeting using a regular expression
greeting = regex(r"Hello (\w+)! ", group=1)

# A parser which takes integers separated by ` + `,
# converts them to `int`s, and sums them.
adder = digits.map(int).sep_by(string(" + ")).map(sum)

# The `greeting` and `adder` parsers are combined in sequence
parser = seq(greeting, adder)
# The type of `parser` is `Parser[tuple[str, int]]`, meaning it's a parser which
# will return a `tuple[str, int]` when it parses text.

# Now the parser can be applied to the example string, or other strings following the
# same pattern.
result = parser.parse("Hello World! 1 + 2 + 3")

# The result is a tuple containing the `greeting` result followed by the `adder` result
assert result == ("World", 6)

# Parsing different text which matches the same structure:
assert parser.parse("Hello Example! 10 + 11") == ("Example", 21)

Type checkers such as mypy and Pylance's type checker help during development by revealing type information and catching type errors.

Here the in-line types are displayed automatically with VSCode's Python extension and the 'Inlay Hints' setting:

Type annotations for Parmancer parsers

When the type of a parser doesn't match what's expected, such as in the following example, a type error reveals the problem as soon as the code is type checked, without having to run the code. In this example the Parser.unpack method is being used to unpack the result tuple of type (str, int) into a function which expects arguments of type (str, str) which is a type incompatibility:

Type mismatch for the unpack method

Dataclass parsers

A key feature of Parmancer is the ability to create parsers which return dataclass instances using a short syntax where parsers are directly associated with each field of a dataclass.

Each dataclass field has a parser associated with it using the take field descriptor instead of the usual dataclasses.field.

The entire dataclass parser is then combined using the gather function, creating a parser which sequentially applies each field's parser, assigning each result to the dataclass field it is associated with.

from dataclasses import dataclass
from parmancer import regex, string, take, gather

# Example text which a sensor might produce
sample_text = """Device: SensorA
ID: abc001
Readings (3:01 PM)
300.1, 301, 300
Readings (3:02 PM)
302, 1000, 2500
"""

numeric = regex(r"\d+(\.\d+)?").map(float)
any_text = regex(r"[^\n]+")
line_break = string("\n")


# Define parsers for the sensor readings and device information
@dataclass
class Reading:
    # Matches text like `Readings (3:01 PM)`
    timestamp: str = take(regex(r"Readings \(([^)]+)\)", group=1) << line_break)
    # Matches text like `300.1, 301, 300`
    values: list[float] = take(numeric.sep_by(string(", ")) << line_break)


@dataclass
class Device:
    # Matches text like `Device: SensorA`
    name: str = take(string("Device: ") >> any_text << line_break)
    # Matches text like `ID: abc001`
    id: str = take(string("ID: ") >> any_text << line_break)
    # Matches the entire `Reading` dataclass parser 0, 1 or many times
    readings: list[Reading] = take(gather(Reading).many())


# Gather the fields of the `Device` dataclass into a single combined parser
# Note the `Device.readings` field parser uses the `Reading` dataclass parser
parser = gather(Device)

# The result of the parser is a nicely structured `Device` dataclass instance,
# ready for use in the rest of the code with minimal boilerplate to get this far
assert parser.parse(sample_text) == Device(
    name="SensorA",
    id="abc001",
    readings=[
        Reading(timestamp="3:01 PM", values=[300.1, 301, 300]),
        Reading(timestamp="3:02 PM", values=[302, 1000, 2500]),
    ],
)

Dataclass parsers come with type annotations which make it easy to write them with hints from an IDE. For example, a dataclass field of type str cannot be associated with a parser of type Parser[int] - the parser has to produce a string (Parser[str]) for it to be compatible, and a type checker can reveal this while writing code in an IDE:

Dataclass field parser type error

Why use Parmancer?

  • Simple construction: Simple parsers can be defined concisely and independently, and then combined with short, understandable combinator functions and methods which replace the usual branching and sequencing boilerplate of parsers written in vanilla Python.
  • Modularity, testability, maintainability: Each intermediate parser component is a complete parser in itself, which means it can be understood, tested and modified in isolation from the rest of the parser.
  • Regular Python: Some approaches to parsing use a separate grammar definition outside of Python which goes through a compilation or generation step before it can be used in Python, which can lead to black boxes. Parmancer parsers are defined as Python code rather than a separate grammar syntax.
  • Combination features: The parser comes with standard parser combinator methods and functions such as: combining parsers in sequence; matching alternative parsers until one matches; making a parser optional; repeatedly matching a parser until it no longer matches; mapping a parsing result through a function, and more.
  • Type checking: Parmancer has a lot of type information which makes it easier to use with IDEs and type checkers.
  • Debug mode: Built-in debug mode (parser.parse(text, debug=True)) provides detailed parse tree visualization including failures to help understand and fix parsing issues.

Parmancer is not for creating performant parsers, its speed is similar to other pure Python parsing libraries. Its purpose is to create understandable, testable and maintainable parsers.

Parmancer is in development so its public API is not stable. Please leave feedback and suggestions in the GitHub issue tracker.

Parmancer is based on Parsy (and typed-parsy) which is an excellent parsing library.

Debug mode

When developing parsers, it can be helpful to understand why a parser fails on certain input. Parmancer includes a debug mode that provides detailed information about parser execution when parsing fails.

To enable debug mode, pass debug=True to the parse() method:

from parmancer import string, regex, seq, ParseError

# Create a simple parser that expects a greeting followed by a number
parser = seq(string("Hello "), regex(r"\d+"))

# This will fail - let's see why
try:
    parser.parse("Hello world", debug=True)
except ParseError as e:
    print(e)

The debug output shows a parse tree indicating which parsers succeeded and which failed:

failed with '\d+'
Furthest parsing position:
Hello world
~~~~~~^

Debug information:
==================
Parse tree:
Parser
└─KeepOne
  └─sequence
    ├─'Hello ' = 'Hello '
    └─\d+ X (failed)

This shows that the 'Hello ' parser succeeded, but the \d+ regex parser failed when it encountered "world" instead of digits.

Debug mode is useful during development but has performance overhead, so it should be disabled in production code.

API documentation and examples

The API docs include minimal examples of each parser and combinator.

The GitHub repository has an examples folder containing larger examples which use multiple features.

  1r'''
  2Parse text into **structured data types** with **parser combinators**.
  3
  4Parmancer has **type annotations** for parsers and intermediate results.
  5Using a type checker with Parmancer gives immediate feedback about parser result types, and gives type errors when creating invalid combinations of parsers.
  6
  7## Installation
  8
  9```sh
 10pip install parmancer
 11```
 12
 13## Documentation
 14
 15https://parmancer.com
 16
 17## Introductory example
 18
 19This example shows a parser which can parse text like `"Hello World! 1 + 2 + 3"` to extract the name in `Hello <name>!` and find the sum of the numbers which come after it:
 20
 21```python
 22from parmancer import regex, digits, seq, string
 23
 24# A parser which extracts a name from a greeting using a regular expression
 25greeting = regex(r"Hello (\w+)! ", group=1)
 26
 27# A parser which takes integers separated by ` + `,
 28# converts them to `int`s, and sums them.
 29adder = digits.map(int).sep_by(string(" + ")).map(sum)
 30
 31# The `greeting` and `adder` parsers are combined in sequence
 32parser = seq(greeting, adder)
 33# The type of `parser` is `Parser[tuple[str, int]]`, meaning it's a parser which
 34# will return a `tuple[str, int]` when it parses text.
 35
 36# Now the parser can be applied to the example string, or other strings following the
 37# same pattern.
 38result = parser.parse("Hello World! 1 + 2 + 3")
 39
 40# The result is a tuple containing the `greeting` result followed by the `adder` result
 41assert result == ("World", 6)
 42
 43# Parsing different text which matches the same structure:
 44assert parser.parse("Hello Example! 10 + 11") == ("Example", 21)
 45```
 46
 47Type checkers such as `mypy` and `Pylance`'s type checker help during development by revealing type information and catching type errors.
 48
 49Here the in-line types are displayed automatically with VSCode's Python extension and the 'Inlay Hints' setting:
 50
 51![Type annotations for Parmancer parsers](../docs/intro_example.gif)
 52
 53When the type of a parser doesn't match what's expected, such as in the following example, a type error reveals the problem as soon as the code is type checked, without having to run the code.
 54In this example the `Parser.unpack` method is being used to unpack the result tuple of type `(str, int)` into a function which expects arguments of type `(str, str)` which is a type incompatibility:
 55
 56![Type mismatch for the unpack method](../docs/type_mismatch.png)
 57
 58## Dataclass parsers
 59
 60A key feature of Parmancer is the ability to create parsers which return dataclass instances using a short syntax where parsers are directly associated with each field of a dataclass.
 61
 62Each dataclass field has a parser associated with it using the `take` field descriptor instead of the usual `dataclasses.field`.
 63
 64The entire dataclass parser is then **combined** using the `gather` function, creating a parser which sequentially applies each field's parser, assigning each result to the dataclass field it is associated with.
 65
 66```python
 67from dataclasses import dataclass
 68from parmancer import regex, string, take, gather
 69
 70# Example text which a sensor might produce
 71sample_text = """Device: SensorA
 72ID: abc001
 73Readings (3:01 PM)
 74300.1, 301, 300
 75Readings (3:02 PM)
 76302, 1000, 2500
 77"""
 78
 79numeric = regex(r"\d+(\.\d+)?").map(float)
 80any_text = regex(r"[^\n]+")
 81line_break = string("\n")
 82
 83
 84# Define parsers for the sensor readings and device information
 85@dataclass
 86class Reading:
 87    # Matches text like `Readings (3:01 PM)`
 88    timestamp: str = take(regex(r"Readings \(([^)]+)\)", group=1) << line_break)
 89    # Matches text like `300.1, 301, 300`
 90    values: list[float] = take(numeric.sep_by(string(", ")) << line_break)
 91
 92
 93@dataclass
 94class Device:
 95    # Matches text like `Device: SensorA`
 96    name: str = take(string("Device: ") >> any_text << line_break)
 97    # Matches text like `ID: abc001`
 98    id: str = take(string("ID: ") >> any_text << line_break)
 99    # Matches the entire `Reading` dataclass parser 0, 1 or many times
100    readings: list[Reading] = take(gather(Reading).many())
101
102
103# Gather the fields of the `Device` dataclass into a single combined parser
104# Note the `Device.readings` field parser uses the `Reading` dataclass parser
105parser = gather(Device)
106
107# The result of the parser is a nicely structured `Device` dataclass instance,
108# ready for use in the rest of the code with minimal boilerplate to get this far
109assert parser.parse(sample_text) == Device(
110    name="SensorA",
111    id="abc001",
112    readings=[
113        Reading(timestamp="3:01 PM", values=[300.1, 301, 300]),
114        Reading(timestamp="3:02 PM", values=[302, 1000, 2500]),
115    ],
116)
117```
118
119Dataclass parsers come with type annotations which make it easy to write them with hints from an IDE.
120For example, a dataclass field of type `str` cannot be associated with a parser of type `Parser[int]` - the parser has to produce a string (`Parser[str]`) for it to be compatible, and a type checker can reveal this while writing code in an IDE:
121
122![Dataclass field parser type error](../docs/dataclass_type_mismatch.png)
123
124## Why use Parmancer?
125
126- **Simple construction**: Simple parsers can be defined concisely and independently, and then combined with short, understandable **combinator** functions and methods which replace the usual branching and sequencing boilerplate of parsers written in vanilla Python.
127- **Modularity, testability, maintainability**: Each intermediate parser component is a complete parser in itself, which means it can be understood, tested and modified in isolation from the rest of the parser.
128- **Regular Python**: Some approaches to parsing use a separate grammar definition outside of Python which goes through a compilation or generation step before it can be used in Python, which can lead to black boxes. Parmancer parsers are defined as Python code rather than a separate grammar syntax.
129- **Combination features**: The parser comes with standard parser combinator methods and functions such as: combining parsers in sequence; matching alternative parsers until one matches; making a parser optional; repeatedly matching a parser until it no longer matches; mapping a parsing result through a function, and more.
130- **Type checking**: Parmancer has a lot of type information which makes it easier to use with IDEs and type checkers.
131- **Debug mode**: Built-in debug mode (`parser.parse(text, debug=True)`) provides detailed parse tree visualization including failures to help understand and fix parsing issues.
132
133Parmancer is not for creating performant parsers, its speed is similar to other pure Python parsing libraries.
134Its purpose is to create understandable, testable and maintainable parsers.
135
136Parmancer is in development so its public API is not stable.
137Please leave feedback and suggestions in the GitHub issue tracker.
138
139Parmancer is based on [Parsy](https://parsy.readthedocs.io/en/latest/overview.html) (and [typed-parsy](https://github.com/python-parsy/typed-parsy)) which is an excellent parsing library.
140
141## Debug mode
142
143When developing parsers, it can be helpful to understand why a parser fails on certain input. Parmancer includes a debug mode that provides detailed information about parser execution when parsing fails.
144
145To enable debug mode, pass `debug=True` to the `parse()` method:
146
147```python
148from parmancer import string, regex, seq, ParseError
149
150# Create a simple parser that expects a greeting followed by a number
151parser = seq(string("Hello "), regex(r"\d+"))
152
153# This will fail - let's see why
154try:
155    parser.parse("Hello world", debug=True)
156except ParseError as e:
157    print(e)
158```
159
160The debug output shows a parse tree indicating which parsers succeeded and which failed:
161
162```
163failed with '\d+'
164Furthest parsing position:
165Hello world
166~~~~~~^
167
168Debug information:
169==================
170Parse tree:
171Parser
172└─KeepOne
173  └─sequence
174    ├─'Hello ' = 'Hello '
175    └─\d+ X (failed)
176```
177
178This shows that the `'Hello '` parser succeeded, but the `\d+` regex parser failed when it encountered `"world"` instead of digits.
179
180Debug mode is useful during development but has performance overhead, so it should be disabled in production code.
181
182## API documentation and examples
183
184The API docs include minimal examples of each parser and combinator.
185
186The [GitHub repository](https://github.com/parmancer/parmancer) has an `examples` folder containing larger examples which use multiple features.
187'''
188
189from parmancer.parser import (
190    FailureInfo,
191    ParseError,
192    Parser,
193    Result,
194    TextState,
195    any_char,
196    char_from,
197    end_of_text,
198    forward_parser,
199    from_enum,
200    gather,
201    gather_perm,
202    look_ahead,
203    one_of,
204    regex,
205    seq,
206    span,
207    stateful_parser,
208    string,
209    string_from,
210    success,
211    take,
212)
213from parmancer.debug import DebugTextState
214
215__all__ = [
216    "string",
217    "regex",
218    "whitespace",
219    "padding",
220    "digit",
221    "digits",
222    "letter",
223    "string_from",
224    "char_from",
225    "span",
226    "any_char",
227    "end_of_text",
228    "from_enum",
229    "seq",
230    "one_of",
231    "success",
232    "look_ahead",
233    "take",
234    "gather",
235    "gather_perm",
236    "stateful_parser",
237    "forward_parser",
238    "Parser",
239    "Result",
240    "ParseError",
241    "FailureInfo",
242    "TextState",
243    "DebugTextState",
244]
245
246
247whitespace: Parser[str] = regex(r"\s+")
248r"""1 or more spaces: `regex(r"\s+")`"""
249
250padding: Parser[str] = regex(r"\s*")
251r"""0 or more spaces: `regex(r"\s*")`"""
252
253letter: Parser[str] = any_char.gate(lambda c: c.isalpha()).with_name("Letter")
254r"""A character ``c`` for which ``c.isalpha()`` is true."""
255
256digit: Parser[str] = regex(r"[0-9]").with_name("Digit")
257"""A numeric digit."""
258
259digits: Parser[str] = regex(r"[0-9]+").with_name("Digits")
260"""Any number of numeric digits in a row."""
def string(string: str) -> Parser[str]:
927def string(string: str) -> Parser[str]:
928    """A parser which matches the value of ``string`` exactly.
929
930    For example:
931
932    ```python
933    from parmancer import string
934
935    assert string("ab").many().parse("abab") == ["ab", "ab"]
936    ```
937    """
938    return String(string)

A parser which matches the value of string exactly.

For example:

from parmancer import string

assert string("ab").many().parse("abab") == ["ab", "ab"]
def regex( pattern: str | re.Pattern[str], *, flags: re.RegexFlag = re.NOFLAG, group: Union[str, int, Tuple[str | int], Tuple[str | int, str | int], Tuple[str | int, str | int, str | int], Tuple[str | int, str | int, str | int, str | int], Tuple[str | int, str | int, str | int, str | int, str | int], Tuple[str | int, ...]] = 0) -> Parser[typing.Union[str, typing.Tuple[str, ...]]]:
1052def regex(
1053    pattern: PatternType,
1054    *,
1055    flags: re.RegexFlag = re.RegexFlag(0),
1056    group: str
1057    | int
1058    | Tuple[str | int]
1059    | Tuple[str | int, str | int]
1060    | Tuple[str | int, str | int, str | int]
1061    | Tuple[str | int, str | int, str | int, str | int]
1062    | Tuple[str | int, str | int, str | int, str | int, str | int]
1063    | Tuple[str | int, ...] = 0,
1064) -> Parser[str | Tuple[str, ...]]:
1065    r"""Match a regex ``pattern``.
1066
1067    The optional ``group`` specifies which regex group(s) to keep as the parser result
1068    using the `re.match` syntax.
1069    The default it is `0`, meaning the entire string matched by the regex is used as the
1070    result.
1071
1072    Numbered and named capture groups are supported.
1073
1074    When ``group`` contains a single value: ``int``; ``str``; ``tuple[int]``;
1075    ``tuple[str]``; then the result is a string: ``Parser[str]``.
1076
1077    When ``group`` contains a tuple of 2 or more elements, the result is a tuple of
1078    those strings, for example a ``group`` of `(1, 2, 3)` produces
1079    a ``Parser[tuple[str, str, str]]``: the result is a tuple of 3 strings.
1080
1081    Some examples:
1082
1083    ```python
1084    from parmancer import regex
1085
1086    assert regex(r".").parse(">") == ">"
1087    assert regex(r".(a)", group=1).parse("1a") == "a"
1088    assert regex(r".(?P<name>a)", group="name").parse("1a") == "a"
1089    assert regex(
1090        r"(?P<hours>\d\d):(?P<minutes>\d\d)", group=("hours", "minutes")
1091    ).parse("10:20") == ("10", "20")
1092    ```
1093
1094    The optional ``flags`` is passed to ``re.compile``.
1095    """
1096    if isinstance(pattern, str):
1097        exp = re.compile(pattern, flags)
1098    else:
1099        if flags:
1100            # Need to recompile with the specified flags
1101            exp = re.compile(pattern.pattern, flags)
1102        else:
1103            exp = pattern
1104
1105    return Regex(exp, flags, group)

Match a regex pattern.

The optional group specifies which regex group(s) to keep as the parser result using the re.match syntax. The default it is 0, meaning the entire string matched by the regex is used as the result.

Numbered and named capture groups are supported.

When group contains a single value: int; str; tuple[int]; tuple[str]; then the result is a string: Parser[str].

When group contains a tuple of 2 or more elements, the result is a tuple of those strings, for example a group of (1, 2, 3) produces a Parser[tuple[str, str, str]]: the result is a tuple of 3 strings.

Some examples:

from parmancer import regex

assert regex(r".").parse(">") == ">"
assert regex(r".(a)", group=1).parse("1a") == "a"
assert regex(r".(?P<name>a)", group="name").parse("1a") == "a"
assert regex(
    r"(?P<hours>\d\d):(?P<minutes>\d\d)", group=("hours", "minutes")
).parse("10:20") == ("10", "20")

The optional flags is passed to re.compile.

whitespace: Parser[str] = Regex(pattern=re.compile('\\s+'), flags=re.NOFLAG, group=0)

1 or more spaces: regex(r"\s+")

padding: Parser[str] = Regex(pattern=re.compile('\\s*'), flags=re.NOFLAG, group=0)

0 or more spaces: regex(r"\s*")

digit: Parser[str] = NamedParser(parser=Regex(pattern=re.compile('[0-9]'), flags=re.NOFLAG, group=0), name='Digit')

A numeric digit.

digits: Parser[str] = NamedParser(parser=Regex(pattern=re.compile('[0-9]+'), flags=re.NOFLAG, group=0), name='Digits')

Any number of numeric digits in a row.

letter: Parser[str] = NamedParser(parser=Gate(parser=Span(length=1), gate_function=<function <lambda>>), name='Letter')

A character c for which c.isalpha() is true.

def string_from(*strings: str) -> Parser[str]:
1845def string_from(*strings: str) -> Parser[str]:
1846    """Any string from a given collection of strings.
1847
1848    ```python
1849    from parmancer import string_from
1850
1851    parser = string_from("cat", "dog")
1852
1853    assert parser.parse("cat") == "cat"
1854    ```
1855    """
1856    return reduce(
1857        operator.or_,
1858        # Sort longest first, so that overlapping options work correctly
1859        (string(s) for s in sorted(strings, key=len, reverse=True)),
1860    )

Any string from a given collection of strings.

from parmancer import string_from

parser = string_from("cat", "dog")

assert parser.parse("cat") == "cat"
def char_from(string: str) -> Parser[str]:
1863def char_from(string: str) -> Parser[str]:
1864    """Any character contained in ``string``.
1865
1866    For example:
1867
1868    ```python
1869    from parmancer import char_from
1870
1871    assert char_from("abc").parse("c") == "c"
1872    assert char_from("abc").match("d").status is False
1873    ```
1874    """
1875    return any_char.gate(lambda c: c in string).with_name(f"[{string}]")

Any character contained in string.

For example:

from parmancer import char_from

assert char_from("abc").parse("c") == "c"
assert char_from("abc").match("d").status is False
def span(length: int) -> Parser[str]:
956def span(length: int) -> Parser[str]:
957    """A parser which matches any string span of length ``length``.
958
959    For example, to match any strings of length 3 and then check that it matches a
960    condition:
961
962    ```python
963    from parmancer import span
964
965    # Match any 3 characters where the first character equals the last character
966    parser = span(3).gate(lambda s: s[0] == s[2])
967
968    assert parser.parse("aba") == "aba"
969    # A case which doesn't match:
970    assert parser.match("abc").status is False
971    ```
972    """
973    return Span(length)

A parser which matches any string span of length length.

For example, to match any strings of length 3 and then check that it matches a condition:

from parmancer import span

# Match any 3 characters where the first character equals the last character
parser = span(3).gate(lambda s: s[0] == s[2])

assert parser.parse("aba") == "aba"
# A case which doesn't match:
assert parser.match("abc").status is False
any_char = Span(length=1)
end_of_text = EndOfText()
def from_enum(enum: Type[~E]) -> Parser[~E]:
1800def from_enum(enum: Type[E]) -> Parser[E]:
1801    """Match any value from an enum, producing the enum value as a result.
1802
1803    For example:
1804
1805    ```python
1806    import enum
1807    from parmancer import from_enum
1808
1809
1810    class Pet(enum.Enum):
1811        CAT = "cat"
1812        DOG = "dog"
1813
1814
1815    pet = from_enum(Pet)
1816    assert pet.parse("cat") == Pet.CAT
1817    assert pet.parse("dog") == Pet.DOG
1818    # This case doesn't match:
1819    assert pet.match("foo").status is False
1820    ```
1821    """
1822    return EnumMember(enum)

Match any value from an enum, producing the enum value as a result.

For example:

import enum
from parmancer import from_enum


class Pet(enum.Enum):
    CAT = "cat"
    DOG = "dog"


pet = from_enum(Pet)
assert pet.parse("cat") == Pet.CAT
assert pet.parse("dog") == Pet.DOG
# This case doesn't match:
assert pet.match("foo").status is False
def seq( *parsers: Parser[typing.Any]) -> Parser[typing.Tuple[typing.Any, ...]]:
1460def seq(*parsers: Parser[Any]) -> Parser[Tuple[Any, ...]]:
1461    r"""
1462    A sequence of parsers are applied in order, and their results are stored in a tuple.
1463
1464    For example:
1465
1466    ```python
1467    from parmancer import seq, regex
1468
1469    word = regex(r"[a-zA-Z]+")
1470    number = regex(r"\d").map(int)
1471
1472    parser = seq(word, number, word, number, word | number)
1473
1474    assert parser.parse("a1b2a") == ("a", 1, "b", 2, "a")
1475    assert parser.parse("a1b23") == ("a", 1, "b", 2, 3)
1476    ```
1477
1478    There are multiple related methods for combining parsers where the result is a
1479    tuple: adding another parser result to the end of the tuple; concatenating two
1480    tuple parsers together; unpacking the tuple result as args to a function, etc.
1481
1482    Here is an example which includes more tuple-related methods. Note that type
1483    annotations are available throughout: a type checker can find the tuple type
1484    for each parser, and it can tell that the `unpack` method is correctly unpacking
1485    a `tuple[int, str, bool]` to a function which expects those types for its arguments.
1486
1487    ```python
1488    from parmancer import digit, letter, seq, string
1489
1490
1491    def demo(score: int, letter: str, truth: bool) -> str:
1492        return str(score) if truth else letter
1493
1494
1495    score = digit.map(int)
1496    truth = string("T").result(True) | string("F").result(False)
1497
1498    # This parser's result is a tuple[int, str, bool]
1499    params = seq(score, letter, truth)
1500    assert params.parse("1aT") == (1, "a", True)
1501
1502    # That tuple can be unpacked as arguments for the demo function
1503    parser = params.unpack(demo)
1504
1505    assert parser.parse("1aT") == "1"
1506    assert parser.parse("2bF") == "b"
1507
1508    # Another parser which returns a tuple[int, int, int]
1509    triple_score = seq(score, score, score)
1510
1511    assert triple_score.parse("123") == (1, 2, 3)
1512    assert triple_score.parse("900") == (9, 0, 0)
1513
1514    # These tuple parsers can be concatenated in sequence by adding them
1515    combined = params + triple_score
1516
1517    assert combined.parse("1aT234") == (1, "a", True, 2, 3, 4)
1518    ```
1519    """
1520
1521    return Sequence(parsers)

A sequence of parsers are applied in order, and their results are stored in a tuple.

For example:

from parmancer import seq, regex

word = regex(r"[a-zA-Z]+")
number = regex(r"\d").map(int)

parser = seq(word, number, word, number, word | number)

assert parser.parse("a1b2a") == ("a", 1, "b", 2, "a")
assert parser.parse("a1b23") == ("a", 1, "b", 2, 3)

There are multiple related methods for combining parsers where the result is a tuple: adding another parser result to the end of the tuple; concatenating two tuple parsers together; unpacking the tuple result as args to a function, etc.

Here is an example which includes more tuple-related methods. Note that type annotations are available throughout: a type checker can find the tuple type for each parser, and it can tell that the unpack method is correctly unpacking a tuple[int, str, bool] to a function which expects those types for its arguments.

from parmancer import digit, letter, seq, string


def demo(score: int, letter: str, truth: bool) -> str:
    return str(score) if truth else letter


score = digit.map(int)
truth = string("T").result(True) | string("F").result(False)

# This parser's result is a tuple[int, str, bool]
params = seq(score, letter, truth)
assert params.parse("1aT") == (1, "a", True)

# That tuple can be unpacked as arguments for the demo function
parser = params.unpack(demo)

assert parser.parse("1aT") == "1"
assert parser.parse("2bF") == "b"

# Another parser which returns a tuple[int, int, int]
triple_score = seq(score, score, score)

assert triple_score.parse("123") == (1, 2, 3)
assert triple_score.parse("900") == (9, 0, 0)

# These tuple parsers can be concatenated in sequence by adding them
combined = params + triple_score

assert combined.parse("1aT234") == (1, "a", True, 2, 3, 4)
def one_of( parser: Parser[typing.Any], *parsers: Parser[typing.Any]) -> Parser[typing.Any]:
1215def one_of(parser: Parser[Any], *parsers: Parser[Any]) -> Parser[Any]:
1216    r"""All parsers are tried, exactly one must succeed.
1217
1218    For example, this can be used to fail on ambiguous inputs by specifying that exactly
1219    one parser must match the input. For date formats, the date string `"01-02-03"` may
1220    be ambiguous in general whereas `"2001-02-03"` may be considered unambiguous:
1221
1222    ```python
1223    from parmancer import one_of, seq, string, regex, ParseError
1224
1225    two_digit = regex(r"\d{2}").map(int)
1226    four_digit = regex(r"\d{4}").map(int)
1227    sep = string("-")
1228
1229    ymd = seq((four_digit | two_digit) << sep, two_digit << sep, two_digit)
1230    dmy = seq(two_digit << sep, two_digit << sep, four_digit | two_digit)
1231
1232    # Exactly one of the formats must match: year-month-day or day-month-year
1233    date = one_of(ymd, dmy)
1234
1235    # This unambiguous input leads to a successful parse
1236    assert date.parse("2001-02-03") == (2001, 2, 3)
1237
1238    # This ambiguous input leads to a failure to parse
1239    assert date.match("01-02-03").status is False
1240    ```
1241    """
1242    return OneOf((parser, *parsers))

All parsers are tried, exactly one must succeed.

For example, this can be used to fail on ambiguous inputs by specifying that exactly one parser must match the input. For date formats, the date string "01-02-03" may be ambiguous in general whereas "2001-02-03" may be considered unambiguous:

from parmancer import one_of, seq, string, regex, ParseError

two_digit = regex(r"\d{2}").map(int)
four_digit = regex(r"\d{4}").map(int)
sep = string("-")

ymd = seq((four_digit | two_digit) << sep, two_digit << sep, two_digit)
dmy = seq(two_digit << sep, two_digit << sep, four_digit | two_digit)

# Exactly one of the formats must match: year-month-day or day-month-year
date = one_of(ymd, dmy)

# This unambiguous input leads to a successful parse
assert date.parse("2001-02-03") == (2001, 2, 3)

# This ambiguous input leads to a failure to parse
assert date.match("01-02-03").status is False
def success(success_value: ~T) -> Parser[~T]:
895def success(success_value: T) -> Parser[T]:
896    """
897    A parser which always succeeds with a result of ``success_value`` and doesn't modify
898    the input state.
899    """
900    return Success(success_value)

A parser which always succeeds with a result of success_value and doesn't modify the input state.

def look_ahead(parser: Parser[~T]) -> Parser[~T]:
1837def look_ahead(parser: Parser[T]) -> Parser[T]:
1838    """
1839    Check whether a parser matches the next part of the input without changing the state
1840    of the parser: no input is consumed and no result is kept.
1841    """
1842    return LookAhead(parser)

Check whether a parser matches the next part of the input without changing the state of the parser: no input is consumed and no result is kept.

def take( parser: Parser[~T], *, init: bool = True, repr: bool = True, hash: bool | None = None, compare: bool = True, metadata: Optional[Mapping[Any, Any]] = None) -> ~T:
1595def take(
1596    parser: Parser[T],
1597    *,
1598    init: bool = True,
1599    repr: bool = True,
1600    hash: bool | None = None,
1601    compare: bool = True,
1602    metadata: Mapping[Any, Any] | None = None,
1603) -> T:
1604    r"""
1605    Assign a parser to a field of a dataclass.
1606
1607    Use this in a dataclass in conjunction with ``gather`` to concisely define parsers
1608    which return dataclass instances.
1609
1610    ```python
1611    from dataclasses import dataclass
1612
1613    from parmancer import gather, regex, take, whitespace
1614
1615
1616    @dataclass
1617    class Person:
1618        # Each field has a parser associated with it.
1619        name: str = take(regex(r"\w+") << whitespace)
1620        age: int = take(regex(r"\d+").map(int))
1621
1622
1623    # "Gather" the dataclass fields into a combined parser which returns
1624    # an instance of the dataclass
1625    person_parser = gather(Person)
1626    person = person_parser.parse("Bilbo 111")
1627
1628    assert person == Person(name="Bilbo", age=111)
1629    ```
1630
1631    """
1632    if metadata is None:
1633        metadata = {}
1634    return cast(
1635        T,
1636        field(
1637            init=init,
1638            repr=repr,
1639            hash=hash,
1640            compare=compare,
1641            metadata={**metadata, "parser": parser},
1642        ),
1643    )

Assign a parser to a field of a dataclass.

Use this in a dataclass in conjunction with gather to concisely define parsers which return dataclass instances.

from dataclasses import dataclass

from parmancer import gather, regex, take, whitespace


@dataclass
class Person:
    # Each field has a parser associated with it.
    name: str = take(regex(r"\w+") << whitespace)
    age: int = take(regex(r"\d+").map(int))


# "Gather" the dataclass fields into a combined parser which returns
# an instance of the dataclass
person_parser = gather(Person)
person = person_parser.parse("Bilbo 111")

assert person == Person(name="Bilbo", age=111)
def gather( model: Type[~DataclassType], field_order: Optional[Iterable[str]] = None) -> Parser[~DataclassType]:
1663def gather(
1664    model: Type[DataclassType], field_order: Optional[Iterable[str]] = None
1665) -> Parser[DataclassType]:
1666    r"""
1667    Gather parsers from the fields of a dataclass into a single combined parser.
1668    Each field parser is applied in sequence, and each value is then assigned to that
1669    field to create an instance of the dataclass. That dataclass is the result of the
1670    combined parser.
1671
1672    ```python
1673    from dataclasses import dataclass
1674    from parmancer import take, string, gather, regex
1675
1676
1677    @dataclass
1678    class Example:
1679        foo: int = take(regex(r"\d+").map(int))
1680        bar: bool = take(string("T").result(True) | string("F").result(False))
1681
1682
1683    parser = gather(Example)
1684    assert parser.parse("123T") == Example(foo=123, bar=True)
1685    ```
1686    """
1687    field_parsers = get_parsers_from_fields(model)
1688    if field_order is not None:
1689        field_parsers = {name: field_parsers[name] for name in field_order}
1690    return DataclassSequence(model, field_parsers)

Gather parsers from the fields of a dataclass into a single combined parser. Each field parser is applied in sequence, and each value is then assigned to that field to create an instance of the dataclass. That dataclass is the result of the combined parser.

from dataclasses import dataclass
from parmancer import take, string, gather, regex


@dataclass
class Example:
    foo: int = take(regex(r"\d+").map(int))
    bar: bool = take(string("T").result(True) | string("F").result(False))


parser = gather(Example)
assert parser.parse("123T") == Example(foo=123, bar=True)
def gather_perm(model: Type[~DataclassType]) -> Parser[~DataclassType]:
1718def gather_perm(model: Type[DataclassType]) -> Parser[DataclassType]:
1719    r"""
1720    Parse all fields of a dataclass parser in any order.
1721
1722    Example:
1723
1724    ```python
1725    from dataclasses import dataclass
1726    from parmancer import take, string, gather_perm, regex
1727
1728
1729    @dataclass
1730    class Example:
1731        foo: int = take(regex(r"\d+").map(int))
1732        bar: bool = take(string("T").result(True) | string("F").result(False))
1733
1734
1735    parser = gather_perm(Example)
1736    assert parser.parse("T123") == Example(foo=123, bar=True)
1737    ```
1738    """
1739    return DataclassPermutation(model)

Parse all fields of a dataclass parser in any order.

Example:

from dataclasses import dataclass
from parmancer import take, string, gather_perm, regex


@dataclass
class Example:
    foo: int = take(regex(r"\d+").map(int))
    bar: bool = take(string("T").result(True) | string("F").result(False))


parser = gather_perm(Example)
assert parser.parse("T123") == Example(foo=123, bar=True)
def stateful_parser( parser: Callable[[TextState], Result[~T]]) -> Parser[~T]:
883def stateful_parser(parser: Callable[[TextState], Result[T]]) -> Parser[T]:
884    return StatefulParser(parser)
def forward_parser( parser_iterator: Callable[[], Iterator[Parser[~T]]]) -> Parser[~T]:
1893def forward_parser(parser_iterator: Callable[[], Iterator[Parser[T]]]) -> Parser[T]:
1894    """Define a parser which refers to another parser which hasn't been defined yet
1895
1896    Wrap a generator which yields the parser to refer to.
1897    This makes recursive parser definitions possible, for example:
1898
1899    ```python
1900    from parmancer import forward_parser, string, Parser
1901    from typing import Iterator
1902
1903
1904    @forward_parser
1905    def _parser() -> Iterator[Parser[str]]:
1906        yield parser
1907
1908
1909    # `parser` refers to itself recursively via `_parser`.
1910    parser = string("a") | string("(") >> _parser << string(")")
1911
1912    assert parser.parse("(a)") == "a"
1913    assert parser.parse("(((a)))") == "a"
1914    ```
1915
1916    """
1917    return ForwardParser(parser_iterator=parser_iterator)

Define a parser which refers to another parser which hasn't been defined yet

Wrap a generator which yields the parser to refer to. This makes recursive parser definitions possible, for example:

from parmancer import forward_parser, string, Parser
from typing import Iterator


@forward_parser
def _parser() -> Iterator[Parser[str]]:
    yield parser


# `parser` refers to itself recursively via `_parser`.
parser = string("a") | string("(") >> _parser << string(")")

assert parser.parse("(a)") == "a"
assert parser.parse("(((a)))") == "a"
class Parser(typing.Generic[+T_co]):
348class Parser(Generic[T_co]):
349    """
350    Parser base class that defines the core parsing interface.
351
352    The generic type parameter `T_co` represents the type of value that the parser produces
353    when it successfully parses input text. For example:
354
355    - `Parser[str]`: A parser that produces string values
356    - `Parser[int]`: A parser that produces integer values
357    - `Parser[List[str]]`: A parser that produces lists of strings
358    - `Parser[Tuple[str, int]]`: A parser that produces tuples containing a string and an integer
359
360    The `_co` suffix indicates that the type parameter is covariant, which means that if
361    `Child` is a subtype of `Parent`, then `Parser[Child]` is a subtype of `Parser[Parent]`.
362
363    Subclasses can override the `parse_result` method to create a specific parser, see
364    `String` for example.
365    """
366
367    name: str = "Parser"
368
369    @overload
370    def parse(self: Parser[T_co], text: str, *, debug: Literal[True]) -> T_co: ...
371
372    @overload
373    def parse(
374        self: Parser[T_co],
375        text: str,
376        state_handler: Type[TextState] = TextState,
377        debug: Literal[False] = False,
378    ) -> T_co: ...
379
380    @overload
381    def parse(
382        self: Parser[T_co],
383        text: str,
384        state_handler: Type[TextState] = TextState,
385        debug: bool = False,
386    ) -> T_co: ...
387
388    def parse(
389        self: Parser[T_co],
390        text: str,
391        state_handler: Type[TextState] = TextState,
392        debug: bool = False,
393    ) -> T_co:
394        """
395        Run the parser on input text, returning the parsed value or raising a
396        `ParseError` on failure.
397
398        `text` - the text to be parsed
399        `state_handler` (optional) - the class to use for handling parser state
400        `debug` (optional) - if True, enables debug mode with detailed error information
401
402        Debug mode provides detailed information about parser execution when parsing fails,
403        including a parse tree that shows successful parsers (marked with "= value") and
404        failed parsers (marked with "X (failed)"). This is useful during development but
405        has performance overhead.
406        """
407        if debug:
408            # Import here to avoid circular imports
409            from parmancer.debug import DebugTextState
410
411            state: TextState = DebugTextState.start(text)
412        else:
413            state = state_handler.start(text)
414        result = (self << end_of_text).parse_result(state)
415        if not result.status:
416            raise ParseError(result.state.failures, result.state)
417        return result.value
418
419    @overload
420    def match(self, text: str, *, debug: Literal[True]) -> Result[T_co]: ...
421
422    @overload
423    def match(
424        self,
425        text: str,
426        state_handler: Type[TextState] = TextState,
427        debug: Literal[False] = False,
428    ) -> Result[T_co]: ...
429
430    @overload
431    def match(
432        self, text: str, state_handler: Type[TextState] = TextState, debug: bool = False
433    ) -> Result[T_co]: ...
434
435    def match(
436        self, text: str, state_handler: Type[TextState] = TextState, debug: bool = False
437    ) -> Result[T_co]:
438        """
439        Run the parser on input text, returning the parsed result.
440
441        Unlike `Parser.parse`, this method does not raise an error if parsing fails, it
442        returns a `Result` type wrapping the parser output or the failure state.
443
444        `text` - the text to be parsed
445        `state_handler` (optional) - the class to use for handling parser state
446        `debug` (optional) - if True, enables debug mode with detailed error information
447
448        Debug mode provides the same detailed parser execution information as `Parser.parse`,
449        but accessible through the Result object's state rather than a raised exception.
450        """
451        if debug:
452            # Import here to avoid circular imports
453            from parmancer.debug import DebugTextState
454
455            state: TextState = DebugTextState.start(text)
456        else:
457            state = state_handler.start(text)
458        return (self << end_of_text).parse_result(state)
459
460    def parse_result(self, state: TextState) -> Result[T_co]:
461        """
462        Given the input text and the current parsing position (state), parse and return
463        a result (success with the parsed value, or failure with failure info).
464
465        Override this method in subclasses to create a specific parser.
466        """
467        return NotImplemented  # type: ignore[no-any-return]
468
469    @overload
470    def result(self: Parser[Any], value: AnyLiteral) -> Parser[AnyLiteral]: ...
471
472    @overload
473    def result(self: Parser[Any], value: T) -> Parser[T]: ...
474
475    def result(self: Parser[Any], value: T) -> Parser[T]:
476        """Replace the current result with the given ``value``."""
477        return self >> Success(value)
478
479    def __or__(self: Parser[T1], other: Parser[T2]) -> Parser[T1 | T2]:
480        """Match either self or other, returning the first parser which succeeds."""
481        if isinstance(self, Choice):
482            self_parsers = self.parsers
483        else:
484            self_parsers = (self,)
485
486        if isinstance(other, Choice):
487            other_parsers = other.parsers
488        else:
489            other_parsers = (other,)
490
491        return Choice((*self_parsers, *other_parsers))
492
493    def many(
494        self: Parser[T_co],
495        min_count: int = 0,
496        max_count: int | float = float("inf"),
497    ) -> Parser[List[T_co]]:
498        """Repeat the parser until it doesn't match, storing all matches in a list.
499        Optionally set a minimum or maximum number of times to match.
500
501        :param min_count: Match at least this many times
502        :param max_count: Match at most this many times
503        :return: A new parser which will repeatedly apply the previous parser
504        """
505        return Range(self, min_count=min_count, max_count=max_count)
506
507    def times(self: Parser[T_co], count: int) -> Parser[List[T_co]]:
508        """Repeat the parser a fixed number of times, storing all matches in a list.
509
510        :param count: Number of times to apply the parser
511        :return: A new parser which will repeat the previous parser ``count`` times
512        """
513        return self.many(min_count=count, max_count=count).with_name(f"times({count})")
514
515    def at_most(self: Parser[T_co], count: int) -> Parser[List[T_co]]:
516        """Repeat the parser at most ``count`` times.
517
518        :param count: Maximum number of repeats
519        :return: A new parser which will repeat the previous parser up to ``count`` times
520        """
521        return self.many(0, count).with_name(f"at_most({count})")
522
523    def at_least(self: Parser[T_co], count: int) -> Parser[List[T_co]]:
524        """Repeat the parser at least ``count`` times.
525
526        :param count: Minimum number of repeats
527        :return: A new parser which will repeat the previous parser at least ``count`` times
528        """
529        return self.many(min_count=count, max_count=float("inf")).with_name(
530            f"at_least({count})"
531        )
532
533    def until(
534        self: Parser[T_co],
535        until_parser: Parser[Any],
536        min_count: int = 0,
537        max_count: int | float = float("inf"),
538    ) -> Parser[List[T_co]]:
539        """Repeatedly apply the parser until the ``until_parser`` matches, optionally
540        setting a minimum or maximum number of times to repeat.
541
542        :param until_parser: Repeats will stop when this parser matches
543        :param min_count: Optional minimum number of repeats required to succeed
544        :param max_count: Optional maximum number of repeats before the ``until_parser``
545            must succeed
546        :return: A new parser which will repeat the previous parser until ``until_parser``
547        """
548        return Until(self, until_parser, min_count, max_count)
549
550    def sep_by(
551        self: Parser[T_co],
552        sep: Parser[Any],
553        *,
554        min_count: int = 0,
555        max_count: int | float = float("inf"),
556    ) -> Parser[List[T_co]]:
557        r"""
558        Alternately apply this parser and the ``sep`` parser, keeping a list of results
559        from this parser.
560
561        For example, to match a comma-separated list of values, keeping only the values
562        and discarding the commas:
563
564        ```python
565        from parmancer import regex, string
566
567        value = regex(r"\d+")
568        sep = string(", ")
569        parser = value.sep_by(sep)
570        assert parser.parse("1, 2, 30") == ["1", "2", "30"]
571        ```
572
573        :param sep: The parser acting as a separator
574        :param min_count: Optional minimum number of repeats
575        :param max_count: Optional maximum number of repeats
576        :return: A new parser which will apply this parser multiple times, with ``sep``
577            applied between each repeat.
578        """
579        return Range(
580            self, separator_parser=sep, min_count=min_count, max_count=max_count
581        )
582
583    def bind(
584        self: Parser[T1],
585        bind_fn: Callable[[T1], Parser[T2]],
586    ) -> Parser[T2]:
587        """
588        Bind the result of the current parser to a function which returns another
589        parser.
590
591        :param bind_fn: A function which will take the result of the current parser as
592            input and return another parser which may depend on the result.
593        :return: The bound parser created by ``bind_fn``
594        """
595        return Bind(self, bind_fn)
596
597    def map(
598        self: Parser[T1],
599        map_fn: Callable[[T1], T2],
600        map_name: Optional[str] = None,
601    ) -> Parser[T2]:
602        """Convert the current result to a new result by passing its value through
603        ``map_fn``
604
605        :param map_fn: The current parser result value will be passed through this
606            function, creating a new result.
607        :param map_name: A name to give to the map function
608        :return: A new parser which will convert the previous parser's result to a new
609            value using ``map_fn``
610        """
611        if map_name is None:
612            map_name = "map"
613            if hasattr(map_fn, "__name__"):
614                map_name = map_fn.__name__
615
616        return Map(parser=self, map_callable=map_fn, map_name=map_name)
617
618    def map_failure(
619        self, failure_transform: Callable[[FailureInfo], FailureInfo]
620    ) -> Parser[T_co]:
621        """Transform a failure state using a transform function, used for example to add
622        additional context to a parser failure.
623
624        :param failure_transform: A function which converts a ``FailureInfo`` into
625            another ``FailureInfo``
626        :return: A parser which will map its failure info using ``failure_transform``
627        """
628        return MapFailure(self, failure_transform)
629
630    def unpack(
631        self: Parser[Tuple[Unpack[Ts]]],
632        transform_fn: Callable[[Unpack[Ts]], T2],
633    ) -> Parser[T2]:
634        """When the result is a tuple, it can be unpacked and passed as *args to
635        ``transform_fn``, creating a new result containing the function's output.
636
637        :param transform_fn: Function to unpack the current result tuple into as args
638        :return: An updated parser which will unpack its result into ``transform_fn``
639            to produce a new result
640        """
641        return self.bind(lambda value: Success(transform_fn(*value))).with_name(
642            "unpack"
643        )
644
645    def tuple(self: Parser[T]) -> Parser[Tuple[T]]:
646        """Wrap the result in a tuple of length 1."""
647        return self.map(lambda value: (value,), "Wrap tuple")
648
649    def append(
650        self: Parser[Tuple[Unpack[Ts]]], other: Parser[T2]
651    ) -> Parser[Tuple[Unpack[Ts], T2]]:
652        """
653        Append the result of another parser to the end of the current parser's result tuple
654
655        ```python
656        from parmancer import string
657
658        initial = string("First").tuple()
659        appended = initial.append(string("Second"))
660
661        assert appended.parse("FirstSecond") == ("First", "Second")
662        ```
663        """
664        return self.bind(
665            lambda self_value: other.bind(
666                lambda other_value: Success((*self_value, other_value))
667            )
668        )
669
670    def list(self: Parser[T]) -> Parser[List[T]]:
671        """Wrap the result in a list."""
672        return self.map(lambda value: [value], map_name="Wrap list")
673
674    # Unpack first arg
675    @overload
676    def __add__(
677        self: Parser[Tuple[Unpack[Ts]]],
678        other: Parser[Tuple[T1]],
679    ) -> Parser[Tuple[Unpack[Ts], T1]]: ...
680
681    @overload
682    def __add__(
683        self: Parser[Tuple[Unpack[Ts]]],
684        other: Parser[Tuple[T1, T2]],
685    ) -> Parser[Tuple[Unpack[Ts], T1, T2]]: ...
686
687    @overload
688    def __add__(
689        self: Parser[Tuple[Unpack[Ts]]],
690        other: Parser[Tuple[T1, T2, T3]],
691    ) -> Parser[Tuple[Unpack[Ts], T1, T2, T3]]: ...
692
693    @overload
694    def __add__(
695        self: Parser[Tuple[Unpack[Ts]]],
696        other: Parser[Tuple[T1, T2, T3, T4]],
697    ) -> Parser[
698        Tuple[
699            Unpack[Ts],
700            T1,
701            T2,
702            T3,
703            T4,
704        ]
705    ]: ...
706
707    @overload
708    def __add__(
709        self: Parser[Tuple[Unpack[Ts]]],
710        other: Parser[Tuple[T1, T2, T3, T4, T5]],
711    ) -> Parser[
712        Tuple[
713            Unpack[Ts],
714            T1,
715            T2,
716            T3,
717            T4,
718            T5,
719        ]
720    ]: ...
721
722    # Cover the rest of cases which can't return a homogeneous tuple
723    @overload
724    def __add__(
725        self: Parser[Tuple[T1, ...]], other: Parser[Tuple[T2, ...]]
726    ) -> Parser[Tuple[T1 | T2, ...]]: ...
727
728    @overload
729    def __add__(
730        self: Parser[Tuple[Any, ...]], other: Parser[Tuple[Any, ...]]
731    ) -> Parser[Tuple[Any, ...]]: ...
732
733    # Literal strings are not caught by the other cases
734    @overload
735    def __add__(self: Parser[LiteralString], other: Parser[str]) -> Parser[str]: ...
736
737    # Mypy calls this unreachable; pyright calls it reachable
738    @overload
739    def __add__(  # type: ignore[overload-cannot-match]
740        self: Parser[str], other: Parser[LiteralString]
741    ) -> Parser[str]: ...
742
743    # SupportsAdd compatible
744    @overload
745    def __add__(
746        self: Parser[SupportsAdd[Addable, AddResult]], other: Parser[Addable]
747    ) -> Parser[AddResult]: ...
748
749    @overload
750    def __add__(
751        self: Parser[Addable], other: Parser[SupportsRAdd[Addable, AddResult]]
752    ) -> Parser[AddResult]: ...
753
754    def __add__(self: Parser[Any], other: Parser[Any]) -> Parser[Any]:
755        """Run this parser followed by ``other``, and add the result values together."""
756        if isinstance(self, Sequence) and isinstance(other, Sequence):
757            # Merge two sequences into one
758            return Sequence((*self.parsers, *other.parsers))
759
760        return seq(self, other).map(lambda x: x[0] + x[1], "Add")
761
762    def concat(
763        self: Parser[Iterable[SupportsAdd[T, T1]]],
764    ) -> Parser[T1]:
765        """
766        Add all the elements of an iterable result together.
767
768        For an iterable of strings, this concatenates the strings:
769
770        ```python
771        from parmancer import digits, string
772
773        delimited = digits.sep_by(string("-"))
774
775        assert delimited.parse("0800-12-3") == ["0800", "12", "3"]
776
777        assert delimited.concat().parse("0800-12-3") == "0800123"
778        ```
779        """
780
781        return self.map(partial(reduce, operator.add), "Concat")
782
783    # >>
784    def __rshift__(self, other: Parser[T]) -> Parser[T]:
785        """Run this parser followed by ``other``, keeping only ``other``'s result."""
786        return KeepOne(left=(self,), keep=other)
787
788    def keep_right(self, other: Parser[T]) -> Parser[T]:
789        """
790        This parser is run, followed by the other parser, but only the result of the
791        other parser is kept.
792
793        Another way to use this is with the `>>` operator:
794
795        ```python
796        from parmancer import string
797
798        parser = string("a") >> string("b")
799        # The "a" is matched but not kept as part of the result
800        assert parser.parse("ab") == "b"
801        ```
802        """
803        return KeepOne(left=(self,), keep=other)
804
805    # <<
806    def __lshift__(self: Parser[T], other: Parser[Any]) -> Parser[T]:
807        """Run this parser followed by ``other``, keeping only this parser's result."""
808        return KeepOne(keep=self, right=(other,))
809
810    def keep_left(self: Parser[T], other: Parser[Any]) -> Parser[T]:
811        """
812        This parser is run, followed by the other parser, but only the result of this
813        parser is kept.
814
815        Another way to use this is with the `<<` operator:
816
817        ```python
818        from parmancer import string
819
820        parser = string("a") << string("b")
821        # The "b" is matched but not kept as part of the result
822        assert parser.parse("ab") == "a"
823        ```
824        """
825        return KeepOne(keep=self, right=(other,))
826
827    def gate(self: Parser[T], gate_function: Callable[[T], bool]) -> Parser[T]:
828        """
829        Fail the parser if ``gate_function`` returns False when called on the result,
830        otherwise succeed without changing the result.
831        """
832        return Gate(self, gate_function)
833
834    @overload
835    def optional(
836        self: Parser[T1], default: Literal[None] = None
837    ) -> Parser[T1 | None]: ...
838
839    @overload
840    def optional(self: Parser[T1], default: AnyLiteral) -> Parser[T1 | AnyLiteral]: ...
841
842    @overload
843    def optional(self: Parser[T1], default: T2) -> Parser[T1 | T2]: ...
844
845    def optional(
846        self: Parser[T1], default: Optional[T2] = None
847    ) -> Parser[T1 | Optional[T2]]:
848        """
849        Make the previous parser optional by returning a result with a value of
850        ``default`` if the parser failed.
851        """
852        return Choice((self, success(default)))
853
854    def with_name(self, name: str) -> Parser[T_co]:
855        """Set the name of the parser."""
856        return NamedParser(name=name, parser=self)
857
858    def breakpoint(self) -> Parser[T_co]:
859        """Insert a breakpoint before the current parser runs, for debugging."""
860
861        @stateful_parser
862        def parser(state: TextState) -> Result[T_co]:
863            breakpoint()
864            result = self.parse_result(state)
865            return result
866
867        return parser

Parser base class that defines the core parsing interface.

The generic type parameter T_co represents the type of value that the parser produces when it successfully parses input text. For example:

  • Parser[str]: A parser that produces string values
  • Parser[int]: A parser that produces integer values
  • Parser[List[str]]: A parser that produces lists of strings
  • Parser[Tuple[str, int]]: A parser that produces tuples containing a string and an integer

The _co suffix indicates that the type parameter is covariant, which means that if Child is a subtype of Parent, then Parser[Child] is a subtype of Parser[Parent].

Subclasses can override the parse_result method to create a specific parser, see String for example.

name: str = 'Parser'
def parse( self: Parser[+T_co], text: str, state_handler: Type[TextState] = <class 'TextState'>, debug: bool = False) -> +T_co:
388    def parse(
389        self: Parser[T_co],
390        text: str,
391        state_handler: Type[TextState] = TextState,
392        debug: bool = False,
393    ) -> T_co:
394        """
395        Run the parser on input text, returning the parsed value or raising a
396        `ParseError` on failure.
397
398        `text` - the text to be parsed
399        `state_handler` (optional) - the class to use for handling parser state
400        `debug` (optional) - if True, enables debug mode with detailed error information
401
402        Debug mode provides detailed information about parser execution when parsing fails,
403        including a parse tree that shows successful parsers (marked with "= value") and
404        failed parsers (marked with "X (failed)"). This is useful during development but
405        has performance overhead.
406        """
407        if debug:
408            # Import here to avoid circular imports
409            from parmancer.debug import DebugTextState
410
411            state: TextState = DebugTextState.start(text)
412        else:
413            state = state_handler.start(text)
414        result = (self << end_of_text).parse_result(state)
415        if not result.status:
416            raise ParseError(result.state.failures, result.state)
417        return result.value

Run the parser on input text, returning the parsed value or raising a ParseError on failure.

text - the text to be parsed state_handler (optional) - the class to use for handling parser state debug (optional) - if True, enables debug mode with detailed error information

Debug mode provides detailed information about parser execution when parsing fails, including a parse tree that shows successful parsers (marked with "= value") and failed parsers (marked with "X (failed)"). This is useful during development but has performance overhead.

def match( self, text: str, state_handler: Type[TextState] = <class 'TextState'>, debug: bool = False) -> Result[+T_co]:
435    def match(
436        self, text: str, state_handler: Type[TextState] = TextState, debug: bool = False
437    ) -> Result[T_co]:
438        """
439        Run the parser on input text, returning the parsed result.
440
441        Unlike `Parser.parse`, this method does not raise an error if parsing fails, it
442        returns a `Result` type wrapping the parser output or the failure state.
443
444        `text` - the text to be parsed
445        `state_handler` (optional) - the class to use for handling parser state
446        `debug` (optional) - if True, enables debug mode with detailed error information
447
448        Debug mode provides the same detailed parser execution information as `Parser.parse`,
449        but accessible through the Result object's state rather than a raised exception.
450        """
451        if debug:
452            # Import here to avoid circular imports
453            from parmancer.debug import DebugTextState
454
455            state: TextState = DebugTextState.start(text)
456        else:
457            state = state_handler.start(text)
458        return (self << end_of_text).parse_result(state)

Run the parser on input text, returning the parsed result.

Unlike Parser.parse, this method does not raise an error if parsing fails, it returns a Result type wrapping the parser output or the failure state.

text - the text to be parsed state_handler (optional) - the class to use for handling parser state debug (optional) - if True, enables debug mode with detailed error information

Debug mode provides the same detailed parser execution information as Parser.parse, but accessible through the Result object's state rather than a raised exception.

def parse_result( self, state: TextState) -> Result[+T_co]:
460    def parse_result(self, state: TextState) -> Result[T_co]:
461        """
462        Given the input text and the current parsing position (state), parse and return
463        a result (success with the parsed value, or failure with failure info).
464
465        Override this method in subclasses to create a specific parser.
466        """
467        return NotImplemented  # type: ignore[no-any-return]

Given the input text and the current parsing position (state), parse and return a result (success with the parsed value, or failure with failure info).

Override this method in subclasses to create a specific parser.

def result( self: Parser[typing.Any], value: ~T) -> Parser[~T]:
475    def result(self: Parser[Any], value: T) -> Parser[T]:
476        """Replace the current result with the given ``value``."""
477        return self >> Success(value)

Replace the current result with the given value.

def many( self: Parser[+T_co], min_count: int = 0, max_count: int | float = inf) -> Parser[typing.List[+T_co]]:
493    def many(
494        self: Parser[T_co],
495        min_count: int = 0,
496        max_count: int | float = float("inf"),
497    ) -> Parser[List[T_co]]:
498        """Repeat the parser until it doesn't match, storing all matches in a list.
499        Optionally set a minimum or maximum number of times to match.
500
501        :param min_count: Match at least this many times
502        :param max_count: Match at most this many times
503        :return: A new parser which will repeatedly apply the previous parser
504        """
505        return Range(self, min_count=min_count, max_count=max_count)

Repeat the parser until it doesn't match, storing all matches in a list. Optionally set a minimum or maximum number of times to match.

Parameters
  • min_count: Match at least this many times
  • max_count: Match at most this many times
Returns

A new parser which will repeatedly apply the previous parser

def times( self: Parser[+T_co], count: int) -> Parser[typing.List[+T_co]]:
507    def times(self: Parser[T_co], count: int) -> Parser[List[T_co]]:
508        """Repeat the parser a fixed number of times, storing all matches in a list.
509
510        :param count: Number of times to apply the parser
511        :return: A new parser which will repeat the previous parser ``count`` times
512        """
513        return self.many(min_count=count, max_count=count).with_name(f"times({count})")

Repeat the parser a fixed number of times, storing all matches in a list.

Parameters
  • count: Number of times to apply the parser
Returns

A new parser which will repeat the previous parser count times

def at_most( self: Parser[+T_co], count: int) -> Parser[typing.List[+T_co]]:
515    def at_most(self: Parser[T_co], count: int) -> Parser[List[T_co]]:
516        """Repeat the parser at most ``count`` times.
517
518        :param count: Maximum number of repeats
519        :return: A new parser which will repeat the previous parser up to ``count`` times
520        """
521        return self.many(0, count).with_name(f"at_most({count})")

Repeat the parser at most count times.

Parameters
  • count: Maximum number of repeats
Returns

A new parser which will repeat the previous parser up to count times

def at_least( self: Parser[+T_co], count: int) -> Parser[typing.List[+T_co]]:
523    def at_least(self: Parser[T_co], count: int) -> Parser[List[T_co]]:
524        """Repeat the parser at least ``count`` times.
525
526        :param count: Minimum number of repeats
527        :return: A new parser which will repeat the previous parser at least ``count`` times
528        """
529        return self.many(min_count=count, max_count=float("inf")).with_name(
530            f"at_least({count})"
531        )

Repeat the parser at least count times.

Parameters
  • count: Minimum number of repeats
Returns

A new parser which will repeat the previous parser at least count times

def until( self: Parser[+T_co], until_parser: Parser[typing.Any], min_count: int = 0, max_count: int | float = inf) -> Parser[typing.List[+T_co]]:
533    def until(
534        self: Parser[T_co],
535        until_parser: Parser[Any],
536        min_count: int = 0,
537        max_count: int | float = float("inf"),
538    ) -> Parser[List[T_co]]:
539        """Repeatedly apply the parser until the ``until_parser`` matches, optionally
540        setting a minimum or maximum number of times to repeat.
541
542        :param until_parser: Repeats will stop when this parser matches
543        :param min_count: Optional minimum number of repeats required to succeed
544        :param max_count: Optional maximum number of repeats before the ``until_parser``
545            must succeed
546        :return: A new parser which will repeat the previous parser until ``until_parser``
547        """
548        return Until(self, until_parser, min_count, max_count)

Repeatedly apply the parser until the until_parser matches, optionally setting a minimum or maximum number of times to repeat.

Parameters
  • until_parser: Repeats will stop when this parser matches
  • min_count: Optional minimum number of repeats required to succeed
  • max_count: Optional maximum number of repeats before the until_parser must succeed
Returns

A new parser which will repeat the previous parser until until_parser

def sep_by( self: Parser[+T_co], sep: Parser[typing.Any], *, min_count: int = 0, max_count: int | float = inf) -> Parser[typing.List[+T_co]]:
550    def sep_by(
551        self: Parser[T_co],
552        sep: Parser[Any],
553        *,
554        min_count: int = 0,
555        max_count: int | float = float("inf"),
556    ) -> Parser[List[T_co]]:
557        r"""
558        Alternately apply this parser and the ``sep`` parser, keeping a list of results
559        from this parser.
560
561        For example, to match a comma-separated list of values, keeping only the values
562        and discarding the commas:
563
564        ```python
565        from parmancer import regex, string
566
567        value = regex(r"\d+")
568        sep = string(", ")
569        parser = value.sep_by(sep)
570        assert parser.parse("1, 2, 30") == ["1", "2", "30"]
571        ```
572
573        :param sep: The parser acting as a separator
574        :param min_count: Optional minimum number of repeats
575        :param max_count: Optional maximum number of repeats
576        :return: A new parser which will apply this parser multiple times, with ``sep``
577            applied between each repeat.
578        """
579        return Range(
580            self, separator_parser=sep, min_count=min_count, max_count=max_count
581        )

Alternately apply this parser and the sep parser, keeping a list of results from this parser.

For example, to match a comma-separated list of values, keeping only the values and discarding the commas:

from parmancer import regex, string

value = regex(r"\d+")
sep = string(", ")
parser = value.sep_by(sep)
assert parser.parse("1, 2, 30") == ["1", "2", "30"]
Parameters
  • sep: The parser acting as a separator
  • min_count: Optional minimum number of repeats
  • max_count: Optional maximum number of repeats
Returns

A new parser which will apply this parser multiple times, with sep applied between each repeat.

def bind( self: Parser[~T1], bind_fn: Callable[[~T1], Parser[~T2]]) -> Parser[~T2]:
583    def bind(
584        self: Parser[T1],
585        bind_fn: Callable[[T1], Parser[T2]],
586    ) -> Parser[T2]:
587        """
588        Bind the result of the current parser to a function which returns another
589        parser.
590
591        :param bind_fn: A function which will take the result of the current parser as
592            input and return another parser which may depend on the result.
593        :return: The bound parser created by ``bind_fn``
594        """
595        return Bind(self, bind_fn)

Bind the result of the current parser to a function which returns another parser.

Parameters
  • bind_fn: A function which will take the result of the current parser as input and return another parser which may depend on the result.
Returns

The bound parser created by bind_fn

def map( self: Parser[~T1], map_fn: Callable[[~T1], ~T2], map_name: Optional[str] = None) -> Parser[~T2]:
597    def map(
598        self: Parser[T1],
599        map_fn: Callable[[T1], T2],
600        map_name: Optional[str] = None,
601    ) -> Parser[T2]:
602        """Convert the current result to a new result by passing its value through
603        ``map_fn``
604
605        :param map_fn: The current parser result value will be passed through this
606            function, creating a new result.
607        :param map_name: A name to give to the map function
608        :return: A new parser which will convert the previous parser's result to a new
609            value using ``map_fn``
610        """
611        if map_name is None:
612            map_name = "map"
613            if hasattr(map_fn, "__name__"):
614                map_name = map_fn.__name__
615
616        return Map(parser=self, map_callable=map_fn, map_name=map_name)

Convert the current result to a new result by passing its value through map_fn

Parameters
  • map_fn: The current parser result value will be passed through this function, creating a new result.
  • map_name: A name to give to the map function
Returns

A new parser which will convert the previous parser's result to a new value using map_fn

def map_failure( self, failure_transform: Callable[[FailureInfo], FailureInfo]) -> Parser[+T_co]:
618    def map_failure(
619        self, failure_transform: Callable[[FailureInfo], FailureInfo]
620    ) -> Parser[T_co]:
621        """Transform a failure state using a transform function, used for example to add
622        additional context to a parser failure.
623
624        :param failure_transform: A function which converts a ``FailureInfo`` into
625            another ``FailureInfo``
626        :return: A parser which will map its failure info using ``failure_transform``
627        """
628        return MapFailure(self, failure_transform)

Transform a failure state using a transform function, used for example to add additional context to a parser failure.

Parameters
Returns

A parser which will map its failure info using failure_transform

def unpack( self: Parser[typing.Tuple[typing.Unpack[Ts]]], transform_fn: Callable[[Unpack[Ts]], ~T2]) -> Parser[~T2]:
630    def unpack(
631        self: Parser[Tuple[Unpack[Ts]]],
632        transform_fn: Callable[[Unpack[Ts]], T2],
633    ) -> Parser[T2]:
634        """When the result is a tuple, it can be unpacked and passed as *args to
635        ``transform_fn``, creating a new result containing the function's output.
636
637        :param transform_fn: Function to unpack the current result tuple into as args
638        :return: An updated parser which will unpack its result into ``transform_fn``
639            to produce a new result
640        """
641        return self.bind(lambda value: Success(transform_fn(*value))).with_name(
642            "unpack"
643        )

When the result is a tuple, it can be unpacked and passed as *args to transform_fn, creating a new result containing the function's output.

Parameters
  • transform_fn: Function to unpack the current result tuple into as args
Returns

An updated parser which will unpack its result into transform_fn to produce a new result

def tuple( self: Parser[~T]) -> Parser[typing.Tuple[~T]]:
645    def tuple(self: Parser[T]) -> Parser[Tuple[T]]:
646        """Wrap the result in a tuple of length 1."""
647        return self.map(lambda value: (value,), "Wrap tuple")

Wrap the result in a tuple of length 1.

def append( self: Parser[typing.Tuple[typing.Unpack[Ts]]], other: Parser[~T2]) -> Parser[typing.Tuple[typing.Unpack[Ts], ~T2]]:
649    def append(
650        self: Parser[Tuple[Unpack[Ts]]], other: Parser[T2]
651    ) -> Parser[Tuple[Unpack[Ts], T2]]:
652        """
653        Append the result of another parser to the end of the current parser's result tuple
654
655        ```python
656        from parmancer import string
657
658        initial = string("First").tuple()
659        appended = initial.append(string("Second"))
660
661        assert appended.parse("FirstSecond") == ("First", "Second")
662        ```
663        """
664        return self.bind(
665            lambda self_value: other.bind(
666                lambda other_value: Success((*self_value, other_value))
667            )
668        )

Append the result of another parser to the end of the current parser's result tuple

from parmancer import string

initial = string("First").tuple()
appended = initial.append(string("Second"))

assert appended.parse("FirstSecond") == ("First", "Second")
def list( self: Parser[~T]) -> Parser[typing.List[~T]]:
670    def list(self: Parser[T]) -> Parser[List[T]]:
671        """Wrap the result in a list."""
672        return self.map(lambda value: [value], map_name="Wrap list")

Wrap the result in a list.

def concat( self: Parser[typing.Iterable[parmancer.parser.SupportsAdd[~T, ~T1]]]) -> Parser[~T1]:
762    def concat(
763        self: Parser[Iterable[SupportsAdd[T, T1]]],
764    ) -> Parser[T1]:
765        """
766        Add all the elements of an iterable result together.
767
768        For an iterable of strings, this concatenates the strings:
769
770        ```python
771        from parmancer import digits, string
772
773        delimited = digits.sep_by(string("-"))
774
775        assert delimited.parse("0800-12-3") == ["0800", "12", "3"]
776
777        assert delimited.concat().parse("0800-12-3") == "0800123"
778        ```
779        """
780
781        return self.map(partial(reduce, operator.add), "Concat")

Add all the elements of an iterable result together.

For an iterable of strings, this concatenates the strings:

from parmancer import digits, string

delimited = digits.sep_by(string("-"))

assert delimited.parse("0800-12-3") == ["0800", "12", "3"]

assert delimited.concat().parse("0800-12-3") == "0800123"
def keep_right(self, other: Parser[~T]) -> Parser[~T]:
788    def keep_right(self, other: Parser[T]) -> Parser[T]:
789        """
790        This parser is run, followed by the other parser, but only the result of the
791        other parser is kept.
792
793        Another way to use this is with the `>>` operator:
794
795        ```python
796        from parmancer import string
797
798        parser = string("a") >> string("b")
799        # The "a" is matched but not kept as part of the result
800        assert parser.parse("ab") == "b"
801        ```
802        """
803        return KeepOne(left=(self,), keep=other)

This parser is run, followed by the other parser, but only the result of the other parser is kept.

Another way to use this is with the >> operator:

from parmancer import string

parser = string("a") >> string("b")
# The "a" is matched but not kept as part of the result
assert parser.parse("ab") == "b"
def keep_left( self: Parser[~T], other: Parser[typing.Any]) -> Parser[~T]:
810    def keep_left(self: Parser[T], other: Parser[Any]) -> Parser[T]:
811        """
812        This parser is run, followed by the other parser, but only the result of this
813        parser is kept.
814
815        Another way to use this is with the `<<` operator:
816
817        ```python
818        from parmancer import string
819
820        parser = string("a") << string("b")
821        # The "b" is matched but not kept as part of the result
822        assert parser.parse("ab") == "a"
823        ```
824        """
825        return KeepOne(keep=self, right=(other,))

This parser is run, followed by the other parser, but only the result of this parser is kept.

Another way to use this is with the << operator:

from parmancer import string

parser = string("a") << string("b")
# The "b" is matched but not kept as part of the result
assert parser.parse("ab") == "a"
def gate( self: Parser[~T], gate_function: Callable[[~T], bool]) -> Parser[~T]:
827    def gate(self: Parser[T], gate_function: Callable[[T], bool]) -> Parser[T]:
828        """
829        Fail the parser if ``gate_function`` returns False when called on the result,
830        otherwise succeed without changing the result.
831        """
832        return Gate(self, gate_function)

Fail the parser if gate_function returns False when called on the result, otherwise succeed without changing the result.

def optional( self: Parser[~T1], default: Optional[~T2] = None) -> Parser[typing.Union[~T1, ~T2, NoneType]]:
845    def optional(
846        self: Parser[T1], default: Optional[T2] = None
847    ) -> Parser[T1 | Optional[T2]]:
848        """
849        Make the previous parser optional by returning a result with a value of
850        ``default`` if the parser failed.
851        """
852        return Choice((self, success(default)))

Make the previous parser optional by returning a result with a value of default if the parser failed.

def with_name(self, name: str) -> Parser[+T_co]:
854    def with_name(self, name: str) -> Parser[T_co]:
855        """Set the name of the parser."""
856        return NamedParser(name=name, parser=self)

Set the name of the parser.

def breakpoint(self) -> Parser[+T_co]:
858    def breakpoint(self) -> Parser[T_co]:
859        """Insert a breakpoint before the current parser runs, for debugging."""
860
861        @stateful_parser
862        def parser(state: TextState) -> Result[T_co]:
863            breakpoint()
864            result = self.parse_result(state)
865            return result
866
867        return parser

Insert a breakpoint before the current parser runs, for debugging.

@dataclass(**_slots)
class Result(typing.Generic[+T_co]):
284@dataclass(**_slots)
285class Result(Generic[T_co]):
286    """
287    A result of running a parser, including whether it failed or succeeded, the parsed
288    value if it succeeded, the text state after parsing, and any failure information
289    about the furthest position in the text which has been parsed so far.
290
291    The generic type parameter `T_co` represents the type of the parsed value when the
292    parsing operation succeeds. This type corresponds to the return type of the parser
293    that produced this result. For example:
294
295    - `Result[str]`: Result of a parser that produces string values
296    - `Result[int]`: Result of a parser that produces integer values
297    - `Result[List[T]]`: Result of a parser that produces lists of values of type T
298
299    The `_co` suffix indicates that the type parameter is covariant, which means that if
300    `Child` is a subtype of `Parent`, then `Result[Child]` is a subtype of `Result[Parent]`.
301    """
302
303    status: bool
304    state: TextState
305    failure_info: FailureInfo
306    value: T_co
307
308    def expect(self: Self) -> Self:
309        """
310        Raise `ResultAsException` if parsing failed, otherwise return the result.
311
312        This is useful in stateful parsers as a way to exit parsing part way through
313        a function; the `ResultAsException` will then be caught and turned into a
314        failure `Result` by the `StatefulParser`.
315        """
316        if not self.status:
317            raise ResultAsException(self)
318        return self
319
320    def map_failure(
321        self, failure_transform: Callable[[FailureInfo], FailureInfo]
322    ) -> Result[T_co]:
323        """
324        If the result is a failure, map the failure to a new failure value by applying
325        `failure_transform`.
326        """
327        if self.status:
328            return self
329        mapped_info = failure_transform(self.failure_info)
330        failures = self.state.failures
331        if self.failure_info in self.state.failures:
332            # Need to update the failures state
333            failures = tuple(
334                mapped_info if info is self.failure_info else info for info in failures
335            )
336        return Result(
337            self.status, self.state.replace_failures(failures), mapped_info, self.value
338        )

A result of running a parser, including whether it failed or succeeded, the parsed value if it succeeded, the text state after parsing, and any failure information about the furthest position in the text which has been parsed so far.

The generic type parameter T_co represents the type of the parsed value when the parsing operation succeeds. This type corresponds to the return type of the parser that produced this result. For example:

  • Result[str]: Result of a parser that produces string values
  • Result[int]: Result of a parser that produces integer values
  • Result[List[T]]: Result of a parser that produces lists of values of type T

The _co suffix indicates that the type parameter is covariant, which means that if Child is a subtype of Parent, then Result[Child] is a subtype of Result[Parent].

Result( status: bool, state: TextState, failure_info: FailureInfo, value: +T_co)
status: bool
state: TextState
failure_info: FailureInfo
value: +T_co
def expect(self: Self) -> Self:
308    def expect(self: Self) -> Self:
309        """
310        Raise `ResultAsException` if parsing failed, otherwise return the result.
311
312        This is useful in stateful parsers as a way to exit parsing part way through
313        a function; the `ResultAsException` will then be caught and turned into a
314        failure `Result` by the `StatefulParser`.
315        """
316        if not self.status:
317            raise ResultAsException(self)
318        return self

Raise ResultAsException if parsing failed, otherwise return the result.

This is useful in stateful parsers as a way to exit parsing part way through a function; the ResultAsException will then be caught and turned into a failure Result by the StatefulParser.

def map_failure( self, failure_transform: Callable[[FailureInfo], FailureInfo]) -> Result[+T_co]:
320    def map_failure(
321        self, failure_transform: Callable[[FailureInfo], FailureInfo]
322    ) -> Result[T_co]:
323        """
324        If the result is a failure, map the failure to a new failure value by applying
325        `failure_transform`.
326        """
327        if self.status:
328            return self
329        mapped_info = failure_transform(self.failure_info)
330        failures = self.state.failures
331        if self.failure_info in self.state.failures:
332            # Need to update the failures state
333            failures = tuple(
334                mapped_info if info is self.failure_info else info for info in failures
335            )
336        return Result(
337            self.status, self.state.replace_failures(failures), mapped_info, self.value
338        )

If the result is a failure, map the failure to a new failure value by applying failure_transform.

class ParseError(builtins.ValueError):
246class ParseError(ValueError):
247    """A parsing error."""
248
249    def __init__(self, failures: Tuple[FailureInfo, ...], state: TextState) -> None:
250        """Create a parsing error with specific failures for a given parser state."""
251        self.failures: Tuple[FailureInfo, ...] = failures
252        self.state: TextState = state
253
254    def __str__(self) -> str:
255        """
256        Error text to display, including information about whichever parser(s) consumed
257        the most text, along with a small window of context showing where parsing
258        failed. If debug mode was used, includes detailed parser state information.
259        """
260        furthest_state = self.state.at(max(failure.index for failure in self.failures))
261        messages = sorted(f"'{info.message}'" for info in self.failures)
262
263        # Build the basic error message
264        if len(messages) == 1:
265            basic_error = f"failed with {messages[0]}\nFurthest parsing position:\n{furthest_state.context_display()}"
266        else:
267            basic_error = f"failed with {', '.join(messages)}\nFurthest parsing position:\n{furthest_state.context_display()}"
268
269        # Check if this is a debug state and add debug information
270        try:
271            # Import here to avoid circular imports
272            from parmancer.debug import DebugTextState
273
274            if isinstance(self.state, DebugTextState):
275                debug_info = self.state.get_debug_info()
276                return f"{basic_error}\n\n{debug_info}"
277        except ImportError:
278            # If debug module isn't available, just return basic error
279            pass
280
281        return basic_error

A parsing error.

ParseError( failures: Tuple[FailureInfo, ...], state: TextState)
249    def __init__(self, failures: Tuple[FailureInfo, ...], state: TextState) -> None:
250        """Create a parsing error with specific failures for a given parser state."""
251        self.failures: Tuple[FailureInfo, ...] = failures
252        self.state: TextState = state

Create a parsing error with specific failures for a given parser state.

failures: Tuple[FailureInfo, ...]
state: TextState
Inherited Members
builtins.BaseException
with_traceback
add_note
args
@dataclass(frozen=True, eq=True)
class FailureInfo:
116@dataclass(frozen=True, eq=True)
117class FailureInfo:
118    """Information about a parsing failure: the text index and a message."""
119
120    index: int
121    message: str

Information about a parsing failure: the text index and a message.

FailureInfo(index: int, message: str)
index: int
message: str
@dataclass(frozen=True, **_slots)
class TextState:
124@dataclass(frozen=True, **_slots)
125class TextState:
126    """
127    Parsing state: the input text, the current index of parsing, failures from previous
128    parser branches for error reporting.
129
130    Note that many `TextState` objects are created during parsing and they all contain
131    the original input `text`, but these are all references to the same original string
132    rather than copies.
133    """
134
135    text: str
136    """The full text being parsed."""
137    index: int
138    """Index at start of the remaining unparsed text."""
139    failures: Tuple[FailureInfo, ...] = tuple()
140    """Previously encountered parsing failures, used for reporting parser failures."""
141
142    @classmethod
143    def start(cls: Type[Self], text: str) -> Self:
144        """Initialize TextState for the given text with the index at the start."""
145        return cls(text, 0)
146
147    def progress(
148        self: Self, index: int, failures: Tuple[FailureInfo, ...] = tuple()
149    ) -> Self:
150        """
151        Create a new state from the current state, maintaining any extra information
152
153        Every time a new state is made from an existing state, it should pass through
154        this function to keep any values other than the basic TextState fields.
155        This is similar to making a shallow copy but doesn't require mutation after
156        the copy is made.
157        """
158        return type(self)(
159            self.text,
160            index,
161            failures,
162            **{
163                field.name: getattr(self, field.name)
164                for field in fields(self)
165                if field.name not in ("text", "index", "failures")
166            },
167        )
168
169    def at(self: Self, index: int) -> Self:
170        """Move `index` to the given value, returning a new state."""
171        return self.progress(index, self.failures)
172
173    def apply(
174        self: Self, parser: Parser[T_co], raise_failure: bool = True
175    ) -> Result[T_co]:
176        """
177        Apply a parser to the current state, returning the parsing `Result` which may
178        be a success or failure.
179        """
180        result = parser.parse_result(self)
181        if not result.status and raise_failure:
182            raise ResultAsException(result)
183        return result
184
185    def success(self: Self, value: T) -> Result[T]:
186        """Produce a success Result with the given value."""
187        return Result(True, self, FailureInfo(-1, ""), value)
188
189    def failure(self: Self, message: str) -> Result[Any]:
190        """Create a failure Result with the given failure message."""
191        info = FailureInfo(index=self.index, message=message)
192
193        new_state = self.merge_failures((info,))
194        return Result(
195            False,
196            new_state,
197            info,
198            None,
199        )
200
201    def merge_state_failures(self: Self, state: TextState) -> Self:
202        return self.merge_failures(state.failures)
203
204    def merge_failures(self: Self, other: Tuple[FailureInfo, ...]) -> Self:
205        furthest_failure = (
206            max(info.index for info in self.failures) if self.failures else -1
207        )
208        result_failures: Tuple[FailureInfo, ...] = self.failures
209        for failure in other:
210            if furthest_failure < failure.index:
211                furthest_failure = failure.index
212                result_failures = (failure,)
213            elif furthest_failure == failure.index:
214                result_failures = (*result_failures, failure)
215
216        return self.progress(self.index, result_failures)
217
218    def replace_failures(self: Self, failures: Tuple[FailureInfo, ...]) -> Self:
219        """Replace any current failures with new failures."""
220        return self.progress(self.index, failures)
221
222    def line_col(self: Self) -> LineColumn:
223        """The line and column at the current parser index in the text."""
224        return LineColumn.from_index(self.text, self.index)
225
226    def context_display(self) -> str:
227        """
228        Text which displays a context window around the current parser position, with
229        an indicator pointing to the character at the current index.
230        """
231        window, cursor = context_window(self.text, self.index, width=40)
232        context: List[str] = []
233        for i, line in enumerate(window):
234            if i == cursor.line:
235                context.append(line.rstrip("\n") + "\n")
236                context.append("~" * cursor.column + "^\n")
237            else:
238                context.append(line)
239        return "".join(context)
240
241    def remaining(self: Self) -> str:
242        """All of the text remaining to be parsed, from the current index onward."""
243        return self.text[self.index :]

Parsing state: the input text, the current index of parsing, failures from previous parser branches for error reporting.

Note that many TextState objects are created during parsing and they all contain the original input text, but these are all references to the same original string rather than copies.

TextState( text: str, index: int, failures: Tuple[FailureInfo, ...] = ())
text: str

The full text being parsed.

index: int

Index at start of the remaining unparsed text.

failures: Tuple[FailureInfo, ...]

Previously encountered parsing failures, used for reporting parser failures.

@classmethod
def start(cls: Type[Self], text: str) -> Self:
142    @classmethod
143    def start(cls: Type[Self], text: str) -> Self:
144        """Initialize TextState for the given text with the index at the start."""
145        return cls(text, 0)

Initialize TextState for the given text with the index at the start.

def progress( self: Self, index: int, failures: Tuple[FailureInfo, ...] = ()) -> Self:
147    def progress(
148        self: Self, index: int, failures: Tuple[FailureInfo, ...] = tuple()
149    ) -> Self:
150        """
151        Create a new state from the current state, maintaining any extra information
152
153        Every time a new state is made from an existing state, it should pass through
154        this function to keep any values other than the basic TextState fields.
155        This is similar to making a shallow copy but doesn't require mutation after
156        the copy is made.
157        """
158        return type(self)(
159            self.text,
160            index,
161            failures,
162            **{
163                field.name: getattr(self, field.name)
164                for field in fields(self)
165                if field.name not in ("text", "index", "failures")
166            },
167        )

Create a new state from the current state, maintaining any extra information

Every time a new state is made from an existing state, it should pass through this function to keep any values other than the basic TextState fields. This is similar to making a shallow copy but doesn't require mutation after the copy is made.

def at(self: Self, index: int) -> Self:
169    def at(self: Self, index: int) -> Self:
170        """Move `index` to the given value, returning a new state."""
171        return self.progress(index, self.failures)

Move index to the given value, returning a new state.

def apply( self: Self, parser: Parser[+T_co], raise_failure: bool = True) -> Result[+T_co]:
173    def apply(
174        self: Self, parser: Parser[T_co], raise_failure: bool = True
175    ) -> Result[T_co]:
176        """
177        Apply a parser to the current state, returning the parsing `Result` which may
178        be a success or failure.
179        """
180        result = parser.parse_result(self)
181        if not result.status and raise_failure:
182            raise ResultAsException(result)
183        return result

Apply a parser to the current state, returning the parsing Result which may be a success or failure.

def success(self: Self, value: ~T) -> Result[~T]:
185    def success(self: Self, value: T) -> Result[T]:
186        """Produce a success Result with the given value."""
187        return Result(True, self, FailureInfo(-1, ""), value)

Produce a success Result with the given value.

def failure(self: Self, message: str) -> Result[typing.Any]:
189    def failure(self: Self, message: str) -> Result[Any]:
190        """Create a failure Result with the given failure message."""
191        info = FailureInfo(index=self.index, message=message)
192
193        new_state = self.merge_failures((info,))
194        return Result(
195            False,
196            new_state,
197            info,
198            None,
199        )

Create a failure Result with the given failure message.

def merge_state_failures(self: Self, state: TextState) -> Self:
201    def merge_state_failures(self: Self, state: TextState) -> Self:
202        return self.merge_failures(state.failures)
def merge_failures(self: Self, other: Tuple[FailureInfo, ...]) -> Self:
204    def merge_failures(self: Self, other: Tuple[FailureInfo, ...]) -> Self:
205        furthest_failure = (
206            max(info.index for info in self.failures) if self.failures else -1
207        )
208        result_failures: Tuple[FailureInfo, ...] = self.failures
209        for failure in other:
210            if furthest_failure < failure.index:
211                furthest_failure = failure.index
212                result_failures = (failure,)
213            elif furthest_failure == failure.index:
214                result_failures = (*result_failures, failure)
215
216        return self.progress(self.index, result_failures)
def replace_failures(self: Self, failures: Tuple[FailureInfo, ...]) -> Self:
218    def replace_failures(self: Self, failures: Tuple[FailureInfo, ...]) -> Self:
219        """Replace any current failures with new failures."""
220        return self.progress(self.index, failures)

Replace any current failures with new failures.

def line_col(self: Self) -> parmancer.text_display.LineColumn:
222    def line_col(self: Self) -> LineColumn:
223        """The line and column at the current parser index in the text."""
224        return LineColumn.from_index(self.text, self.index)

The line and column at the current parser index in the text.

def context_display(self) -> str:
226    def context_display(self) -> str:
227        """
228        Text which displays a context window around the current parser position, with
229        an indicator pointing to the character at the current index.
230        """
231        window, cursor = context_window(self.text, self.index, width=40)
232        context: List[str] = []
233        for i, line in enumerate(window):
234            if i == cursor.line:
235                context.append(line.rstrip("\n") + "\n")
236                context.append("~" * cursor.column + "^\n")
237            else:
238                context.append(line)
239        return "".join(context)

Text which displays a context window around the current parser position, with an indicator pointing to the character at the current index.

def remaining(self: Self) -> str:
241    def remaining(self: Self) -> str:
242        """All of the text remaining to be parsed, from the current index onward."""
243        return self.text[self.index :]

All of the text remaining to be parsed, from the current index onward.

@dataclass(frozen=True)
class DebugTextState(parmancer.TextState):
195@dataclass(frozen=True)
196class DebugTextState(TextState):
197    """
198    A TextState subclass that captures parser execution information for debug display.
199    When a parser fails, this state can provide detailed information about what
200    parsers were attempted and where the failure occurred.
201    """
202
203    tree: Node = field(default_factory=Node.default)
204
205    def progress(
206        self: Self, index: int, failures: Tuple[FailureInfo, ...] = tuple()
207    ) -> Self:
208        """
209        Override progress to maintain the tree state across state transitions.
210        """
211        # Use the parent's progress method but ensure we keep our tree
212        new_state = super().progress(index, failures)
213        # The tree should already be preserved by the parent's progress method
214        # since it uses **{field.name: getattr(self, field.name) for field in fields(self)}
215        return new_state
216
217    def success(self: Self, value: Any) -> Any:
218        # Capture successful parser results in the tree
219        stack = ParseStack.get_from_stack()
220        if stack.path:  # Only add to tree if we have a valid path
221            node = Node(stack.path[-1], [], result=value)
222            append_tree(self.tree, stack.path, node)
223
224        return super().success(value)
225
226    def failure(self: Self, message: str) -> Any:
227        # Capture failure in the tree when it actually occurs
228        stack = ParseStack.get_from_stack()
229        if stack.path:  # Only add to tree if we have a valid path
230            node = Node(stack.path[-1], [], result=Failure)
231            append_tree(self.tree, stack.path, node)
232
233        return super().failure(message)
234
235    def get_debug_info(self: Self) -> str:
236        """Get formatted debug information for this state."""
237        return format_debug_info(self, self.tree)

A TextState subclass that captures parser execution information for debug display. When a parser fails, this state can provide detailed information about what parsers were attempted and where the failure occurred.

DebugTextState( text: str, index: int, failures: Tuple[FailureInfo, ...] = (), tree: parmancer.debug.Node = <factory>)
tree: parmancer.debug.Node
def progress( self: Self, index: int, failures: Tuple[FailureInfo, ...] = ()) -> Self:
205    def progress(
206        self: Self, index: int, failures: Tuple[FailureInfo, ...] = tuple()
207    ) -> Self:
208        """
209        Override progress to maintain the tree state across state transitions.
210        """
211        # Use the parent's progress method but ensure we keep our tree
212        new_state = super().progress(index, failures)
213        # The tree should already be preserved by the parent's progress method
214        # since it uses **{field.name: getattr(self, field.name) for field in fields(self)}
215        return new_state

Override progress to maintain the tree state across state transitions.

def success(self: Self, value: Any) -> Any:
217    def success(self: Self, value: Any) -> Any:
218        # Capture successful parser results in the tree
219        stack = ParseStack.get_from_stack()
220        if stack.path:  # Only add to tree if we have a valid path
221            node = Node(stack.path[-1], [], result=value)
222            append_tree(self.tree, stack.path, node)
223
224        return super().success(value)

Produce a success Result with the given value.

def failure(self: Self, message: str) -> Any:
226    def failure(self: Self, message: str) -> Any:
227        # Capture failure in the tree when it actually occurs
228        stack = ParseStack.get_from_stack()
229        if stack.path:  # Only add to tree if we have a valid path
230            node = Node(stack.path[-1], [], result=Failure)
231            append_tree(self.tree, stack.path, node)
232
233        return super().failure(message)

Create a failure Result with the given failure message.

def get_debug_info(self: Self) -> str:
235    def get_debug_info(self: Self) -> str:
236        """Get formatted debug information for this state."""
237        return format_debug_info(self, self.tree)

Get formatted debug information for this state.