regex - IGNORECASE errors in Python's re.Scanner? -
there hidden known functionality in re module
import re def s_ident(scanner, token): return token def s_operator(scanner, token): return "op%s" % token def s_float(scanner, token): return float(token) def s_int(scanner, token): return int(token) scanner = re.scanner([ (r"[a-za-z]\w*", s_ident), (r"\d+\.\d*", s_float), (r"\d+", s_int), (r"=|\+|-|\*|/", s_operator), (r"\s+", none), ]) print scanner.scan("sum = 3*foo + 312.50 + bar") # (['sum', 'op=', 3, 'op*', 'foo', 'op+', 312.5, 'op+', 'bar'], '')
i want use ignorecase flag here seems not work:
import re def s_ident(scanner, token): return token def s_operator(scanner, token): return "op%s" % token def s_float(scanner, token): return float(token) def s_int(scanner, token): return int(token) scanner = re.scanner([ (r"(?i)[a-z]\w*", s_ident), (r"\d+\.\d*", s_float), (r"\d+", s_int), (r"=|\+|-|\*|/", s_operator), (r"\s+", none), ]) print scanner.scan("sum = 3*foo + 312.50 + bar") # ([], 'sum = 3*foo + 312.50 + bar')
is issue of scanner or error in code? possible implement non-case-sensitive matching using scanner?
this issue reproduced on python 2.7.9.
expected value: (['sum', 'op=', 3, 'op*', 'foo', 'op+', 312.5, 'op+', 'bar'], '')
actual value: ([], 'sum = 3*foo + 312.50 + bar')
you can pass flags
parameter constructor.
scanner = re.scanner([ (r"[a-z]\w*", s_ident), (r"\d+\.\d*", s_float), (r"\d+", s_int), (r"=|\+|-|\*|/", s_operator), (r"\s+", none), ], flags=re.ignorecase)
source scanner
: https://github.com/python/cpython/blob/master/lib/re.py#l345
Comments
Post a Comment