Introduction for future readers
I posted the code below in the hope that it would be a good example to discuss the topic "Alternative to nonlocal for local variables in a function with nested functions in Python".
As it turns out, the answer to that question depends a lot on whether or not the hypothetical function with nested functions would return an value/object or not.
For the case when it would not, you can find a consise and clear discussion there. For people who don't know about "nonlocal", that discussion also consisely clarifies the topic, and what I generally mean by alternatives to nonlocal.
For the case with a return value/object, the next question is whether or not that return value/object needs to provide a class interface, typically methods to access its contents. If it does, which is likely if the problem at hand is complex enough to justify a function with nested functions, the hypothetical function should not be implemented as a function but as a class, and the accessor methods to the hypothetical return value would simply be methods in that class. I.e. the problem becomes a classical class design problem.
This is the case for the code below, as shown in 200_success' and my answer.
The example itself is a tokenizer for a simple arithmetic expression. For the sake of simplicity, the code includes no error handling.
Please note that the alternatives provided in this question are not good solutions. 200_success' and my answer below provide good solutions, although error handling would need to be added.
Alternatives
Alternative one: class self
OPERATORS = '+', '-', '*', '/'
def tokenize(expression):
def state_none(c):
if c.isdecimal():
self.token = c
self.state = state_number
elif c in OPERATORS:
self.token = 'operator', c
self.token_ready = True
def state_number(c):
if c.isdecimal():
self.token += c
else:
self.char_consumed = False
self.token = 'number', self.token
self.token_ready = True
self.state = state_none
def interpret_character(c):
self.token_ready = False
self.char_consumed = True
self.state(c)
class self:
token_ready = False
token = None
char_consumed = True
state = state_none
for c in expression:
self.char_consumed = False
while not self.char_consumed:
interpret_character(c)
if self.token_ready:
yield self.token
if self.state == state_number:
yield 'number', self.token
def main():
for x in tokenize('15+ 2 * 378 / 5'):
print(x)
# ('number', '15')
# ('operator', '+')
# ('number', '2')
# ('operator', '*')
# ('number', '378')
# ('operator', '/')
# ('number', '5')
if __name__ == '__main__':
main()
Alternative two: callable class
I do not like it because one has to first instantiate the class in order to call the object, but it is a clear inspiration for alternative one.
OPERATORS = '+', '-', '*', '/'
class Tokenizer:
def __init__(self, expression):
self.expression = expression
self.token_ready = False
self.token = None
self.char_consumed = True
self.state = self.state_none
def state_none(self, c):
if c.isdecimal():
self.token = c
self.state = self.state_number
elif c in OPERATORS:
self.token = 'operator', c
self.token_ready = True
def state_number(self, c):
if c.isdecimal():
self.token += c
else:
self.char_consumed = False
self.token = 'number', self.token
self.token_ready = True
self.state = self.state_none
def interpret_character(self, c):
self.token_ready = False
self.char_consumed = True
self.state(c)
def __call__(self):
for c in self.expression:
self.char_consumed = False
while not self.char_consumed:
self.interpret_character(c)
if self.token_ready:
yield self.token
if self.state == self.state_number:
yield 'number', self.token
def main():
for x in Tokenizer('15+ 2 * 378 / 5')():
print(x)
# ('number', '15')
# ('operator', '+')
# ('number', '2')
# ('operator', '*')
# ('number', '378')
# ('operator', '/')
# ('number', '5')
if __name__ == '__main__':
main()