Home  >  Article  >  Backend Development  >  Using Python to implement a simple four arithmetic interpreter

Using Python to implement a simple four arithmetic interpreter

王林
王林forward
2023-04-21 11:46:091649browse

Calculation function demonstration

Here we first show the help information of the program, and then a few simple four arithmetic operations tests. It seems that there is no problem (I cannot guarantee that the program has no bugs!).

Using Python to implement a simple four arithmetic interpreter

Output tokens

Using Python to implement a simple four arithmetic interpreter

Output AST

This formatted JSON message is too long. Not conducive to direct viewing. We render it to see the final generated tree diagram (see the previous two blogs for methods). Save the following JSON in a file, here I call it demo.json, and then execute the following command: pytm-cli -d LR -i demo.json -o demo.html, and then open the generated file in the browser html file.

Using Python to implement a simple four arithmetic interpreter

Using Python to implement a simple four arithmetic interpreter

##Code

All the code is here, only one file is needed

my_eval.py, if you want to run it, copy and paste, and then follow the steps of the demonstration.

Node, BinOp, and Constan are classes used to represent nodes.

The lexizer method in Calculator is used for word segmentation. Originally, I planned to use regularization. If you have read my previous blog, you can I found that I used regular expressions to segment words (because there is a simple word segmentation program in the regular expressions of Python's official documentation). But I saw that other people were writing participles by hand, so I did the same, but it didn’t feel very good, it was very tedious and error-prone.
The parse method is for parsing, mainly to analyze the structure of the expression, determine whether it conforms to the grammar of the four arithmetic operations, and finally generate the expression tree (its AST).

"""
Grammar

G -> E
E -> T E'
E' -> '+' T E' | '-' T E' | ɛ
T -> F T'
T' -> '*' F T' | '/' F T' | ɛ
F -> '(' E ')' | num | name

"""

import json
import argparse


class Node:
    """
    简单的抽象语法树节点,定义一些需要使用到的具有层次结构的节点
    """

    def eval(self) -> float: ...   # 节点的计算方法
    def visit(self): ...           # 节点的访问方法


class BinOp(Node):
    """
    BinOp Node
    """

    def __init__(self, left, op, right) -> None:
        self.left = left
        self.op = op
        self.right = right

    def eval(self) -> float:
        if self.op == "+":
            return self.left.eval() + self.right.eval()
        if self.op == "-":
            return self.left.eval() - self.right.eval()
        if self.op == "*":
            return self.left.eval() * self.right.eval()
        if self.op == "/":
            return self.left.eval() / self.right.eval()
        return 0

    def visit(self):
        """
        遍历树的各个节点,并生成 JSON 表示
        """

        return {
            "name": "BinOp",
            "children": [
                self.left.visit(),
                {
                    "name": "OP",
                    "children": [
                        {
                            "name": self.op
                        }
                    ]
                },
                self.right.visit()
            ]
        }


class Constant(Node):
    """
    Constant Node
    """

    def __init__(self, value) -> None:
        self.value = value

    def eval(self) -> float:
        return self.value

    def visit(self):
        return {
            "name": "NUMBER",
            "children": [
                {
                    "name": str(self.value)  # 转成字符是因为渲染成图像时,需要该字段为 str
                }
            ]
        }


class Calculator:
    """
    Simple Expression Parser
    """

    def __init__(self, expr) -> None:
        self.expr = expr           # 输入的表达式
        self.parse_end = False     # 解析是否结束,默认未结束
        self.toks = []             # 解析的 tokens
        self.index = 0             # 解析的下标

    def lexizer(self):
        """
        分词
        """
        index = 0
        while index < len(self.expr):
            ch = self.expr[index]
            if ch in [" ", "\r", "\n"]:
                index += 1
                continue
            if &#39;0&#39; <= ch <= &#39;9&#39;:
                num_str = ch
                index += 1
                while index < len(self.expr):
                    n = self.expr[index]
                    if &#39;0&#39; <= n <= &#39;9&#39;:
                        if ch == &#39;0&#39;:
                            raise Exception("Invalid number!")
                        num_str = n
                        index += 1
                        continue
                    break
                self.toks.append({
                    "kind": "INT",
                    "value": int(num_str)
                })
            elif ch in [&#39;+&#39;, &#39;-&#39;, &#39;*&#39;, &#39;/&#39;, &#39;(&#39;, &#39;)&#39;]:
                self.toks.append({
                    "kind": ch,
                    "value": ch
                })
                index += 1
            else:
                raise Exception("Unkonwn character!")

    def get_token(self):
        """
        获取当前位置的 token
        """
        if 0 <= self.index < len(self.toks):
            tok = self.toks[self.index]
            return tok
        if self.index == len(self.toks):  # token解析结束
            return {
                "kind": "EOF",
                "value": "EOF"
            }
        raise Exception("Encounter Error, invalid index = ", self.index)

    def move_token(self):
        """
        下标向后移动一位
        """
        self.index += 1

    def parse(self) -> Node:
        """
        G -> E
        """
        # 分词
        self.lexizer()
        # 解析
        expr_tree = self.parse_expr()
        if self.parse_end:
            return expr_tree
        else:
            raise Exception("Invalid expression!")

    def parse_expr(self):
        """
        E -> T E&#39;
        E&#39; -> + T E&#39; | - T E&#39; | ɛ
        """
        # E -> E E&#39;
        left = self.parse_term()
        # E&#39; -> + T E&#39; | - T E&#39; | ɛ
        while True:
            tok = self.get_token()
            kind = tok["kind"]
            value = tok["value"]

            if tok["kind"] == "EOF":
                # 解析结束的标志
                self.parse_end = True
                break
            if kind in ["+", "-"]:
                self.move_token()
                left = BinOp(left, value, self.parse_term())
            else:
                break

        return left

    def parse_term(self):
        """
        T -> F T&#39;
        T&#39; -> * F T&#39; | / F T&#39; | ɛ
        """
        # T -> F T&#39;
        left = self.parse_factor()
        # T&#39; -> * F T&#39; | / F T&#39; | ɛ
        while True:
            tok = self.get_token()
            kind = tok["kind"]
            value = tok["value"]

            if kind in ["*", "/"]:
                self.move_token()
                right = self.parse_factor()
                left = BinOp(left, value, right)
            else:
                break

        return left

    def parse_factor(self):
        """
        F -> &#39;(&#39; E &#39;)&#39; | num | name
        """
        tok = self.get_token()
        kind = tok["kind"]
        value = tok["value"]
        if kind == &#39;(&#39;:
            self.move_token()
            expr_node = self.parse_expr()
            if self.get_token()["kind"] != ")":
                raise Exception("Encounter Error, expected )!")
            self.move_token()
            return expr_node
        if kind == "INT":
            self.move_token()
            return Constant(value=value)

        raise Exception("Encounter Error, unknown factor: ", kind)


if __name__ == "__main__":
    # 添加命令行参数解析器
    cmd_parser = argparse.ArgumentParser(
        description="Simple Expression Interpreter!")
    group = cmd_parser.add_mutually_exclusive_group()
    group.add_argument("--tokens", help="print tokens", action="store_true")
    group.add_argument("--ast", help="print ast in JSON", action="store_true")
    cmd_parser.add_argument(
        "expr", help="expression, contains [&#39;+&#39;, &#39;-&#39;, &#39;*&#39;, &#39;/&#39;, &#39;(&#39;, &#39;)&#39;, &#39;num&#39;]")
    args = cmd_parser.parse_args()

    calculator = Calculator(expr=args.expr)
    tree = calculator.parse()
    if args.tokens:   # 输出 tokens
        for t in calculator.toks:
            print(f"{t[&#39;kind&#39;]:3s} ==> {t[&#39;value&#39;]}")
    elif args.ast:    # 输出 JSON 表示的 AST
        print(json.dumps(tree.visit(), indent=4))
    else:             # 计算结果
        print(tree.eval())

Summary

I originally wanted to talk about why it is called

my_eval.py, but I feel that there are not many people behind it, so I will say it here. If you write a complex expression, how do you verify whether it is correct? Here we can just use Python, the most perfect interpreter, haha. Python's eval function is used here. Of course, you don't need to call this function, just copy the calculated expression directly. I use the eval function just to express why my program is called my_eval.

Using Python to implement a simple four arithmetic interpreter

After this is implemented, it can be regarded as a simple four arithmetic interpreter. However, if you do it again, you will probably feel that the whole process is very cumbersome like me. Because there are ready-made tools for word segmentation and grammatical analysis, and they are not prone to errors, the workload can be greatly reduced. However, it is necessary to do it yourself. Before using the tool, you must at least understand the function of the tool.

The above is the detailed content of Using Python to implement a simple four arithmetic interpreter. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete