search
HomeBackend DevelopmentPython Tutorialpython的即时标记项目练习笔记

这是《python基础教程》后面的实践,照着写写,一方面是来熟悉python的代码方式,另一方面是练习使用python中的基本的以及非基本的语法,做到熟能生巧。

这个项目一开始比较简单,不过重构之后就有些复杂了,但是更灵活了。

按照书上所说,重构之后的程序,分为四个模块:处理程序模块,过滤器模块,规则(其实应该是处理规则),语法分析器。

先来说处理程序模块,这个模块的作用有两个,一个是提供那些固定的html标记的输出(每一个标记都有start和end),另一个是对这个标记输出的开始和结束提供了一个友好的访问接口。来看下程序handlers.py:

代码如下:


class Handler:
    '''
    '''
    def callback(self, prefix, name, *args):
        method = getattr(self,prefix+name,None)
        if callable(method): return method(*args)
    def start(self, name):
        self.callback('start_', name)
    def end(self, name):
        self.callback('end_', name)
    def sub(self, name):
        def substitution(match):
            result = self.callback('sub_', name, match)
            if result is None: match.group(0)
            return result
        return substitution

class HTMLRenderer(Handler):
    '''

    '''
    def start_document(self):
        print '

...'
    def end_document(self):
        print ''
    def start_paragraph(self):
        print '

'
    def end_paragraph(self):
        print '

'
    def start_heading(self):
        print '

'
    def end_heading(self):
        print '

'
    def start_list(self):
        print '
    '
        def end_list(self):
            print '
'
    def start_listitem(self):
        print '
  • '
        def end_listitem(self):
            print '
  • '
        def start_title(self):
            print '

    '
        def end_title(self):
            print '

    '
        def sub_emphasis(self, match):
            return '%s' % match.group(1)
        def sub_url(self,  match):
            return '%s' % (match.group(1),match.group(1))
        def sub_mail(self,  match):
            return '%s' % (match.group(1),match.group(1))
        def feed(self, data):
            print data

    这个程序堪称是整个“项目”的基石所在:提供了标签的输出,以及字符串的替换。理解起来也比较简单。

    再来看第二个模块“过滤器”,这个模块更为简单,其实就是一个正则表达式的字符串。相关代码如下:

    代码如下:


    self.addFilter(r'\*(.+?)\*', 'emphasis')
    self.addFilter(r'(http://[\.a-z0-9A-Z/]+)', 'url')
    self.addFilter(r'([\.a-zA-Z]+@[\.a-zA-Z]+[a-zA-Z]+)','mail')

    这就是三个过滤器了,分别是:强调牌过滤器(用×号标出的),url牌过滤器,email牌过滤器。熟悉正则表达式的同学理解起来是没有压力的。

    再来看第三个模块“规则”,这个模块,抛开那祖父类不说,其他类应该有的两个方法是condition和action,前者是用来判断读进来的字符串是不是符合自家规则,后者是用来执行操作的,所谓的执行操作就是指调用“处理程序模块”,输出前标签、内容、后标签。 来看下这个模块的代码,其实这个里面几个类的关系,画到类图里面看会比较清晰。 rules.py:

    代码如下:


    class Rule:
        def action(self, block, handler):
            handler.start(self.type)
            handler.feed(block)
            handler.end(self.type)
            return True

    class HeadingRule(Rule):
        type = 'heading'
        def condition(self, block):
            return not '\n' in block and len(block)

    class TitleRule(HeadingRule):
        type = 'title'
        first = True

        def condition(self, block):
            if not self.first: return False
            self.first = False
            return HeadingRule.condition(self, block)

    class ListItemRule(Rule):
        type = 'listitem'
        def condition(self, block):
            return block[0] == '-'
        def action(self,block,handler):
            handler.start(self.type)
            handler.feed(block[1:].strip())
            handler.end(self.type)
            return True

    class ListRule(ListItemRule):
        type = 'list'
        inside = False
        def condition(self, block):
            return True
        def action(self,block, handler):
            if not self.inside and ListItemRule.condition(self,block):
                handler.start(self.type)
                self.inside = True
            elif self.inside and not ListItemRule.condition(self,block):
                handler.end(self.type)
                self.inside = False
            return False

    class ParagraphRule(Rule):
        type = 'paragraph'
        def condition(self, block):
            return True

    补充utils.py:

    代码如下:


    def line(file):
        for line in file:yield line
        yield '\n'

    def blocks(file):
        block = []
        for line in lines(file):
            if line.strip():
                block.append(line)
            elif block:
                yield ''.join(block).strip()
                block = []

    最后隆重的来看下“语法分析器模块”,这个模块的作用其实就是协调读入的文本和其他模块的关系。在往重点说就是,提供了两个存放“规则”和“过滤器”的列表,这么做的好处就是使得整个程序的灵活性得到了极大的提高,使得规则和过滤器变成的热插拔的方式,当然这个也归功于前面在写规则和过滤器时每一种类型的规则(过滤器)都单独的写成了一个类,而不是用if..else来区分。 看代码:

    代码如下:


    import sys, re
    from handlers import *
    from util import *
    from rules import *

    class Parser:
        def __init__(self,handler):
            self.handler = handler
            self.rules = []
            self.filters = []

        def addRule(self, rule):
            self.rules.append(rule)

        def addFilter(self,pattern,name):
            def filter(block, handler):
                return re.sub(pattern, handler.sub(name),block)
            self.filters.append(filter)

        def parse(self, file):
            self.handler.start('document')
            for block in blocks(file):
                for filter in self.filters:
                    block = filter(block, self.handler)
                for rule in self.rules:
                    if rule.condition(block):
                        last = rule.action(block, self.handler)
                        if last:break
            self.handler.end('document')

    class BasicTextParser(Parser):
        def __init__(self,handler):
            Parser.__init__(self,handler)
            self.addRule(ListRule())
            self.addRule(ListItemRule())
            self.addRule(TitleRule())
            self.addRule(HeadingRule())
            self.addRule(ParagraphRule())

            self.addFilter(r'\*(.+?)\*', 'emphasis')
            self.addFilter(r'(http://[\.a-z0-9A-Z/]+)', 'url')
            self.addFilter(r'([\.a-zA-Z]+@[\.a-zA-Z]+[a-zA-Z]+)','mail')

    handler = HTMLRenderer()
    parser = BasicTextParser(handler)

    parser.parse(sys.stdin)

    这个模块里面的处理思路是,遍历客户端(也就是程序执行的入口)给插进去的所有的规则和过滤器,来处理读进来的文本。

    有一个细节的地方也要说一下,其实是和前面写的呼应一下,就是在遍历规则的时候通过调用condition这个东西来判断是否符合当前规则。

    我觉得这个程序很像是命令行模式,有空可以复习一下该模式,以保持记忆网节点的牢固性。

    最后说一下我以为的这个程序的用途:

    1、用来做代码高亮分析,如果改写成js版的话,可以做一个在线代码编辑器。
    2、可以用来学习,供我写博文用。

    还有其他的思路,可以留下您的真知灼见。
    补充一个类图,很简陋,但是应该能说明之间的关系。另外我还是建议如果看代码捋不清关系最好自己画图,自己画图才能熟悉整个结构。

    Statement
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
    Python vs. C  : Learning Curves and Ease of UsePython vs. C : Learning Curves and Ease of UseApr 19, 2025 am 12:20 AM

    Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

    Python vs. C  : Memory Management and ControlPython vs. C : Memory Management and ControlApr 19, 2025 am 12:17 AM

    Python and C have significant differences in memory management and control. 1. Python uses automatic memory management, based on reference counting and garbage collection, simplifying the work of programmers. 2.C requires manual management of memory, providing more control but increasing complexity and error risk. Which language to choose should be based on project requirements and team technology stack.

    Python for Scientific Computing: A Detailed LookPython for Scientific Computing: A Detailed LookApr 19, 2025 am 12:15 AM

    Python's applications in scientific computing include data analysis, machine learning, numerical simulation and visualization. 1.Numpy provides efficient multi-dimensional arrays and mathematical functions. 2. SciPy extends Numpy functionality and provides optimization and linear algebra tools. 3. Pandas is used for data processing and analysis. 4.Matplotlib is used to generate various graphs and visual results.

    Python and C  : Finding the Right ToolPython and C : Finding the Right ToolApr 19, 2025 am 12:04 AM

    Whether to choose Python or C depends on project requirements: 1) Python is suitable for rapid development, data science, and scripting because of its concise syntax and rich libraries; 2) C is suitable for scenarios that require high performance and underlying control, such as system programming and game development, because of its compilation and manual memory management.

    Python for Data Science and Machine LearningPython for Data Science and Machine LearningApr 19, 2025 am 12:02 AM

    Python is widely used in data science and machine learning, mainly relying on its simplicity and a powerful library ecosystem. 1) Pandas is used for data processing and analysis, 2) Numpy provides efficient numerical calculations, and 3) Scikit-learn is used for machine learning model construction and optimization, these libraries make Python an ideal tool for data science and machine learning.

    Learning Python: Is 2 Hours of Daily Study Sufficient?Learning Python: Is 2 Hours of Daily Study Sufficient?Apr 18, 2025 am 12:22 AM

    Is it enough to learn Python for two hours a day? It depends on your goals and learning methods. 1) Develop a clear learning plan, 2) Select appropriate learning resources and methods, 3) Practice and review and consolidate hands-on practice and review and consolidate, and you can gradually master the basic knowledge and advanced functions of Python during this period.

    Python for Web Development: Key ApplicationsPython for Web Development: Key ApplicationsApr 18, 2025 am 12:20 AM

    Key applications of Python in web development include the use of Django and Flask frameworks, API development, data analysis and visualization, machine learning and AI, and performance optimization. 1. Django and Flask framework: Django is suitable for rapid development of complex applications, and Flask is suitable for small or highly customized projects. 2. API development: Use Flask or DjangoRESTFramework to build RESTfulAPI. 3. Data analysis and visualization: Use Python to process data and display it through the web interface. 4. Machine Learning and AI: Python is used to build intelligent web applications. 5. Performance optimization: optimized through asynchronous programming, caching and code

    Python vs. C  : Exploring Performance and EfficiencyPython vs. C : Exploring Performance and EfficiencyApr 18, 2025 am 12:20 AM

    Python is better than C in development efficiency, but C is higher in execution performance. 1. Python's concise syntax and rich libraries improve development efficiency. 2.C's compilation-type characteristics and hardware control improve execution performance. When making a choice, you need to weigh the development speed and execution efficiency based on project needs.

    See all articles

    Hot AI Tools

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Clothoff.io

    Clothoff.io

    AI clothes remover

    Video Face Swap

    Video Face Swap

    Swap faces in any video effortlessly with our completely free AI face swap tool!

    Hot Tools

    Atom editor mac version download

    Atom editor mac version download

    The most popular open source editor

    SublimeText3 Linux new version

    SublimeText3 Linux new version

    SublimeText3 Linux latest version

    SublimeText3 Mac version

    SublimeText3 Mac version

    God-level code editing software (SublimeText3)

    SublimeText3 English version

    SublimeText3 English version

    Recommended: Win version, supports code prompts!

    SAP NetWeaver Server Adapter for Eclipse

    SAP NetWeaver Server Adapter for Eclipse

    Integrate Eclipse with SAP NetWeaver application server.