search
HomeBackend DevelopmentPython TutorialPython script for reading different message patterns

用于读取不同消息模式的 Python 脚本

Question content

I'm trying to make a flexible python script that reads and extracts some weather variables from a synop code.

This is the code:

import re

def extract_data_12_utc(message):
    # pattern message

    pattern = r'(\d{5}),(\d{4}),(\d{2}),(\d{2}),(\d{2}),(\d{2}),aaxx (\d{5}) (\d{5}) (\d{5}) (\d{5}) (1\d{4}) (2\d{4}) (3\d{4})? (4\d{4}) (6\d{4})? (7\d{4})? (8\d{4})? (\{3}) (2\d{4}) (5\d{4}) (7\d{4})'


    matches = re.search(pattern, message)

    # check if the match is successsful
    if matches:
        
        station = matches.group(1)
        year = matches.group(2)
        month = matches.group(3)
        day = matches.group(4)
        hour = matches.group(5)
        min = matches.group(6)

        # extracting variables
        temp_air = float(matches.group(11)[2:]) / 10.0
        temp_dew = float(matches.group(12)[2:]) / 10.0
        pres_station = float(matches.group(13)[1:]) / 10.0 + 1000  
        pres_sealv = float(matches.group(14)[1:]) / 10.0 + 1000
        prec_6h = float(matches.group(15)[2:4]) if matches.group(15) else none
        wx = str(matches.group(16)[1:]) if matches.group(16) else none
        cld = str(matches.group(17)[1:]) if matches.group(17) else none
        temp_min = float(matches.group(19)[2:]) / 10.0 if matches.group(19) else none
        pres_chg = float(matches.group(20)[2:]) / 10.0 if matches.group(20) else none
        prec_24h = float(matches.group(21)[1:]) / 10.0 if matches.group(21) else none

        # formatting results
        formatted_data = [
            station, year, month, day, hour, min,
            f"{int(temp_air):02d}.{int((temp_air % 1) * 10):01d}",
            f"{int(temp_dew):02d}.{int((temp_dew % 1) * 10):01d}",
            f"{int(pres_station):04d}.{int((pres_station % 1) * 10):01d}",
            f"{int(pres_sealv):04d}.{int((pres_sealv % 1) * 10):01d}",
            f"{int(prec_6h):1d}"  if prec_6h is not none else "none",
            f"{int(wx):1d}"  if wx is not none else "none",
            f"{int(cld):1d}"  if cld is not none else "none",
            f"{int(temp_min):02d}.{int((temp_min % 1) * 10):01d}",
            f"{int(pres_chg):1d}"  if pres_chg is not none else "none",
            f"{prec_24h:.1f}" if prec_24h is not none else "none"
        ]

        # returns formatted data
        return formatted_data
    else:
        # returns list if fails
        return ["none"] * 16

# reading file
file_name = r"synop.txt"
with open(file_name, 'r') as file:
    lines = file.readlines()

# list to store results
data_12_utc = []

# from 17th line
for line in lines:
    data = extract_data_12_utc(line)
    data_12_utc.append(data)

# show formatted data
for data in data_12_utc:
    print(data)

The input data is:

82145,2024,01,24,12,00,aaxx 24124 82145 32598 30502 10292 20250 30082 40124 83200 333 20231 58004=
82181,2024,01,24,12,00,aaxx 24124 82181 21498 73603 10257 20242 30008 40149 70262 84520 333 20246 59014 60084=
82184,2024,01,24,12,00,aaxx 24124 82184 21498 60502 10272 20252 30116 40124 70362 85520 333 20243 59014 69944=
82188,2024,01,24,12,00,aaxx 24124 82188 11560 53602 10264 20248 30128 40146 60214 72162 83260 333 58002 70210==
82191,2024,01,24,12,00,aaxx 24124 82191 12570 60501 10290 20262 30108 40114 60184 84250 333 20238 59014 70180==
82193,2024,01,24,12,00,aaxx 24124 82193 22470 30409 10289 20254 30106 40124 83100 333 20254 59016 60054=
82244,2024,01,24,12,00,aaxx 24124 82244 11470 70503 10269 20248 30061 40130 60024 70296 84220 333 20256 59002 70020==
82246,2024,01,24,12,00,aaxx 24124 82246 21596 83202 10252 20242 3//// 4//// 7036/ 887// 333 2//// 5//// 60254=
82263,2024,01,24,12,00,aaxx 24124 82263 11470 8//// 30118 69934 70352 887// 333 59013 70003==
82353,2024,01,24,12,00,aaxx 24124 82353 22497 63602 10264 20246 30002 40086 86400 333 20215 59014 60024=
82361,2024,01,24,12,00,aaxx 24124 82361 21497 63602 10276 20258 30088 40125 70265 86700 333 20269 59018 60024=
82444,2024,01,24,12,00,aaxx 24124 82444 12470 72703 10269 20252 30091 60624 85000 333 20270 58000 70620==
82445,2024,01,24,12,00,aaxx 24124 82445 22497 83202 10266 20254 30102 40154 8472/ 333 20243 58000 60314=
82562,2024,01,24,12,00,aaxx 24124 82562 32597 836// 1//// 2//// 3//// 4//// 8869/ 333 2//// 5////=
82861,2024,01,24,12,00,aaxx 24124 82861 21596 73202 1//// 2//// 39917 4//// 70360 8572/ 333 2//// 59027 60054=

However, it returns the following:

['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']

When I limit certain variables (i.e. until group 15) it returns:

['82145', '2024', '01', '24', '12', '00', '29.1', '25.0', '1008.2', '1012.3', 'None']
['82181', '2024', '01', '24', '12', '00', '25.6', '24.1', '1000.7', '1014.8', 'None']
['82184', '2024', '01', '24', '12', '00', '27.1', '25.1', '1011.6', '1012.3', 'None']
['82188', '2024', '01', '24', '12', '00', '26.3', '24.8', '1012.7', '1014.6', '21']
['82191', '2024', '01', '24', '12', '00', '29.0', '26.1', '1010.7', '1011.3', '18']
['82193', '2024', '01', '24', '12', '00', '28.8', '25.3', '1010.6', '1012.3', 'None']
['82244', '2024', '01', '24', '12', '00', '26.8', '24.8', '1006.1', '1013.0', '2']
['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None']
['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None']
['82353', '2024', '01', '24', '12', '00', '26.3', '24.6', '1000.2', '1008.6', 'None']
['82361', '2024', '01', '24', '12', '00', '27.6', '25.8', '1008.7', '1012.5', 'None']
['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None']
['82445', '2024', '01', '24', '12', '00', '26.6', '25.3', '1010.2', '1015.3', 'None']
['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None']
['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None']
['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None']
['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None']

How do I have a script that contains all types of schema messages?


Correct Answer


Even if only one variable is malformed, there may be reasons to reject the entire line (or replace it with a None string).

However, if you want to extract every correctly formatted variable, even if some variables in the line are malformed, you should split the line using re.split(', ', line) for a list of variables and convert/check each variable individually. Unfortunately, re matches the entire expression instead of each group

If you must use a flexible regular expression, you should consider potential formatting errors like (?:(4\d{4})|\d*[/] ) group. Unfortunately, it increases the number of groups, so I use the non-capturing group operator :? to keep the group numbers the same. If you find it too unwieldy, another option is to use the more universal group expression (4[/\d]{4}), which allows missing values, but you will test the presence later Missing numeric symbol "/" or just catching an exception during conversion.

The above is the detailed content of Python script for reading different message patterns. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:stackoverflow. If there is any infringement, please contact admin@php.cn delete
使用pyjokes创建随机笑话的Python脚本使用pyjokes创建随机笑话的Python脚本Sep 13, 2023 pm 08:25 PM

您想为您的Python脚本或应用程序添加一些幽默吗?无论您是构建聊天机器人、开发命令行工具,还是只是想用随机笑话自娱自乐,pyjokes库都可以为您提供帮助。借助pyjokes,您可以轻松生成各种类别的笑话,并根据您的喜好进行自定义。在这篇博文中,我们将探讨如何使用pyjokes库在Python中创建随机笑话。我们将介绍安装过程、生成不同类别的笑话、自定义笑话、在控制台应用程序或网页中显示它们,以及处理可能发生的任何潜在错误。安装pyjokes在我们开始使用pyjokes创建随机笑话之前,我们需

用于监控网站变化的Python脚本用于监控网站变化的Python脚本Aug 29, 2023 pm 12:25 PM

在当今的数字时代,了解网站上的最新变化对于各种目的都至关重要,例如跟踪竞争对手网站上的更新、监控产品可用性或随时了解重要信息。手动检查网站是否有更改可能既耗时又低效。这就是自动化发挥作用的地方。在这篇博文中,我们将探讨如何创建Python脚本来监控网站更改。通过利用Python的强大功能和一些方便的库,我们可以自动化检索网站内容、与以前的版本进行比较并通知我们任何更改的过程。这使我们能够保持主动并及时对我们监控的网站上的更新或修改做出反应。设置环境在开始编写脚本来监控网站更改之前,我们需要设置P

PyCharm高级教程:利用PyInstaller将代码打包为EXE格式PyCharm高级教程:利用PyInstaller将代码打包为EXE格式Feb 20, 2024 am 09:34 AM

PyCharm是一款功能强大的Python集成开发环境,提供了丰富的功能和工具来帮助开发者提高效率。其中,PyInstaller是一个常用的工具,可以将Python代码打包为可执行文件(EXE格式),方便在没有Python环境的机器上运行。在本篇文章中,我们将介绍如何在PyCharm中使用PyInstaller将Python代码打包为EXE格式,并提供具体的

Python脚本自动刷新Excel电子表格Python脚本自动刷新Excel电子表格Sep 09, 2023 pm 06:21 PM

Python和Excel是两个强大的工具,结合起来可以开启自动化世界。Python具有多功能的库和用户友好的语法,使我们能够编写脚本来有效地执行各种任务。另一方面,Excel是一种广泛使用的电子表格程序,它为数据分析和操作提供了熟悉的界面。在本教程中,我们将探索如何利用Python来自动化刷新Excel电子表格的过程,从而节省我们的时间和精力。您是否发现自己花费了宝贵的时间使用更新的数据手动刷新Excel电子表格?这是一项重复且耗时的任务,可能会真正降低生产力。在本文中,我们将指导您完成使用Py

Flask安装配置教程:轻松搭建PythonWeb应用的利器Flask安装配置教程:轻松搭建PythonWeb应用的利器Feb 20, 2024 pm 11:12 PM

Flask安装配置教程:轻松搭建PythonWeb应用的利器,需要具体代码示例引言:随着Python的日益流行,Web开发也成为了Python程序员的必备技能之一。而要进行Python的Web开发,我们需要选择合适的Web框架。在众多的PythonWeb框架中,Flask是一款简洁、易上手且灵活的框架,备受开发者们的青睐。本文将介绍Flask框架的安装、

如何在Linux系统中运行Python脚本如何在Linux系统中运行Python脚本Oct 05, 2023 am 08:05 AM

如何在Linux系统中运行Python脚本作为一种强大的脚本语言,Python在Linux系统中广泛应用。在本文中,我将为你介绍如何在Linux系统中运行Python脚本,并提供具体的代码示例。安装Python首先,确保你的Linux系统上已经安装了Python。在终端中输入以下命令来检查系统是否已安装Python:python--version如果显示了

完全指南:确保准确查看Django版本完全指南:确保准确查看Django版本Feb 19, 2024 pm 06:33 PM

专业指南:如何准确查看Django版本,需要具体代码示例引言:Django是一个高度受欢迎的PythonWeb框架,其不断更新的版本对于开发者来说非常重要。查看Django版本对于确保使用最新功能和修复了的漏洞至关重要。本文将介绍如何准确查看Django版本,并提供具体的代码示例。一、使用命令行查看Django版本使用命令行是最简单快捷的方式来查看Djan

如何利用Python脚本在Linux系统中实现并行计算如何利用Python脚本在Linux系统中实现并行计算Oct 05, 2023 am 09:09 AM

如何利用Python脚本在Linux系统中实现并行计算,需要具体代码示例在现代计算机领域,对于大规模数据处理和复杂计算任务,使用并行计算可以显著提高计算效率。Linux作为一个强大的操作系统,提供了丰富的工具和功能,可以方便地实现并行计算。而Python作为一种简单易用且功能强大的编程语言,也有许多库和模块可以用于编写并行计算任务。本文将介绍如何利用Pyth

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!