I'm trying to make a flexible python script that reads and extracts some weather variables from a synop code.
This is the code:
import re def extract_data_12_utc(message): # pattern message pattern = r'(\d{5}),(\d{4}),(\d{2}),(\d{2}),(\d{2}),(\d{2}),aaxx (\d{5}) (\d{5}) (\d{5}) (\d{5}) (1\d{4}) (2\d{4}) (3\d{4})? (4\d{4}) (6\d{4})? (7\d{4})? (8\d{4})? (\{3}) (2\d{4}) (5\d{4}) (7\d{4})' matches = re.search(pattern, message) # check if the match is successsful if matches: station = matches.group(1) year = matches.group(2) month = matches.group(3) day = matches.group(4) hour = matches.group(5) min = matches.group(6) # extracting variables temp_air = float(matches.group(11)[2:]) / 10.0 temp_dew = float(matches.group(12)[2:]) / 10.0 pres_station = float(matches.group(13)[1:]) / 10.0 + 1000 pres_sealv = float(matches.group(14)[1:]) / 10.0 + 1000 prec_6h = float(matches.group(15)[2:4]) if matches.group(15) else none wx = str(matches.group(16)[1:]) if matches.group(16) else none cld = str(matches.group(17)[1:]) if matches.group(17) else none temp_min = float(matches.group(19)[2:]) / 10.0 if matches.group(19) else none pres_chg = float(matches.group(20)[2:]) / 10.0 if matches.group(20) else none prec_24h = float(matches.group(21)[1:]) / 10.0 if matches.group(21) else none # formatting results formatted_data = [ station, year, month, day, hour, min, f"{int(temp_air):02d}.{int((temp_air % 1) * 10):01d}", f"{int(temp_dew):02d}.{int((temp_dew % 1) * 10):01d}", f"{int(pres_station):04d}.{int((pres_station % 1) * 10):01d}", f"{int(pres_sealv):04d}.{int((pres_sealv % 1) * 10):01d}", f"{int(prec_6h):1d}" if prec_6h is not none else "none", f"{int(wx):1d}" if wx is not none else "none", f"{int(cld):1d}" if cld is not none else "none", f"{int(temp_min):02d}.{int((temp_min % 1) * 10):01d}", f"{int(pres_chg):1d}" if pres_chg is not none else "none", f"{prec_24h:.1f}" if prec_24h is not none else "none" ] # returns formatted data return formatted_data else: # returns list if fails return ["none"] * 16 # reading file file_name = r"synop.txt" with open(file_name, 'r') as file: lines = file.readlines() # list to store results data_12_utc = [] # from 17th line for line in lines: data = extract_data_12_utc(line) data_12_utc.append(data) # show formatted data for data in data_12_utc: print(data)
The input data is:
82145,2024,01,24,12,00,aaxx 24124 82145 32598 30502 10292 20250 30082 40124 83200 333 20231 58004= 82181,2024,01,24,12,00,aaxx 24124 82181 21498 73603 10257 20242 30008 40149 70262 84520 333 20246 59014 60084= 82184,2024,01,24,12,00,aaxx 24124 82184 21498 60502 10272 20252 30116 40124 70362 85520 333 20243 59014 69944= 82188,2024,01,24,12,00,aaxx 24124 82188 11560 53602 10264 20248 30128 40146 60214 72162 83260 333 58002 70210== 82191,2024,01,24,12,00,aaxx 24124 82191 12570 60501 10290 20262 30108 40114 60184 84250 333 20238 59014 70180== 82193,2024,01,24,12,00,aaxx 24124 82193 22470 30409 10289 20254 30106 40124 83100 333 20254 59016 60054= 82244,2024,01,24,12,00,aaxx 24124 82244 11470 70503 10269 20248 30061 40130 60024 70296 84220 333 20256 59002 70020== 82246,2024,01,24,12,00,aaxx 24124 82246 21596 83202 10252 20242 3//// 4//// 7036/ 887// 333 2//// 5//// 60254= 82263,2024,01,24,12,00,aaxx 24124 82263 11470 8//// 30118 69934 70352 887// 333 59013 70003== 82353,2024,01,24,12,00,aaxx 24124 82353 22497 63602 10264 20246 30002 40086 86400 333 20215 59014 60024= 82361,2024,01,24,12,00,aaxx 24124 82361 21497 63602 10276 20258 30088 40125 70265 86700 333 20269 59018 60024= 82444,2024,01,24,12,00,aaxx 24124 82444 12470 72703 10269 20252 30091 60624 85000 333 20270 58000 70620== 82445,2024,01,24,12,00,aaxx 24124 82445 22497 83202 10266 20254 30102 40154 8472/ 333 20243 58000 60314= 82562,2024,01,24,12,00,aaxx 24124 82562 32597 836// 1//// 2//// 3//// 4//// 8869/ 333 2//// 5////= 82861,2024,01,24,12,00,aaxx 24124 82861 21596 73202 1//// 2//// 39917 4//// 70360 8572/ 333 2//// 59027 60054=
However, it returns the following:
['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none'] ['none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none', 'none']
When I limit certain variables (i.e. until group 15) it returns:
['82145', '2024', '01', '24', '12', '00', '29.1', '25.0', '1008.2', '1012.3', 'None'] ['82181', '2024', '01', '24', '12', '00', '25.6', '24.1', '1000.7', '1014.8', 'None'] ['82184', '2024', '01', '24', '12', '00', '27.1', '25.1', '1011.6', '1012.3', 'None'] ['82188', '2024', '01', '24', '12', '00', '26.3', '24.8', '1012.7', '1014.6', '21'] ['82191', '2024', '01', '24', '12', '00', '29.0', '26.1', '1010.7', '1011.3', '18'] ['82193', '2024', '01', '24', '12', '00', '28.8', '25.3', '1010.6', '1012.3', 'None'] ['82244', '2024', '01', '24', '12', '00', '26.8', '24.8', '1006.1', '1013.0', '2'] ['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None'] ['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None'] ['82353', '2024', '01', '24', '12', '00', '26.3', '24.6', '1000.2', '1008.6', 'None'] ['82361', '2024', '01', '24', '12', '00', '27.6', '25.8', '1008.7', '1012.5', 'None'] ['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None'] ['82445', '2024', '01', '24', '12', '00', '26.6', '25.3', '1010.2', '1015.3', 'None'] ['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None'] ['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None'] ['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None'] ['None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'None']
How do I have a script that contains all types of schema messages?
Correct Answer
Even if only one variable is malformed, there may be reasons to reject the entire line (or replace it with a None string).
However, if you want to extract every correctly formatted variable, even if some variables in the line are malformed, you should split the line using re.split(', ', line)
for a list of variables and convert/check each variable individually. Unfortunately, re
matches the entire expression instead of each group
If you must use a flexible regular expression, you should consider potential formatting errors like (?:(4\d{4})|\d*[/] )
group.
Unfortunately, it increases the number of groups, so I use the non-capturing group operator :?
to keep the group numbers the same. If you find it too unwieldy, another option is to use the more universal group expression (4[/\d]{4})
, which allows missing values, but you will test the presence later Missing numeric symbol "/" or just catching an exception during conversion.
The above is the detailed content of Python script for reading different message patterns. For more information, please follow other related articles on the PHP Chinese website!

There are many methods to connect two lists in Python: 1. Use operators, which are simple but inefficient in large lists; 2. Use extend method, which is efficient but will modify the original list; 3. Use the = operator, which is both efficient and readable; 4. Use itertools.chain function, which is memory efficient but requires additional import; 5. Use list parsing, which is elegant but may be too complex. The selection method should be based on the code context and requirements.

There are many ways to merge Python lists: 1. Use operators, which are simple but not memory efficient for large lists; 2. Use extend method, which is efficient but will modify the original list; 3. Use itertools.chain, which is suitable for large data sets; 4. Use * operator, merge small to medium-sized lists in one line of code; 5. Use numpy.concatenate, which is suitable for large data sets and scenarios with high performance requirements; 6. Use append method, which is suitable for small lists but is inefficient. When selecting a method, you need to consider the list size and application scenarios.

Compiledlanguagesofferspeedandsecurity,whileinterpretedlanguagesprovideeaseofuseandportability.1)CompiledlanguageslikeC arefasterandsecurebuthavelongerdevelopmentcyclesandplatformdependency.2)InterpretedlanguageslikePythonareeasiertouseandmoreportab

In Python, a for loop is used to traverse iterable objects, and a while loop is used to perform operations repeatedly when the condition is satisfied. 1) For loop example: traverse the list and print the elements. 2) While loop example: guess the number game until you guess it right. Mastering cycle principles and optimization techniques can improve code efficiency and reliability.

To concatenate a list into a string, using the join() method in Python is the best choice. 1) Use the join() method to concatenate the list elements into a string, such as ''.join(my_list). 2) For a list containing numbers, convert map(str, numbers) into a string before concatenating. 3) You can use generator expressions for complex formatting, such as ','.join(f'({fruit})'forfruitinfruits). 4) When processing mixed data types, use map(str, mixed_list) to ensure that all elements can be converted into strings. 5) For large lists, use ''.join(large_li

Pythonusesahybridapproach,combiningcompilationtobytecodeandinterpretation.1)Codeiscompiledtoplatform-independentbytecode.2)BytecodeisinterpretedbythePythonVirtualMachine,enhancingefficiencyandportability.

ThekeydifferencesbetweenPython's"for"and"while"loopsare:1)"For"loopsareidealforiteratingoversequencesorknowniterations,while2)"while"loopsarebetterforcontinuinguntilaconditionismetwithoutpredefinediterations.Un

In Python, you can connect lists and manage duplicate elements through a variety of methods: 1) Use operators or extend() to retain all duplicate elements; 2) Convert to sets and then return to lists to remove all duplicate elements, but the original order will be lost; 3) Use loops or list comprehensions to combine sets to remove duplicate elements and maintain the original order.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Dreamweaver Mac version
Visual web development tools

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

SublimeText3 Mac version
God-level code editing software (SublimeText3)
