Home >Backend Development >Python Tutorial >In Python, how to split a string containing multiple delimiters?

In Python, how to split a string containing multiple delimiters?

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBforward
2023-05-09 19:25:062643browse

To split a string using multiple delimiters:

Use the re.split() method, for example re.split(r',|-', my_str).

re.split() method will split all occurrences of a string with one of the delimiters.

import re
# ????️ 用 2 个分隔符拆分字符串
my_str = 'fql,jiyik-dot,com'
my_list = re.split(r',|-', my_str)  # ????️ 以逗号或连字符分隔
print(my_list)  # ????️ ['fql', 'jiyik', 'dot', 'com']

re.split method accepts a pattern and a string and splits the string each time the pattern occurs.

Pipe | character is an or. Matches A or B.

This example splits the string using 2 delimiters (comma and hyphen).

# ????️ 用 3 个分隔符拆分字符串
my_str = 'fql,jiyik-dot:com'
my_list = re.split(r',|-|:', my_str)  # ????️ comma, hyphen or colon
print(my_list)  # ????️ ['fql', 'jiyik', 'dot', 'com']

Here is an example of splitting a string using 3 delimiters (comma, hyphen and colon).

We can use as many | characters as necessary in the regular expression.

Use square brackets [] to split a string based on multiple delimiters

Alternatively, we can use square brackets [] to indicate a group of characters.

import re
my_str = 'fql,jiyik-dot,com'
my_list = re.split(r'[,-]', my_str)
print(my_list)  # ????️ ['fql', 'jiyik', 'dot', 'com']

In Python, how to split a string containing multiple delimiters?

Make sure to add all delimiters between square brackets.

import re
# ????️ 用 3 个分隔符拆分字符串
my_str = 'fql,jiyik-dot:com'
my_list = re.split(r'[,-:]', my_str) # 以逗号、连字符、冒号分割
print(my_list)  # ????️ ['fql', 'jiyik', 'dot', 'com']

If the string starts or ends with one of these delimiters, we may get empty string values ​​in the output list.

Handling leading or trailing delimiters

We can use list comprehension to remove any empty string from the list.

import re

# ????️ 用 3 个分隔符拆分字符串
my_str = ',fql,jiyik-dot:com:'

my_list = [
    item for item in re.split(r'[,-:]', my_str)
    if item
]

print(my_list)  # ????️ ['fql', 'jiyik', 'dot', 'com']

List comprehension is responsible for removing empty strings from the list.

List comprehensions are used to perform certain operations on each element or to select a subset of elements that satisfy a condition.

Another way is to use the str.replace() method.

Use str.replace() to split a string with multiple delimiters

To split a string with multiple delimiters:

  • Use the str.replace() method to replace the first delimiter with the second delimiter.

  • Split the string by the second delimiter using the str.split() method.

my_str = 'fql_jiyik!dot_com'
my_list = my_str.replace('_', '!').split('!')
print(my_list)  # ????️ ['fql', 'jiyik', 'dot', 'com']

This method is only convenient when there are few delimiters you want to split, such as 2.

First, we Replace every occurrence of the first delimiter with the second delimiter, and then we split the second delimiter.

str.replace method returns a copy of the string in which all occurrences of the substring are replaced by the provided replacement.

This method takes the following parameters:

  • #old The substring we want to replace in the string

  • new Replace every occurrence of old

  • count Replace only the first occurrence of count (optional)

Please note that this method does not change the original string. Strings are immutable in Python.

Here is another example.

my_str = 'fql jiyik, dot # com. abc'

my_list = my_str.replace(
    ',', '').replace(
    '#', '').replace('.', '').split()

print(my_list)  # ????️ ['fql', 'jiyik', 'dot', 'com', 'abc']

We use the str.replace() method to remove punctuation before splitting the string on whitespace characters.

We use an empty string for replacement because we want to delete the specified characters.

We can chain as many calls to the str.replace() method as needed.

The final step is to split the string into a list of words using the str.split() method.

str.split() method splits a string into a list of substrings using delimiters.

This method takes the following 2 parameters:

  • #separator Split the string into substrings every time a separator appears

  • maxsplit Complete maxsplit splitting at most (optional)

When no delimiter is passed to str. split() method, it splits the input string into one or more whitespace characters.

my_str = 'fql jiyik com'
print(my_str.split())  # ????️ ['fql', 'jiyik', 'com']

If the delimiter is not found in the string, then a list containing only 1 element is returned.

Use reusable function to split string based on multiple delimiters

If we need to split string based on multiple delimiters frequently, please define a reusable function.

import re

def split_multiple(string, delimiters):
    pattern = '|'.join(map(re.escape, delimiters))

    return re.split(pattern, string)

my_str = 'fql,jiyik-dot:com'

print(split_multiple(my_str, [',', '-', ':']))

split_multiple The function accepts a string and a list of delimiters and splits the string based on the delimiter.

str.join() method is used to join the delimiter with the pipe |. Divider.

# ????️ ,|-|:
print('|'.join([',', '-', ':']))

This will create a regular expression pattern that we can use to split a string based on the specified delimiter.

If we need to split a string into a word list with multiple delimiters, we can also use the re.findall() method.

Use re.findall() to split a string into a list of words

Use the re.findall() method to split a string into a list of words with multiple delimiters word list.

re.findall() The method will split the string every time a word occurs and return a list containing the word.

import re

# ✅ 将字符串拆分为具有多个分隔符的单词列表
my_str = 'fql jiyik, dot # com. abc'

my_list = re.findall(r'[\w]+', my_str)
print(my_list)  # ????️ ['fql', 'jiyik', 'dot', 'com', 'abc']

re.findall The method takes a pattern and a string as parameters and returns a list of strings containing all non-overlapping occurrences of the pattern in the string.

我们传递给 re.findall() 方法的第一个参数是一个正则表达式。

import re

my_str = 'fql jiyik, dot # com. abc'

my_list = re.findall(r'[\w]+', my_str)
print(my_list)  # ????️ ['fql', 'jiyik', 'dot', 'com', 'abc']

方括号 [] 用于表示一组字符。

\w 字符与 Unicode 单词字符匹配,并且包括可以作为任何语言的单词一部分的大多数字符。

加号 + 使正则表达式匹配前面字符(Unicode 字符)的 1 次或多次重复。

re.findall() 方法返回一个包含字符串中单词的列表。

The above is the detailed content of In Python, how to split a string containing multiple delimiters?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete