Home  >  Article  >  Backend Development  >  How to Replace Non-ASCII Characters with a Single Space in Python?

How to Replace Non-ASCII Characters with a Single Space in Python?

Susan Sarandon
Susan SarandonOriginal
2024-11-01 14:11:02878browse

How to Replace Non-ASCII Characters with a Single Space in Python?

Replacing Non-ASCII Characters with a Single Space

In Python, replacing non-ASCII characters with a space is not a trivial task. Many solutions exist to remove non-ASCII characters, but replacement remains an uncommon requirement.

The provided function, remove_non_ascii_1, effectively removes all non-ASCII characters. remove_non_ascii_2, on the other hand, replaces non-ASCII characters with spaces, but the number of spaces corresponds to the character's code point size.

Now, let's address the central question:

How can we replace all non-ASCII characters with a single space?

Solution 1:

<code class="python">def replace_with_space(text):
    return ''.join([i if ord(i) < 128 else ' ' for i in text])</code>

This approach employs a conditional expression within the list comprehension of ''.join(). Characters with ASCII values under 128 remain unchanged, while non-ASCII ones are replaced with a space.

Solution 2:

<code class="python">import re

def replace_with_space(text):
    return re.sub(r'[^\x00-\x7F]+', ' ', text)</code>

In this solution, the character in the regular expression ensures that consecutive non-ASCII characters are replaced with a single space. This eliminates the issue in remove_non_ascii_2 where multiple spaces were inserted.

The above is the detailed content of How to Replace Non-ASCII Characters with a Single Space in Python?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn