Home  >  Article  >  Backend Development  >  How to Selectively Remove Non-ASCII Characters Preserving Spaces and Periods?

How to Selectively Remove Non-ASCII Characters Preserving Spaces and Periods?

Linda Hamilton
Linda HamiltonOriginal
2024-10-19 20:32:02864browse

How to Selectively Remove Non-ASCII Characters Preserving Spaces and Periods?

Selective Removal of Non-ASCII Characters

Working with textual data often involves the need to remove non-ASCII characters, while preserving certain symbols like spaces and periods. While basic filtering methods may remove all non-ASCII characters, this might not be desirable in some cases.

Let's consider the following code:

<code class="python">def onlyascii(char):
    if ord(char) < 48 or ord(char) > 127: return ''
    else: return char</code>

This code removes all characters with ASCII values less than 48 or greater than 127, effectively stripping the text of non-ASCII characters. However, it also removes spaces (ASCII 32) and periods (ASCII 46).

To selectively remove non-ASCII characters while preserving spaces and periods, we can leverage Python's string.printable module:

<code class="python">import string
printable = set(string.printable)
filtered_data = filter(lambda x: x in printable, data)</code>

The string.printable set contains all printable characters on the system, including digits, letters, symbols, spaces, and periods. Using this set as a filter, we can remove all non-printable characters from the string.

For example, if we have the string "somex00string. withx15 funny characters":

<code class="python">s = "some\x00string. with\x15 funny characters"
''.join(filter(lambda x: x in printable, s))</code>

The result will be:

'somestring. with funny characters'

This method effectively removes non-ASCII characters while preserving spaces and periods, providing a clean string for further processing.

The above is the detailed content of How to Selectively Remove Non-ASCII Characters Preserving Spaces and Periods?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn