Home >Backend Development >Python Tutorial >How to Filter Non-ASCII Characters While Preserving Spaces and Periods in Python?
Removing Non-ASCII Characters While Preserving Spaces and Periods
In Python, you may encounter situations where you need to filter out non-ASCII characters from a string while keeping spaces and periods intact. The code provided for this purpose, known as onlyascii(), currently removes all non-ASCII characters, including desired ones.
To address this issue, consider modifying the onlyascii() function to include special handling for spaces and periods. One approach is to use Python's string.printable, which contains a set of characters that are deemed printable, including spaces and periods.
Within the onlyascii() function, you can filter out non-ASCII characters while allowing spaces and periods to pass through by checking whether the character is in the string.printable set. Here's how you can do it:
def onlyascii(char): if ((ord(char) < 48 or ord(char) > 127) and (char not in string.printable)): return '' else: return char
By adding the char not in string.printable condition to the if statement, you ensure that spaces and periods are retained, even if they are outside the ASCII range. Incorporating this modification into the get_my_string() function, you can now filter out non-ASCII characters while preserving spaces and periods:
def get_my_string(file_path): f=open(file_path,'r') data=f.read() f.close() filtered_data=filter(onlyascii, data) filtered_data = filtered_data.lower() return filtered_data
The above is the detailed content of How to Filter Non-ASCII Characters While Preserving Spaces and Periods in Python?. For more information, please follow other related articles on the PHP Chinese website!