Home > Article > Backend Development > How to Remove Emojis from Strings in Python?
Python's str.startswith() function indeed produces an invalid character error when searching for emojis starting with "xf." However, there are alternative methods to effectively remove emojis from strings in Python.
Using Unicode Strings and re.UNICODE Flag
On Python 2, to handle emojis, you need to create Unicode strings using u'' literals. Additionally, pass the re.UNICODE flag during compilation to enable Unicode support:
<code class="python">import re emoji_pattern = re.compile( u"[\U0001F600-\U0001F64F]" # emoticons u"|\U0001F300-\U0001F5FF]" # symbols & pictographs u"|\U0001F680-\U0001F6FF]" # transport & map symbols u"|\U0001F1E0-\U0001F1FF]" # flags (iOS)", flags=re.UNICODE) text = u'This dog \U0001F602' print(text) # with emoji print(emoji_pattern.sub(r'', text)) # without emoji</code>
Output:
This dog ? This dog
Using Compiled Regular Expression
Another approach is to use a pre-compiled regular expression:
<code class="python">emoji_patterns = [ u"[\U0001F600-\U0001F64F]" # emoticons u"|\U0001F300-\U0001F5FF]" # symbols & pictographs u"|\U0001F680-\U0001F6FF]" # transport & map symbols u"|\U0001F1E0-\U0001F1FF]" # flags (iOS)] emoji_pattern = re.compile(emoji_pat, flags=re.UNICODE)</code>
Remember, these patterns may not match all emojis. For a more comprehensive list, refer to the Unicode Emoji List.
The above is the detailed content of How to Remove Emojis from Strings in Python?. For more information, please follow other related articles on the PHP Chinese website!