Home >Backend Development >C++ >What's the Most Efficient Way to Filter Special Characters from a String?

What's the Most Efficient Way to Filter Special Characters from a String?

Linda Hamilton
Linda HamiltonOriginal
2025-01-01 04:31:12163browse

What's the Most Efficient Way to Filter Special Characters from a String?

Efficient Character Filtering in Strings

This article addresses the task of efficiently removing special characters from a string, ensuring it contains only alphanumeric characters, underscores, and dots.

The provided code reviews a method using a loop for character validation, although it may not be the most efficient approach. The suggested optimization involves using an enumerator and initializing a StringBuilder with the expected capacity to reduce array accesses.

For a более efficient option, regular expressions could be used, but performance may suffer for short strings. The following regular expression would successfully match allowed characters:

[0-9A-Za-z._]+

However, a lookup table outperforms both string manipulation and regular expressions in this scenario. The lookup table stores Boolean values indicating whether each character is allowed, significantly speeding up the filtering process.

The complete solution incorporating a lookup table:

private static bool[] _lookup;

static Program() {
   _lookup = new bool[65536];
   for (char c = '0'; c <= '9'; c++) _lookup[c] = true;
   for (char c = 'A'; c <= 'Z'; c++) _lookup[c] = true;
   for (char c = 'a'; c <= 'z'; c++) _lookup[c] = true;
   _lookup['.'] = true;
   _lookup['_'] = true;
}

public static string RemoveSpecialCharacters(string str) {
   char[] buffer = new char[str.Length];
   int index = 0;
   foreach (char c in str) {
      if (_lookup[c]) {
         buffer[index] = c;
         index++;
      }
   }
   return new string(buffer, 0, index);
}

Performance tests show that the lookup table approach is significantly faster than the loop method or regular expression, with an execution time of approximately 13 milliseconds for a 24-character string.

The above is the detailed content of What's the Most Efficient Way to Filter Special Characters from a String?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn