


The example in this article describes the efficient .NET dirty word filtering algorithm. Share it with everyone for your reference, the details are as follows:
BadWordsFilter.cs class
using System; using System.Collections.Generic; using System.Linq; using System.Web; using System.Collections; using System.Data; namespace WNF { public class BadWordsFilter { private HashSet<string> hash = new HashSet<string>(); //关键字 private byte[] fastCheck = new byte[char.MaxValue]; private byte[] fastLength = new byte[char.MaxValue]; private BitArray charCheck = new BitArray(char.MaxValue); private BitArray endCheck = new BitArray(char.MaxValue); private int maxWordLength = 0; private int minWordLength = int.MaxValue; public BadWordsFilter() { } //初始化关键字 public void Init(DataTable badwords) { for (int j = 0; j < badwords.Rows.Count; j++) { string word = badwords.Rows[j][0].ToString(); maxWordLength = Math.Max(maxWordLength, word.Length); minWordLength = Math.Min(minWordLength, word.Length); for (int i = 0; i < 7 && i < word.Length; i++) { fastCheck[word[i]] |= (byte)(1 << i); } for (int i = 7; i < word.Length; i++) { fastCheck[word[i]] |= 0x80; } if (word.Length == 1) { charCheck[word[0]] = true; } else { fastLength[word[0]] |= (byte)(1 << (Math.Min(7, word.Length - 2))); endCheck[word[word.Length - 1]] = true; hash.Add(word); } } } public string Filter(string text, string mask) { throw new NotImplementedException(); } //检查是否有关键字 public bool HasBadWord(string text) { int index = 0; while (index < text.Length) { int count = 1; if (index > 0 || (fastCheck[text[index]] & 1) == 0) { while (index < text.Length - 1 && (fastCheck[text[++index]] & 1) == 0) ; } char begin = text[index]; if (minWordLength == 1 && charCheck[begin]) { return true; } for (int j = 1; j <= Math.Min(maxWordLength, text.Length - index - 1); j++) { char current = text[index + j]; if ((fastCheck[current] & 1) == 0) { ++count; } if ((fastCheck[current] & (1 << Math.Min(j, 7))) == 0) { break; } if (j + 1 >= minWordLength) { if ((fastLength[begin] & (1 << Math.Min(j - 1, 7))) > 0 && endCheck[current]) { string sub = text.Substring(index, j + 1); if (hash.Contains(sub)) { return true; } } } } index += count; } return false; } } }
Quote:
string sql = "select keywords from tb_keyword"; BadWordsFilter badwordfilter = new BadWordsFilter(); //初始化关键字 badwordfilter.Init(oEtb.GetDataSet(sql).Tables[0]); //检查是否有存在关键字 bool a = badwordfilter.HasBadWord(TextBox1.Text); if (a == true) { Page.RegisterClientScriptBlock("a", "<script>alert('该评论含有不合法文字!')</script>"); } else { PingLun();//写入评论表 }
I hope this article will be helpful to everyone in asp.net programming .
For more efficient .NET dirty word filtering algorithms and application examples, please pay attention to the PHP Chinese website!

The char array stores character sequences in C language and is declared as char array_name[size]. The access element is passed through the subscript operator, and the element ends with the null terminator '\0', which represents the end point of the string. The C language provides a variety of string manipulation functions, such as strlen(), strcpy(), strcat() and strcmp().

In C, the char type is used in strings: 1. Store a single character; 2. Use an array to represent a string and end with a null terminator; 3. Operate through a string operation function; 4. Read or output a string from the keyboard.

The usage methods of symbols in C language cover arithmetic, assignment, conditions, logic, bit operators, etc. Arithmetic operators are used for basic mathematical operations, assignment operators are used for assignment and addition, subtraction, multiplication and division assignment, condition operators are used for different operations according to conditions, logical operators are used for logical operations, bit operators are used for bit-level operations, and special constants are used to represent null pointers, end-of-file markers, and non-numeric values.

In C language, special characters are processed through escape sequences, such as: \n represents line breaks. \t means tab character. Use escape sequences or character constants to represent special characters, such as char c = '\n'. Note that the backslash needs to be escaped twice. Different platforms and compilers may have different escape sequences, please consult the documentation.

In C language, char type conversion can be directly converted to another type by: casting: using casting characters. Automatic type conversion: When one type of data can accommodate another type of value, the compiler automatically converts it.

The difference between multithreading and asynchronous is that multithreading executes multiple threads at the same time, while asynchronously performs operations without blocking the current thread. Multithreading is used for compute-intensive tasks, while asynchronously is used for user interaction. The advantage of multi-threading is to improve computing performance, while the advantage of asynchronous is to not block UI threads. Choosing multithreading or asynchronous depends on the nature of the task: Computation-intensive tasks use multithreading, tasks that interact with external resources and need to keep UI responsiveness use asynchronous.

There is no built-in sum function in C language, so it needs to be written by yourself. Sum can be achieved by traversing the array and accumulating elements: Loop version: Sum is calculated using for loop and array length. Pointer version: Use pointers to point to array elements, and efficient summing is achieved through self-increment pointers. Dynamically allocate array version: Dynamically allocate arrays and manage memory yourself, ensuring that allocated memory is freed to prevent memory leaks.

A strategy to avoid errors caused by default in C switch statements: use enums instead of constants, limiting the value of the case statement to a valid member of the enum. Use fallthrough in the last case statement to let the program continue to execute the following code. For switch statements without fallthrough, always add a default statement for error handling or provide default behavior.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SublimeText3 Chinese version
Chinese version, very easy to use

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

SublimeText3 Linux new version
SublimeText3 Linux latest version