Home >Backend Development >C++ >How to Remove Unusual Characters from a SQL Server VARCHAR Column?
Remove unusual characters from SQL Server VARCHAR columns
Background:
Certain non-standard characters, especially characters with diacritics (such as a with a hat), are stored in SQL Server varchar columns. This issue arises from limited control over the import of the .csv data source.
Solution:
Option 1: Use .NET regular expressions
In C#, you can use regular expressions to remove these characters. You can use the String.Replace method as shown below:
<code class="language-csharp">Regex.Replace(s, @"[^\u0000-\u007F]", string.Empty);</code>
Option 2: Create SQL CLR function
Since SQL Server does not natively support regular expressions, you can create a SQL CLR function. This requires:
Implementation:
Option 1:
<code class="language-csharp">Regex.Replace(inputString, @"[^\u0000-\u007F]", string.Empty);</code>
Option 2:
<code class="language-csharp">[SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, Name = "RegexReplace")] public static SqlString Replace(SqlString sqlInput, SqlString sqlPattern, SqlString sqlReplacement) { string input = (sqlInput.IsNull) ? string.Empty : sqlInput.Value; string pattern = (sqlPattern.IsNull) ? string.Empty : sqlPattern.Value; string replacement = (sqlReplacement.IsNull) ? string.Empty : sqlReplacement.Value; return new SqlString(Regex.Replace(input, pattern, replacement)); }</code>
<code class="language-sql">CREATE FUNCTION [dbo].[StackOverflowRegexReplace] (@input NVARCHAR(MAX),@pattern NVARCHAR(MAX), @replacement NVARCHAR(MAX)) RETURNS NVARCHAR(4000) AS EXTERNAL NAME [StackOverflow].[StackOverflow].[Replace] GO</code>
<code class="language-sql">SELECT [dbo].[StackOverflowRegexReplace] ('Hello Kitty Essential Accessory Kit', '[^\u0000-\u007F]', '')</code>
The above is the detailed content of How to Remove Unusual Characters from a SQL Server VARCHAR Column?. For more information, please follow other related articles on the PHP Chinese website!