Home >Backend Development >C++ >How to Encode and Decode Unicode Characters in C#?
The encoding and decoding of the UNICode character in the C#
In C#,
class is used for character coding and decoding. However, it has limitations when processing the Unicode character outside the ASCII range. In order to retain non -ASCII characters (such as Greek letters PI (π)), we need to use specific technologies.
Encoding
Unicode character encoding
To encode the Unicode character into the ASCII string of the righteousness, we use the following methods:
This method replaces non -ASCII characters to its corresponding transition ASCII form, such as "π" to "U03A0".
<code class="language-csharp">static string EncodeNonAsciiCharacters(string value) { StringBuilder sb = new StringBuilder(); foreach (char c in value) { if (c > 127) { string encodedValue = "\u" + ((int)c).ToString("x4"); sb.Append(encodedValue); } else { sb.Append(c); } } return sb.ToString(); }</code>Transfer ASCII character decoding
To decode the righteous ASCII string back to Unicode, we use regular expressions:
This regular expression replaces all the re -righteous unicode characters (UXXXX) to its corresponding Unicode character, for example, "U03A0" becomes "π".Example usage
<code class="language-csharp">static string DecodeEncodedNonAsciiCharacters(string value) { return Regex.Replace(value, @"\u(?<value>[a-zA-Z0-9]{4})", m => { return ((char)int.Parse(m.Groups["Value"].Value, NumberStyles.HexNumber)).ToString(); }); }</code>
The following example illustrates the code and decoding process:
This example retains non -ASCII character PI during the entire code and decoding process.
The above is the detailed content of How to Encode and Decode Unicode Characters in C#?. For more information, please follow other related articles on the PHP Chinese website!