正则表达式 - C++正则匹配中文乱码

Question

{代码...} C++在匹配中文的时候，部分文字乱码，不知道大家遇到过这种情况吗

ringa_lee · Answer

u4e00-u9fa5 is the Chinese character matching Unicode
C++ does not support Unicode very well. If you are a program compiled with VS under Windows, ordinary strings will be ANSI encoded after compilation, which is GBK, and L"" strings will be UTF16 LE. After C++11, you can Try using u8""(UTF8) u""(UTF16) U""(UTF32) to specify different UTF encodings of unicode strings

Looking at the source code regex should be in the C++ standard library. Looking for questions on stackoverflow, the general response is that the regex library in the C++ standard library does not support Unicode well.
http://stackoverflow.com/questions /11254232/do-c11-regular-expressions...
http://stackoverflow.com/questions/15882991/range-of-utf-8-characters-...
http://stackoverflow. com/questions/17103925/how-well-is-unicode-supplor...

I don’t know if using UTF32 or UTF16 can solve the problem. The generally recommended method is boost::regex + icu
This example looks like it can be solved using u""

正则表达式 - C++正则匹配中文乱码

reply all(1)I'll reply