Home > Article > Backend Development > How to Read Unicode (UTF-8) Files into wstrings on Windows with C 11?
Reading Unicode UTF-8 Files into Wstrings on Windows
Reading Unicode (UTF-8) files into wstrings(s) on Windows can be achieved efficiently using C 11's std::codecvt_utf8 facet.
With std::codecvt_utf8, conversion between UTF-8 byte strings and UCS2 or UCS4 character strings is simplified. This facet enables reading and writing UTF-8 files, both text and binary.
To leverage the facet, create a locale object that encapsulates facets defining the desired localized environment. Once the locale object is created, imbue your stream buffer with it.
Here's an implementation using imbuing:
#include <sstream> #include <fstream> #include <codecvt> std::wstring readFile(const char* filename) { std::wifstream wif(filename); wif.imbue(std::locale(std::locale::empty(), new std::codecvt_utf8<wchar_t>)); std::wstringstream wss; wss << wif.rdbuf(); return wss.str(); }
After imbuing the stream buffer, reading the file into a wstring is straightforward:
std::wstring wstr = readFile("a.txt");
Alternatively, setting the global C locale before working with string streams will eliminate the need for explicit imbuing:
std::locale::global(std::locale(std::locale::empty(), new std::codecvt_utf8<wchar_t>));
This sets the global locale as the default, which will automatically apply to future stream buffers.
The above is the detailed content of How to Read Unicode (UTF-8) Files into wstrings on Windows with C 11?. For more information, please follow other related articles on the PHP Chinese website!