UTF-8 Text Retrieval from MySQL in R: Decoding "?????"
Retrieving UTF-8 encoded text from a MySQL database into R can sometimes result in "?????" characters. To resolve this issue, consider the following:
1. Verify Database Encoding
Ensure that the database table is defined with the appropriate charset and collation. For example, in MySQL:
CREATE TABLE test (str VARCHAR(10)) ENGINE=InnoDB DEFAULT CHARSET=utf8;
2. Set Connection Encoding
When establishing a database connection in R, specify the correct character encoding.
RODBC:
con <- odbcDriverConnect('DRIVER=mysql;user=root', CharSet='utf8')
RMySQL:
Connect first and then run:
dbConnect(MySQL(), user='root') dbSendQuery(con, 'SET NAMES utf8')
3. Convert Character Encoding
After retrieving the results, convert the character encoding of the string column to UTF-8.
RODBC:
res <- sqlQuery(con, 'SELECT * FROM rtest.test') res$str <- iconv(res$str, "UTF-8-Mac")
RMySQL:
res <- dbGetQuery(con, 'SELECT * FROM rtest.test') res$str <- as.character(res$str, encoding = 'UTF-8')
Additional Notes:
The above is the detailed content of Why am I seeing \"?????\" characters when retrieving UTF-8 text from MySQL in R?. For more information, please follow other related articles on the PHP Chinese website!