I'm having trouble uploading data to mysql database.
The data looks like this:
review_id, user, text
A typical line looks like this:
12345,SomeCoolName,"this is my "awsome" comment. some more text, and dome more. and some "more""
This should be a row in my table.
I'm having trouble uploading this content due to multiple lines in the text field and the use of commas and brackets. Any suggestions on how to deal with this issue?
Thanks!
I tried using some manuals I found about uploading csv files to a database, but with no success.
P粉1655228862023-09-17 00:57:56
Source CSV content that must be imported:
review_id,user,text 123,John,This is multiline 1, which contains a comma. 456,Jim,This is miltiline 2, which contains commas, 'quote' chars and "double quote" chars.
The table this data must be imported into:
CREATE TABLE test (review_id INT, user VARCHAR(255), review_text TEXT);
Query to load data into table:
LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/test.csv'
INTO TABLE test
FIELDS TERMINATED BY 'DELETE t1
FROM test t1
JOIN test t2 USING (user)
WHERE t1.review_text < t2.review_text;
' ENCLOSED BY '' ESCAPED BY ''
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(@line)
SET review_id = (@review_id := CASE WHEN @line REGEXP '^\d'
THEN SUBSTRING_INDEX(@line, ',', 1)
ELSE @review_id
END
),
user = (@user := CASE WHEN @line REGEXP '^\d'
THEN SUBSTRING_INDEX(SUBSTRING_INDEX(@line, ',', 2), ',', -1)
ELSE @user
END
),
review_text = (@review_text := CASE WHEN @line REGEXP '^\d'
THEN SUBSTRING(@line FROM 2 + LENGTH(SUBSTRING_INDEX(@line, ',', 2)))
ELSE CONCAT_WS(' ', @review_text, @line)
END
);
Table data status after loading:
review_id | user | review_text |
---|---|---|
123 | John | This is |
123 | John | This is multiple lines 1, |
123 | John | This is multiple lines 1 containing commas. |
456 | Jim | This is |
456 | Jim | This is miltiline 2 which contains |
456 | Jim | This is miltiline 2, which contains comma, "quote" characters, and "double quote" characters. |
Liquidation:
rrreeeFinal table data status:
review_id | user | review_text |
---|---|---|
123 | John | This is multiple lines 1 containing commas. |
456 | Jim | This is miltiline 2, which contains comma, "quote" characters, and "double quote" characters. |