使用SQL 從全名字段解析名字、中間名和姓氏
處理資料時,通常需要將名字分成各自的名字更容易操作的組成部分。在這種情況下,我們需要從「全名」欄位中提取名字、中間名和姓氏,同時考慮常見的資料變更。
準確率達 90% 的高效解決方案
提供的範例提供了一個實用的解決方案,可以高度處理大多數情況準確性:
SELECT FIRST_NAME.ORIGINAL_INPUT_DATA, FIRST_NAME.TITLE, FIRST_NAME.FIRST_NAME, CASE WHEN 0 = CHARINDEX(' ', FIRST_NAME.REST_OF_NAME) THEN NULL -- No more spaces? Assume rest is last name ELSE SUBSTRING(FIRST_NAME.REST_OF_NAME, 1, CHARINDEX(' ', FIRST_NAME.REST_OF_NAME) - 1) END AS MIDDLE_NAME, SUBSTRING(FIRST_NAME.REST_OF_NAME, 1 + CHARINDEX(' ', FIRST_NAME.REST_OF_NAME), LEN(FIRST_NAME.REST_OF_NAME)) AS LAST_NAME FROM ( SELECT TITLE.TITLE, CASE WHEN 0 = CHARINDEX(' ', TITLE.REST_OF_NAME) THEN TITLE.REST_OF_NAME -- No space? Return the whole thing ELSE SUBSTRING(TITLE.REST_OF_NAME, 1, CHARINDEX(' ', TITLE.REST_OF_NAME) - 1) END AS FIRST_NAME, CASE WHEN 0 = CHARINDEX(' ', TITLE.REST_OF_NAME) THEN NULL -- No spaces at all? Then 1st name is all we have ELSE SUBSTRING(TITLE.REST_OF_NAME, CHARINDEX(' ', TITLE.REST_OF_NAME) + 1, LEN(TITLE.REST_OF_NAME)) END AS REST_OF_NAME, TITLE.ORIGINAL_INPUT_DATA FROM ( SELECT -- If the first three characters are in this list, -- then pull it as a "title". Otherwise return NULL for title. CASE WHEN SUBSTRING(TEST_DATA.FULL_NAME, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS') THEN LTRIM(RTRIM(SUBSTRING(TEST_DATA.FULL_NAME, 1, 3))) ELSE NULL END AS TITLE, -- If you change the list, don't forget to change it here, too. CASE WHEN SUBSTRING(TEST_DATA.FULL_NAME, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS') THEN LTRIM(RTRIM(SUBSTRING(TEST_DATA.FULL_NAME, 4, LEN(TEST_DATA.FULL_NAME)))) ELSE LTRIM(RTRIM(TEST_DATA.FULL_NAME)) END AS REST_OF_NAME, TEST_DATA.ORIGINAL_INPUT_DATA FROM ( SELECT -- Trim leading & trailing spaces before trying to process -- Disallow extra spaces *within* the name REPLACE(REPLACE(LTRIM(RTRIM(FULL_NAME)), ' ', ' '), ' ', ' ') AS FULL_NAME, FULL_NAME AS ORIGINAL_INPUT_DATA FROM ( -- Replace this block with your actual table SELECT 'GEORGE W BUSH' AS FULL_NAME UNION SELECT 'SUSAN B ANTHONY' AS FULL_NAME UNION SELECT 'ALEXANDER HAMILTON' AS FULL_NAME UNION SELECT 'OSAMA BIN LADEN JR' AS FULL_NAME UNION SELECT 'MARTIN J VAN BUREN SENIOR III' AS FULL_NAME UNION SELECT 'TOMMY' AS FULL_NAME UNION SELECT 'BILLY' AS FULL_NAME ) RAW_DATA ) TEST_DATA ) TITLE ) FIRST_NAME;
此查詢將“MR”、“MS”、“DR”和“MRS”等前綴會以單獨的「TITLE」欄位進行識別和刪除,處理缺少的名稱、多個空格姓名,以及單部分「全名」(僅名字)。
特殊處理案例
此解決方案還包括針對特定特殊情況的修改,例如空的「全名」欄位、尾隨/前導空格、多個連續空格以及僅包含名字的「全名」 :
-- Handle the following special cases: -- 1 - The NAME field is NULL -- 2 - The NAME field contains leading / trailing spaces -- 3 - The NAME field has > 1 consecutive space within the name -- 4 - The NAME field contains ONLY the first name -- 5 - Include the original full name in the final output as a separate column, for readability -- 6 - Handle a specific list of prefixes as a separate "title" column SELECT FIRST_NAME.ORIGINAL_INPUT_DATA, FIRST_NAME.TITLE, FIRST_NAME.FIRST_NAME, CASE WHEN 0 = CHARINDEX(' ', FIRST_NAME.REST_OF_NAME) THEN NULL -- No more spaces? Assume rest is last name ELSE SUBSTRING(FIRST_NAME.REST_OF_NAME, 1, CHARINDEX(' ', FIRST_NAME.REST_OF_NAME) - 1) END AS MIDDLE_NAME, SUBSTRING(FIRST_NAME.REST_OF_NAME, 1 + CHARINDEX(' ', FIRST_NAME.REST_OF_NAME), LEN(FIRST_NAME.REST_OF_NAME)) AS LAST_NAME FROM ( SELECT TITLE.TITLE, CASE WHEN 0 = CHARINDEX(' ', TITLE.REST_OF_NAME) THEN TITLE.REST_OF_NAME -- No space? Return the whole thing ELSE SUBSTRING(TITLE.REST_OF_NAME, 1, CHARINDEX(' ', TITLE.REST_OF_NAME) - 1) END AS FIRST_NAME, CASE WHEN 0 = CHARINDEX(' ', TITLE.REST_OF_NAME) THEN NULL -- No spaces at all? Then 1st name is all we have ELSE SUBSTRING(TITLE.REST_OF_NAME, CHARINDEX(' ', TITLE.REST_OF_NAME) + 1, LEN(TITLE.REST_OF_NAME)) END AS REST_OF_NAME, TITLE.ORIGINAL_INPUT_DATA FROM ( SELECT -- If the first three characters are in this list, -- then pull it as a "title". Otherwise return NULL for title. CASE WHEN SUBSTRING(TEST_DATA.FULL_NAME, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS') THEN LTRIM(RTRIM(SUBSTRING(TEST_DATA.FULL_NAME, 1, 3))) ELSE NULL END AS TITLE, -- If you change the list, don't forget to change it here, too. CASE WHEN SUBSTRING(TEST_DATA.FULL_NAME, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS') THEN LTRIM(RTRIM(SUBSTRING(TEST_DATA.FULL_NAME, 4, LEN(TEST_DATA.FULL_NAME)))) ELSE LTRIM(RTRIM(TEST_DATA.FULL_NAME)) END AS REST_OF_NAME, TEST_DATA.ORIGINAL_INPUT_DATA FROM ( SELECT -- Trim leading & trailing spaces before trying to process -- Disallow extra spaces *within* the name REPLACE(REPLACE(LTRIM(RTRIM(FULL_NAME)), ' ', ' '), ' ', ' ') AS FULL_NAME, FULL_NAME AS ORIGINAL_INPUT_DATA FROM ( -- Replace this block with your actual table SELECT 'GEORGE W BUSH' AS FULL_NAME UNION SELECT 'SUSAN B ANTHONY' AS FULL_NAME UNION SELECT 'ALEXANDER HAMILTON' AS FULL_NAME UNION SELECT 'OSAMA BIN LADEN JR' AS FULL_NAME UNION SELECT 'MARTIN J VAN BUREN SENIOR III' AS FULL_NAME UNION SELECT 'TOMMY' AS FULL_NAME UNION SELECT 'BILLY' AS FULL_NAME UNION SELECT NULL AS FULL_NAME UNION SELECT ' ' AS FULL_NAME UNION SELECT ' JOHN JACOB SMITH' AS FULL_NAME UNION SELECT ' DR SANJAY GUPTA' AS FULL_NAME UNION SELECT 'DR JOHN S HOPKINS' AS FULL_NAME UNION SELECT ' MRS SUSAN ADAMS' AS FULL_NAME UNION SELECT ' MS AUGUSTA ADA KING ' AS FULL_NAME ) RAW_DATA ) TEST_DATA ) TITLE ) FIRST_NAME;
以上是如何從 SQL 中的單一「全名」欄位有效解析名字、中間名和姓氏,處理各種資料不一致和特殊情況?的詳細內容。更多資訊請關注PHP中文網其他相關文章!