首頁 >資料庫 >mysql教程 >如何從 SQL 中的單一「全名」欄位有效解析名字、中間名和姓氏,處理各種資料不一致和特殊情況?

如何從 SQL 中的單一「全名」欄位有效解析名字、中間名和姓氏,處理各種資料不一致和特殊情況?

Barbara Streisand
Barbara Streisand原創
2024-12-30 09:49:111015瀏覽

How can I efficiently parse first, middle, and last names from a single

使用SQL 從全名字段解析名字、中間名和姓氏

處理資料時,通常需要將名字分成各自的名字更容易操作的組成部分。在這種情況下,我們需要從「全名」欄位中提取名字、中間名和姓氏,同時考慮常見的資料變更。

準確率達 90% 的高效解決方案

提供的範例提供了一個實用的解決方案,可以高度處理大多數情況準確性:

SELECT
  FIRST_NAME.ORIGINAL_INPUT_DATA,
  FIRST_NAME.TITLE,
  FIRST_NAME.FIRST_NAME,
  CASE
    WHEN 0 = CHARINDEX(' ', FIRST_NAME.REST_OF_NAME)
    THEN NULL  -- No more spaces? Assume rest is last name
    ELSE SUBSTRING(FIRST_NAME.REST_OF_NAME, 1, CHARINDEX(' ', FIRST_NAME.REST_OF_NAME) - 1)
  END AS MIDDLE_NAME,
  SUBSTRING(FIRST_NAME.REST_OF_NAME, 1 + CHARINDEX(' ', FIRST_NAME.REST_OF_NAME), LEN(FIRST_NAME.REST_OF_NAME)) AS LAST_NAME
FROM
  (
    SELECT
      TITLE.TITLE,
      CASE
        WHEN 0 = CHARINDEX(' ', TITLE.REST_OF_NAME)
        THEN TITLE.REST_OF_NAME -- No space? Return the whole thing
        ELSE SUBSTRING(TITLE.REST_OF_NAME, 1, CHARINDEX(' ', TITLE.REST_OF_NAME) - 1)
      END AS FIRST_NAME,
      CASE
        WHEN 0 = CHARINDEX(' ', TITLE.REST_OF_NAME)
        THEN NULL  -- No spaces at all? Then 1st name is all we have
        ELSE SUBSTRING(TITLE.REST_OF_NAME, CHARINDEX(' ', TITLE.REST_OF_NAME) + 1, LEN(TITLE.REST_OF_NAME))
      END AS REST_OF_NAME,
      TITLE.ORIGINAL_INPUT_DATA
    FROM
      (
        SELECT
          -- If the first three characters are in this list,
          -- then pull it as a "title". Otherwise return NULL for title.
          CASE
            WHEN SUBSTRING(TEST_DATA.FULL_NAME, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS')
            THEN LTRIM(RTRIM(SUBSTRING(TEST_DATA.FULL_NAME, 1, 3)))
            ELSE NULL
          END AS TITLE,
          -- If you change the list, don't forget to change it here, too.
          CASE
            WHEN SUBSTRING(TEST_DATA.FULL_NAME, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS')
            THEN LTRIM(RTRIM(SUBSTRING(TEST_DATA.FULL_NAME, 4, LEN(TEST_DATA.FULL_NAME))))
            ELSE LTRIM(RTRIM(TEST_DATA.FULL_NAME))
          END AS REST_OF_NAME,
          TEST_DATA.ORIGINAL_INPUT_DATA
        FROM
          (
            SELECT
              -- Trim leading & trailing spaces before trying to process
              -- Disallow extra spaces *within* the name
              REPLACE(REPLACE(LTRIM(RTRIM(FULL_NAME)), '  ', ' '), '  ', ' ') AS FULL_NAME,
              FULL_NAME AS ORIGINAL_INPUT_DATA
            FROM
              (
                -- Replace this block with your actual table
                SELECT 'GEORGE W BUSH' AS FULL_NAME
                UNION SELECT 'SUSAN B ANTHONY' AS FULL_NAME
                UNION SELECT 'ALEXANDER HAMILTON' AS FULL_NAME
                UNION SELECT 'OSAMA BIN LADEN JR' AS FULL_NAME
                UNION SELECT 'MARTIN J VAN BUREN SENIOR III' AS FULL_NAME
                UNION SELECT 'TOMMY' AS FULL_NAME
                UNION SELECT 'BILLY' AS FULL_NAME
              ) RAW_DATA
          ) TEST_DATA
      ) TITLE
  ) FIRST_NAME;

此查詢將“MR”、“MS”、“DR”和“MRS”等前綴會以單獨的「TITLE」欄位進行識別和刪除,處理缺少的名稱、多個空格姓名,以及單部分「全名」(僅名字)。

特殊處理案例

此解決方案還包括針對特定特殊情況的修改,例如空的「全名」欄位、尾隨/前導空格、多個連續空格以及僅包含名字的「全名」 :

-- Handle the following special cases:
-- 1 - The NAME field is NULL
-- 2 - The NAME field contains leading / trailing spaces
-- 3 - The NAME field has > 1 consecutive space within the name
-- 4 - The NAME field contains ONLY the first name
-- 5 - Include the original full name in the final output as a separate column, for readability
-- 6 - Handle a specific list of prefixes as a separate "title" column

SELECT
  FIRST_NAME.ORIGINAL_INPUT_DATA,
  FIRST_NAME.TITLE,
  FIRST_NAME.FIRST_NAME,
  CASE
    WHEN 0 = CHARINDEX(' ', FIRST_NAME.REST_OF_NAME)
    THEN NULL  -- No more spaces? Assume rest is last name
    ELSE SUBSTRING(FIRST_NAME.REST_OF_NAME, 1, CHARINDEX(' ', FIRST_NAME.REST_OF_NAME) - 1)
  END AS MIDDLE_NAME,
  SUBSTRING(FIRST_NAME.REST_OF_NAME, 1 + CHARINDEX(' ', FIRST_NAME.REST_OF_NAME), LEN(FIRST_NAME.REST_OF_NAME)) AS LAST_NAME
FROM
  (
    SELECT
      TITLE.TITLE,
      CASE
        WHEN 0 = CHARINDEX(' ', TITLE.REST_OF_NAME)
        THEN TITLE.REST_OF_NAME -- No space? Return the whole thing
        ELSE SUBSTRING(TITLE.REST_OF_NAME, 1, CHARINDEX(' ', TITLE.REST_OF_NAME) - 1)
      END AS FIRST_NAME,
      CASE
        WHEN 0 = CHARINDEX(' ', TITLE.REST_OF_NAME)
        THEN NULL  -- No spaces at all? Then 1st name is all we have
        ELSE SUBSTRING(TITLE.REST_OF_NAME, CHARINDEX(' ', TITLE.REST_OF_NAME) + 1, LEN(TITLE.REST_OF_NAME))
      END AS REST_OF_NAME,
      TITLE.ORIGINAL_INPUT_DATA
    FROM
      (
        SELECT
          -- If the first three characters are in this list,
          -- then pull it as a "title". Otherwise return NULL for title.
          CASE
            WHEN SUBSTRING(TEST_DATA.FULL_NAME, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS')
            THEN LTRIM(RTRIM(SUBSTRING(TEST_DATA.FULL_NAME, 1, 3)))
            ELSE NULL
          END AS TITLE,
          -- If you change the list, don't forget to change it here, too.
          CASE
            WHEN SUBSTRING(TEST_DATA.FULL_NAME, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS')
            THEN LTRIM(RTRIM(SUBSTRING(TEST_DATA.FULL_NAME, 4, LEN(TEST_DATA.FULL_NAME))))
            ELSE LTRIM(RTRIM(TEST_DATA.FULL_NAME))
          END AS REST_OF_NAME,
          TEST_DATA.ORIGINAL_INPUT_DATA
        FROM
          (
            SELECT
              -- Trim leading & trailing spaces before trying to process
              -- Disallow extra spaces *within* the name
              REPLACE(REPLACE(LTRIM(RTRIM(FULL_NAME)), '  ', ' '), '  ', ' ') AS FULL_NAME,
              FULL_NAME AS ORIGINAL_INPUT_DATA
            FROM
              (
                -- Replace this block with your actual table
                SELECT 'GEORGE W BUSH' AS FULL_NAME
                UNION SELECT 'SUSAN B ANTHONY' AS FULL_NAME
                UNION SELECT 'ALEXANDER HAMILTON' AS FULL_NAME
                UNION SELECT 'OSAMA BIN LADEN JR' AS FULL_NAME
                UNION SELECT 'MARTIN J VAN BUREN SENIOR III' AS FULL_NAME
                UNION SELECT 'TOMMY' AS FULL_NAME
                UNION SELECT 'BILLY' AS FULL_NAME
                UNION SELECT NULL AS FULL_NAME
                UNION SELECT ' ' AS FULL_NAME
                UNION SELECT '    JOHN  JACOB     SMITH' AS FULL_NAME
                UNION SELECT ' DR  SANJAY       GUPTA' AS FULL_NAME
                UNION SELECT 'DR JOHN S HOPKINS' AS FULL_NAME
                UNION SELECT ' MRS  SUSAN ADAMS' AS FULL_NAME
                UNION SELECT ' MS AUGUSTA  ADA   KING ' AS FULL_NAME      
              ) RAW_DATA
          ) TEST_DATA
      ) TITLE
  ) FIRST_NAME;

以上是如何從 SQL 中的單一「全名」欄位有效解析名字、中間名和姓氏,處理各種資料不一致和特殊情況?的詳細內容。更多資訊請關注PHP中文網其他相關文章!

陳述:
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn