search

Home  >  Q&A  >  body text

Is storing delimited lists in database columns really that bad?

<p>Imagine a web form with a set of checkboxes (any or all of which can be selected). I chose to save them in a comma separated list of values ​​stored in a column in the database table. </p> <p>Now, I know that the correct solution is to create a second table and normalize the database correctly. It's faster to implement a simple solution and I want to get a proof of concept of the application quickly without having to spend too much time on it. </p> <p>I think the time savings and simpler code are worth it in my case, is this a reasonable design choice or should I just standardize this from the start? </p> <p>For more context, this is a small internal application that essentially replaces an Excel file stored in a shared folder. I'm also asking this question because I'm thinking of cleaning up the program and making it easier to maintain. There are some things about it that I'm not too happy with, one of which is the subject of this question. </p>
P粉020556231P粉020556231481 days ago498

reply all(2)I'll reply

  • P粉948258958

    P粉9482589582023-08-28 11:49:53

    "One of the reasons is laziness."

    This is a wake-up call. The only reason you should do something like this is if you know how to do it "the right way", but you conclude there is a solid reason not to do it.

    Having said that: If the data you choose to store this way is data you will never need to query, then there may be a case for storing it the way you choose.

    (Some users will take issue with my statement in the previous paragraph, saying "you never know what requirements may be added in the future". These users are either misinformed or state religious beliefs. Sometimes, hard work is advantageous in front of you Require.)

    reply
    0
  • P粉545956597

    P粉5459565972023-08-28 09:02:45

    In addition to violating First Normal Form, there are many other more practical problems with duplicate group value columns, comma separated lists stored in a single value:

    • Unable to ensure that each value is the correct data type: unable to prevent 1,2,3,banana,5
    • Cannot use foreign key constraints to link values ​​to a lookup table; cannot enforce referential integrity.
    • Unable to enforce uniqueness: Unable to block1,2,3,3,3,5
    • You cannot remove a value from a list without getting the entire list.
    • The length of the stored list cannot exceed the length of the string column.
    • It's difficult to search for all entities in a list with a given value; you have to use an inefficient table scan. You may have to resort to regular expressions, for example in MySQL:
      idlist REGEXP '[[:<:]]2[[:>:]]' Or in MySQL 8.0: idlist REGEXP '\\b2\\b'
    • It is difficult to count elements in a list, or perform other aggregation queries.
    • It is difficult to connect values ​​to the lookup tables they reference.
    • It is difficult to get the list in sorted order.
    • It is difficult to choose a separator that is guaranteed not to appear in the value

    To solve these problems, you have to write a lot of application code and reinvent the more efficient features that RDBMS already provides.

    The comma-separated list is wrong, and I made it the first chapter in my book: SQL Anti-Patterns, Volume 1: Avoiding the Pitfalls of Database Programming.

    Sometimes you need to denormalize, but as @OMG Ponies mentioned, these are exceptions. Any non-relational "optimization" will benefit one type of query at the expense of other uses of the data, so make sure you know which queries need special handling so that they are worthy of denormalization.

    reply
    0
  • Cancelreply