search
HomeDatabaseMysql TutorialTrue Alphanumeric / natural sorting in MySQL - why is the answer always recursion?

True Alphanumeric / natural sorting in MySQL - why is the answer always recursion?

Yesterday I attempted to solve alphanumeric sorting in MySQL and failed. (read that article here)

I did get close though and had the right concept, just wrong execution.

Today, I woke up and had an epiphany...recursion.

The problem with recursion is that you have to understand recursion to be able to do recursion...and I don't understand recursion enough to do recursion in MySQL.

However with a bit of Chat Gippity back and forth (by which I mean getting it to write what I asked for, getting back about 25% of what I asked for, fixing it and feeding it into a new chat so it doesn't keep repeating itself for about 2 hours) I got a working answer!

To the point

May I present to you my swan song, my masterpiece, the answer to life itself (ok fine, the only working solution to true alphanumeric sorting in MySQL I have seen).

WITH RECURSIVE process_numbers AS (
    SELECT 
        data_value,
        data_value AS remaining_data,
        CAST('' AS CHAR(20000)) AS processed_data,
        1 AS iteration
    FROM test_data

    UNION ALL

    SELECT
        data_value,
        CASE 
            WHEN LOCATE(REGEXP_SUBSTR(remaining_data, '[0-9]+'), remaining_data) > 0 THEN
                SUBSTRING(
                    remaining_data,
                    LOCATE(REGEXP_SUBSTR(remaining_data, '[0-9]+'), remaining_data)
                    + LENGTH(REGEXP_SUBSTR(remaining_data, '[0-9]+'))
                )
            ELSE '' 
        END AS remaining_data,

        CONCAT(
            processed_data,
            CASE 
                WHEN LOCATE(REGEXP_SUBSTR(remaining_data, '[0-9]+'), remaining_data) > 0 THEN
                    LEFT(remaining_data, LOCATE(REGEXP_SUBSTR(remaining_data, '[0-9]+'), remaining_data) - 1)
                ELSE remaining_data
            END,
            CASE
                WHEN REGEXP_SUBSTR(remaining_data, '[0-9]+') IS NOT NULL THEN
                    RIGHT(CONCAT('0000000000', REGEXP_SUBSTR(remaining_data, '[0-9]+')), 10)
                ELSE ''
            END
        ) AS processed_data,

        iteration + 1
    FROM process_numbers
    WHERE LENGTH(remaining_data) > 0
          AND iteration 



<p>And if you want to try it out (and try to break it) you can play with this DB fiddle</p>

<h2>
  
  
  So how does this work?
</h2>

<p>It does what I originally wanted to do, take every group of numbers and pads them out to 10 digits total.</p>

<p>So obviously if you feed this a couple of strings with 11 consecutive numeric digits it won't work without adjustment, but other than that it works fine!</p>

<p>You see, MySQL can sort numbers correctly, even in lexicographical ordering mode, but it has one flaw.</p>

<p>It counts "11" as smaller than "2" because of the fact it does sorting one character at a time (effectively). So "2" is bigger than "1" so it comes first. Then it checks the next character, by which point the sorting is incorrect (for numbers at least). </p>

<p>To understand this better, imagine if 1 was actually the letter "b" and 2 was the letter "c". </p>

<p>That is how MySQL "sees" numbers, they are just another character.</p>

<p>So if I had "bb" and "c" you would <em>expect</em> "bb" to come before "c". Now swap the numbers back in and you can see why "11" comes before "2".</p>

<h3>
  
  
  So this is a hack?
</h3>

<p>Yes, we remove the issue by moving the numbers "back" through padding.</p>

<p>Going back to our example, if we padded "11" and "2" to 3 in length and use "a" as 0, this is what happens:<br>
</p>

<pre class="brush:php;toolbar:false">011 = abb
002 = aac 

notice how now the sorting would go:

  • character 1: is "a" bigger than "a" - no, they are the same.
  • character 2: is "b" bigger than "a" - yes, put the "a" before the "b"
  • character 3: is now irrelevant and we have already found an occurence earlier that was different and larger.

So by that logic we now have:

002 = aac (the second "a" comes before the second "b" in the next row)
011 = abb

And that is how it works!

Are you going to explain the recursion thing?

Kind of. I have been "round the houses" with this one and my knowledge is surface level, but I will give it a go.

The problem came with how RegEx works in MySQL. REGEX_SUBSTR will only ever find one match and then keep returning that for every other match it finds. So that was why my solution from yesterday was not working correctly.

But REGEX_REPLACE has its own issues where it doesn't seem to correctly expose the string length of a match (so we can't LPAD with it correctly)

That is why I thought about recursion as the answer.

I can use REGEX_SUBSTR to get the correct padding behaviour, and as each loop through the RegEx is essentially a new function call it doesn't "remember" the previous match, so it solves that problem.

And if you want a brief step through of the logic it is actually not as scary as it looks!

  • We loop over a given string, looking for any numbers (the entire number, not just a single character).
  • We then remove that from remaining_data so we don't match it again.
  • We take that number we just matched and pad it out to be 10 digits long total.
  • We then search for the next numeric part in the string and repeat the process, building up processed_data as our final string.
  • finally once we have no more numbers to process, we add any remaining letters to the end of processed_data to complete the transformation and we return this as sort_key.

Then we can use this sort_key in our query to order the column correctly.

And the iteration part is purely a safeguarding tool, to make sure it doesn't completely run the MySQL server out of memory or crash the query if a sufficiently complex string is processed (or there is an error in the logic that means it would recurse forever).

That's a wrap!

Isn't it funny how sleeping on things brings new perspective?

Perhaps I should try out polyphasic sleep so I can sleep on problems 2-3 times more often each day and become a 10x developer? haha.

Anyway there you have it, a reasonably robust true alphanumeric sort.

Oh and in reality you should probably convert the sort_key to a stored column on your database using GENERATE or a stored procedure. Sadly the playground I use doesn't seem to support that and it is a Sunday so I will leave that to you, dear viewer!

Have a great rest of your weekend and a great week.

The above is the detailed content of True Alphanumeric / natural sorting in MySQL - why is the answer always recursion?. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
MySQL: Essential Skills for Beginners to MasterMySQL: Essential Skills for Beginners to MasterApr 18, 2025 am 12:24 AM

MySQL is suitable for beginners to learn database skills. 1. Install MySQL server and client tools. 2. Understand basic SQL queries, such as SELECT. 3. Master data operations: create tables, insert, update, and delete data. 4. Learn advanced skills: subquery and window functions. 5. Debugging and optimization: Check syntax, use indexes, avoid SELECT*, and use LIMIT.

MySQL: Structured Data and Relational DatabasesMySQL: Structured Data and Relational DatabasesApr 18, 2025 am 12:22 AM

MySQL efficiently manages structured data through table structure and SQL query, and implements inter-table relationships through foreign keys. 1. Define the data format and type when creating a table. 2. Use foreign keys to establish relationships between tables. 3. Improve performance through indexing and query optimization. 4. Regularly backup and monitor databases to ensure data security and performance optimization.

MySQL: Key Features and Capabilities ExplainedMySQL: Key Features and Capabilities ExplainedApr 18, 2025 am 12:17 AM

MySQL is an open source relational database management system that is widely used in Web development. Its key features include: 1. Supports multiple storage engines, such as InnoDB and MyISAM, suitable for different scenarios; 2. Provides master-slave replication functions to facilitate load balancing and data backup; 3. Improve query efficiency through query optimization and index use.

The Purpose of SQL: Interacting with MySQL DatabasesThe Purpose of SQL: Interacting with MySQL DatabasesApr 18, 2025 am 12:12 AM

SQL is used to interact with MySQL database to realize data addition, deletion, modification, inspection and database design. 1) SQL performs data operations through SELECT, INSERT, UPDATE, DELETE statements; 2) Use CREATE, ALTER, DROP statements for database design and management; 3) Complex queries and data analysis are implemented through SQL to improve business decision-making efficiency.

MySQL for Beginners: Getting Started with Database ManagementMySQL for Beginners: Getting Started with Database ManagementApr 18, 2025 am 12:10 AM

The basic operations of MySQL include creating databases, tables, and using SQL to perform CRUD operations on data. 1. Create a database: CREATEDATABASEmy_first_db; 2. Create a table: CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY, titleVARCHAR(100)NOTNULL, authorVARCHAR(100)NOTNULL, published_yearINT); 3. Insert data: INSERTINTObooks(title, author, published_year)VA

MySQL's Role: Databases in Web ApplicationsMySQL's Role: Databases in Web ApplicationsApr 17, 2025 am 12:23 AM

The main role of MySQL in web applications is to store and manage data. 1.MySQL efficiently processes user information, product catalogs, transaction records and other data. 2. Through SQL query, developers can extract information from the database to generate dynamic content. 3.MySQL works based on the client-server model to ensure acceptable query speed.

MySQL: Building Your First DatabaseMySQL: Building Your First DatabaseApr 17, 2025 am 12:22 AM

The steps to build a MySQL database include: 1. Create a database and table, 2. Insert data, and 3. Conduct queries. First, use the CREATEDATABASE and CREATETABLE statements to create the database and table, then use the INSERTINTO statement to insert the data, and finally use the SELECT statement to query the data.

MySQL: A Beginner-Friendly Approach to Data StorageMySQL: A Beginner-Friendly Approach to Data StorageApr 17, 2025 am 12:21 AM

MySQL is suitable for beginners because it is easy to use and powerful. 1.MySQL is a relational database, and uses SQL for CRUD operations. 2. It is simple to install and requires the root user password to be configured. 3. Use INSERT, UPDATE, DELETE, and SELECT to perform data operations. 4. ORDERBY, WHERE and JOIN can be used for complex queries. 5. Debugging requires checking the syntax and use EXPLAIN to analyze the query. 6. Optimization suggestions include using indexes, choosing the right data type and good programming habits.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
Will R.E.P.O. Have Crossplay?
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

VSCode Windows 64-bit Download

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.