Home >Database >Mysql Tutorial >How to Improve MySQL's ORDER BY RAND() Performance for Large Datasets?

How to Improve MySQL's ORDER BY RAND() Performance for Large Datasets?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-12-18 08:11:14384browse

How to Improve MySQL's ORDER BY RAND() Performance for Large Datasets?

How to Enhance MySQL's ORDER BY RAND() Performance

The ORDER BY RAND() function in MySQL can prove inefficient for large datasets, leading to slow query execution. To address this issue, it is crucial to dive into MySQL's slow query log for insights.

Inefficiency with ORDER BY RAND()

Queries containing ORDER BY RAND() often dominate slow query logs. The proposed solution provided by MySQLPerformanceBlog may suffice under specific conditions. However, poorly optimized or user-managed tables present challenges that require more effective measures.

Solution: Avoiding Sorting

The most optimal solution lies in avoiding sorting altogether. We can achieve this by employing a technique that calculates a row's probability of selection. Here's a query that harnesses this approach:

SELECT  *
FROM    (
        SELECT  @cnt := COUNT(*) + 1,
                @lim := 10
        FROM    t_random
        ) vars
STRAIGHT_JOIN
        (
        SELECT  r.*,
                @lim := @lim - 1
        FROM    t_random r
        WHERE   (@cnt := @cnt - 1)
                AND RAND(20090301) < @lim / @cnt
        ) i

This technique is highly efficient in MyISAM databases and provides significant performance improvements in InnoDB as well.

Selecting a Single Random Record

For scenarios involving the selection of a single random record, consider the following query:

SELECT  aco.*
FROM    (
        SELECT  minid + FLOOR((maxid - minid) * RAND()) AS randid
        FROM    (
                SELECT  MAX(ac_id) AS maxid, MIN(ac_id) AS minid
                FROM    accomodation
                ) q
        ) q2
JOIN    accomodation aco
ON      aco.ac_id =
        COALESCE
        (
        (
        SELECT  accomodation.ac_id
        FROM    accomodation
        WHERE   ac_id > randid
                AND ac_status != 'draft'
                AND ac_images != 'b:0;'
                AND NOT EXISTS
                (
                SELECT  NULL
                FROM    accomodation_category
                WHERE   acat_id = ac_category
                        AND acat_slug = 'vendeglatohely'
                )
        ORDER BY
                ac_id
        LIMIT   1
        ),
        (
        SELECT  accomodation.ac_id
        FROM    accomodation
        WHERE   ac_status != 'draft'
                AND ac_images != 'b:0;'
                AND NOT EXISTS
                (
                SELECT  NULL
                FROM    accomodation_category
                WHERE   acat_id = ac_category
                        AND acat_slug = 'vendeglatohely'
                )
        ORDER BY
                ac_id
        LIMIT   1
        )
        )

This query assumes an even distribution of ac_id values.

The above is the detailed content of How to Improve MySQL's ORDER BY RAND() Performance for Large Datasets?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn