Home >Database >Mysql Tutorial >How to Efficiently Perform Simple Random Sampling in MySQL?

How to Efficiently Perform Simple Random Sampling in MySQL?

Patricia Arquette
Patricia ArquetteOriginal
2025-01-05 16:03:43559browse

How to Efficiently Perform Simple Random Sampling in MySQL?

Efficient Simple Random Sampling in MySQL

Many applications require the ability to extract a simple random sample from a large database table. However, using the seemingly intuitive method of SELECT * FROM table ORDER BY RAND() LIMIT 10000 can be prohibitively slow for tables with millions of rows.

Faster Solution

A more efficient approach is to use the rand() function to assign a random number to each row, then filter the table based on this number:

SELECT * FROM table WHERE rand() <= 0.3

How It Works

This method generates a random number between 0 and 1 for each row. If this number is less than or equal to 0.3 (30%), the row is selected for the sample.

Advantages

  • O(n) complexity, without the need for sorting
  • Utilizes MySQL's built-in rand() function for efficient number generation

Improved Version

For even greater efficiency, consider sampling the rows to 2-5x your desired sample size and sorting them by the random number using an index, then trimming the results to the desired size:

SELECT COUNT(*) FROM table; -- Use this to determine rand_low and rand_high

SELECT *
FROM table
WHERE frozen_rand BETWEEN %(rand_low)s AND %(rand_high)s
ORDER BY RAND() LIMIT 1000

This method uses an index scan to reduce the size of the data before sorting, making it suitable for large tables.

The above is the detailed content of How to Efficiently Perform Simple Random Sampling in MySQL?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn