Home >Database >Mysql Tutorial >How Can I Efficiently Select a Random Sample from a SQL Server Table?
Efficiently select random samples from SQL Server tables
Retrieving random row samples from large SQL Server tables is useful for a variety of purposes. A common approach is to generate a "random number" column in a temporary table, populate it with random values, and then select rows where the random number is within the desired range. However, this approach can be complex and inefficient.
A more direct method is to use the NEWID() function. NEWID() generates a unique identifier based on a combination of system time and randomness. By sorting the table by the output of NEWID(), you effectively randomize the row order.
To select a specific percentage of rows, you can use the TOP clause. For example, to select 10% of the rows from a table named [yourtable], you would use the following query:
<code class="language-sql">SELECT TOP 10 PERCENT * FROM [yourtable] ORDER BY NEWID();</code>
While this method is generally efficient, you may encounter performance issues when working with particularly large tables. To optimize performance, you can combine TOP and WHERE clauses:
<code class="language-sql">SELECT * FROM [yourtable] WHERE [yourPk] IN ( SELECT TOP 10 PERCENT [yourPk] FROM [yourtable] ORDER BY NEWID() );</code>
This query uses [yourPk] as the primary key. The inner query retrieves the primary keys of the selected rows, and the outer query uses these primary keys to filter the table. This approach improves performance by reducing the number of rows scanned.
The above is the detailed content of How Can I Efficiently Select a Random Sample from a SQL Server Table?. For more information, please follow other related articles on the PHP Chinese website!