Efficient random row selection method for PostgreSQL
PostgreSQL provides a variety of methods for efficiently selecting random rows.
Method 1: Use Random() and Limit clause
This method uses the random()
function and the LIMIT
clause:
SELECT * FROM table ORDER BY random() LIMIT 1000;
However, for large tables, this method may be slower as it requires a full table scan.
Method 2: Index-based method
This method uses the primary key index to optimize the query:
WITH params AS ( SELECT 1 AS min_id, -- 最小ID (大于等于当前最小ID) , 5100000 AS id_span -- 四舍五入 (max_id - min_id + 缓冲) ) SELECT * FROM ( SELECT p.min_id + trunc(random() * p.id_span)::integer AS id FROM params p , generate_series(1, 1100) g -- 1000 + 缓冲 GROUP BY 1 -- 去除重复项 ) r JOIN table USING (id) LIMIT 1000; -- 去除多余项
This method is faster than method one because it uses an index scan instead of a full table scan.
Method 3: Use recursive CTE
This method uses a recursive common table expression (CTE) to handle missing values in the ID column:
WITH RECURSIVE random_pick AS ( SELECT * FROM ( SELECT 1 + trunc(random() * 5100000)::int AS id FROM generate_series(1, 1030) -- 1000 + 百分几 - 根据需要调整 LIMIT 1030 -- 查询规划器提示 ) r JOIN table b USING (id) -- 去除缺失值 UNION -- 去除重复项 SELECT b.* FROM ( SELECT 1 + trunc(random() * 5100000)::int AS id FROM random_pick r -- 加上百分几 - 根据需要调整 LIMIT 999 -- 小于1000,查询规划器提示 ) r JOIN table b USING (id) -- 去除缺失值 ) TABLE random_pick LIMIT 1000; -- 实际限制
Method 4: Use TABLESAMPLE SYSTEM (n)
PostgreSQL 9.5 introduced the TABLESAMPLE SYSTEM (n)
syntax, where n is a percentage between 0 and 100:
SELECT * FROM big TABLESAMPLE SYSTEM ((1000 * 100) / 5100000.0);
This method is fast, but may not return truly random samples due to clustering effects.
Comparison and suggestions
If the table has few missing values for the ID column and the primary key index is in place, Method two (index-based method) is the best choice as it provides the best speed and accuracy sex.
For tables with many missing values, please consider Method 3 (recursive CTE), which can effectively handle missing values.
Method one (random()
and limit
) has lower performance and should be used with smaller tables.
Method 4(TABLESAMPLE SYSTEM
) is fast, but not as accurate as other methods. It can be used to make quick estimates on large tables.
The above is the detailed content of How can I efficiently select random rows in PostgreSQL?. For more information, please follow other related articles on the PHP Chinese website!

TograntpermissionstonewMySQLusers,followthesesteps:1)AccessMySQLasauserwithsufficientprivileges,2)CreateanewuserwiththeCREATEUSERcommand,3)UsetheGRANTcommandtospecifypermissionslikeSELECT,INSERT,UPDATE,orALLPRIVILEGESonspecificdatabasesortables,and4)

ToaddusersinMySQLeffectivelyandsecurely,followthesesteps:1)UsetheCREATEUSERstatementtoaddanewuser,specifyingthehostandastrongpassword.2)GrantnecessaryprivilegesusingtheGRANTstatement,adheringtotheprincipleofleastprivilege.3)Implementsecuritymeasuresl

ToaddanewuserwithcomplexpermissionsinMySQL,followthesesteps:1)CreatetheuserwithCREATEUSER'newuser'@'localhost'IDENTIFIEDBY'password';.2)Grantreadaccesstoalltablesin'mydatabase'withGRANTSELECTONmydatabase.TO'newuser'@'localhost';.3)Grantwriteaccessto'

The string data types in MySQL include CHAR, VARCHAR, BINARY, VARBINARY, BLOB, and TEXT. The collations determine the comparison and sorting of strings. 1.CHAR is suitable for fixed-length strings, VARCHAR is suitable for variable-length strings. 2.BINARY and VARBINARY are used for binary data, and BLOB and TEXT are used for large object data. 3. Sorting rules such as utf8mb4_unicode_ci ignores upper and lower case and is suitable for user names; utf8mb4_bin is case sensitive and is suitable for fields that require precise comparison.

The best MySQLVARCHAR column length selection should be based on data analysis, consider future growth, evaluate performance impacts, and character set requirements. 1) Analyze the data to determine typical lengths; 2) Reserve future expansion space; 3) Pay attention to the impact of large lengths on performance; 4) Consider the impact of character sets on storage. Through these steps, the efficiency and scalability of the database can be optimized.

MySQLBLOBshavelimits:TINYBLOB(255bytes),BLOB(65,535bytes),MEDIUMBLOB(16,777,215bytes),andLONGBLOB(4,294,967,295bytes).TouseBLOBseffectively:1)ConsiderperformanceimpactsandstorelargeBLOBsexternally;2)Managebackupsandreplicationcarefully;3)Usepathsinst

The best tools and technologies for automating the creation of users in MySQL include: 1. MySQLWorkbench, suitable for small to medium-sized environments, easy to use but high resource consumption; 2. Ansible, suitable for multi-server environments, simple but steep learning curve; 3. Custom Python scripts, flexible but need to ensure script security; 4. Puppet and Chef, suitable for large-scale environments, complex but scalable. Scale, learning curve and integration needs should be considered when choosing.

Yes,youcansearchinsideaBLOBinMySQLusingspecifictechniques.1)ConverttheBLOBtoaUTF-8stringwithCONVERTfunctionandsearchusingLIKE.2)ForcompressedBLOBs,useUNCOMPRESSbeforeconversion.3)Considerperformanceimpactsanddataencoding.4)Forcomplexdata,externalproc


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Atom editor mac version download
The most popular open source editor

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Dreamweaver Mac version
Visual web development tools

Zend Studio 13.0.1
Powerful PHP integrated development environment
