本文内容遵从CC版权协议, 可以随意转载, 但必须以超链接形式标明文章原始出处和作者信息及版权声明网址: http://www.penglixun.com/tech/database/mysql-vs-postgresql-part-1-table-organization.html 翻译自:http://blogs.enterprisedb.com/2010/11/29/my
本文内容遵从CC版权协议, 可以随意转载, 但必须以超链接形式标明文章原始出处和作者信息及版权声明网址: http://www.penglixun.com/tech/database/mysql-vs-postgresql-part-1-table-organization.html
翻译自:http://blogs.enterprisedb.com/2010/11/29/mysql-vs-postgresql-part-1-table-organization/
翻译不正确之处请指正。
I’m going to be starting an occasional series of blog postings comparing MySQL’s architecture to PostgreSQL’s architecture. Regular readers of this blog will already be aware that I know PostgreSQL far better than MySQL, having last used MySQL a very long time ago when both products were far less mature than they are today. So, my discussion of how PostgreSQL works will be based on first-hand knowledge, but discussion of how MySQL works will be based on research and – insofar as I’m can make it happen – discussion with people who know it better than I do. (Note: If you’re a person who knows MySQL better than I do and would like to help me avoid making stupid mistakes, drop me an email.)
我将要开始一个比较MySQL和PostgreSQL架构系列的博客。本博客的长期读者都已经知道,我最后一次使用MySQL是在很久很久以前两款产品都远不如今天的时候,所以我认为PostgreSQL远好于MySQL。因此,我讨论PostgreSQL如何工作是基于第一手资料,而对于MySQL则是基于很久以前的情况,看博客的同学有很多比我更了解MySQL。如果你是一个比我更了解MySQL的人,发现了我愚蠢的错误请给我一个邮件。
In writing these posts, I’m going to try to avoid making value judgments about which system is “better”, and instead focus on describing how the architecture differs, and maybe a bit about the advantages of each architecture. I can’t promise that it will be entirely unbiased (after all, I am a PostgreSQL committer, not a MySQL committer!) but I’m going to try to make it as unbiased as I can. Also, bearing in mind what I’ve recently been told by Baron Schwartz and Rob Wultsch, I’m going to focus completely on InnoDB and ignore MyISAM and all other storage engines. Finally, I’m going to focus on architectural differences. People might choose to use PostgreSQL because they hate Oracle, or MySQL because it’s easier to find hosting, or either product because they know it better, and that’s totally legitimate and perhaps worth talking about, but – partly in the interests of harmony among communities that ought to be allies – it’s not what I’m going to talk about here.
写这些文章,我要尽量避免作出哪个系统更好的判断,而是侧重于介绍他们架构的不同,也许是一些各种架构的优势。我不能保证这些观点是完全不带偏见的(毕竟,我是一个PostgreSQL代码的提交者,而不是MySQL的提交者),但是我会尽量做到不偏重某一个。此外,考虑到我最近已经跟Baron Schwartz和Rob Wultsch说的内容,我将完全忽略MyISAM和所有其他存储引擎,而重点关注InnoDB。最后,我将专注于架构的差异。人们有时选择使用PostgreSQL是因为他们恨甲骨文,或者选择MySQL因为它更容易找到托管服务,或其他一些产品因为他们知道它更好,并且这是完全符合授权的。这些但这不是我想要谈的。(译者注:最后一段话太绕口,翻译不了,只翻译大意)
So, all that having been said, what I’d like to talk about in this post is the way that MySQL and PostgreSQL store tables and indexes on disk. In PostgreSQL, table data and index data are stored in completely separate structures. When a new row is inserted, or when an existing row is updated, the new row is stored in any convenient place in the table. In the case of an update, we try to store the new row on the same page as the old row if there’s room; if there isn’t room or if it’s an insert, we pick a page that has adequate free space and use that, or failing all else extend the table by one page and add the new row there. Once the table row is added, we cycle through all the indexes defined for the table and add an index entry to each one pointing at the physical position of the table row. One index may happen to be the primary key, but that’s a fairly nominal distinction – all indexes are basically the same.
因此,我将说的内容是,MySQL和PostgreSQL的表和索引存储在磁盘上的方式。在 PostgreSQL,表数据和索引数据是完全分开存储的。当新行插入,或现有的行被更新,新行是表中的任何方便保存的地方保存。在更新的场景下,我们尝试在页内还有空间的情况下存储新行与旧行在同一个页上。如果没有空间,或者如果它是一个插入操作,我们将选择一个有足够空闲空间的页,使用它,或者扩展一个新页把新行放入。我们轮训表上定义的所有索引,并添加一个索引项指针指向表中新行的物理位置。这个索引也许是主键,也许是一般的索引,但是所有的所有索引都是基于一样的操作。
Under MySQL’s InnoDB, the table data and the primary key index are stored in the same data structure. As I understand it, this is what Oracle calls an index-organized table. Any additional (”secondary”) indexes refer to the primary key value of the tuple to which they point, not the physical position, which can change as leaf pages in the primary key index are split. Since this architecture requires every table to have a primary key, an internal row ID field is used as the primary key if no explicit primary key is specified.
在InnoDB中,表数据和主键索引是存在同样的数据结构中(译者注:主键聚集索引)。据我的理解,这就像Oracle的索引组织表(译者注:还是有一些区别,索引组织表完全按索引排序,但是InnoDB只按主键排序)。任何非主键索引指向主键索引的位置,而不是物理位置,所以主键索引页的页节点分裂不会导致数据改变。由于这种架构要求每个表都有一个主键,所以如果没有定义主键内部将隐含定义一个主键(译者注,内部定义的主键为6字节)。
Since Oracle supports both options, they are probably both useful. An index-organized table seems particularly likely to be useful when most lookups are by primary key, and most of the data in each row is part of the primary key anyway, either because the primary key columns are long compared with the remaining columns, or because the rows, overall, are short. Storing the whole row in the index avoids storing the same data twice (once in the index and once in the table), and the gain will be larger when the primary key is a substantial percentage of the total data. Furthermore, in this situation, the index page still holds as many, or almost as many, keys as it would if only a pointer were stored in lieu of the whole row, so one fewer random I/Os will be needed to access a given row.
由于Oracle支持两种选择(索引组织表和堆表),他们可能都非常有用。一个索引组织表似乎在多数SQL是通过主键查找,以及每行的大部分数据是主键的一部分的时候非常有用。要么因为主键列比其余的列长,或因为行总体而言是比较短的。存储整行数据在索引上避免了同样的数据存两分(一份在索引,一份在表中),但是如果主键占数据行的比例较大时,数据增益(译者注:数据+表的重复数据量)将更大。此外,在这种情况下,索引页将保存很多或几乎一样多的数据,访问数据时在索引页中就可能得到整行需要的列,所以这可以减少随机IO(译者注:覆盖索引扫描,Index Scan)。
When accessing an index-organized table via a secondary index, it may be necessary to traverse both the B-tree in the secondary-index, and the B-tree in the primary index. As a result, queries involving secondary indexes might be slower. However, since MySQL has index-only scans ( PostgreSQL does not ), it can sometimes avoid traversing the secondary index. So in MySQL, adding additional columns to an index might very well make it run faster, if it causes the index to function as a covering index for the query being executed. But in PostgreSQL, we frequently find ourselves telling users to pare down the columns in the index to the minimum set that is absolutely necessary, often resulting in dramatic performance gains. This is an interesting example of how the tuning that is right for one database may be completely wrong for another database.
当通过非主键索引访问一个索引组织表,可能需要遍历非主键索引的B树和主键索引的B树。因此,查询涉及非主键索引可能会变慢。然而,由于MySQL有Index-Scan方式(译者注:访问索引即可获得数据) 而PostgreSQL没有,它有时访问非主键索引就能拿到数据。因此,在MySQL中,添加额外的列索引如果带来覆盖索引的查询计划,则很可能使SQL运行得更快(译者注:这个不完全对,索引多的话索引页分裂时的物理IO操作还是比较多的,推荐满足需求的情况下减少索引,除非你能保证覆盖索引经常被用到)。但是在PostgreSQL里,我们经常发现自己告诉用户减少索引到满足要求的最低限度时往往能带来巨大的性能提升。这是一个有趣的例子,如何调整数据库在不同的数据库中是完全相反的方法。
I’ve recently learned that neither InnoDB nor PostgreSQL supports traversing an index in physical order, only in key order. For InnoDB, this means that ALL scans are performed in key order, since the table itself is, in essence, also an index. As I understand it, this can make a large sequential scan quite slow, by defeating the operating system’s prefetch logic. In PostgreSQL, however, because tables are not index-organized, sequential scans are always performed in physical order, and don’t require looking at the indexes at all; this also means we can skip any I/O or CPU cost associated with examining non-leaf index pages. Traversing in physical order is apparently difficult from a locking perspective, although it must be possible, because Oracle supports it. It would be very useful to see this support in MySQL, and once PostgreSQL has index-only scans, it would be a useful improvement for PostgreSQL, too.
我最近获悉,PostgreSQL跟InnoDB一样也支持通过主键索引顺序遍历(译者注:InnoDB访问全表返回数据按主键顺序排列)。对于 InnoDB,这意味着所有的全表扫描是在扫描主键索引,主键索引本身就是表。据我了解,这可能导致大的顺序扫描慢很多(译者注:这个比较扯淡,在数据静止的情况下,PostgreSQL一样要通过block的指针访问下一个block,InnoDB通过页的指针访问下一个页)。在PostgreSQL,因为表不是按(主键)索引组织,顺序扫描总是按物理顺序进行,并且完全不需要访问索引,这也意味着我们可以跳过任何访问索引非叶子节点的IO或CPU开销(译者注:这位兄台应该忘记了什么是B+树)。显然按物理顺序访问是很困难的,但是肯定可以实现,因为Oracle支持。这是MySQL一个非常有用的功能,PostgreSQL一旦有了覆盖索引扫描功能,对PostgreSQL也将是非常有用的提升。
One final difficulty with an index-organized table is that you can’t add, drop, or change the primary key definition without a full-table rewrite. In PostgreSQL, on the other hand, this can be done – even while allow concurrent read and write activity. This is a fairly nominal advantage for most use cases since the primary key of a table rarely changes – I think it’s happened to me only once or twice in the last ten years – but it is useful when it does comes up.
使用索引组织表的最后一个问题是不能在不重建全表的情况下添加,删除或变更主键索引定义。反而在PostgreSQL里,这是可以做到的——即使当允许并发读写活动时。在大多数情况下(InnoDB)具有优势,因为在大多数场景下一旦定义主键不太可能更改 。在我最近十年内这只碰到一次或两次——但是它真的发生时,(PostgreSQL)还是很有用的。
I hope that the above is a fair and accurate summary of the topic, but I’m sure I’ve missed a few things and covered others incompletely or in less detail than might be helpful. Please feel free to respond with a comment below or a blog post of your own if I’ve missed something.
我希望以上是这个专题比较公正和准确的总结,但我敢肯定,我已经错过了一些东西,或者覆盖一些内容不完全,缺少一些可能会有所帮助的细节。请随时反馈在下面的评论中评论您对我遗漏的一些内容的看法。

MySQL在Web應用中的主要作用是存儲和管理數據。 1.MySQL高效處理用戶信息、產品目錄和交易記錄等數據。 2.通過SQL查詢,開發者能從數據庫提取信息生成動態內容。 3.MySQL基於客戶端-服務器模型工作,確保查詢速度可接受。

構建MySQL數據庫的步驟包括:1.創建數據庫和表,2.插入數據,3.進行查詢。首先,使用CREATEDATABASE和CREATETABLE語句創建數據庫和表,然後用INSERTINTO語句插入數據,最後用SELECT語句查詢數據。

MySQL適合初學者,因為它易用且功能強大。 1.MySQL是關係型數據庫,使用SQL進行CRUD操作。 2.安裝簡單,需配置root用戶密碼。 3.使用INSERT、UPDATE、DELETE、SELECT進行數據操作。 4.複雜查詢可使用ORDERBY、WHERE和JOIN。 5.調試需檢查語法,使用EXPLAIN分析查詢。 6.優化建議包括使用索引、選擇合適數據類型和良好編程習慣。

MySQL適合初學者,因為:1)易於安裝和配置,2)有豐富的學習資源,3)SQL語法直觀,4)工具支持強大。儘管如此,初學者需克服數據庫設計、查詢優化、安全管理和數據備份等挑戰。

是的,sqlisaprogramminglanguges pecialized fordatamanage.1)它具有焦點,focusingonwhattoachieveratherthanhow.2)sqlisessential forquerying forquerying,插入,更新,更新,和detletingdatainrelationalDatabases.3)

ACID屬性包括原子性、一致性、隔離性和持久性,是數據庫設計的基石。 1.原子性確保事務要么完全成功,要么完全失敗。 2.一致性保證數據庫在事務前後保持一致狀態。 3.隔離性確保事務之間互不干擾。 4.持久性確保事務提交後數據永久保存。

MySQL既是數據庫管理系統(DBMS),也與編程語言緊密相關。 1)作為DBMS,MySQL用於存儲、組織和檢索數據,優化索引可提高查詢性能。 2)通過SQL與編程語言結合,嵌入在如Python中,使用ORM工具如SQLAlchemy可簡化操作。 3)性能優化包括索引、查詢、緩存、分庫分錶和事務管理。

MySQL使用SQL命令管理數據。 1.基本命令包括SELECT、INSERT、UPDATE和DELETE。 2.高級用法涉及JOIN、子查詢和聚合函數。 3.常見錯誤有語法、邏輯和性能問題。 4.優化技巧包括使用索引、避免SELECT*和使用LIMIT。


熱AI工具

Undresser.AI Undress
人工智慧驅動的應用程序,用於創建逼真的裸體照片

AI Clothes Remover
用於從照片中去除衣服的線上人工智慧工具。

Undress AI Tool
免費脫衣圖片

Clothoff.io
AI脫衣器

AI Hentai Generator
免費產生 AI 無盡。

熱門文章

熱工具

SublimeText3 Mac版
神級程式碼編輯軟體(SublimeText3)

PhpStorm Mac 版本
最新(2018.2.1 )專業的PHP整合開發工具

Safe Exam Browser
Safe Exam Browser是一個安全的瀏覽器環境,安全地進行線上考試。該軟體將任何電腦變成一個安全的工作站。它控制對任何實用工具的訪問,並防止學生使用未經授權的資源。

SublimeText3 Linux新版
SublimeText3 Linux最新版

MinGW - Minimalist GNU for Windows
這個專案正在遷移到osdn.net/projects/mingw的過程中,你可以繼續在那裡關注我們。 MinGW:GNU編譯器集合(GCC)的本機Windows移植版本,可自由分發的導入函式庫和用於建置本機Windows應用程式的頭檔;包括對MSVC執行時間的擴展,以支援C99功能。 MinGW的所有軟體都可以在64位元Windows平台上運作。