search
HomeDatabaseMysql Tutorial [Leveldb] 实现文档翻译

[Leveldb] 实现文档翻译

Jun 07, 2016 pm 05:37 PM
leveldbaccomplishdocumenttranslate

文件 leveldb是根据单机版BigTable来实现的,但是文件的组织方式却有以下几点不同。 每一个数据库是由存储在文件夹下面的一系列文件集合来实现的,有很多不同类型的文件: Log Files: log文件(*.log) 存储了一系列最近的更新。每一个更新都会追加到当前的lo

文件

leveldb是根据单机版BigTable来实现的,但是文件的组织方式却有以下几点不同。
每一个数据库是由存储在文件夹下面的一系列文件集合来实现的,有很多不同类型的文件:

  • Log Files:
  • log文件(*.log) 存储了一系列最近的更新。每一个更新都会追加到当前的log文件中。当一个log文件到达一个预设阈值(默认是4MB),它将会转变成一个有序表,并且为以后的更新操作生成一个新的log文件。

  • sorted tables
  • 一个 sorted tables (*.sst) 存储一系列有序的key。每一个entry是一个key的value或者一个删除的key。
    sorted tables 由多级的方式组成。sorted table 由一个特殊的更新的层级生成(也叫做level-0)。当更新的文件超过某一阈值(通常是4个),所有更新的文件会一起与level-1层的文件进行合并产生一个新的leve-1文件(我们为每2M的数据建立一个level-1层的文件)
    更新层的文件可能会包含重复的key,然而在其他层级的文件有着有序不相同的key。加入第L层,L>=1。当在L层文件的大小超过10^L MB 时,一个在L层的文件以及所有在L+1层的文件会形成一个新的文件集合。这些合并操作会逐渐的从level-0到最后一层。

  • Mainifest
  • 一个MANIFEST文件列出了所有sorted tables的集合,key的序列,一起他重要的元数据。一个新的MANIFEST文件,会在一个数据库重新打开时生成。这个MANIFEST文件以一个log文件的格式,服务的一些更新信息会追加到这个log文件中。

  • Current
  • CURRENT是一个简单的文本文件包含最新的一个MANIFEST文件的名字

  • Info logs
  • 数据信息会打印在LOG和LOG.old文件中

  • Others
  • 其他文件用来生成其他的用处,比如LOCK,*.dbtmp等等

    Level 0

    当一个log文件增长到超过阈值时(默认为1MB):
    建立一个新的内存表和log文件用于写入以后的更新
    在后台:
    将之前内存表中的内存写到一个sstable中
    丢掉这个内存表
    删除旧的log文件和旧的内存表
    向level-0层中增加新的sstable

  • Compactions
  • 当L层的大小超过它的界限,我们在后台的进程中对它进行压缩。压缩操作从L层和所有L+1层之间选择一个文件。注意如果一个L层的文件只与一个L+1层的文件重叠,,全部的L+1层的文件被用来做压缩的输出文件并且压缩后将会被删除。一方面:因为level-0的特殊性,我们特殊对待从level-0到level-1的压缩:一个level-0的压缩可能会选择超过一个level-0文件因为这些文件会与其他文件有重叠。
    一个压缩会合并选择的文件的内存来生成一个L+1文件序列。我们会生成一个新的L+1层的文件在当前输出文件达到文件的大小(2MB)。我们也会生成一个新的输出文件当这些key超过是个L+2文件。最后的规则保证了后续的L+1层文件的压缩不会从L+2层选择过多的数据
    老文件会被删除,新文件会被添加到服务的状态中。
    一个典型的压缩会通过key空间进行旋转,更多的细节是,对于没一个L层我们记住最后一个key。下一个L层的压缩会从这个key开始选择第一个文件。
    合并会丢弃掉重复的值。我们也会丢弃标记删除的key,如果编号更高的层数中包含覆盖当前key的文件。

  • Timing
  • Level-0 压缩会根据从level0中取的四个1MB的文件,并且最坏情况所有的level-1(10M)。。。我们将会读14MB写14MB。
    除了level0的特殊压缩,我们会从L层选择一个2MB的文件。在最坏情况,这个会与其他L+1中的12个文件重叠。压缩过程会读26MB,写26MB。假设一个磁盘的IO速度为100MB最坏情况的压缩会花费0.5秒
    如果我们限制后台写的速度,假如100MB的10%,一个压缩过程会花费5秒。如果用户以10MB的速度写,我们可能会建立很多level-0文件。这样在每次合并的过程中花费会上升。
    Solution 1: 减少这类问题,当level-0的文件数量足够大时我们可能会增加log文件转换的阈值。阈值下降的趋势越大,内存表需要的内存就越大。
    Solution 2: 当level0文件数量上升的时候认为降低写的速率
    Solution 3:降低大量合并操作的花费。大部分level-0的文件不进行压缩,而我们只在合并时考虑O(N)复杂度的算法。

  • Number of files
  • 不是只生成2MB的文件,而是对于更大的层级我们可以生成更大的文件以减少文件总数,尽管这样会增加合并的花费。我们可以在多个文件夹中共享文件集合。

    Statement
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
    How do you handle database upgrades in MySQL?How do you handle database upgrades in MySQL?Apr 30, 2025 am 12:28 AM

    The steps for upgrading MySQL database include: 1. Backup the database, 2. Stop the current MySQL service, 3. Install the new version of MySQL, 4. Start the new version of MySQL service, 5. Recover the database. Compatibility issues are required during the upgrade process, and advanced tools such as PerconaToolkit can be used for testing and optimization.

    What are the different backup strategies you can use for MySQL?What are the different backup strategies you can use for MySQL?Apr 30, 2025 am 12:28 AM

    MySQL backup policies include logical backup, physical backup, incremental backup, replication-based backup, and cloud backup. 1. Logical backup uses mysqldump to export database structure and data, which is suitable for small databases and version migrations. 2. Physical backups are fast and comprehensive by copying data files, but require database consistency. 3. Incremental backup uses binary logging to record changes, which is suitable for large databases. 4. Replication-based backup reduces the impact on the production system by backing up from the server. 5. Cloud backups such as AmazonRDS provide automation solutions, but costs and control need to be considered. When selecting a policy, database size, downtime tolerance, recovery time, and recovery point goals should be considered.

    What is MySQL clustering?What is MySQL clustering?Apr 30, 2025 am 12:28 AM

    MySQLclusteringenhancesdatabaserobustnessandscalabilitybydistributingdataacrossmultiplenodes.ItusestheNDBenginefordatareplicationandfaulttolerance,ensuringhighavailability.Setupinvolvesconfiguringmanagement,data,andSQLnodes,withcarefulmonitoringandpe

    How do you optimize database schema design for performance in MySQL?How do you optimize database schema design for performance in MySQL?Apr 30, 2025 am 12:27 AM

    Optimizing database schema design in MySQL can improve performance through the following steps: 1. Index optimization: Create indexes on common query columns, balancing the overhead of query and inserting updates. 2. Table structure optimization: Reduce data redundancy through normalization or anti-normalization and improve access efficiency. 3. Data type selection: Use appropriate data types, such as INT instead of VARCHAR, to reduce storage space. 4. Partitioning and sub-table: For large data volumes, use partitioning and sub-table to disperse data to improve query and maintenance efficiency.

    How can you optimize MySQL performance?How can you optimize MySQL performance?Apr 30, 2025 am 12:26 AM

    TooptimizeMySQLperformance,followthesesteps:1)Implementproperindexingtospeedupqueries,2)UseEXPLAINtoanalyzeandoptimizequeryperformance,3)Adjustserverconfigurationsettingslikeinnodb_buffer_pool_sizeandmax_connections,4)Usepartitioningforlargetablestoi

    How to use MySQL functions for data processing and calculationHow to use MySQL functions for data processing and calculationApr 29, 2025 pm 04:21 PM

    MySQL functions can be used for data processing and calculation. 1. Basic usage includes string processing, date calculation and mathematical operations. 2. Advanced usage involves combining multiple functions to implement complex operations. 3. Performance optimization requires avoiding the use of functions in the WHERE clause and using GROUPBY and temporary tables.

    An efficient way to batch insert data in MySQLAn efficient way to batch insert data in MySQLApr 29, 2025 pm 04:18 PM

    Efficient methods for batch inserting data in MySQL include: 1. Using INSERTINTO...VALUES syntax, 2. Using LOADDATAINFILE command, 3. Using transaction processing, 4. Adjust batch size, 5. Disable indexing, 6. Using INSERTIGNORE or INSERT...ONDUPLICATEKEYUPDATE, these methods can significantly improve database operation efficiency.

    Steps to add and delete fields to MySQL tablesSteps to add and delete fields to MySQL tablesApr 29, 2025 pm 04:15 PM

    In MySQL, add fields using ALTERTABLEtable_nameADDCOLUMNnew_columnVARCHAR(255)AFTERexisting_column, delete fields using ALTERTABLEtable_nameDROPCOLUMNcolumn_to_drop. When adding fields, you need to specify a location to optimize query performance and data structure; before deleting fields, you need to confirm that the operation is irreversible; modifying table structure using online DDL, backup data, test environment, and low-load time periods is performance optimization and best practice.

    See all articles

    Hot AI Tools

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Clothoff.io

    Clothoff.io

    AI clothes remover

    Video Face Swap

    Video Face Swap

    Swap faces in any video effortlessly with our completely free AI face swap tool!

    Hot Tools

    Dreamweaver CS6

    Dreamweaver CS6

    Visual web development tools

    SAP NetWeaver Server Adapter for Eclipse

    SAP NetWeaver Server Adapter for Eclipse

    Integrate Eclipse with SAP NetWeaver application server.

    PhpStorm Mac version

    PhpStorm Mac version

    The latest (2018.2.1) professional PHP integrated development tool

    Atom editor mac version download

    Atom editor mac version download

    The most popular open source editor

    Safe Exam Browser

    Safe Exam Browser

    Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.