Snowflake algorithm implemented by mysql-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

Snowflake algorithm implemented by mysql

coldplay.xixi

Aug 20, 2020 pm 04:15 PM

mysql

Snowflake algorithm implemented by mysql

[Related learning recommendations: mysql video tutorial]

1. Why use the snowflake algorithm

1. Background of the problem

Nowadays, more and more companies are using distributed and microservices, so the corresponding databases will be split for different services, and then when the amount of data increases Tables will also be divided when the table is divided, and then there will be the problem of id after the table is divided.

For example, in the previous single project, the data primary key id in a table was auto-incremented. MySQL used autoincrement to achieve auto-increment, while Oracle used sequences to achieve it. However, when the amount of data in a single table increases, In the future, horizontal table splitting will be necessary. Alibaba's Java development recommendation is to split tables when a single table exceeds 5 million, but the specifics still depend on the business. If the index is used, tens of millions of data in a single table is also possible. Horizontal table partitioning is to divide the data of one table into multiple tables. Then the problem arises. If the primary key ID is still made according to the previous auto-increment, then ID duplication will occur. At this time, you have to consider what solution to solve the distribution problem. There is a problem with the formula id.

2. Solution

2.1. Database table

You can maintain a table specifically in a certain library, and then each time any table needs to increment its id Check the records of this table, then use for update to lock the table, then add one to the obtained value, and then return and record the value into the table again. However, this method is suitable for projects with relatively small concurrency, so every time Gotta lock the watch.

2.2, redis

Because redis is single-threaded, you can maintain a key-value pair in redis, and then which table needs to directly go to redis to get the value and then add one, but this is the same as above Also, since single thread does not have high support for high concurrency, it is only suitable for projects with small concurrency.

2.3, uuid

You can use uuid as a unique primary key id, but a problem with uuid is that it is an unordered string. If uuid is used as the primary key, the primary key index will be invalid. .

2.4. Snowflake algorithm

The snowflake algorithm is an efficient solution to solve distributed IDs. Most Internet companies are using the snowflake algorithm, and of course there are companies that implement other solutions themselves.

2. Snowflake algorithm

1. Principle

##The snowflake algorithm uses 64-bit long type data Store the ID, the highest bit stores 0 or 1, 0 represents an integer, 1 represents a negative number, usually 0, so the highest bit remains unchanged, 41 bits store millisecond-level timestamp, 10 bits store machine code (including 5-bit datacenterId and 5-digit workerId), 12-digit storage sequence number. In this way, the maximum number of machines with a maximum of 2 to the 10th power, that is, 1024 machines, can generate a maximum of 2 to the 12th power of 4096 IDs per millisecond. (There is code implementation below)

But generally we don’t have that many machines, so we can also use 53 bits to store the id. Why use 53 bits?

Because we almost all deal with web pages, we need to deal with js. The maximum integer range supported by js is 53 bits. If it exceeds this range, the accuracy will be lost. Within 53, it can be read directly by js. , if it exceeds 53 bits, it needs to be converted into a string to ensure that js can process it correctly. If 53 is stored, 32 bits store the second-level timestamp, 5 bits store the machine code, and 16 bits store the serialization. In this way, each machine can produce 65536 unique IDs per second.

2. Disadvantages

Since the snowflake algorithm relies heavily on time, when the server clock dialback occurs, duplicate IDs may be generated. Of course, almost no company will modify the server time. Modification will cause various problems. The company would rather add a new server than modify the server time, but special circumstances cannot be ruled out.

How to solve the problem of clock dialback? You can set the step size for the initial value of the serialization. Each time the clock dialback event is triggered, the initial step size is increased by 1w. This can be achieved in line 85 of the following code, and the initial value of the sequence is set to 10000.

3. Code implementation

64-bit code implementation:

package com.yl.common;
/**
 * Twitter_Snowflake<br>
 * SnowFlake的结构如下(每部分用-分开):<br>
 * 0 - 0000000000 0000000000 0000000000 0000000000 0 - 00000 - 00000 - 000000000000 <br>
 * 1位标识，由于long基本类型在Java中是带符号的，最高位是符号位，正数是0，负数是1，所以id一般是正数，最高位是0<br>
 * 41位时间截(毫秒级)，注意，41位时间截不是存储当前时间的时间截，而是存储时间截的差值（当前时间截 - 开始时间截)
 * 得到的值），这里的的开始时间截，一般是我们的id生成器开始使用的时间，由我们程序来指定的（如下下面程序IdWorker类的startTime属性）。41位的时间截，可以使用69年，年T = (1L << 41) / (1000L * 60 * 60 * 24 * 365) = 69<br>
 * 10位的数据机器位，可以部署在1024个节点，包括5位datacenterId和5位workerId<br>
 * 12位序列，毫秒内的计数，12位的计数顺序号支持每个节点每毫秒(同一机器，同一时间截)产生4096个ID序号<br>
 * 加起来刚好64位，为一个Long型。<br>
 * SnowFlake的优点是，整体上按照时间自增排序，并且整个分布式系统内不会产生ID碰撞(由数据中心ID和机器ID作区分)，并且效率较高，经测试，SnowFlake每秒能够产生26万ID左右。
 */
public class SnowflakeIdWorker {

 // ==============================Fields===========================================
 /** 开始时间截 (2020-01-01) */
 private final long twepoch = 1577808000000L;

 /** 机器id所占的位数 */
 private final long workerIdBits = 5L;

 /** 数据标识id所占的位数 */
 private final long datacenterIdBits = 5L;

 /** 支持的最大机器id，结果是31 (这个移位算法可以很快的计算出几位二进制数所能表示的最大十进制数) */
 private final long maxWorkerId = -1L ^ (-1L << workerIdBits);

 /** 支持的最大数据标识id，结果是31 */
 private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);

 /** 序列在id中占的位数 */
 private final long sequenceBits = 12L;

 /** 机器ID向左移12位 */
 private final long workerIdShift = sequenceBits;

 /** 数据标识id向左移17位(12+5) */
 private final long datacenterIdShift = sequenceBits + workerIdBits;

 /** 时间截向左移22位(5+5+12) */
 private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;

 /** 生成序列的掩码，这里为4095 (0b111111111111=0xfff=4095) */
 private final long sequenceMask = -1L ^ (-1L << sequenceBits);

 /** 工作机器ID(0~31) */
 private long workerId;

 /** 数据中心ID(0~31) */
 private long datacenterId;

 /** 毫秒内序列(0~4095) */
 private long sequence = 0L;

 /** 上次生成ID的时间截 */
 private long lastTimestamp = -1L;

 //==============================Constructors=====================================
 /**
 * 构造函数
 * @param workerId 工作ID (0~31)
 * @param datacenterId 数据中心ID (0~31)
 */
 public SnowflakeIdWorker(long workerId, long datacenterId) {
 if (workerId > maxWorkerId || workerId < 0) {
 throw new IllegalArgumentException(String.format("worker Id can&#39;t be greater than %d or less than 0", maxWorkerId));
 }
 if (datacenterId > maxDatacenterId || datacenterId < 0) {
 throw new IllegalArgumentException(String.format("datacenter Id can&#39;t be greater than %d or less than 0", maxDatacenterId));
 }
 this.workerId = workerId;
 this.datacenterId = datacenterId;
 }

 // ==============================Methods==========================================
 /**
 * 获得下一个ID (该方法是线程安全的)
 * @return SnowflakeId
 */
 public synchronized long nextId() {
 long timestamp = timeGen();

 //如果当前时间小于上一次ID生成的时间戳，说明系统时钟回退过这个时候应当抛出异常
 if (timestamp < lastTimestamp) {
 throw new RuntimeException(
  String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
 }

 //如果是同一时间生成的，则进行毫秒内序列
 if (lastTimestamp == timestamp) {
 sequence = (sequence + 1) & sequenceMask;
 //毫秒内序列溢出
 if (sequence == 0) {
 //阻塞到下一个毫秒,获得新的时间戳
 timestamp = tilNextMillis(lastTimestamp);
 }
 }
 //时间戳改变，毫秒内序列重置
 else {
 sequence = 0L;
 }

 //上次生成ID的时间截
 lastTimestamp = timestamp;

 //移位并通过或运算拼到一起组成64位的ID
 return ((timestamp - twepoch) << timestampLeftShift) //
 | (datacenterId << datacenterIdShift) //
 | (workerId << workerIdShift) //
 | sequence;
 }

 /**
 * 阻塞到下一个毫秒，直到获得新的时间戳
 * @param lastTimestamp 上次生成ID的时间截
 * @return 当前时间戳
 */
 protected long tilNextMillis(long lastTimestamp) {
 long timestamp = timeGen();
 while (timestamp <= lastTimestamp) {
 timestamp = timeGen();
 }
 return timestamp;
 }

 /**
 * 返回以毫秒为单位的当前时间
 * @return 当前时间(毫秒)
 */
 protected long timeGen() {
 return System.currentTimeMillis();
 }

 //==============================Test=============================================
 /** 测试 */
 public static void main(String[] args) {
 SnowflakeIdWorker idWorker = new SnowflakeIdWorker(0, 0);
 
 for (int i = 0; i < 100; i++) {
 long id = idWorker.nextId();
 System.out.println(id);
 }
 }
}

Supplementary knowledge: Snowflake algorithm realizes distributed self-increasing ID

I won’t talk nonsense anymore, let’s just look at the code~

/**
 * <p>名称：IdWorker.java</p>
 * <p>描述：分布式自增长ID</p>
 * <pre class="brush:php;toolbar:false">
 * Twitter的 Snowflake　JAVA实现方案
 *

* 核心代码为其IdWorker这个类实现，其原理结构如下，我分别用一个0表示一位，用—分割开部分的作用： * 1||0---0000000000 0000000000 0000000000 0000000000 0 --- 00000 ---00000 ---000000000000 * 在上面的字符串中，第一位为未使用（实际上也可作为long的符号位），接下来的41位为毫秒级时间， * 然后5位datacenter标识位，5位机器ID（并不算标识符，实际是为线程标识）， * 然后12位该毫秒内的当前毫秒内的计数，加起来刚好64位，为一个Long型。 * 这样的好处是，整体上按照时间自增排序，并且整个分布式系统内不会产生ID碰撞（由datacenter和机器ID作区分）， * 并且效率较高，经测试，snowflake每秒能够产生26万ID左右，完全满足需要。 *

* 64位ID (42(毫秒)+5(机器ID)+5(业务编码)+12(重复累加)) * * @author Polim */ public class IdWorker { // 时间起始标记点，作为基准，一般取系统的最近时间（一旦确定不能变动） private final static long twepoch = 1288834974657L; // 机器标识位数 private final static long workerIdBits = 5L; // 数据中心标识位数 private final static long datacenterIdBits = 5L; // 机器ID最大值 private final static long maxWorkerId = -1L ^ (-1L maxWorkerId || workerId maxDatacenterId || datacenterId * 获取 maxWorkerId *

*/ protected static long getMaxWorkerId(long datacenterId, long maxWorkerId) { StringBuffer mpid = new StringBuffer(); mpid.append(datacenterId); String name = ManagementFactory.getRuntimeMXBean().getName(); if (!name.isEmpty()) { /* * GET jvmPid */ mpid.append(name.split("@")[0]); } /* * MAC + PID 的 hashcode 获取16个低位 */ return (mpid.toString().hashCode() & 0xffff) % (maxWorkerId + 1); } /** *

* 数据标识id部分 *

*/ protected static long getDatacenterId(long maxDatacenterId) { long id = 0L; try { InetAddress ip = InetAddress.getLocalHost(); NetworkInterface network = NetworkInterface.getByInetAddress(ip); if (network == null) { id = 1L; } else { byte[] mac = network.getHardwareAddress(); id = ((0x000000FF & (long) mac[mac.length - 1]) | (0x0000FF00 & (((long) mac[mac.length - 2]) > 6; id = id % (maxDatacenterId + 1); } } catch (Exception e) { System.out.println(" getDatacenterId: " + e.getMessage()); } return id; } }

Related recommendations:
programmingvideocourse

The above is the detailed content of Snowflake algorithm implemented by mysql. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:jb51. If there is any infringement, please contact admin@php.cn delete

图文详解mysql架构原理May 17, 2022 pm 05:54 PM

本篇文章给大家带来了关于mysql的相关知识，其中主要介绍了关于架构原理的相关内容，MySQL Server架构自顶向下大致可以分网络连接层、服务层、存储引擎层和系统文件层，下面一起来看一下，希望对大家有帮助。

mysql怎么替换换行符Apr 18, 2022 pm 03:14 PM

在mysql中，可以利用char()和REPLACE()函数来替换换行符；REPLACE()函数可以用新字符串替换列中的换行符，而换行符可使用“char(13)”来表示，语法为“replace(字段名,char(13),'新字符串') ”。

mysql的msi与zip版本有什么区别May 16, 2022 pm 04:33 PM

mysql的msi与zip版本的区别：1、zip包含的安装程序是一种主动安装，而msi包含的是被installer所用的安装文件以提交请求的方式安装；2、zip是一种数据压缩和文档存储的文件格式，msi是微软格式的安装包。

mysql怎么去掉第一个字符May 19, 2022 am 10:21 AM

方法：1、利用right函数，语法为“update 表名 set 指定字段 = right(指定字段, length(指定字段)-1)...”；2、利用substring函数，语法为“select substring(指定字段,2)..”。

mysql怎么将varchar转换为int类型May 12, 2022 pm 04:51 PM

转换方法：1、利用cast函数，语法“select * from 表名 order by cast(字段名 as SIGNED)”；2、利用“select * from 表名 order by CONVERT(字段名,SIGNED)”语句。

MySQL复制技术之异步复制和半同步复制Apr 25, 2022 pm 07:21 PM

本篇文章给大家带来了关于mysql的相关知识，其中主要介绍了关于MySQL复制技术的相关问题，包括了异步复制、半同步复制等等内容，下面一起来看一下，希望对大家有帮助。

带你把MySQL索引吃透了Apr 22, 2022 am 11:48 AM

本篇文章给大家带来了关于mysql的相关知识，其中主要介绍了mysql高级篇的一些问题，包括了索引是什么、索引底层实现等等问题，下面一起来看一下，希望对大家有帮助。

mysql怎么判断是否是数字类型May 16, 2022 am 10:09 AM

在mysql中，可以利用REGEXP运算符判断数据是否是数字类型，语法为“String REGEXP '[^0-9.]'”；该运算符是正则表达式的缩写，若数据字符中含有数字时，返回的结果是true，反之返回的结果是false。

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Repo: How To Revive Teammates

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hello Kitty Island Adventure: How To Get Giant Seeds

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

3 weeks agoByDDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

3 weeks agoByDDD

Hot Tools

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software