The unique system ID is a problem we often encounter when designing a system, and we often struggle with this problem. There are many ways to generate IDs, adapting to different scenarios, needs and performance requirements. Therefore, some more complex systems will have multiple ID generation strategies. Here are some common ID generation strategies.
1. Database self-increasing sequence or field
The most common way. Using the database, the entire database is unique.
Advantages:
Simple, convenient code, and acceptable performance.
Numeric IDs are naturally sorted, which is helpful for paging or results that need to be sorted.
Disadvantages:
# Different database syntax and implementation are different, when database migration or when multiple database versions are supported Needs to be processed.
In the case of a single database or read-write separation or one master and multiple slaves, there is only one master database can be generated. There is a risk of a single point of failure.
It is difficult to expand when the performance cannot meet the requirements.
If you encounter multiple systems that need to be merged or data migration is involved, it will be quite painful.
There will be trouble when dividing tables and databases.
Optimization plan:
For the main database single point, if there are multiple Master databases, each Master The starting number set by the library is different, but the step size is the same, which can be the number of Masters. For example: Master1 generates 1, 4, 7, 10, Master2 generates 2,5,8,11, Master3 generates 3,6,9,12. This can effectively generate unique IDs in the cluster, and can also greatly reduce the load of ID generation database operations.
2. UUID common method.
It can be generated using a database or a program, and is generally unique in the world.
Advantages:
Simple and convenient code.
The ID generation performance is very good and there will be basically no performance problems.
The only one in the world. In the case of data migration, system data merging, or database changes, you can Take it in stride.
Disadvantages:
There is no sorting, and the trend cannot be guaranteed to increase.
UUID is often stored using strings, and the query efficiency is relatively low.
The storage space is relatively large. If it is a massive database, you need to consider the storage amount.
Transfer large amount of data
is not readable.
3. Redis generates ID
When the performance of using the database to generate ID is not enough, we can try to use Redis to generate ID. This mainly relies on Redis being single-threaded, so it can also be used to generate globally unique IDs. This can be achieved using Redis's atomic operations INCR and INCRBY.
You can use Redis cluster to obtain higher throughput. Suppose there are 5 Redis in a cluster. The values of each Redis can be initialized to 1, 2, 3, 4, 5 respectively, and then the step size is all 5. The IDs generated by each Redis are:
A: 1,6,11,16,21 B: 2,7,12,17,22 C: 3,8,13,18,23 D: 4, 9,14,19,24 E: 5,10,15,20,25
This can be determined by whichever machine it is loaded to. It will be difficult to modify in the future. However, 3-5 servers can basically satisfy the needs of the server, and they can all obtain different IDs. But the step size and initial value must be required in advance. Using Redis cluster can also solve the problem of single point of failure.
In addition, it is more suitable to use Redis to generate serial numbers starting from 0 every day. For example, order number = date, and the number will increase automatically on that day. You can generate a Key in Redis every day and use INCR for accumulation.
Advantages:
public class IdWorker { // ==============================Fields=========================================== /** 开始时间截 (2015-01-01) */ private final long twepoch = 1420041600000L; /** 机器id所占的位数 */ private final long workerIdBits = 5L; /** 数据标识id所占的位数 */ private final long datacenterIdBits = 5L; /** 支持的最大机器id,结果是31 (这个移位算法可以很快的计算出几位二进制数所能表示的最大十进制数) */ private final long maxWorkerId = -1L ^ (-1L << workerIdBits); /** 支持的最大数据标识id,结果是31 */ private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits); /** 序列在id中占的位数 */ private final long sequenceBits = 12L; /** 机器ID向左移12位 */ private final long workerIdShift = sequenceBits; /** 数据标识id向左移17位(12+5) */ private final long datacenterIdShift = sequenceBits + workerIdBits; /** 时间截向左移22位(5+5+12) */ private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits; /** 生成序列的掩码,这里为4095 (0b111111111111=0xfff=4095) */ private final long sequenceMask = -1L ^ (-1L << sequenceBits); /** 工作机器ID(0~31) */ private long workerId; /** 数据中心ID(0~31) */ private long datacenterId; /** 毫秒内序列(0~4095) */ private long sequence = 0L; /** 上次生成ID的时间截 */ private long lastTimestamp = -1L; //==============================Constructors===================================== /** * 构造函数 * @param workerId 工作ID (0~31) * @param datacenterId 数据中心ID (0~31) */ public IdWorker(long workerId, long datacenterId) { if (workerId > maxWorkerId || workerId < 0) { throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId)); } if (datacenterId > maxDatacenterId || datacenterId < 0) { throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId)); } this.workerId = workerId; this.datacenterId = datacenterId; } // ==============================Methods========================================== /** * 获得下一个ID (该方法是线程安全的) * @return SnowflakeId */ public synchronized long nextId() { long timestamp = timeGen(); //如果当前时间小于上一次ID生成的时间戳,说明系统时钟回退过这个时候应当抛出异常 if (timestamp < lastTimestamp) { throw new RuntimeException( String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp)); } //如果是同一时间生成的,则进行毫秒内序列 if (lastTimestamp == timestamp) { sequence = (sequence + 1) & sequenceMask; //毫秒内序列溢出 if (sequence == 0) { //阻塞到下一个毫秒,获得新的时间戳 timestamp = tilNextMillis(lastTimestamp); } } //时间戳改变,毫秒内序列重置 else { sequence = 0L; } //上次生成ID的时间截 lastTimestamp = timestamp; //移位并通过或运算拼到一起组成64位的ID return ((timestamp - twepoch) << timestampLeftShift) // | (datacenterId << datacenterIdShift) // | (workerId << workerIdShift) // | sequence; } /** * 阻塞到下一个毫秒,直到获得新的时间戳 * @param lastTimestamp 上次生成ID的时间截 * @return 当前时间戳 */ protected long tilNextMillis(long lastTimestamp) { long timestamp = timeGen(); while (timestamp <= lastTimestamp) { timestamp = timeGen(); } return timestamp; } /** * 返回以毫秒为单位的当前时间 * @return 当前时间(毫秒) */ protected long timeGen() { return System.currentTimeMillis(); } //==============================Test============================================= /** 测试 */ public static void main(String[] args) { IdWorker idWorker = new IdWorker(0, 0); for (int i = 0; i < 1000; i++) { long id = idWorker.nextId(); System.out.println(Long.toBinaryString(id)); System.out.println(id); } }}
snowflake algorithm can be modified according to the needs of your own project. For example, estimate the number of future data centers, the number of machines in each data center, and the number of possible concurrencies in a unified millisecond to adjust the number of bits required in the algorithm.
Advantages:
php news release management system development example
PHP development simple news release system tutorial
The above is the detailed content of Summary of unique ID generation solutions for distributed systems. For more information, please follow other related articles on the PHP Chinese website!