Efficient way to fill missing values in unordered map from SQL table (SQL/C++)

Question

Question Currently there is a system that reads unique identifiers (ids) and their attached dataset from a SQL table into an unordered map. The datasets used to start with id1, but adding and removing datasets took about 10 milliseconds. Note that not all datasets will always be loaded into RAM. When the program starts, it reads SELECTMAX(id) from the database and proceeds to add +1 to the counter variable, which will be used as the id for any added dataset. The ids of the deleted dataset are no longer used anywhere. This inevitably leads to gaps in the id sequence, and there are

P粉252423906 · Answer

If there is a database change every 10 milliseconds, then there are 100 changes per second. A signed int can hold approximately 2,147,483,648 values, or 21,474,846 seconds, which is approximately 8 months. After this, it is not possible to have a new ID available.

The first solution is to use the 64bit type instead of int. This gives you about 13,600 years (for signed 64b), which seems enough :)

Other solution is to have a vector containing all possible IDs. Vector storage bool(ID used/unused). Requesting a new ID is done by moving the vector to the first position marked as unused.
This vector uses a lot of RAM, although there is a version of std::vector specifically for bool that requires less RAM.

The third solution is to deal with storing a linked list (possibly doubly linked) of deleted (read: reusable) IDs.

When a new ID is requested, the list provides its header, or the size of the table if the list is empty.
When a dataset is deleted, its ID is correctly inserted into the list, so the list is always sorted.
When an ID is reused, it is removed from the list.
Deleting the last record in the table may also delete the last nodes in the list because they are useless (case ID > table size). That's why I recommend using a doubly linked list so that the last node can be removed quickly.

So the list uses "new" and "delete" on its nodes quickly, and also runs up and down (for dual links) frequently to insert new nodes.
This is a bit slow, but I hope the list isn't too big and then the time required isn't bad.

Also note that this list gives you the array of gaps you need.

Efficient way to fill missing values ​​in unordered map from SQL table (SQL/C++)

reply all(1)I'll reply

Efficient way to fill missing values in unordered map from SQL table (SQL/C++)