Home >Backend Development >Golang >Demystifying OTPs: the logic behind the offline generation of tokens
Hello there! Another evening, on my way back home, I decided to check the mailbox. I don't mean my email inbox, but the old-school actual box where the postman puts the physical letters. And to my great surprise, I found an envelope there with something inside! While opening it, I spent a few moments hoping that it's the decades delayed letter from Hogwarts. But then I had to get back down to Earth, once I noticed that it's a boring "grown-up" letter from the bank. I skimmed through the text and realized that my "digital-only" bank for cool kids had been acquired by the biggest player on the local market. And as a token of the new beginning, they added this to the envelope:
Alongside the instructions on how to use it.
If you are like me, and have never come across such a piece of the tech innovation, let me share what I learned from the letter: new owners decided to enforce the security policies from their company, which means that all the user accounts from now on will have the MFA enabled (kudos for that, btw). And the device you could see above generates one-time tokens 6-digits long that are used as a second factor while logging in to your bank account. Basically, the same way as apps like Authy, Google Authenticator or 2FAS work, but in a physical shape.
So, I gave it a try, and the login process went smoothly: the device showed me a 6-digits code, I entered it in my banking app, and this got me in. Hooray! But then something struck me: how does this thing work? There is no way it is connected to the internet somehow, but it manages to generate correct codes that are accepted by my bank server. Hm... Could it have a SIM-card or something similar inside? No way!
Realizing that my life will never be the same, I began wondering about apps I've mentioned above (Authy and friends)? My inner researcher was awakened, so I switched my phone into the airplane mode and, to my big surprise, realized that they work perfectly fine offline: they keep generating codes that are accepted by the servers of the apps. Interesting!
Not sure about you, but I've always taken the one-time token flow for granted and have never actually given it a proper thought (especially due to the fact that these days it is rare that my phone has no internet unless I'm doing some outdoor adventures), so that was the root cause of my surprise. Otherwise, it makes a perfect sense from the security point of view to work this way, as the generation process is purely local, so safe-ish from the external actors. But how does it work?
Well, modern technologies like Google or ChatGPT make it easy to find the answer easily. But this technical problem seemed fun to me, so decided to give it a try and solve it on my own first.
Let's start with what we have:
The server validation part hints that the server must be able to generate the same code as the offline device to compare them. Hm..that can be helpful.
My further observations of my new "toy" brought even more discoveries:
The only logic explanation I could come up with is that these codes have a certain lifetime. I'd like to tell a story of me trying to count the duration of it in "1-2-3-...-N" fashion, but it won't be true: I got a big hint from the apps like Authy and Co, where I saw the 30 seconds TTL. Good find, let's add this to the list of the known facts.
Let's summarize the requirements we have so far:
All right, but the main question remains unanswered: how come the offline app can generate the value that will match the one from the other app? What do they have in common?
If you are into the Lord of the Rings universe, you might remember how Bilbo played a riddle game with Gollum, and got this one to solve:
This thing all things devours:
Birds, beasts, trees, flowers;
Gnaws iron, bites steel;
Grinds hard stones to meal;
Slays king, ruins town,
And beats high mountain down.
Spoiler alert, but Mr. Baggins got lucky and came up with the correct answer accidentally - "Time!". Believe it or not, but this is exactly the answer to our riddle as well: any 2 (or more) apps have access to the same time as long as they have an embedded clock within. The latter is not a problem these days, and the device in question is big enough to fit it. Look around, and the chances are that the time on your hand watch, mobile phone, TV, oven, and clock on the wall is the same. We are into something here, it seems like we have found the base for the OTP (one-time password) computing!
Relying on time has its own set of challenges:
Let's address them one by one:
Ok, this is settled, so let's try to implement the very first version of our algorithm using time as the basis. Since we are interested in a 6 digits result, it sounds like a smart choice to rely on a timestamp rather than the human-readable date. Let's start from there:
// gets current timestamp: current := time.Now().Unix() fmt.Println("Current timestamp: ", current)
According to the Go docs, the .Unix() returns
the number of seconds elapsed since January 1, 1970 UTC.
This is what printed to the terminal:
Current timestamp: 1733691162
That's a good start, but if we rerun that code, the timestamp value will change, while we'd like to keep it stable for 30 seconds. Well, piece of cake, let's divide it by 30 and use that value as a base:
// gets current timestamp current := time.Now().Unix() fmt.Println("Current timestamp: ", current) // gets a number that is stable within 30 seconds base := current / 30 fmt.Println("Base: ", base)
Let's run it:
Current timestamp: 1733691545 Base: 57789718
And again:
Current timestamp: 1733691552 Base: 57789718
The base value remains the same. Let's wait for a bit, and run it again:
Current timestamp: 1733691571 Base: 57789719
The base value has changed, as the 30 seconds window has passed - good job!
If the "divide by 30" logic doesn't make sense, let me explain it with a simple example:
I hope it makes more sense now.
However, not all the requirements are satisfied yet, as we need a 6 digits result, while current base has 8 digits as of today, but at some point of time in the future it might reach 9 digits point, and so on. Well, let's use another simple math trick: let divide the base by 1 000 000, and get the remainder, which will always have exactly 6 digits, as the reminder can be any number from 0 to 999 999, but never larger:
// gets current timestamp: current := time.Now().Unix() fmt.Println("Current timestamp: ", current)
The fmt.Sprintf("d", code) part appends leading zeros in case if our code value has less than 6 digits. For example, 1234 will be converted into 001234.
The entire code for this post can be found here.
Let's run this code:
Current timestamp: 1733691162
All right, we get our 6-digits code, hooray! But something doesn't feel right here, does it? If I give you this code, and you will run it at the same time as I do, you'll get the same code, as I. This doesn't make it a secure one-time password, right? Here comes a new requirement:
Of course, some collisions are inevitable, if we have beyond 1 million users, as this is the max possible unique values per 6 digits. But these are rare and technically unavoidable collisions, not the algorithm design flaw like we have now.
I don't think any clever mathematical tricks will help us here on their own: if we need separate results per user, we need a user-specific state to make it happen. As engineers and, at the same time, users of many services, we know that to grant access to their APIs, services rely on private keys, which are unique per user. Let's introduce a private key for our use case as well to distinguish between the users.
A simple logic for generating private keys as integers between 1 000 000 and 999 999 999:
// gets current timestamp current := time.Now().Unix() fmt.Println("Current timestamp: ", current) // gets a number that is stable within 30 seconds base := current / 30 fmt.Println("Base: ", base)
We are using pkDb map as a way to prevent duplicates among private keys, and if the duplicate has been detected, we run the generation logic once more until we get a unique result.
Let's run this code to get the private key sample:
Current timestamp: 1733691545 Base: 57789718
Let's make a use of this private key within our code generation logic to make sure we are getting different results per private key. Since our private key is of integer type, the simplest thing we can do is to add it to the base value, and keep the remaining algorithm as it is:
Current timestamp: 1733691552 Base: 57789718
Let's make sure it produces different results for different private keys:
Current timestamp: 1733691571 Base: 57789719
The outcome looks as we wanted and expected:
// gets current timestamp: current := time.Now().Unix() fmt.Println("Current timestamp: ", current) // gets a number that is stable within 30 seconds: base := current / 30 fmt.Println("Base: ", base) // makes sure it has only 6 digits: code := base % 1_000_000 // adds leading zeros if necessary: formattedCode := fmt.Sprintf("%06d", code) fmt.Println("Code: ", formattedCode)
Works like charm! This means that the private key should be injected into the device that generates codes before it is sent to the users like me: that should not be a problem at all for the banks.
Are we done now? Well, only if we are satisfied with the artificial scenario we used. If you have ever enabled the MFA for any services / websites you have account on, you might have noticed that the web resource asks you to scan the QR code with your second-factor app of choice (Authy, Google Authenticator, 2FAS, etc.) that will input the secret code into your app and start generating 6-digits code from that moment on. Alternatively, it is possible to enter the code manually.
I'm bringing this up to mention that it is possible to take a peek at the format of the real private keys used in the industry. They are usually 16-32 characters long Base32-encoded strings that look like this:
// gets current timestamp: current := time.Now().Unix() fmt.Println("Current timestamp: ", current)
As you can see, this is quite different from the integer private keys we used, and the current implementation of our algorithm won't work if we are to switch to this format. How can we adjust our logic?
Let's start with a simple approach: our code won't compile, because of this line:
Current timestamp: 1733691162
as pk is of type string from now on. So why don't we convert it into integer? While there are way more elegant and performant ways to do that, here is the simplest thing I came up with:
// gets current timestamp current := time.Now().Unix() fmt.Println("Current timestamp: ", current) // gets a number that is stable within 30 seconds base := current / 30 fmt.Println("Base: ", base)
This is heavily inspired by the Java hashCode() implementation for the String data type, which makes it good-enough for our scenario.
Here is the adjusted logic:
Current timestamp: 1733691545 Base: 57789718
Here is the terminal output:
Current timestamp: 1733691552 Base: 57789718
Nice, 6-digits code is there, good job. Let's wait to get to the next time window and run it again:
Current timestamp: 1733691571 Base: 57789719
Hm...it works, but the code is, basically the increment of the previous value, which is not good, as this way OTPs are predictable, and having its value and knowing what time it belongs to, it is very simple to start generating the same values without the need of knowing the private key. Here is the simple pseudocode for this hack:
// gets current timestamp: current := time.Now().Unix() fmt.Println("Current timestamp: ", current) // gets a number that is stable within 30 seconds: base := current / 30 fmt.Println("Base: ", base) // makes sure it has only 6 digits: code := base % 1_000_000 // adds leading zeros if necessary: formattedCode := fmt.Sprintf("%06d", code) fmt.Println("Code: ", formattedCode)
where keepWithinSixDigits will make sure that after the 999 999 the next value is 000 000 and so on to keep the value within the 6-digits limit possibilities.
As you can see, it is a serious security flaw. Why does it happen? If we take a look at the base calculation logic, we'll see that it relies on 2 factors:
The hash produces the same value for the same key, so its value is constant. As for the current / 30 , it has the same value for 30 seconds, but once the window has passed, the next value will be the increment of the previous one. Then base % 1_000_000 behaves the way we see. Our previous implementation (with private keys as integers) had the same vulnerability, but we didn't notice that - lack of testing to blame.
We need to transform current / 30 into something to make the change of its value more noticeable.
There are multiple ways to achieve that, and some cool math tricks exist out there, but for the educational purposes let's prioritize the readability of the solution we'll go with: let's extract current / 30 into a separate variable base and include it into the hash computation logic:
// gets current timestamp: current := time.Now().Unix() fmt.Println("Current timestamp: ", current)
This way, even though the base will change with 1 every 30 seconds, after being used within the hash() function logic, the weight of the diff will increase due to the series of performed multiplications.
Here is the updated code example:
Current timestamp: 1733691162
Let's run it:
// gets current timestamp current := time.Now().Unix() fmt.Println("Current timestamp: ", current) // gets a number that is stable within 30 seconds base := current / 30 fmt.Println("Base: ", base)
Boom! How come did we get the minus value here? Well, it seems like we ran out of the int64 ranges, so it capped the values to the minus and started over - my Java fellows are familiar with this from the hashCode() behavior. The fix is simple: let's take the absolute value from the result, so then the minus sign is ignored:
Current timestamp: 1733691545 Base: 57789718
Here is entire code sample with the fix:
Current timestamp: 1733691552 Base: 57789718
Let's run it:
Current timestamp: 1733691571 Base: 57789719
Let's run it again to make sure that the OTP values are distributed now:
// gets current timestamp: current := time.Now().Unix() fmt.Println("Current timestamp: ", current) // gets a number that is stable within 30 seconds: base := current / 30 fmt.Println("Base: ", base) // makes sure it has only 6 digits: code := base % 1_000_000 // adds leading zeros if necessary: formattedCode := fmt.Sprintf("%06d", code) fmt.Println("Code: ", formattedCode)
Nice, finally a decent solution!
Actually, that was the moment when I stopped my manual implementation process, as I had my share of fun and learned something new. However, it's neither the best solution nor the one I'd go live with. Among other things, it has a big flaw: as you can see, our logic always deals with big numbers due to the hashing logic and timestamp values, which means that it's highly unlikely for us to be able to generate results that start with 1 or more zeros: e.g., 012345 , 001234, etc., even though they are completely valid. Due to that, we are 100 000 possible values short, which is 10% from the possible number of outcomes of the algorithm - the chances of collisions are higher this way. Not cool!
I won't go deep into the implementation that is used in the real applications, but for those who are curious, I'll share two RFCs worth taking a look at:
And here is the pseudocode implementation that will work the intended way based on the RFCs above:
Current timestamp: 1733692423 Base: 57789747 Code: 789747
As you can see, we got very close to that, but the original algorithm uses more advanced hashing (HMAC-SHA1 in this example), and performs some bitwise operations to normalize the output.
However, there is one more thing I'd like to cover before we call it a day: security. I'd strongly encourage you against implementing the logic of generating OTPs on your own, as there are plenty of libraries out there that have it done for us. The room for error is huge, and it's a short distance to the vulnerability that will be discovered and exploited by the bad actors out there.
Even if you get the generation logic right and will cover it with tests, there are other things that can go wrong. For example, how much do you think it will take to brute-force the 6-digit code? Let's experiment:
// gets current timestamp: current := time.Now().Unix() fmt.Println("Current timestamp: ", current)
Let's run this code:
Current timestamp: 1733691162
And once more:
// gets current timestamp current := time.Now().Unix() fmt.Println("Current timestamp: ", current) // gets a number that is stable within 30 seconds base := current / 30 fmt.Println("Base: ", base)
As you can see, it takes around 70 ms to guess the code via simple brute-forcing for-loop. That's 400 times faster than the lifetime of the OTP! The server of the app/website using OTP mechanism needs to prevent that by, for example, not accepting new codes for the next 5 or 10 seconds after 3 failed attempts. This way the attacker gets only 18 or 9 attempts correspondingly within the 30 seconds window, which is not enough for the pool of 1 million possible values.
And there are other things like this that are easy to overlook. So, let me repeat: don't build this from scratch, but rely on the existing solutions.
Anyway, I hope you learned something new today, and the OTP logic won't be a mystery for you from this point on. Also, if at some point in life, you'll need to make your offline device generate some values using a reproducible algorithm, you have a good idea where to start.
Thanks for the time you spent reading this post, and have fun! =)
P.S. Receive an email once I publish a new post - subscribe here
P.P.S. Like the rest of the cool kids, I have created a Bluesky account lately, so, please, help me to make my feed more fun =)
The above is the detailed content of Demystifying OTPs: the logic behind the offline generation of tokens. For more information, please follow other related articles on the PHP Chinese website!