Reversing verification code algorithm

Hi there

Not sure if right place to post this but it might ring a bell to someone.

I have a large sample of product serial numbers and their resulting verification codes. I’ve reversed the binary but turns out not generated at runtime – the serial and code burnt at factory 🙁

Serial numbers are 9 digits.

Verification code for serial is 6 letters (a-z).

After doing some analysis, I found some collisions where verification code for very different serial numbers are the same. Not surprising since 26^6 < 10^9.

In these cases, the difference between the 2 numbers is a prime, or in one case, the product of 2 primes.

e.g. (not real data)

639271233 and 640018040 both share code pzqlln

Difference between them is prime 746807 (there’s an unrelated serial pair sharing this prime difference though code is different)

Other collisions has a different prime difference (+/- prime) or 1 case of ( +/- prime 1 * prime 2)

Strangely there also instances where there are ranges of 150 consecutive numbers that have the same resulting 6 letter code. But most times with such ranges the code is different for each number. I assume there is a weakness in the algorithm. Maybe XOR is involved.

Does this sound similar to a known check code generation algorithm?

Or is there any solver/analysis solution I can feed the dataset into?

I feel like there are enough clues to have a stab at this but sadly my math class was a long time ago and can only remember pythagoras.

submitted by /u/incomingone
[link] [comments]

June 4, 2023
Read More >>