By Norman Yue, Chief Technical Officer
Originally published: http://wordswithcomputers.wordpress.com/2014/07/02/breaking-lcg_value/
One of the things I do, under the guise of OWASP Sydney Chapter Lead, is run a weekly workshop – every week, a small group of people get together to work on some security topics ranging from reverse engineering to web-based wargames, followed by security chit-chat over dinner.
Recently, at one of these get togethers, a (very smart) friend pointed me to PHP’s lcg_value function.
First looked at by samy in 2010, lcg_value is a PHP pseudo-random number generator, which generates a random 64-bit floating point. To cut a long story short, this function works as follows (variable names taken from samy’s lcg_state_forward.c):
Within lcg_value, “s1” and “s2” are internal state variables – PHP keeps track of them, and for each iteration of lcg_value, it updates them. If an attacker were to know the values of “s1” and “s2”, the attacker can easily predict future values of lcg_value.
The problem arises because “s1” and “s2” are related. If you know the output of lcg_value, and one of these variables, you can calculate the other (it’s a constant-time operation). Therefore, if an attacker knows at least two outputs for lcg_value (z1 and z2), he/she can brute force all the possible “s1” values for a given lcg_value output (z1), calculate the “s2” for each, run the lcg_value state variable permutation once, and see if the new “s1” and “s2” values match up with the second known lcg_value output (z2).
Armed with the correct “s1” and “s2” internal state values, an attacker can predict future values of lcg_output, as follows:
As an attacker only has to brute force one 32-bit state variable instead of 2 (as calculating the other state variable is a constant time operation), this reduces the attack time to about 5 minutes on a modern computer. I’ll leave the implementation of this particular attack to the reader, a good place to start is lcg_state_forward.c .
The implications of this alone are somewhat limited – it’s rare that lcg_value’s output is directly output to an attacker, such that it’s immediately useful to guess lcg_value’s internal state and next outputs.
The greater implications arise when we are able to guess lcg_value’s internal state based on a partial output – for example, when part of lcg_value’s output is used to generate a session cookie, or request a CAPTCHA image and corresponding answer – but this is a tale for another time.
On a hunch, I also had a look at some source code belonging to other open source programming languages – it’s surprising how much source code from a decade ago is still in use today, sitting at the heart of widely-used software, forgotten and unmaintained.
Acknowledgements: Blair Strang and Scott Contini from Covata , who first came up with this idea.