(Secure) Random string?
In Lua, one would usually generate random value开发者_StackOverflow社区s, and/or strings by using math.random
& math.randomseed
, where os.time
is used for math.randomseed
.
This method however has one major weakness; The returned number is always just as random as the current time, AND the interval for each random number is one second, which is way too long if one needs many random values in a very short time.
This issue is even pointed out by the Lua Users wiki: http://lua-users.org/wiki/MathLibraryTutorial, and the corresponding RandomStringS receipe: http://lua-users.org/wiki/RandomStrings.
So I've sat down and wrote a different algorithm (if it even can be called that), that generates random numbers by (mis-)using the memory addresses of tables:
math.randomseed(os.time())
function realrandom(maxlen)
local tbl = {}
local num = tonumber(string.sub(tostring(tbl), 8))
if maxlen ~= nil then
num = num % maxlen
end
return num
end
function string.random(length,pattern)
local length = length or 11
local pattern = pattern or '%a%d'
local rand = ""
local allchars = ""
for loop=0, 255 do
allchars = allchars .. string.char(loop)
end
local str=string.gsub(allchars, '[^'..pattern..']','')
while string.len(rand) ~= length do
local randidx = realrandom(string.len(str))
local randbyte = string.byte(str, randidx)
rand = rand .. string.char(randbyte)
end
return rand
end
At first, everything seems perfectly random, and I'm sure they are... at least for the current program.
So my question is, how random are these numbers returned by realrandom
really?
Or is there an even better way to generate random numbers in a shorter interval than one second (which kind of implies that os.time
shouldn't be used, as explaind above), without relying on external libraries, AND, if possible, in an entirely crossplatform manner?
EDIT:
There seems to be a major misunderstanding regarding the way the RNG is seeded; In production code, the call tomath.randomseed()
happens just once, this was just a badly chosen example here.
What I mean by the random value is only random once per second, is easily demonstrated by this paste: http://codepad.org/4cDsTpcD
As this question will get downvoted regardless my edits, I also cancelled my previously accepted answer - In hope for a better one, even if just better opinions. I understand that issues regarding random values/numbers has been discussed many times before, but I have not found such a question that could be relevant to Lua - Please keep that in mind!
You should not call seed each time you call random, you ought to call it only once, on the program initialization (unless you get the seed from somewhere, for example, to replicate some previous "random" behaviour).
Standard Lua random generator is of poor quality in the statistical sense (as it is, in fact, standard C random generator), do not use it if you care for that. Use, for example,
lrandom
module (available in LuaRocks).If you need more secure random, read from
/dev/random
on Linux. (I think that Windows should have something along the same lines — but you may need to code something in C to use it.)Relying on table pointer values is a bad idea. Think about alternate Lua implementations, in Java, for example — there is no telling what they would return. (Also, the pointer values may be predictable, and they may be, under certain circumstances the same each time the program is invoked.)
If you want finer precision for the seed (and you will want this only if you're launching the program more often than once per second), you should use a timer with better resolution. For example,
socket.gettime()
from LuaSocket. Multiply it by some value, sincemath.randomseed
is working with integer part only, andsocket.gettime()
returns time in (floating point) seconds.require 'socket' math.randomseed(socket.gettime() * 1e6) for i = 1, 1e3 do print(math.random()) end
This method however has one major weakness; The returned number is always just as random as the current time, AND the interval for each random number is one second, which is way too long if one needs many random values in a very short time.
It has those weaknesses only if you implement it incorrectly.
math.randomseed
is supposed to be called sparingly - usually just once at the beginning of your program, and it usually seeds using os.time
. Once the seed is set, you can use math.random
many times, and it will yield random values.
See what happens on this sample:
> math.randomseed(1)
> return math.random(), math.random(), math.random()
0.84018771715471 0.39438292681909 0.78309922375861
> math.randomseed(2)
> return math.random(), math.random(), math.random()
0.70097636929759 0.80967634907443 0.088795455214007
> math.randomseed(1)
> return math.random(), math.random(), math.random()
0.84018771715471 0.39438292681909 0.78309922375861
When I change the seed from 1 to 2, I get different random results. But when I go back to 1, the "random sequence" is reset. I obtain the same values as before.
os.time()
returns an ever-increasing number. Using it as a seed is appropriate; then you can invoke math.random
forever and have different random numbers every time you invoke it.
The only scenario you have to be a bit worried about non-randomness is when your program is supposed to be executed more than once per second. In that case, as the others are saying, the simplest solution is using a clock with higher definition.
In other words:
- Invoke math.randomseed with an appropiate seed (
os.time()
is ok 99% of the cases) at the beginning of your program - Invoke
math.random
every time you need a random number.
Regards!
Some thoughts on the first part of your question:
So my question is, how random are these numbers returned by
realrandom
really?
Your function is attempting to discover the address of a table by using a quirk of its default implementation of tostring()
. I don't believe that the string returned by tostring{}
has a specified format, or that the value included in that string has any documented meaning. In practice, it is derived from the address of something related to the specific table, and so distinct tables convert to distinct strings. However, the next version of Lua is free to change that to anything that is convenient. Worse, the format it takes will be highly platform dependent because it appears to use the %p
format specifier to sprintf()
which is only specified as being a sensible representation of a pointer.
There's also a much bigger issue. While the address of the nth table created in a process might seem random on your platform, tt might not be random at all. Or it might vary in only a few bits. For example, on my win7 box only a few bits vary, and not very randomly:
C:...>for /L %i in (1,1,20) do @ lua -e "print{}" table: 0042E5D8 table: 0061E5D8 table: 0024E5D8 table: 0049E5D8 table: 0042E5D8 table: 0042E5D8 table: 0042E5D8 table: 0064E5D8 table: 0042E5D8 table: 002FE5D8 table: 0042E5D8 table: 0049E5D8 table: 0042E5D8 table: 0042E5D8 table: 0042E5D8 table: 0024E5D8 table: 0042E5D8 table: 0042E5D8 table: 0061E5D8 table: 0042E5D8
Other platforms will vary, of course. I'd even expect there to be platforms where the address of the first allocated table is completely deterministic, and hence identical on every run of the program.
In short, the address of an arbitrary object in your process image is not a very good source of randomness.
Edit: For completeness, I'd like to add a couple of other thoughts that came to mind over night.
The stock tostring()
function is supplied by the base library and implemented by the function luaB_tostring()
. The relevant bit is this fragment:
switch (lua_type(L, 1)) {
...
default:
lua_pushfstring(L, "%s: %p", luaL_typename(L, 1), lua_topointer(L, 1));
break;
If you really are calling this function, then the end of the string will be an address, represented by standard C sprintf()
format %p
, strongly related to the specific table. One observation is that I've seen several distinct implementations for %p
. Windows MSVCR80.DLL (the version of the C library used by the current release of Lua for Windows) makes it equivalent to %08X
. My Ubuntu Karmic Koala box appears to make it equivalent to %#x
which notably drops leading zeros. If you are going to parse out that part of the string, then you should do it in a way that is more flexible in the face of variation of the meaning of %p
.
Note, also, that doing anything like this in library code may expose you to a couple of surprises.
First, if the table passed to tostring()
has a metatable that provides the function __tostring()
, then that function will be called, and the fragment quoted above will never be executed at all. In your case, that issue cannot arise because tables have individual metatables, and you didn't accidentally apply a metatable to your local table.
Second, by the time your module loads, some other module or user-supplied code might have replaced the stock tostring()
with something else. If the replacement is benign, (such as a memoization wrapper) then it likely doesn't matter to the code as written. However, this would be a source of attack, and is entirely outside the control of your module. That doesn't strike me as a good idea if the goal is some kind of improved security for your random seed material.
Third, you might not be loaded in a stock Lua interpreter at all, and the larger application (Lightroom, WoW, Wireshark, ...) may choose to replace the base library functions with their own implementations. This is a much less likely issue for tostring()
, but note that the base library's print()
is a frequent target for replacement or removal in alternate implementations and there are modules (Lua Lanes, for one) that break if print
is not the implementation in the base library.
A few important things come to mind:
- In most other languages you typically only call the random 'seed' function once at the beginning of the program or perhaps at limited times throughout its execution. You generally do not want to call it each time you generate a random number/sequence. If you call it once when the program starts you get around the "once per second" limitation. By calling it each time you may actually end up with less randomness in your results.
- Your realrandom() function seems to rely on a private implementation detail of Lua. What happens in the next major release if this detail changes to always return the same number, or only even numbers, etc.... Just because it works for now is not a strong enough guarantee, especially in the case of wanting a secure RNG.
- When you say "everything seems perfectly random" how are you measuring this performance? We humans are terrible at determining if a sequence is random or not and just looking at a sequence of numbers would be virtually impossible to truly tell if they were random or not. There are many ways to quantify the "randomness" of a series including frequency distribution, autocorrelation, compression, and many more far beyond my understanding.
- If you are writing a true "secure PRNG" for production do not write your own! Investigate and use a library or algorithm by experts who has spent years/decades studying, designing and trying to break it. True secure random number generation is hard.
If you need more info start on the PRNG article on Wikipedia and use the references/links there as needed.
精彩评论