how to numerically sample from a joint, discrete, probability distribution function
I have a 2D "heat map" or PDF that I need to recreate by random sampling. I.E. I have a 2D probability density map showing starting locations. I need to randomly choose starting locations with the same probability as the original PDF.
To do this, I think I need to first find the joint CDF (cumulative de开发者_开发百科nsity function), then choose random uniform numbers to sample the CDF. That's where I get stuck.
How do I numerically find the joint CDF of my PDF? I tried doing a cumulative sum along both dimensions, but that didn't yield the correct result. My knowledge of statistics is failing me.
EDIT The heatmap/PDF is the form of [x,y,z], where Z is the intensity or probability at each x,y point.
You could first go over the 2D density map and for each (x,y) pair in it, find z by a lookup from the PDF. This will give you a starting point (x,y) with a probability of z. So each of the starting points have their own probability from the PDF. What you can do now, is to order the starting points, randomly pick a number and map it to some starting point.
For example, lets say you have n starting points: P1 .. Pn. With a probability of p1 .. pn (normalized or weighted probabilities, so the sum is 100%). Lets say you pick a random value p, pick P1 if p < p1, pick P2 if p1 < p < p1+p2, pick P3 if p1+p2 < p < p1+p2+p3 etc. You can look at it as a histogram over the points P1 to PN, which is the same thing as a cumulative distribution function.
Gibbs Sampling should give you what you want
http://en.wikipedia.org/wiki/Gibbs_sampling
Well, as observed in this answer, for my case it doesn't necessarily matter that my distribution is bivariate. Since I can normalize the whole thing so that it's a true pdf (total surface integrates to 1), I can then rearrange the MxN matrix into a 1xM*N vector. Once I have that, I can do a cumulative integral (cumtrapz in MATLAB), and then sample from that (use a uniform random number to find the corresponding index value).
This is what I want to do as well!!
I have a joint density function for to independent variables X and Y. And I now want to sample new x,y from this distribution.
What I believe I have to do is to find the joint cumulative distribution and then somehow sample from it. Which is exactly what you seemed to have done.
Could you perhaps be more specific when you say you use "uniform random numbers to find the corresponding index values"?
Just for reference: X is size of ask orders and Y is size of bid orders in the stock market.
精彩评论