SET game odds simulation (MATLAB)

2022-12-29 08:03 问答作者：

I have recently found the great card came - SET. Briefly, there are 81 cards with the four features: symbol (oval, squiggle or diamond), color (red, purple or green), number (one, t开发者_JAVA技巧wo or three) and shading (solid, striped or open). The task is to locate (from selected 12 cards) a SET of 3 cards, in which each of the four features is either all the same on each card or all different on each card (no 2+1 combination).

I've coded it in MATLAB to find a solution and to estimate odds of having a set in randomly selected cards.

Here is my code to estimate odds:

%% initialization
K = 12; % cards to draw
NF = 4; % number of features (usually 3 or 4)
setallcards = unique(nchoosek(repmat(1:3,1,NF),NF),'rows'); % all cards: rows - cards, columns - features
setallcomb = nchoosek(1:K,3); % index of all combinations of K cards by 3

%% test
tic
NIter=1e2; % number of test iterations
setexists = 0; % test results holder
% C = progress('init'); % if you have progress function from FileExchange
for d = 1:NIter
% C = progress(C,d/NIter);    

% cards for current test
setdrawncardidx = randi(size(setallcards,1),K,1);
setdrawncards = setallcards(setdrawncardidx,:);

% find all sets in current test iteration
for setcombidx = 1:size(setallcomb,1)
    setcomb = setdrawncards(setallcomb(setcombidx,:),:);
    if all(arrayfun(@(x) numel(unique(setcomb(:,x))), 1:NF)~=2) % test one combination
        setexists = setexists + 1;
        break % to find only the first set
    end
end
end
fprintf('Set:NoSet = %g:%g = %g:1\n', setexists, NIter-setexists, setexists/(NIter-setexists))
toc

100-1000 iterations are fast, but be careful with more. One million iterations takes about 15 hours on my home computer. Anyway, with 12 cards and 4 features I've got around 13:1 of having a set. This is actually a problem. The instruction book said this number should be 33:1. And it was recently confirmed by Peter Norvig. He provides the Python code, but I didn't test it yet.

So can you find an error? Any comments on performance improvement are welcome.

I tackled the problem writing my own implementation before looking at your code. My first attempt was very similar to what you already had :)

%# some parameters
NUM_ITER = 100000;  %# number of simulations to run
DRAW_SZ = 12;       %# number of cards we are dealing
SET_SZ = 3;         %# number of cards in a set
FEAT_NUM = 4;       %# number of features (symbol,color,number,shading)
FEAT_SZ = 3;        %# number of values per feature (eg: red/purple/green, ...)

%# cards features
features = {
    'oval' 'squiggle' 'diamond' ;    %# symbol
    'red' 'purple' 'green' ;         %# color
    'one' 'two' 'three' ;            %# number
    'solid' 'striped' 'open'         %# shading
};
fIdx = arrayfun(@(k) grp2idx(features(k,:)), 1:FEAT_NUM, 'UniformOutput',0);

%# list of all cards. Each card: [symbol,color,number,shading]
[W X Y Z] = ndgrid(fIdx{:});
cards = [W(:) X(:) Y(:) Z(:)];

%# all possible sets: choose 3 from 12
setsInd = nchoosek(1:DRAW_SZ,SET_SZ);

%# count number of valid sets in random draws of 12 cards
counterValidSet = 0;
for i=1:NUM_ITER
    %# pick 12 cards
    ord = randperm( size(cards,1) );
    cardsDrawn = cards(ord(1:DRAW_SZ),:);

    %# check for valid sets: features are all the same or all different
    for s=1:size(setsInd,1)
        %# set of 3 cards
        set = cardsDrawn(setsInd(s,:),:);

        %# check if set is valid
        count = arrayfun(@(k) numel(unique(set(:,k))), 1:FEAT_NUM);
        isValid = (count==1|count==3);

        %# increment counter
        if isValid
            counterValidSet = counterValidSet + 1;
            break           %# break early if found valid set among candidates
        end
    end
end

%# ratio of found-to-notfound
fprintf('Size=%d, Set=%d, NoSet=%d, Set:NoSet=%g\n', ...
    DRAW_SZ, counterValidSet, (NUM_ITER-counterValidSet), ...
    counterValidSet/(NUM_ITER-counterValidSet))

After using the Profiler to discover hot spots, some improvement can be made mainly by early-break'ing out of loops when possible. The main bottleneck is the call to the UNIQUE function. Those two lines above where we check for valid sets can be rewritten as:

%# check if set is valid
isValid = true;
for k=1:FEAT_NUM
    count = numel(unique(set(:,k)));
    if count~=1 && count~=3
        isValid = false;
        break   %# break early if one of the features doesnt meet conditions
    end
end

Unfortunately, the simulation is still slow for larger simulation. Thus my next solution is a vectorized version, where for each iteration, we build a single matrix of all possible sets of 3 cards from the hand of 12 drawn cards. For all these candidate sets, we use logical vectors to indicate what feature is present, thus avoiding the calls to UNIQUE/NUMEL (we want features all the same or all different on each card of the set).

I admit that the code is now less readable and harder to follow (thus I posted both versions for comparison). The reason being that I tried to optimize the code as much as possible, so that each iteration-loop is fully vectorized. Here is the final code:

%# some parameters
NUM_ITER = 100000;  %# number of simulations to run
DRAW_SZ = 12;       %# number of cards we are dealing
SET_SZ = 3;         %# number of cards in a set
FEAT_NUM = 4;       %# number of features (symbol,color,number,shading)
FEAT_SZ = 3;        %# number of values per feature (eg: red/purple/green, ...)

%# cards features
features = {
    'oval' 'squiggle' 'diamond' ;    %# symbol
    'red' 'purple' 'green' ;         %# color
    'one' 'two' 'three' ;            %# number
    'solid' 'striped' 'open'         %# shading
};
fIdx = arrayfun(@(k) grp2idx(features(k,:)), 1:FEAT_NUM, 'UniformOutput',0);

%# list of all cards. Each card: [symbol,color,number,shading]
[W X Y Z] = ndgrid(fIdx{:});
cards = [W(:) X(:) Y(:) Z(:)];

%# all possible sets: choose 3 from 12
setsInd = nchoosek(1:DRAW_SZ,SET_SZ);

%# optimizations: some calculations taken out of the loop
ss = setsInd(:);
set_sz2 = numel(ss)*FEAT_NUM/SET_SZ;
col = repmat(1:set_sz2,SET_SZ,1);
col = FEAT_SZ.*(col(:)-1);
M = false(FEAT_SZ,set_sz2);

%# progress indication
%#hWait = waitbar(0./NUM_ITER, 'Simulation...');

%# count number of valid sets in random draws of 12 cards
counterValidSet = 0;
for i=1:NUM_ITER
    %# update progress
    %#waitbar(i./NUM_ITER, hWait);

    %# pick 12 cards
    ord = randperm( size(cards,1) );
    cardsDrawn = cards(ord(1:DRAW_SZ),:);

    %# put all possible sets of 3 cards next to each other
    set = reshape(cardsDrawn(ss,:)',[],SET_SZ)';
    set = set(:);

    %# check for valid sets: features are all the same or all different
    M(:) = false;            %# if using PARFOR, it will complain about this
    M(set+col) = true;
    isValid = all(reshape(sum(M)~=2,FEAT_NUM,[]));

    %# increment counter if there is at least one valid set in all candidates
    if any(isValid)
        counterValidSet = counterValidSet + 1;
    end
end

%# ratio of found-to-notfound
fprintf('Size=%d, Set=%d, NoSet=%d, Set:NoSet=%g\n', ...
    DRAW_SZ, counterValidSet, (NUM_ITER-counterValidSet), ...
    counterValidSet/(NUM_ITER-counterValidSet))

%# close progress bar
%#close(hWait)

If you have the Parallel Processing Toolbox, you can easily replace the plain FOR-loop with a parallel PARFOR (you might want to move the initialization of the matrix M inside the loop again: replace M(:) = false; with M = false(FEAT_SZ,set_sz2);)

Here are some sample outputs of 50000 simulations (PARFOR used with a pool of 2 local instances):

» tic, SET_game2, toc
Size=12, Set=48376, NoSet=1624, Set:NoSet=29.7882
Elapsed time is 5.653933 seconds.

» tic, SET_game2, toc
Size=15, Set=49981, NoSet=19, Set:NoSet=2630.58
Elapsed time is 9.414917 seconds.

And with a million iterations (PARFOR for 12, no-PARFOR for 15):

» tic, SET_game2, toc
Size=12, Set=967516, NoSet=32484, Set:NoSet=29.7844
Elapsed time is 110.719903 seconds.

» tic, SET_game2, toc
Size=15, Set=999630, NoSet=370, Set:NoSet=2701.7
Elapsed time is 372.110412 seconds.

The odds ratio agree with the results reported by Peter Norvig.

Here's a vectorized version, where 1M hands can be calculated in about a minute. I got about 28:1 with it, so there might still be something a little off with finding 'all different' sets. My guess is that this is what your solution has trouble with, as well.

%# initialization
K = 12; %# cards to draw
NF = 4; %# number of features (this is hard-coded to 4)
nIter = 100000; %# number of iterations

%# each card has four features. This means that a card can be represented
%# by a coordinate in 4D space. A set is a full row, column, etc in 4D
%# space. We can even parallelize the iterations, at least as long as we
%# have RAM (each hand costs 81 bytes)
%# make card space - one dimension per feature, plus one for the iterations
cardSpace = false(3,3,3,3,nIter);

%# To draw cards, we put K trues into each cardSpace. I can't think of a
%# good, fast way to draw exactly K cards that doesn't involve calling
%# unique
for i=1:nIter
    shuffle = randperm(81) + (i-1) * 81;
    cardSpace(shuffle(1:K)) = true;
end

%# to test, all we have to do is check whether there is any row, column,
%# with all 1's
isEqual = squeeze(any(any(any(all(cardSpace,1),2),3),4) | ...
    any(any(any(all(cardSpace,2),1),3),4) | ...
    any(any(any(all(cardSpace,3),2),1),4) | ...
    any(any(any(all(cardSpace,4),2),3),1));
%# to get a set of 3 cards where all symbols are different, we require that
%# no 'sub-volume' is completely empty - there may be something wrong with this
%# but since my test looked ok, I'm not going to investigate on Friday night
isDifferent = squeeze(~any(all(all(all(~cardSpace,1),2),3),4) & ...
    ~any(all(all(all(~cardSpace,1),2),4),3) & ...
    ~any(all(all(all(~cardSpace,1),3),4),2) & ...
    ~any(all(all(all(~cardSpace,4),2),3),1));

isSet = isEqual | isDifferent;

%# find the odds
fprintf('odds are %5.2f:1\n',sum(isSet)/(nIter-sum(isSet)))

I found my error. Thanks Jonas for the hint with RANDPERM.

I used RANDI to randomly drawn K cards, but there is about 50% chance to get repeats even in 12 cards. When I substituted this line with randperm, I've got 33.8:1 with 10000 iterations, very close to the number in instruction book.

setdrawncardidx = randperm(81);
setdrawncardidx = setdrawncardidx(1:K);

Anyway, it would be interesting to see other approaches to the problem.

I'm sure there's something wrong with my calculation of these odds, since several others have confirmed with simulations that it's close to 33:1 as in the instructions, but what's wrong with the following logic?

For 12 random cards, there are 220 possible combinations of three cards (12!/(9!3!) = 220). Each combination of three cards has a 1/79 chance of being a set, so there's a 78/79 chance of three arbitrary cards not being a set. So if you examined all 220 combinations and there were a 78/79 chance that each one weren't a set, then your chance of not finding a set examining all possible combinations would be 78/79 raised to the 220th power, or 0.0606, which is approx. 17:1 odds.

I must be missing something...?

Christopher

继续阅读：probability simulation

SET game odds simulation (MATLAB)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？