开发者

Matching data between multiple data sets as much as possible, in Matlab

I have four sets of data, each of which is stored as an array of structs. The data that is common in the four data sets can be found by performing logical indexing with different combinations of the arrays.

The problem: If I cannot find any intersected data matches in all four data sets (= a perfect match), I would like to find the data that can be found in any three data sets. If I cannot find any intersected data matches in any combination of three data sets, I would like to find intersected data in any two data sets.

Of course, since I only have four data sets, I could write if-clauses to handle each combination case separately, it will take a while but it should be possible. But since I am using Matlab I am curious if a smart Matlabish way for handeling this exists? What would be a feasible way of doing this if I had had ten datasets to be processed?

Maybe it is possible to generate an array with all the combinations and transform the array entries into logical indexing expressions?

The output of the program will be something like this:

The following items matched all criteria: ... The following items matched only criteria A, B, C: ... The following items matched only A and D: ...

%Four sets of data. A number of arrays of structs matching items in ArrayOfStructs
matchingA = strcmpi({ArrayOfStructs.A},iA);
matchingB = strcmpi({ArrayOfStructs.B},iB);
matchingC = strcmpi({ArrayOfStructs.C},iC);
matchingD = strcmpi({ArrayOfStructs.D},iD);

%Find as many posts as possible where the posts are identical between the data sets.

%Try to find posts matching in all four data sets
if (sum(matchingA)>0 && sum(matchingB)>0 && sum(matchingC)>0 && sum(matchingD)>0)  
            mayMatchAll = ArrayOfStructs(matchingA & matchingB & matchingC & matchingD);
%etc...
%Try to find posts matching in any three of the data sets
%Try to find in A,B,C      开发者_如何学运维      
if (sum(matchingA)>0 && sum(matchingB)>0 && sum(matchingC)>0)  
            mayMatch3_1st = ArrayOfStructs(matchingA & matchingB & matchingC);
%etc...
%Try to find in A,B,D  
if (sum(matchingA)>0 && sum(matchingB)>0 && sum(matchingD)>0)  
            mayMatch3_2nd = ArrayOfStructs(matchingA & matchingB & matchingD);
%etc...
%Try to find in A,C,D 
if (sum(matchingA)>0 && sum(matchingC)>0 && sum(matchingD)>0)  
            mayMatch3_3rd = ArrayOfStructs(matchingA & matchingC & matchingD);
%etc...            
          %...

%Try to find posts matching in any two of the data sets
          %etc...
end


Yes, you can use logical indexing here. Instead of matchingA(:), matchingB(:) use matching(1,:) and matching(2,:). Then you can use

%#Try to find posts matching in all four data sets
if any(all(matching))
            mayMatchAll = ArrayOfStructs(all(matching));
end
%#Try to find posts matching in any three of the data sets
if any(sum(matching)==3)
            mayMatch3 = ArrayOfStructs(sum(matching)==3);
end
%#Try to find posts matching in any two of the data sets
if any(sum(matching)==2)
            mayMatch2 = ArrayOfStructs(sum(matching)==2);
end

Of course you could replace mayMatchN by a cell array mayMatch{N} and loop over N to make it even more compact.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜