Is there an easy way to drop all variables from one of the datasets when merging in SAS?

2023-01-06 04:56 问答作者：

Say I've already sorted set1 and set2 by the variables 'sticks', 'stones', and 'bones' and then i do this:

data merged;
    merge set1(in=a) set2(in=b);
    by sticks stones bones;
    if a and b then output;
    *else we don't want to do anything;
run;

Is there an easy way to drop all the variables from set2 in the merged dataset without having to type them all? I keep running into this problem where I have two datasets - both with quite a few variables - and I only want to merge them by a few variables and then only keep the variables from one of the sets.

I usually just use proc sql for something like this, but there are a few situations (more complex than above) wher开发者_开发问答e where I think merge is better.

Also, I find it annoying that SAS requires you to "manually" sort datasets before merging them. If it will not let you merge datasets unless they are sorted correctly, why doesn't it just do it for you when you use merge? Thoughts? Maybe there is a way around this I don't know about.

The sorted requirements is there for the way the merge statement and the PDV works in it. There is really no way around it.

However here basically you're doing a lookup of set2 to make sure you have a match of the key variables (sticks stones bones) through the equivalent of an inner join, which you could likely do more efficiently through an hash table or set with keys (if you have an index of course).

The easiest and more convenient way for what you want here is having a keep statement in the set2 so you load into the PDV only the by variables. Something like this:

data merged;
merge set1(in=a) set2(in=b keep=sticks stones bones);
by sticks stones bones;
if a and b then output;
*else we don't want to do anything;

run;

In case hash tables don't scare you and want to learn more on how to implement them in this case feel free to contact me for more help.

EDIT:

Here is a good paper about using hash tables http://www.nesug.org/proceedings/nesug06/dm/da07.pdf Bear in mind that using hashes you should know what you're doing and they may yield unexpected results if you don't know whats happening under the hood. Regardless here is the problem solved using a very simple and basic hash table

data merged2;
set set1;
if _N_ = 1 then do;
  declare hash h(dataset:"set2");
  h.defineKey('sticks','stones','bones');
  h.defineData('sticks','stones','bones');
  h.defineDone();
end;
rc = h.find();
if rc=0;
drop rc;
run;

This code has the main benefit of not requiring the sorting of the datasets which in case set2 is particularly big is a great time-saver.

继续阅读：sas

Is there an easy way to drop all variables from one of the datasets when merging in SAS?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？