开发者

How to extract a distinct list of records from SQL or Excel

I have an excel spreadsheet with 15 columns, one of which is EmailAddress and then 100,000+ records..

In my data i know that there are many duplicate email addresses.

Can someone tell me how can i extract a distinct list where each reco开发者_如何学编程rd is represented only once by emailaddress?

Alternately, if i import the data into SQL, how can i remove records that are duplicated by emailaddress but leave one record if 4 are found...

Thx..


In simpler tasks I would suggest the use of openrowset. Of course you could combine that with other queries following that, to do the filtering, but in this case it would be great to use DTS for MSSQL2000 and SSIS for later versions.

How?
the question is complex, and has 2 sub-questions that must have been answered previously.

  1. How to remove duplicates
  2. Excel and SQL

and you could read more about SSIS https://stackoverflow.com/search?q=SSIS+Excel


See the link on How to remove duplicates in Alexander's answer, for dealing with it in SQL. (Note that the linked answer is specific to SQLServer - the syntax is likely to be slightly different if you are using other versions of SQL.)

For Excel, I suggest either:

  1. Using a pivot table, or
  2. Sort on EmailAddress, insert an additional column populated with formulas similar to:

    =IF(A1=A2,"","X")
    

    [- row 2, assuming that EmailAddress is in column A; copy and paste for the rest of the column] and use autofilter to select calculated values of X.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜