开发者

Excel formula to Cross reference 2 sheets, remove duplicates from one sheet

This is related to

Excel / VBA Remove duplica开发者_高级运维te rows by cross referencing 2 different sheets then deleting 1 row

I can't seem to get any VBA to work well or fast enough for a couple 100 rows.

Does Excel have a formula to remove duplicates from one sheet, by cross referencing another sheet?

Thanks for all your help.


Here is a much faster VBA solution, utilizing a dictionary object. As you can see, it loops only once through sheet A and sheet B, while your original solution has a running time proportional to "number of rows in sheet A" * "number of rows in sheet B".

Option Explicit
Sub CleanDupes()
    Dim wsA As Worksheet
    Dim wsB As Worksheet
    Dim keyColA As String
    Dim keyColB As String
    Dim rngA As Range
    Dim rngB As Range
    Dim intRowCounterA As Integer
    Dim intRowCounterB As Integer

    keyColA = "A"
    keyColB = "B"

    intRowCounterA = 1
    intRowCounterB = 1

    Set wsA = Worksheets("Sheet A")
    Set wsB = Worksheets("Sheet B")

    Dim dict As Object
    Set dict = CreateObject("Scripting.Dictionary")

    Do While Not IsEmpty(wsA.Range(keyColA & intRowCounterA).Value)
        Set rngA = wsA.Range(keyColA & intRowCounterA)
        If Not dict.Exists(rngA.Value) Then
            dict.Add rngA.Value, 1
        End If
        intRowCounterA = intRowCounterA + 1
    Loop

    intRowCounterB = 1
    Do While Not IsEmpty(wsB.Range(keyColB & intRowCounterB).Value)
        Set rngB = wsB.Range(keyColB & intRowCounterB)
        If dict.Exists(rngB.Value) Then
             wsB.Rows(intRowCounterB).Delete
             intRowCounterB = intRowCounterB - 1
        End If
        intRowCounterB = intRowCounterB + 1
    Loop
End Sub


You can do a lot with ADO and Excel.

Dim cn As Object
Dim rs As Object
Dim wb As Workbook
Dim sSQL As String
Dim sFile As String
Dim sCon As String
Dim sXLFileToProcess As String
Dim i

sXLFileToProcess = "Book1z.xls"

sFile = Workbooks(sXLFileToProcess).FullName

''Note that if HDR=No, F1,F2 etc are used for column names,
''if HDR=Yes, the names in the first row of the range
''can be used.
''This is the Jet 4 connection string, you can get more
''here : http://www.connectionstrings.com/excel

sCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & sFile _
    & ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"

''Late binding, so no reference is needed

Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")

cn.Open sCon

'' In this example, the column header for column F is F, see notes
'' above on field (column) names. It also assumes that the sheets to
'' be merged have the same column headers in the same order
'' It would be safer to list the column heards rather than use *.

sSQL = sSQL & "SELECT b.Key,b.b,b.c,b.d,b.e FROM [SheetB$] As B " _
            & "LEFT JOIN [SheetA$] As A " _
            & "ON B.Key=A.Key " _
            & "WHERE A.Key Is Null"

rs.Open sSQL, cn, 3, 3

Set wb = Workbooks.Add

With wb.Worksheets("Sheet1")
    For i = 1 To rs.Fields.Count
        .Cells(1, i) = rs.Fields(i - 1).Name
    Next

    .Cells(2, 1).CopyFromRecordset rs
End With

''Tidy up
rs.Close
Set rs = Nothing
cn.Close
Set cn = Nothing
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜