Excel Find Speed vs. VBA binary Search?
How good/fast is Excel VBA's Find vs. binary search? My platform is Office 11|2003 and I'll be searching for strings against Column A on three sheets of values. Total number of rows ~140,000
If worth it which Library & functions should I reference to do the sorting and then the binary search? Binary searching strings/text reportedly has potential problems.
... one thing must be noted. Using binary search formulas with sortedtextrequires caution. Aladin A., Excel MVP
Excel Find:
Worksheets(开发者_如何学Go1).Range("A:A").Find("PN-String-K9", LookIn:=xlValues, LookAt:=xlWhole)
Much against my intuition a VBA binary search strongly outperforms an Excel Find. At least with the scenario below where 120,000 6 character strings are distributed evenly over 3 worksheets.
Excel Find takes 1 minute 58 seconds,
VBA binary search takes 36 seconds on my particular machine.
The advantage of knowing that the text is in order obviously outweighs Excel's natural advantage. Note Aladin A's warning about sort order.
Option Explicit
' Call Search to look for a thousand random strings
' in 3 worksheets of a workbook
' requires a workbook with 3 sheets and
' column A populated with values between "00001" to "120000"
' split evenly 40,000 to a worksheet in ascending order.
' They must be text, not numbers.
Private Const NUM_ROWS As Long = 120000
Private Const SHEET_1 As String = "Sheet1"
Private Const SHEET_2 As String = "Sheet2"
Private Const SHEET_3 As String = "Sheet3"
' This uses VBA Binary Search
Public Sub Search()
Worksheets(SHEET_1).Range("B:B").ClearContents
Worksheets(SHEET_2).Range("B:B").ClearContents
Worksheets(SHEET_3).Range("B:B").ClearContents
DoSearch True ' change to False to test Excel search
End Sub
' Searches for a thousand values using binary or excel search depending on
' value of bBinarySearch
Public Sub DoSearch(ByVal bBinarySearch As Boolean)
Debug.Print Now
Dim ii As Long
For ii = 1 To 1000
Dim rr As Long
rr = Int((NUM_ROWS) * Rnd + 1)
If bBinarySearch Then
Dim strSheetName As String
Dim nRow As Long
If BinarySearch(MakeSearchArg(rr), strSheetName, nRow) Then
Worksheets(strSheetName).Activate
Cells(nRow, 1).Activate
End If
Else
If Not ExcelSearch(SHEET_1, MakeSearchArg(rr)) Then
If Not ExcelSearch(SHEET_2, MakeSearchArg(rr)) Then
ExcelSearch SHEET_3, MakeSearchArg(rr)
End If
End If
End If
ActiveCell.Offset(0, 1).Value = "FOUND"
Next
Debug.Print Now
End Sub
' look for one cell value using Excel Find
Private Function ExcelSearch(ByVal strWorksheet As String _
, ByVal strSearchArg As String) As Boolean
On Error GoTo Err_Exit
Worksheets(strWorksheet).Activate
Worksheets(strWorksheet).Range("A:A").Find(What:=strSearchArg, LookIn:=xlValues, LookAt:=
xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=True
, SearchFormat:=False).Activate
ExcelSearch = True
Exit Function
Err_Exit:
ExcelSearch = False
End Function
' Look for value using a vba based binary search
' returns true if the search argument is found in the workbook
' strSheetName contains the name of the worksheet on exit and nRow gives the row
Private Function BinarySearch(ByVal strSearchArg As String _
, ByRef strSheetName As String, ByRef nRow As Long) As Boolean
Dim nFirst As Long, nLast As Long
nFirst = 1
nLast = NUM_ROWS
Do While True
Dim nMiddle As Long
Dim strValue As String
If nFirst > nLast Then
Exit Do ' Failed to find search arg
End If
nMiddle = Round((nLast - nFirst) / 2 + nFirst)
SheetNameAndRowFromIdx nMiddle, strSheetName, nRow
strValue = Worksheets(strSheetName).Cells(nRow, 1)
If strSearchArg < strValue Then
nLast = nMiddle - 1
ElseIf strSearchArg > strValue Then
nFirst = nMiddle + 1
Else
BinarySearch = True
Exit Do
End If
Loop
End Function
' convert 1 -> "000001", 120000 -> "120000", etc
Private Function MakeSearchArg(ByVal nArg As Long) As String
MakeSearchArg = Right(CStr(nArg + 1000000), 6)
End Function
' converts some number to a worksheet name and a row number
' This is depenent on the worksheets being named sheet1, sheet2, sheet3
' and containing an equal number of vlaues in each sheet where
' the total number of values is NUM_ROWS
Private Sub SheetNameAndRowFromIdx(ByVal nIdx As Long _
, ByRef strSheetName As String, ByRef nRow As Long)
If nIdx <= NUM_ROWS / 3 Then
strSheetName = SHEET_1
nRow = nIdx
ElseIf nIdx > (NUM_ROWS / 3) * 2 Then
strSheetName = SHEET_3
nRow = nIdx - (NUM_ROWS / 3) * 2
Else
strSheetName = SHEET_2
nRow = nIdx - (NUM_ROWS / 3)
End If
End Sub
I find using the AutoFilter works a lot faster than manually searching the records with any method.
I filter, check to see if there are any results, then move on. If any are found (by checking the count of results), I can search the small portion that are filtered manually or return them all.
I used it on around 44,000 records, searching for a list of 100+ parts against it.
Binary searches can easily get stuck in infinite loops if you're not careful.
If you use vlookup with the sorted option, it will likely be faster than your vba.
I have become interested in this because I was using the .Find function, and on one PC it failed to work on some lookups, but on another it was OK! So I've done some testing on timings - I've got a sheet with 985 names sorted into order, and I've written a small subroutine to run through them, and look each one up in the same list using a different method (the times are in milliseconds):
- Brute force: 2000
- .Find: 750
- Application.VLookup: 265
- Binary Search: 234
The problem with VLookup is that it can't return the row number unless you include it in your table.
Here is my code for the Binary search, I've assumed that the sheet has a header row, but you could easily modify the header and code to pass that information. The optional Col parameter is used to indicated if you want the row number, or a cell's value. The function returns 0 (zero) if the find fails.
Function Find(Sheet As Worksheet, What As String, Optional Col As Long = 0) As Variant
Dim Top As Long
Dim Mid As Long
Dim Bot As Long 'Bottom
Dim S As String
Dim T As String
With Sheet
Top = 2 'Sheet has a header row
Bot = .UsedRange.Rows.Count
S = LCase(What)
Do
Mid = (Top + Bot) / 2
T = LCase(.Cells(Mid, 1))
Select Case True
Case T > S
Bot = Mid - 1
Case T < S
Top = Mid + 1
Case Else 'T = S
If Col = 0 Then
Find = Mid 'Return the row
Else
Find = .Cells(Mid, Col).Value2 'Return the cell's value
End If
Exit Function
End Select
Loop Until Bot < Top
End With
Find = 0
End Function
精彩评论