开发者

Parsing and comparing a complicated string

I am hoping someone could help me out with a VBA Excel macro.

I have received a worksheet in Excel 2007 which contains product names in one column, and I need to sort this into a logical format so I can use it. However, the list itself is not in any kind of logical order, is 10 000 rows long and I am going to have to do this every month!!

Basically, what I would like to do is search for certain keywords which are common to most of the entries and move them into separate cells in different columns (but in the same row as the original entry).

Regarding keywords: There are 3 different types, two of which I have a complete list of.

Example of keywords: some are measures such as cm (centimetre), mm (millimetre), m (metre) etc.). Then there are other keywords such as % and finally a last set of keywords which is wood, plastic, glass etc.

If this was not complicated enough, the measures (cm for example) are duplicated in some instances and are important details so I cant just separate them but would ideally like them in two adjacent cells.

Fortunately, there is a space after each measure, % sign and item material.

Working from right to left is the easiest way I can think of achieving this as the first description in the string varies wildly between entries and that can stay as is.

So, below is an example string, lets say this is in Cell A1. (Inverted commas are not included in the string and the word "by" appears in only about 100 cases. Usually it is missing...)

"Chair Leg Wood 100% 1m by 20cm"

I would ideally like for the string to be split up into cells as follows

Cell B1 - Chair Leg  
Cell C1 - Wood  
Cell D1 - 1m  
Cell E1 - 2cm  
Cell F1 - 100%  

Having the % measures in the same column would be extremely helpful

Can anyone please help me wit开发者_Go百科h this or the beginnings of a macro which does this and then moves down the list - I have tried using some basic "find" and "len" formulas but really am at my wits end on how to deal with this!


The task boils down to defining a robust definition of the structure of the input data.

Form the info provided a candidate definition might be

<Description, one or more words> <Material, one word> <A value followd by %> <Dimension A> <optional "by">  <Dimension B>

The following macro will process data that conforms this this spec. The definition may need expanding, eg two word materials (eg Mild Steel)

You will need to add error handling in case any rows don't conform, eg no % in the string, or % character elsewhere in string

Option Explicit

Dim dat As Variant

Sub ProcessData()
    Dim r As Range
    Dim i As Long

    Set r = Intersect(ActiveSheet.UsedRange, ActiveSheet.Columns(1)).Resize(, 5)
    dat = r
    For i = 1 To UBound(dat, 1)
        ParseRow i, CStr(dat(i, 1))
    Next
    r = dat
    ActiveSheet.Columns(5).Style = "Percent"

End Sub


Sub ParseRow(rw As Long, s As String)
    'Chair Leg Wood 100% 1m by 20cm

    Dim i As Long
    Dim sDim As String, sPCnt As String, sMat As String, sDesc As String
    Dim sA As String, sB As String

    i = InStr(s, "% ")
    sDim = Trim(Replace(Mid(s, i + 2), " by ", " "))  ' text to right of %, remove "by"
    sA = Trim(Left(sDim, InStr(sDim, " ")))           ' split dimension string in two
    sB = Trim(Mid(sDim, InStr(sDim, " ")))
    s = Left(s, i)

    i = InStrRev(s, " ")
    sPCnt = Mid(s, i + 1)        ' text back to first space before %
    s = Trim(Left(s, i))  

    i = InStrRev(s, " ")         ' last word in string
    sMat = Mid(s, i + 1)
    sDesc = Trim(Left(s, i))     ' whats left


    dat(rw, 1) = sDesc
    dat(rw, 2) = sMat
    dat(rw, 3) = sA
    dat(rw, 4) = sB
    dat(rw, 5) = sPCnt

End Sub


First, I'd use the Split function to separate the parts into an array, this will avoid most of the string functions and string math:

Dim parts As Variant
parts = Split(A1)

Then, I'd do my comparisons to each part.
Finally, I'd concatenate the parts I didn't breakout, and place all parts on the sheet.

This is based on your example which has spaces inbetween every part, though something similar could work otherwise, you just have to do more work with each part.


Here's my stab at it. We could use about 10 more examples, but this should be a start. To use, select a one column range with your descriptions and run SplitProduct. It will split it out to the right of each cell.

Sub SplitProducts()

    Dim rCell As Range
    Dim vaSplit As Variant
    Dim i As Long
    Dim aOutput() As Variant
    Dim lCnt As Long

    Const lCOLDESC As Long = 1
    Const lCOLMAT As Long = 2
    Const lCOLPCT As Long = 3
    Const lCOLREM As Long = 4

    If TypeName(Selection) = "Range" Then
        If Selection.Columns.Count = 1 Then
            For Each rCell In Selection.Cells
                'split into words
                vaSplit = Split(rCell.Value, Space(1))
                ReDim aOutput(1 To 1, 1 To 1)

                'loop through the words
                For i = LBound(vaSplit) To UBound(vaSplit)
                    Select Case True
                        Case IsPercent(vaSplit(i))
                            'percents always go in the same column
                            lCnt = lCOLPCT
                            If UBound(aOutput, 2) < lCnt Then
                                ReDim Preserve aOutput(1 To 1, 1 To lCnt)
                            End If
                            aOutput(1, lCnt) = vaSplit(i)
                        Case IsInList(vaSplit(i))
                            'list items always go in the same column
                            lCnt = lCOLMAT
                            ReDim Preserve aOutput(1 To 1, 1 To lCnt)
                            If UBound(aOutput, 2) < lCnt Then
                                ReDim Preserve aOutput(1 To 1, 1 To lCnt)
                            End If
                            aOutput(1, lCnt) = vaSplit(i)
                        Case IsMeasure(vaSplit(i))
                            'measurements go in the last column(s)
                            If UBound(aOutput, 2) < lCOLREM Then
                                lCnt = lCOLREM
                            Else
                                lCnt = UBound(aOutput, 2) + 1
                            End If
                            ReDim Preserve aOutput(1 To 1, 1 To lCnt)
                            aOutput(1, lCnt) = vaSplit(i)
                        Case Else
                            'everything else gets concatentated in the desc column
                            aOutput(1, lCOLDESC) = aOutput(1, lCOLDESC) & " " & vaSplit(i)
                    End Select
                Next i

                'remove any extraneous spaces
                aOutput(1, lCOLDESC) = Trim(aOutput(1, lCOLDESC))

                'write the values to the left of the input range
                rCell.Offset(0, 1).Resize(1, UBound(aOutput, 2)).Value = aOutput

            Next rCell
        Else
            MsgBox "Select a one column range"
        End If
    End If

End Sub

Function IsPercent(ByVal sInput As String) As Boolean

    IsPercent = Right$(sInput, 1) = "%"

End Function

Function IsInList(ByVal sInput As String) As Boolean

    Dim vaList As Variant
    Dim vaTest As Variant

    'add list items as needed
    vaList = Array("Wood", "Glass", "Plastic")
    vaTest = Filter(vaList, sInput)

    IsInList = UBound(vaTest) > -1

End Function

Function IsMeasure(ByVal sInput As String) As Boolean

    Dim vaMeas As Variant
    Dim i As Long

    'add measurements as needed
    vaMeas = Array("mm", "cm", "m")

    For i = LBound(vaMeas) To UBound(vaMeas)
        'any number of characters that end in a number and a measurement
        If sInput Like "*#" & vaMeas(i) Then
            IsMeasure = True
            Exit For
        End If
    Next i

End Function

No guarantees that this will be speedy on 10k rows.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜