开发者

To compare 2 excel files using python

I have a excel file with following fields

Software_name , Version and Count.

This file is a inventory of all softwares installed in the network of an organization which was generated using LANdesk.

I have another excel file which is an purchase inventory of these softwares which is generated manually.

I need to compare these sheets and create a report stating whether the organization is compliant or not.

Hence how do i compare these two files.

there are instances like Microsoft Office is mentioned as just office and 'server' is spelt as 'svr'

How 开发者_JS百科do go about with it?


The first step as SeyZ mentions is to determine how you want to read these Excel files. I don't have experience with the libraries he refers to. Instead I use COM programming to read and write Excel files, which of course requires that you have Excel installed. This capability comes from PyWin32 which is installed by default if you use the ActiveState Python installer, or can be installed separately if you got Python from Python.org.

The next step would be to convert things into a common format for comparing, or searching for elements from one file within the other. My first thought here would be to load the contents of the LANdesk software inventory into a database table using something quick and easy like SQLite.

Then for each item of the manual purchase list, normalize the product name and search for it within the inventory table.

Normalizing the values would be a process of splitting a name into parts and replacing partial words and phrases with their full versions. For example, you could create a lookup table of conversions:

partial               full
--------------------  --------------------
svr                   server
srv                   server
SRV                   Stevie Ray Vaughan
office                Microsoft Office
etc                   et cetera

You would want to run your manual list data through the normalizing process and add partial values and their full version to this table until it handles all of the cases that you need. Then run the comparison. Here is some Pythonish pseudocode:

for each row of manual inventory excel worksheet:
    product = sh.Cells(row, 1) # get contents of row n, column 1
                               # adjust based on the structure of this sheet

    parts = product.split(" ") # split on spaces into a list

    for n, part in enumerate(parts):
        parts[n] = Normalize(part) # look up part in conversion table
    normalProduct = " ".join(parts)

    if LookupProduct(normalProduct): # look up normalized name in LANdesk list
        add to compliant list
    else:
        add to non-compliant list

if len(non-compliant list) > 0:
    TimeForShopping(non-compliant list)

If you have experience with using SQLite or any other database with Python, then creating the LANdesk product table, and the normalize and lookup routines should be fairly straightforward, but if not then more pseudocode and examples would be in order. Let me know if you need those.


There are several libraries to manipulate .xls files.

XLRD allows you to extract data from Excel spreadsheet files. So you can compare two files easily. (read)

XLWT allows you to create some Excel files. (write)

XLUtils requires both of the xlrd and xlwt packages. So, you can read & write easily thanks to this library.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜