开发者

Python xlrd data extraction

I am using python xlrd http://scienceoss.com/read-excel-files-from-python/ to read data from an excel sheet

My question is if i read a row with first cell as "Employee name" in the excel sheet

And there is another row named whose first cell is "Employee name"

How can we read the last column starting with the last row which has "Employee name" in the first cell.Ignoring the previous

  wb = xlrd.open_workbook(file,encoding_override="cp1252") 
  wb.sheet_names()
  sh =  wb.sh开发者_开发问答eet_by_index(0)
  num_of_rows = sh.nrows
  num_of_cols = sh.ncols
  valid_xl_format = 0
  invalid_xl_format = 0

  if(num_of_rows != 0):
     for i in range(num_of_rows):
        questions_dict = {}
        for j in range(num_of_cols):
              xl_data=sh.cell(i,j).value
              if ((xl_data == "Employee name")):
                  # Regardless of how many "Employee name" found in rows first cell,Read only the last "Employee name"


I am using python xlrd http://scienceoss.com/read-excel-files-from-python/ to read data from an excel sheet

You need to think about what you are doing, instead of grabbing some blog code and leaving in totally irrelevant stuff like wb.sheet_names() and omitting parts very relevant to your requirement like first_column = sh.col_values(0).

Here's how to find the row_index of the last "whatever" in column A (the first column) -- untested:

import xlrd
wb = xlrd.open_workbook(file_name)
# Why do you think that you need to use encoding_overide?
sheet0 = wb.sheet_by_index(0)
tag = u"Employee name" # or u"Emp name" or ...
column_0_values = sheet0.col_values(colx=0)
try:
    max_tag_row_index = column_0_values.rindex(tag)
    print "last tag %r found at row_index %d" % (
        tag, max_tag_row_index)
except IndexError:
    print "tag %r not found" % tag

Now we need to interpret "How can we read the last column starting with the last row which has "Employee name" in the first cell"

Assuming that "the last column" means the one with column_index == sheet0.ncols - 1, then:

last_colx = sheet0.ncols - 1
required_values = sheet0.col_values(colx=last_colx, start_rowx=max_tag_row_index)
required_cells = sheet0.col_slice(colx=last_colx, start_rowx=max_tag_row_index)
# choose one of the above 2 lines, depending on what you need to do

If that's not what you mean (which is quite possible as it is ignoring a whole bunch of data (why do you want to read only the last column?), please try to explain with examples what you do mean.

Possibly you want to iterate over the remaining cells:

for rowx in xrange(max_tag_row_index, sheet0.nrows): # or max_tag_row_index + 1
    for colx in xrange(0, sheet0.ncols):
        do_something_with_cell_object(sheet0.cell(rowx, colx))


It's difficult to understand exactly what you're asking.
Posting sample data might help make your intent more clear.

Have you tried iterating over the dataset in reverse?, e.g.:

for i in reversed(range(num_of_rows)):
    ...
    if xl_data == "Employee name":
        # do something 
        # then break since you've found the final "Employee Name"
        break
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜