Python xlrd data extraction
I am using python xlrd http://scienceoss.com/read-excel-files-from-python/ to read data from an excel sheet
My question is if i read a row with first cell as "Employee name" in the excel sheet
And there is another row named whose first cell is "Employee name"
How can we read the last column starting with the last row which has "Employee name" in the first cell.Ignoring the previous
wb = xlrd.open_workbook(file,encoding_override="cp1252")
wb.sheet_names()
sh = wb.sh开发者_开发问答eet_by_index(0)
num_of_rows = sh.nrows
num_of_cols = sh.ncols
valid_xl_format = 0
invalid_xl_format = 0
if(num_of_rows != 0):
for i in range(num_of_rows):
questions_dict = {}
for j in range(num_of_cols):
xl_data=sh.cell(i,j).value
if ((xl_data == "Employee name")):
# Regardless of how many "Employee name" found in rows first cell,Read only the last "Employee name"
I am using python xlrd http://scienceoss.com/read-excel-files-from-python/ to read data from an excel sheet
You need to think about what you are doing, instead of grabbing some blog code and leaving in totally irrelevant stuff like wb.sheet_names()
and omitting parts very relevant to your requirement like first_column = sh.col_values(0)
.
Here's how to find the row_index of the last "whatever" in column A (the first column) -- untested:
import xlrd
wb = xlrd.open_workbook(file_name)
# Why do you think that you need to use encoding_overide?
sheet0 = wb.sheet_by_index(0)
tag = u"Employee name" # or u"Emp name" or ...
column_0_values = sheet0.col_values(colx=0)
try:
max_tag_row_index = column_0_values.rindex(tag)
print "last tag %r found at row_index %d" % (
tag, max_tag_row_index)
except IndexError:
print "tag %r not found" % tag
Now we need to interpret "How can we read the last column starting with the last row which has "Employee name" in the first cell"
Assuming that "the last column" means the one with column_index == sheet0.ncols - 1, then:
last_colx = sheet0.ncols - 1
required_values = sheet0.col_values(colx=last_colx, start_rowx=max_tag_row_index)
required_cells = sheet0.col_slice(colx=last_colx, start_rowx=max_tag_row_index)
# choose one of the above 2 lines, depending on what you need to do
If that's not what you mean (which is quite possible as it is ignoring a whole bunch of data (why do you want to read only the last column?), please try to explain with examples what you do mean.
Possibly you want to iterate over the remaining cells:
for rowx in xrange(max_tag_row_index, sheet0.nrows): # or max_tag_row_index + 1
for colx in xrange(0, sheet0.ncols):
do_something_with_cell_object(sheet0.cell(rowx, colx))
It's difficult to understand exactly what you're asking.
Posting sample data might help make your intent more clear.
Have you tried iterating over the dataset in reverse?, e.g.:
for i in reversed(range(num_of_rows)):
...
if xl_data == "Employee name":
# do something
# then break since you've found the final "Employee Name"
break
精彩评论