how to extract data from one excel workbook and output to another using python xlrd/xlwt?
I am trying to write a script which will automate a copy/paste of employee time sheets from several files to one compiled file. Since they are time sheets with project codes some cells are left blank where an employee worked on a different project that day. Also the files have been converted from xlsx(2007) to .csv.xls which xlrd seems to open just fine.
开发者_JS百科I do know how to open and create a book object but my knowledge of this module is very limited so I thought maybe a general algorithm would be helpful:
import xlrd, xlwt
put all following in for or while loop to iterate through files:
book = xlrd.open_workbook('mybook.csv.xls')
extract data; store data for ouput
use for loop to iterate over data, output to final sheet
open next file, repeat process storing each output below the previous
I am looking for anything that will help me find the answers, not just code. Any help would be appreciated. Thanks.
This might help ... it reproduces your data as closely as possible (dates remain as dates, empty cells don't become text cells with 0-length contents, booleans and error cells don't become number cells).
from xlrd import XL_CELL_EMPTY, XL_CELL_TEXT, XL_CELL_NUMBER,
XL_CELL_DATE, XL_CELL_BOOLEAN, XL_CELL_ERROR, open_workbook
from xlwt import Row, easyxf, Workbook
method_for_type = {
XL_CELL_TEXT: Row.set_cell_text,
XL_CELL_NUMBER: Row.set_cell_number,
XL_CELL_DATE: Row.set_cell_number,
XL_CELL_ERROR: Row.set_cell_error,
XL_CELL_BOOLEAN: Row.set_cell_boolean,
}
date_style = easyxf(num_format_str='yyyy-mm-dd')
other_style = easyxf(num_format_str='General')
def append_sheet(rsheet, wsheet, wrowx=0):
for rrowx in xrange(rsheet.nrows):
rrowvalues = rsheet.row_values(rrowx)
wrow = wsheet.row(wrowx)
for rcolx, rtype in enumerate(rsheet.row_types(rrowx)):
if rtype == XL_CELL_EMPTY: continue
wcolx = rcolx
wmethod = method_for_type[rtype]
wstyle = date_style if rtype == XL_CELL_DATE else other_style
wmethod(wrow, wcolx, rrowvalues[rcolx], wstyle)
wrowx += 1
return wrowx
if __name__ == '__main__':
import sys, xlrd, xlwt, glob
rdpattern, wtfname = sys.argv[1:3]
wtbook = Workbook()
wtsheet = wtbook.add_sheet('guff')
outrowx = 0
for rdfname in glob.glob(rdpattern):
rdbook = open_workbook(rdfname)
rdsheet = rdbook.sheet_by_index(0)
outrowx = append_sheet(rdsheet, wtsheet, outrowx)
print outrowx
wtbook.save(wtfname)
I am creating a class called excel functions for xlutils, xlrd, and xlwt that I might ultimately make a library. If you are interested in helping I am trying to make a delete sheet function.
You might want to move towards openpyxl and/or pyexcel because they are easier and have functions for this.
Here is how to copy using open pyxl: Copy whole worksheet with openpyxl
Here is the documentation for pyexcel which is a wrapper for xlwt, xlrd, and xlutils: https://pyexcel.readthedocs.io/en/latest/
If you want to extract data from one excel workbook and output to another you would want to use createCopy(original workbook, other workbook, original filename, new filename)
import xlwt
import xlrd
import xlutils.copy
import xlutils class excelFunctions():
def getSheetNumber(self, fileName, sheetName):
# opens existing workbook
workbook = xlrd.open_workbook(fileName, on_demand=True)
#turns sheet name into sheet number
for index, sheet in enumerate(workbook.sheet_names()):
if sheet == sheetName:
return index
def createSheet(self, fileName, sheetName):
# open existing workbook
rb = xlrd.open_workbook(fileName, formatting_info=True, on_demand=True)
# make a copy of it
wb = xl_copy(rb)
# creates a variable called sheets which stores all the sheet names
sheets = rb.sheet_names()
# creates a string which is equal to the sheetName user input
str1 = sheetName
# checks to see if the given sheetName is a current sheet
if (str1 not in sheets):
# add sheet to workbook with existing sheets
Sheet = wb.add_sheet(sheetName)
# save the sheet with the same file name as before
wb.save(fileName)
else:
# this declares the sheet variable to be equal to the sheet name the user gives
sheet = wb.get_sheet(self.getSheetNumber(fileName, sheetName))
# save the sheet with the same file name as before
wb.save(fileName)
def createCopy(self, fileName, fileName2, sheetName, sheetName2):
# open existing workbook
rb = xlrd.open_workbook(fileName, formatting_info=True)
# defines sheet as the name of the sheet given
sheet = rb.sheet_by_name(sheetName)
# makes a copy of the original sheet
wb = xl_copy(rb)
# creates an int called column_count which is equal to the sheets maximum columns
column_count = sheet.ncols - 1
# creates a blank array called stuff
Stuff = []
# this loops through adding columns from the given sheet name
for i in range (0, column_count):
Stuff.append([sheet.cell_value(row, i) for row in range(sheet.nrows)])
# create a sheet if there is not already a sheet
self.createSheet(fileName, sheetName2)
# defines sheet as the new sheet
sheet = wb.get_sheet(self.getSheetNumber(fileName, sheetName2))
# this writes to the sheet
for colidx, col in enumerate(Stuff):
for rowidx, row in enumerate(col):
sheet.write(rowidx, colidx, row)
# this saves the file
wb.save(fileName2)
精彩评论