how to split Excel file using java?
I have Excel sheet with 200000 rows.i want to splits the excel file for each 50000 Records. i am using Apache POI API To read and write Excel file.Is it possible to split file if number of row reaches on defined Record size.please help me to get the solution for that problem.
Code:
public String[][] getSheetData(int SheetIndex)
{
int noOfColumns = 0;XSSFRow row = null;
XSSFCell cell = null;
int i=0;int noOfRows=0;
int j=0;
String[][] data=null; XSSFSheet sheet=null;
try {
loadFile(); //load give Excel
if(validateIndex(SheetIndex))
{
sheet = workbook.getSheetAt(SheetIndex);
noOfColumns = getNumberOfColumns(SheetIndex);
noOfRows =getNumberOfRows(SheetIndex)+1;
data = new String[noOfRows][noOfColumns];
Iterator rowIter = sheet.rowIterator();
while(rowIter.hasNext())
{
row = (XSSFRow) rowIter.next();
Iterator cellIter = row.cellIterator();
j=0;
while(cellIter.hasNext())
{
cell = (XSSFCell) cellIter.next();
if(cell.getCellType() == cell.CELL_TYPE_STRING)
{
data[i][j] = cell.getStringCellValue();
}
else if(cell.getCellType() == cell.CELL_TYPE_NUMERIC)
{
data[i][j] = Double.toString(cell.getNumericCellValue());
}
j++;
}
i++;
开发者_运维问答 } // outer while
}
else throw new InvalidSheetIndexException("Invalid sheet index.");
} catch (Exception ex) {
logger.error(ex);}
return data;
}
Occur Exception:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.xmlbeans.impl.store.Locale$ScrubBuffer.<init>(Locale.java:1885)
at org.apache.xmlbeans.impl.store.Locale.getScrubBuffer(Locale.java:1904)
at org.apache.xmlbeans.impl.store.Xobj.getValueAsString(Xobj.java:1205)
at org.apache.xmlbeans.impl.store.Xobj.fetch_text(Xobj.java:1796)
at org.apache.xmlbeans.impl.values.XmlObjectBase.get_wscanon_text(XmlObjectBase.java:1332)
at org.apache.xmlbeans.impl.values.XmlObjectBase.check_dated(XmlObjectBase.java:1269)
at org.apache.xmlbeans.impl.values.JavaLongHolder.longValue(JavaLongHolder.java:53)
at org.apache.xmlbeans.impl.values.XmlObjectBase.getLongValue(XmlObjectBase.java:1502)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTRowImpl.getR(Unknown Source)
at org.apache.poi.xssf.usermodel.XSSFRow.getRowNum(XSSFRow.java:321)
at org.apache.poi.xssf.usermodel.XSSFSheet.initRows(XSSFSheet.java:180)
at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:147)
at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:134)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:235)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:190)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:189)
at com.arosys.excelReading.ReadXLSX.loadFile(ReadXLSX.java:49)
at com.arosys.excelReading.ReadXLSX.getNumberOfSheet(ReadXLSX.java:121)
at com.arosys.excelReading.ReadXLSX.main(ReadXLSX.java:324)
Java Result: 1
Thanks
If you cannot read this docuemnt with POI but the table is simple I'd suggest you to open it with Excel, save it as CSV file and then separate this file to fragments. This can be even done using shell script/batch file.
This recommendation is valid if you have one huge file and have to split it. If however you are implementing a service that should receive such excel files and separate it look for other solution.
The default Java heap size is very small, especially for when working with large xml files.
You just need to up your memory size and you'll be fine. Set a decent size for your -Xmx
setting when running Java
First of all use recommendation of Gagravarr and maximize your memory footprint. If your platform allows, switch to 64-bit JVM. This brute force approach alone may solve your problem
If possible, avoid using .xlsx files - POI consumes much more memory on .xlsx than on .xls
POI here recommends using eventmodel for reading (here is where most of the memory is consumed) and you also may try their new SXSSF API with small memory footprint for writing (still in beta).
精彩评论