The basic class for reading xml-spreadsheets of type spreadsheetML is {@link nl.fountain.xelem.lex.ExcelReader}. It can deliver the contents of an xml-file or an xml-InputSource as a fully populated {@link nl.fountain.xelem.excel.Workbook}. This package also offers an event-based model with the {@link nl.fountain.xelem.lex.ExcelReaderListener ExcelReaderListener-interface} which can be fine-tuned with an {@link nl.fountain.xelem.lex.ExcelReaderFilter} that acts as a wrapper for listeners.
ExcelReader reader = new ExcelReader(); Workbook xlWorkbook = reader.getWorkbook("foo/bar/myFile.xml");Now what you have is a fully populated {@link nl.fountain.xelem.excel.ss.XLWorkbook}, that is, the (almost) complete representation of the Excel xml-file
myFile.xml
in Java class-instances. Now you can
inspect it's structure, process it's contents, alter it or send it to the president.
OOoops! You got an
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
, did ya? The thing is that the above code works fine as long as
your workbooks aren't too big. Every cell, row, worksheet and so on in the file
that you're reading is stuffed as an object into the one encompassing object
XLWorkbook. How much will go in there depends -among others- on your system
and VM-settings. So what if you want to read large files and huge input sources?
ExcelReader reader = new ExcelReader(); reader.setReadArea(new Area("E11:M16")); Workbook xlWorkbook = reader.getWorkbook("foo/bar/myFile.xml");The XLWorkbook that is returned will contain {@link nl.fountain.xelem.excel.ss.SSWorksheet SSWorksheets} and {@link nl.fountain.xelem.excel.ss.SSTable SSTables} containing only those cells, rows and columns that were found in the specified area of the succesive worksheets. Apart from cells, rows, columns, tables and worksheets, all other {@link nl.fountain.xelem.excel.XLElement XLElements} will be present and fully populated. Mark that the above code will treat all the worksheets in the workbook in the same way. If we want to set different read areas or clear the read area for individual worksheets we need other code. We'll come to that. First let's take a look at the event-based API.
ExcelReader reader = new ExcelReader(); reader.addExcelReaderListener(new DefaultExcelReaderListener() { public void setRow(int sheetIndex, String sheetName, Row row) { // process row and/or it's cells // ... } }); reader.read("foo/bar/myFile.xml");Of course we could do the same with a non-anonymous listener. And we could be listening to events other then the setRow-event as well.
When we're only interested in particalur parts of the worksheets we might set the read area:
ExcelReader reader = new ExcelReader(); reader.setReadArea(new Area("E11:M16")); reader.addExcelReaderListener(new DefaultExcelReaderListener() { public void setRow(int sheetIndex, String sheetName, Row row) { // process row and/or it's cells // ... } }); reader.read("foo/bar/myFile.xml");Now we will only be notified when the row is within the specified area and the row only contains cells that were found within that area.
class SwappingAreaListener extends DefaultExcelReaderListener { private ExcelReader reader; public SwappingAreaListener(ExcelReader reader) { this.reader = reader; } // override method in DefaultExcelReaderListener public void startWorksheet(int sheetIndex, Worksheet sheet) { switch (sheetIndex) { case 1: reader.setReadArea(new Area("A1:C6")); break; case 2: reader.setReadArea(new Area("G11:G11")); break; default: reader.clearReadArea(); } } }The SwappingAreaListener tells ExcelReader to only read particular areas on sheets 1 and 2 and to read the rest of the sheets unrestricted. Mind that the sheetIndex is 0-based. Here's how SwappingAreaListener might be fit in in code that tells ExcelReader to get a workbook:
ExcelReader reader = new ExcelReader(); SwappingAreaListener sal = new SwappingAreaListener(reader); reader.addExcelReaderListener(sal); Workbook xlWorkbook = reader.getWorkbook("foo/bar/myFile.xml");The returned XLWorkbook will have it's sheets populated according to the restrictions imposed by SwappingAreaListener. Internally ExcelReader uses a {@link nl.fountain.xelem.lex.WorkbookListener} to build the workbook and events are being dispatched to both listeners. WorkbookListener uses these events to build and populate the workbook, SwappingAreaListener at the same time imposes restrictions. At the end of the read that is performed by the getWorkbook-method ExcelReader removes the workbooklistener and then returns the completed XLWorkbook. If you want to use the same ExcelReader to perform more reads that must not be restricted, do not forget to remove the SwappingAreaListener.
Filters and listeners can be configured in a variety of ways. For instance, we use an EmptyRowFilter to count the number of empty rows on a worksheet and abandon listening for further events on that sheet at a certain emptyRowCount. I'm sure your imagination, inventiveness and creativity go beyond that.