Apr 20, 2011 at 8:24 PM
Edited Apr 20, 2011 at 8:33 PM
Well, I have proof that it doesn't always happen. The script executes and terminates
OK! you have proof that it can happen, but that doesn't help identify the cause of the problem, or provide a solution. I've explained how PHP is supposed to work, firing the destructor for each worksheet object as it is flushed from memory, which should
delete the cache file; and I've confirmed that an out of memory error will leave residual files, and explained why that is. I can't think of any other circumstances that will leave residual files on normal termination of PHP code.
Or should I unset the worksheet objects, too?
Because the PHPExcel object contains references to each child worksheet, and each worksheet object contains a reference to the parent PHPExcel object, PHPExcel has cyclic references. Because of these cyclic references, using a straight unset() on either
will not actually unset these objects... it's something that PHP has real problems with. The latest versions of PHP 5.3 do provide a mechanism to handle unsetting in this circumstance; but that is not available in PHP 5.2, so we provide the disconnectWorksheets()
method to break the cyclic reference so that unsetting the workbook will unset it correctly, and unset all its worksheets as well (which should trigger the worksheet destructor to delete the cache files. On normal script termination, this should be unnecessary,
as PHP should be able to do this itself, despite the cyclic references.
If I'm calling getActiveSheet(), why is there more than one cache file if the cache files are per worksheet?
getActiveSheet() simply identifies the sheet that you are currently working with. Unless you tell PHPExcel only to load selected worksheets, it will read
all worksheets in the workbook, and each of those worksheets will have its own cache file. Using getActiveSheet() is simply identifying which of those worksheets you want to manipulate. It selects the currently active worksheet, and
it's collection of cells.
Is there a way to load only the first sheet and ignore the others?
You can load a selected subset of worksheets by name, as described in section 4.2 of the User Documentation entitled "Reading Only Named WorkSheets from a File". If you don't know the names of the worksheets,
you can retrieve a list of these without reading the entire workbook as described in section 4.3 of the User Documentation entitled "Reading the List of Worksheets from the Spreadsheet". To read the first worksheet,
even without knowing its name:
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Read the list of Worksheet Names from the Workbook file **/
$worksheetNames = $objReader->listWorksheetNames($inputFileName);
/** Advise the Reader of which WorkSheets we want to load **/
/** Load $inputFileName to a PHPExcel Object **/
$objPHPExcel = $objReader->load($inputFileName);
I still don't understand the chunk methodology that I read about in some other posts, could you explain it?
Chunking works by using a Read Filter to tell the Reader which cells to actually read. The loader will then read only those cells which match the readCell() rule defined in the Read Filter, ignoring all other cells. So, if you have a worksheet with values
in the range A1:IV65535, but use a read filter that tells PHPExcel to load only cells A1:B2, then only the four cells in that range will be loaded when you execute the load()method. Instead of needing over 16GB of memory (assuming no caching) to load every
cell, that worksheet will only need about 4kB to load just 4 cells.
Section 4.4 of the User Documentation entitled "Reading Only Specific Columns and Rows from a File (Read Filters)" describes this process in more detail.