How to handle large files

Jan 4, 2011 at 7:52 AM

Hi

 

I am new to PHPExcel...

I have run tests on files up to 2MB and have had no troubles...

I now have a very large file - 10 MB, and growing

I have managed to load bits and bobs by $objReader->setLoadSheetsOnly($sheetname);

This works really well

 

However, when I need to set a cell value, then read the calculated value, I need to load most (probably all) sheets. PHP runs out of memory and the script fails.

I am wondering if adding LOADS of RAm would resolve the issue. If so, how much for a 10-15 MB file ?

Or is ther a coding solution ?

 

Thanks in advance to anyone who can provide info

Best

Jason

Coordinator
Jan 5, 2011 at 10:08 PM
Edited Jan 5, 2011 at 10:09 PM

Filesize doesn't have any real meaning: what matters more is the number of cells in the worksheets. My rule of thumb estimate is an average of 1k/cell.

A coding solution is to use cell caching. See the documentation for details. With cell caching, you can reduce that 1k/cell to about 300 bytes/cell; albeit at a cost in speed.

Jan 6, 2011 at 5:45 AM

Hi Mark

Thanks for the reply.

I have opted to exclude the heavy sheets when I can.

However, two worksheets with about 60,000 cells, come in at 1.5 MB each. They contain cells with ormulae like this

=IF(AND(AG1312<45),R1312,IF(AND(AG1312>=45,AG1312<100),S1312,IF(AND(AG1312>=100,AG1312<300),T1312,IF(AND(AG1312>=300,AG1312<500),U1312,IF(AND(AG1312>=500,AG1312<1000),V1312,W1312)))))

Two questions - am aussiming that sort of forumla adds loads of overhead to loading / calculating ?

When in Excel 2010, the cells return calculations fine, but in PHP Excel, I am getting a Fatal error: Call to a member function cellExists() on a non-object in /usr/lib/php/PHPExcel/PHPExcel/Calculation.php on line 3150

I have seen on this board that can come from faulty formulae (not the case since it works in Excel) or non-standard characters in cell references (shouldn't be the case as the only non-standard character used in sheet naming are spaces). ANy other reasons possible ?

 

Thanks

 

Jason

Coordinator
Jan 6, 2011 at 7:37 AM
clipclicmusic wrote:

Two questions - am aussiming that sort of forumla adds loads of overhead to loading / calculating ?

PHPExcel performs no calculation on loading a workbook.

It will calculate the results of formulae when explicitly requested (eg. a call to getCalculatedValue(), or functions that themselves call it such as toArray()). It will also execute all formulae calculations on a save (the result is saved as well as the formula) unless you instruct the writer to setPreCalculateFormulas(false). Note that cells containing formulae are always calculated on a save if they are in (or nested from) an autofit column.

clipclicmusic wrote:

Two questions - am aussiming that sort of forumla adds loads of overhead to loading / calculating ?

When in Excel 2010, the cells return calculations fine, but in PHP Excel, I am getting a Fatal error: Call to a member function cellExists() on a non-object in /usr/lib/php/PHPExcel/PHPExcel/Calculation.php on line 3150

I have seen on this board that can come from faulty formulae (not the case since it works in Excel) or non-standard characters in cell references (shouldn't be the case as the only non-standard character used in sheet naming are spaces). ANy other reasons possible ?

You may also have noticed a number of issues and threads regarding problems with formulae that include references to other worksheets, or nested function calls. The latest version of Calculation.php in the SVN repository fixes these.

Jan 6, 2011 at 8:45 AM

Hi Mark

Thanks for the extra info.

Am 'starting' to see how this works. LONG way to go !

Am downloading Wednesday's version...

I actually installed using PEAR. Is there a simple way to upgrade to this latest version using PEAR

 

Thanks

 

Jason

Jan 7, 2011 at 4:02 PM

@clipclipmusic

To enable the cache Mark is talking about add theese lines after require/include the PHPExcel library:

// Initiate cache
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array( 'memoryCacheSize' => '32MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

Regards,

Kim Steinhaug, kim@steinhaug.com, www.easywebshop.no

Coordinator
Jan 7, 2011 at 8:57 PM
clipclicmusic wrote:

I actually installed using PEAR. Is there a simple way to upgrade to this latest version using PEAR

 There is no simple way to upgrade to the latest SVN code, or to update individual files using PEAR. The PEAR build is only created as part of the automated process when we do a full production release.

Jan 8, 2011 at 3:52 AM
Thanks !

On 07/01/2011 21:02, kimss wrote:

From: kimss

@clipclipmusic

To enable the cache Mark is talking about add theese lines after require/include the PHPExcel library:

// Initiate cache
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array( 'memoryCacheSize' => '32MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

Regards,

Kim Steinhaug, kim@steinhaug.com, www.easywebshop.no

Jan 8, 2011 at 2:22 PM
Hi Mark

OK. I installed a second copy. I should really have done that from the off.
Good advice for novices " install the stable version using PEAR (easy as pie)
THen download the unstable versions and simply install them in a directory on the webserver.

Switch the script from one to the other by changing the include path
Also, make sure the relative path to the 'webserver' version is NOT 'PHPExcel/PHPExcel' !


I just ran a test on a tiny file with 5 cells on one sheet and ran this code:

<?php
require_once 'php/PHPExcelSVN/PHPExcel/IOFactory.php';
$objPHPExcel = PHPExcel_IOFactory::load("php/tinytest.xlsx");
$objPHPExcel->setActiveSheetIndex(0);
$objPHPExcel->getActiveSheet()->setCellValue('D2', 50);
echo $objPHPExcel->getActiveSheet()->getCell('D8')->getCalculatedValue() . "
";
echo date('H:i:s') . " Peak memory usage: " . (memory_get_peak_usage(true) / 1024 / 1024) . " MB\r\n";
?>

I get 6MB of memory usage.

12 cells
Peak memory usage: 6144 KB

So there is a 6MB 'cost' to load PHPExcel at the start... and from thereon, for any list readers who want some figures to show mem usage :

1000 cells
01:07:54 Peak memory usage: 9216 KB
3000 cells
01:08:34 Peak memory usage: 12544 KB
10000 cells
Peak memory usage: 26368 KB
20000 cells
Peak memory usage: 46848 KB
40000 cells
Peak memory usage: 87296 KB
80000 cells
Peak memory usage: 167936 KB

Also, thanks to the list for the caching suggestion.
Slows things down, but allows for dealing with the larger files better

Still not at the end of my 'travails' with this, but getting a better understanding

Jjust curious : I've seen discussions about it, but are there any improvements in the memory usage likely to be released in the next month or two ?

On 08/01/2011 01:57, MarkBaker wrote:

From: MarkBaker

clipclicmusic wrote:

I actually installed using PEAR. Is there a simple way to upgrade to this latest version using PEAR

There is no simple way to upgrade to the latest SVN code, or to update individual files using PEAR. The PEAR build is only created as part of the automated process when we do a full production release.

Coordinator
Jan 8, 2011 at 5:14 PM
Edited Jan 8, 2011 at 5:16 PM
clipclicmusic wrote:

Just curious : I've seen discussions about it, but are there any improvements in the memory usage likely to be released in the next month or two

 No plans. I spent most of the last few months working on performance and memory improvements; and with the lazy loader and cell caching, we're probably close to the limits of what can reasonably be done to improve memory any further. The only way I can think of to make further significant reductions, is to break with coding in PHP... and at that point PHPExcel ceases to exist.

Focus for the next couple of months is integrating the code for Charts (for the Excel2007 Reader and Writer); basic writers for OOCalc, Gnumeric and Excel2003XML, an initial HTML Reader; and a complete rewrite of the PHPOLE code (so that it can handle workbook properties and some of the other features not yet supported by the Excel5 Writer).

Jan 8, 2011 at 5:27 PM
Hi Mark

Thanks again for the clear and complete reply

I'll work on making the the Excel file easier for PHPExcel to run with

best

Jason

On 08/01/2011 22:14, MarkBaker wrote:

From: MarkBaker

clipclicmusic wrote:

Just curious : I've seen discussions about it, but are there any improvements in the memory usage likely to be released in the next month or two

No plans. I spent most of the last few months working on performance and memory improvements; and with the lazy loader and cell caching, we're probably close to the limits of what can reasonably be done to improve memory any further. The only way I can think of to make further significant reductions, is to break with coding in PHP... and at that point PHPExcel ceases to exist.

Focus for the next couple of months is integrating the code for Charts (for the Excel2007 Reader and Writer); basic writers for OOCalc, Gnumeric and Excel2003XML, an initial HTML Reader; and a complete rewrite of the OLE code for the Excel5 Writer.