1500M for writing a 30k rows file?

Topics: User Forum
Apr 14, 2008 at 9:13 AM
Using a script to export db table rows into a workbook with 7 sheets and few columns for each, i've noticed that it uses more than 1500 M of memory for a file that should be maximum 10M(using excel 2007 writer)

How is this possible? How can i limit the memory usage?
Apr 16, 2008 at 6:16 PM
Any ideas?

It actually takes 1 Gb of ram for a 10000 rows workbook.
Coordinator
Apr 17, 2008 at 6:20 AM
We know about this problem, but not how to solve it...
PHP uses reference counting to free up memory, which is part of the problem. This can be fixed, but does not improve that much.
Major problem is we are using too much objects under the hood.
Apr 17, 2008 at 10:36 PM
Oh , is there a way to save the file in more steps then? I've read about an user suggesting an add method beside the save one..
Apr 25, 2008 at 2:54 PM
Edited Apr 25, 2008 at 2:54 PM
Actually, I'd need such functionality too.
I'm not sure if the excel format supports simple merging of two files, but maybe it'd be a good idea to support some kind of disk caching? I think a little bit is already implemented in the PHPExcel_Writer..
E.g. my script works on 1000 rows, and when I'm finished I send these rows to the disk cache and calculate the next 1000? Something like that'd be great. :)
Coordinator
Apr 28, 2008 at 6:19 AM
Saving in more steps is not the main issue: the writer actually does not consume that much memory. The base library (PHPExcel_*) is a little too heavy on objects. Still looking into a nice way to make it lighter without breaking any API...
Apr 28, 2008 at 9:18 AM
Well, I didn't mean the Writer, I meant the storage of the excel data - if there is some way to send parts of it to the disk cache.
Anyway, I hope you guys can figure something out. :)
May 5, 2008 at 11:53 AM
Argh, now I reached the limits. Generating a worksheet with 10,000+ rows requires over 1.5 GB memory usage. :(

What about using cache systems like Cache_Lite or memcache?

http://pear.php.net/package/Cache_Lite
http://www.php.net/manual/en/book.memcache.php
Jan 5, 2011 at 5:11 PM
Edited Jan 5, 2011 at 5:19 PM

Just installed v1.7.5 and run straight into memory problems trying to export 6000 rows from a MYSQL table into an Excel2007 spreadsheet. Seems running 15 cols I get to some 5000 rows before 128MB of RAM is exhausted. Been trying to solve this however seems that there are no way of doing this without bumping memory to the roof.

I know this is an old thread but I cant see to find any sollutions. I was thinking of breaking the export into several Excel files, however how could this solve anything as I would need to stitch them together which would bring me back to the initial problem...

Has anyone found a sollution to free up memory? Anyone found a sollution to trim down the system to not enable objects that I will not use?

Regards,

Kim Steinhaug - kim@steinhaug.com - www.steinhaug.no 

 

Coordinator
Jan 5, 2011 at 5:35 PM

Objects that you don't use aren't loaded... that's what the "Lazy Loader" manages.

Have you tried enabling cell caching? Using cell caching can reduce memory requirements to about 1/3.

 

Jan 7, 2011 at 9:36 AM

Thanks Mark!

I added this to my script:

$cacheMethod = PHPExcel_CachedObjectStorageFactory:: cache_to_phpTemp;
$cacheSettings = array( 'memoryCacheSize' => '32MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

Then the script completed iteration of 15 cols and 6800 records with 66MB footprint. Without that cache above the script would choke after 5000 rows and exhausted 120MB of RAM. Is the above code what you was referring to?

 

Regards,
Kim Steinhaug
kim@steinhaug.comwww.easywebshop.no 

Coordinator
Jan 7, 2011 at 9:51 AM
kimss wrote:

Then the script completed iteration of 15 cols and 6800 records with 66MB footprint. Without that cache above the script would choke after 5000 rows and exhausted 120MB of RAM. Is the above code what you was referring to?

 That's precisely what I was referring to. The default behaviour for PHPExcel is to maintain the entire workbook "in memory" (a default cacheMethod of PHPExcel_CachedObjectStorageFactory::cache_in_memory), which is fast access to cell data, but uses PHP's memory allocation resulting in "memory exceeded" errors with larger workbooks. The alternative cacheMethods store cell data outside of PHP memory (or in a compressed form in PHP memory) giving a trade off between reduced memory usage and slower execution speed; allowing you to work with much larger workbooks within the same PHP memory, but at a cost in speed.

Jan 7, 2011 at 10:04 AM

Im working on benchmarking now, and the workbook is created in 11s with the cached method while the Excel2007 writer takes 25s to create the file. This is a total of 36s to complete the export, which is no problem as I can expand the execution time.

At the top of your head, what would be the fastest writer to create a Excel file? Is there any ways to speed it up?

Thanks Mark!

Coordinator
Jan 7, 2011 at 10:31 AM
Edited Jan 7, 2011 at 10:32 AM
kimss wrote:

At the top of your head, what would be the fastest writer to create a Excel file? Is there any ways to speed it up?

 It should really depend what output format you want, rather than which is the fastest. Not all of the writers support the same feature: CSV is the fastest, for example,and uses the least memory; but has no formatting. Excel5 doesn't support formulae containing functions from the Analysis Toolpack, or autofilter, or document properties (yet).

You can find some comparisons between the performance of the different Writers in this thread http://phpexcel.codeplex.com/Thread/View.aspx?ThreadId=234150 and I've started to put together a chart showing which features are supported by which Readers and Writers in the Documentation (though it will take a long time before I've completed that).

 

If your worksheets include formulae, then you can improve writer speed by setting setPreCalculateFormulas(false). This tells the writer not to calculate the resulting value of any formula, and only the formula itself is written to the generated Excel file. The cost of this is an overhead when opening the file in MS Excel itself as it forces recalculation then (possibly with a prompt asking if you want to recalculate, depending on the version of Excel).

You can also set styles (including number format masks, bold, colours, etc) against rows or columns, rather than each individual cell, while you are building the worksheet. This will also help boost writer speed.

Jan 7, 2011 at 12:17 PM

Thank you for your reply, I really appretiate it!

Regards,
Kim Steinhaug