HowTo Set up Disk (Full) cell storage (using a temporary file for each cell)

Topics: Developer Forum
Jul 30, 2010 at 12:30 PM

I've tested on a small range and the various cache engines are accepted and change the memory use.

So I can at least confirm it is no jedi mind trick.

cache_to_discISAM

22 M
cache_in_memory_serialized 21 M
cache_in_memory 32 M
cache_to_phpTemp

22 M

This thread http://phpexcel.codeplex.com/Thread/View.aspx?ThreadId=203384 has a table with an extra cache strategy

Disk (Full) cell storage (using a temporary file for each cell)

How do I set this up ?

as I am failing on memory_limit 256M

Coordinator
Jul 30, 2010 at 12:59 PM

Disk (Full) Cell Storage was never released because it was so slow as to be unusable, as can be seen from the results table:

Cell Storage Method Memory Usage
(MB)
Cell Creation time
(seconds for
16,384 cells)
Cell retrieve time
(seconds for
16,384 cells)

Cell modify time
(seconds for
16,384 cells)

Current cell storage (at 1.7.2 release) 12.25 0.2293 0.0674 0.2117
Serialized cell storage 9 0.4141 0.3384 1.5574
APC cell storage 0 0.9113 0.5387 2.8961
Memcache cell storage 0 11.6727 5.6859 18.4403
Disk (Partial) cell storage
(using an ISAM file for all cells)
4.75 1.0742 0.7211 2.9781
Disk (Full) cell storage
(using a temporary file for each cell)
0 45.04405 10.4741 66.5842

Cell caching is a trade-off between reduced memory usage and increased execution time, and I made an arbitrary decision that taking over a minute to read/modify/rewrite just 16,384 cells (compared with sub 3 seconds for most of the other caching methods) was too much speed overhead for the memory gain that it provided... especially when compared with cache_to_discISAM or cache_to_phpTemp.

 

In practise, none of the cell caching mechanisms reduces cell memory usage to 0. As a further compromise, the cell caching engine still retains an array of cell IDs in memory to reduce the performance overheads of identifying whether a cell exists or not, or to retrieve a list of the cells in a row or column. 100% caching is just too slow, and this trade-off allowed some reduction in memory usage (allowing more than double the size of worksheets to be held in the same memory as previous versions) while retaining an acceptable level of execution time.

I'd love to be able to do full caching, including that array, but (for the moment at least) it simply isn't practical.

 

We are still actively trying to reduce the memory footprint further, and I am still running performance tests on other potential improvements, particularly in the writers to see where we can make additional savings, but haven't highlighted any significant gains as yet. I've also been playing with a method for dynamically increasing and reducing PHP memory as needed (although this isn't always an option in shared hosting environments), while still keeping to a minimum so that concurrent server requests aren't being adversely affected, and this may appear in the next release.

Jul 30, 2010 at 2:30 PM

Hi Mark

    $cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
    $cacheSettings = array( ' memoryCacheSize '  => '256MB' );

    peak memory usage of 46M for 498 products
    peak memory usage of 52M for 598 products
    peak memory usage of 78M for 998 products
    peak memory usage of 2000 products :=
<b>Fatal error</b>: 
  Call to a member function attach() on a non-object in
  <b>/home/potncom/public_html/supplyant/PHPExcel/Classes/PHPExcel/CachedObjectStorage/PHPTemp.php</b> on line <b>98</b><br />

I wasn't expecting that.

Are there any guidelines for cache size setting and tradoffs ?