Writing Large Dataset to Excel XLSX Failing

Topics: Developer Forum
Aug 24, 2012 at 9:20 PM
Edited Aug 24, 2012 at 9:22 PM

I am having a problem creating an XLSX file that contains a very large dataset. The data consists of 24 columns and 23,422 rows.

Originally I was writing data to each cell individually using a loop. After doing research, I found that fromArray() should be used as it's more efficient. I modified my code to use fromArray() and it's still failing. I then found some information about caching and implemented that. Same results. Based on different changes to the code, I'm getting either a maximum execution time exceeded error (script was initially set to 90 seconds) or (if I set max execution time to 0) an allowed memory exhausted error, and I've set allowed memory to 384MB.

Here's a code snippet of what I'm doing. Note, when I run this code with a smaller dataset, it functions just fine. It seems to be dying with the larger dataset.

 

// Initiate cache
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array( 'memoryCacheSize' => '32MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

// set up excel workbook
$excel = new PHPExcel();
$excel->getProperties()
		->setCreator('creator')
		->setLastModifiedBy('username')
		->setTitle('Title of Report')
		->setSubject('Title of Report')
		->setDescription('Title of Report, run on ' . date('m/d/Y H:i:s'));
$excel->createSheet(0);
$excel->setActiveSheetIndex(0);
$excel->getActiveSheet()->fromArray($source);

// save file
$writer = new PHPExcel_Writer_Excel2007($excel);
$filename = 'report-'.time().'.xlsx';
$writer->save('/path/to/'.$filename);

 

I've been researching the web and this forum for an hour or two trying to find related information but I'm now stuck. Any ideas?

Aug 24, 2012 at 9:25 PM

Hello CreedFeed,

the below code could help you.

/** Set Memory Limit */
    ini_set("memory_limit","500M"); // set your memory limit in the case of memory problem

if necessary you can try go up to 1000M

BR
Frank

Aug 24, 2012 at 9:31 PM

Unfortunately the server is configured to override ini_set with the value in php.ini and I am unable to change that.

Besides the memory limit, it is taking forever to run. In my code snippet above, I removed the $excel->getProperties section of code and am rerunning it now. It's been running for about 5 minutes now and just finally died.

I should add, the error I'm getting is:

PHP Fatal error:  Allowed memory size of 402653184 bytes exhausted (tried to allocate 48 bytes) in /path/to/PHPExcel/CachedObjectStorage/CacheBase.php on line 155

It seems weird the cache code is the one eating up the memory, shouldn't it be using phpTemp to store the data? (I have no idea how the cache code works lol)

Aug 24, 2012 at 9:44 PM
Edited Aug 24, 2012 at 9:45 PM

Try this one to use discISAM see manual this had help me some time but now I've so large files so I need myself help from the team see my Discussion point http://phpexcel.codeplex.com/discussions/392997

/** Caching to discISAM 1.0*/
$cacheMethod = PHPExcel_CachedObjectStorageFactory:: cache_to_discISAM;
$cacheSettings = array( 'dir'  => '/usr/local/tmp' // If you have a large file you can cache it optional
                      );
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

Coordinator
Aug 25, 2012 at 12:00 AM
CreedFeed wrote:

Besides the memory limit, it is taking forever to run. In my code snippet above, I removed the $excel->getProperties section of code and am rerunning it now. It's been running for about 5 minutes now and just finally died.

Removing the getProperties() section of the code is a minimal saving, it will make no perceptible difference in time or memory usage.

Using fromArray() is more efficient than writing to each cell individually, but if you have 24 columns and 23,422 rows in your $source array, then $source is taking a lot of memory. An easier compromise (especially if you're building your $source one row at a time, perhaps from a database query) is to populate each row in turn in $source, then use fromArray() for that row, then repopulate $source with the next row, and so on.

 

CreedFeed wrote:

It seems weird the cache code is the one eating up the memory, shouldn't it be using phpTemp to store the data? (I have no idea how the cache code works lol)

 The cache code isn't eating the memory... the cell data is. All cells are stored in a "cache", but the default cache is fully in memory. The cache handler logic serves as a wrapper for different storage media such as memory, php://temp, sqlite, disk, APC, etc as described in the manual, each of which then has its own individual code. It will only use php://temp if you've told it to use php://temp. Even then, it maintains an in-memory index, allowing it to access the cell more quickly in the cache.

All caching is a trade-off between speed and memory. The lower the memory usage, the more execution time overhead in its use. "Memory" is the fastest, but uses the largest amount of memory, SQLite is the most efficient use of memory, but among the slowest methods. The caching wrapper

Aug 27, 2012 at 2:37 PM

Mark > I took your suggestion and rewrote my script to use fromArray() once for each row, because as you assumed, I was pulling data from a single query. I have good news and bad news. Good news is, it worked! 24 columns and 23,422 rows of data was written. The bad news: it took about 15 minutes to finish. I used the following code:

ini_set("memory_limit", "384M"); // set in php.ini, I cannot change this
set_time_limit(0);

// Initiate cache
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
$cacheSettings = array( 'memoryCacheSize' => '32MB');
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

// set up excel workbook
$excel = new PHPExcel();
$excel->getProperties()
		->setCreator('creator')
		->setLastModifiedBy('username')
		->setTitle('Title of Report')
		->setSubject('Title of Report')
		->setDescription('Title of Report, run on ' . date('m/d/Y H:i:s'));
$excel->createSheet(0);
$excel->setActiveSheetIndex(0);

// set up query here
$query = "MY QUERY";
$obj = new Object();
$obj->query($query);
while ($obj->fetch()) {

	$source = array();
	$source[0] = array( my data here );

	$excel->getActiveSheet()->fromArray($source);

	unset($source);
}
	
// save file
$writer = new PHPExcel_Writer_Excel2007($excel);
$filename = 'report-'.time().'.xlsx';
$writer->save('/path/to/'.$filename);

Anything else you can recommend to speed this up some? Is there a different cache method that would increase time? Anything?

Thanks for your feedback up to this point! I really appreciate it, and appreciate your work on this project! It's a great library!