Processing Time for large file

Nov 4, 2014 at 9:58 PM
Looking for any advice to spped up processing time.

I've got a 23mb .xls file, 33 columns, ~ 49,000 rows.

I used the following snipped to read it in chunks so I'm not getting a timeout error anymore, but processing time is still extrodinarily slow. for chunks of 1024, it's taking ~ 20 seconds per chunk.
    /**  Create a new Reader of the type defined in $inputFileType  **/ 
$objReader = PHPExcel_IOFactory::createReader('Excel5'); 

/**  Define how many rows we want to read for each "chunk"  **/ 
$chunkSize = 1024; 
/**  Create a new Instance of our Read Filter  **/ 
$chunkFilter = new ChunkReadFilter(); 

/**  Tell the Reader that we want to use the Read Filter  **/ 
$objReader->setReadFilter($chunkFilter); 

    $x = new \DateTime();
    echo $x->format('H:i:s')."\n"; 
/**  Loop to read our worksheet in "chunk size" blocks  **/ 
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) { 
    /**  Tell the Read Filter which rows we want this iteration  **/ 
    $chunkFilter->setRows($startRow,$chunkSize); 
    /**  Load only the rows that match our filter  **/ 
    $objPHPExcel = $objReader->load($this->importDir.$fileName); 
    //    Do some processing here
    var_dump($chunkSize);
    $x = new \DateTime();
    echo $x->format('H:i:s')."\n"; 

Is there anything that can be done to speed up the process?

a) does phpexcel support opcache?
b) would switching to it benefit at all?
Coordinator
Nov 4, 2014 at 10:25 PM
If you're using OpCache, it doesn't care one way or another what PHP code you're running, it simply works.... it's simply a bytecode cache. It saves on the parsing time for the code, and will work with any PHP code..... nothing special is required in that code for it to work with OpCache, so PHPExcel doesn't need to support OpCache, OpCache will simply work with it in the same way that OpCache works with any PHP script.

Using read chunking will slow down your code. Read chunking is a method to reduce memory consumption of PHPExcel by loading only part of the spreadsheet into memory rather than all of it in one go, but at a cost in speed. If you can read the entire workbook into memory in one go, it will be faster. If you don't have enough memory to load it all in one go, then work with the largest chunk size that you can hold in memory so that it uses fewer iterations of the chunking