Problem loading large xlsx file even with chunks

Topics: Developer Forum, User Forum
Apr 8, 2014 at 8:38 AM
Hi,

I'm having problems loading a large xlsx file (300.000 rows with 5 columns), about 15 mega in size even if i use chunks.
Bellow is my code, and some debugging:
try {
        $inputFileType = PHPExcel_IOFactory::identify($inputFileName);
        $objReader = PHPExcel_IOFactory::createReader($inputFileType);
        $workSheetInfo = $objReader->listWorksheetInfo($inputFileName);
} catch(Exception $e) {
        return $this->__set_error("Parsing file error");
}

$totalRows = $workSheetInfo[0]['totalRows'];
$MemoryLimit = getMemorySizeAsBytes();
$BufferFactor = 1;
$RowMemory = (1024 * 1) * $workSheetInfo[0]['totalColumns'];
$chunkSize = max(intval(($MemoryLimit / $BufferFactor) / $RowMemory), 100);
$chunkSize = min($totalRows, $chunkSize);
$chunkSize = 1000; // THIS IS FOR TESTING PURPOSE ONLY, EVEN IF I LOWER THE CHUNK SIZE NO EFFECT AT ALL

for ($startRow = 1; $startRow <= $totalRows; $startRow += $chunkSize) {
    $chunkFilter = new chunkReadFilter($startRow, $chunkSize);
    $objReader->setReadFilter($chunkFilter);
    $objPHPExcel = $objReader->load($inputFileName);
    
    $last = ($startRow - 1 ) + $chunkSize;
    for ($r=$startRow; $r <= $last; $r++) {
        $rowData = array();

        for ($c='A'; $c <= $workSheetInfo[0]['lastColumnLetter']; $c++) {
            $rowData[] = $objPHPExcel->getActiveSheet()->getCell($c.$r)->getValue();
        }
        // create an sql query to insert into database, using transactions, after each chunk phase
    }
}
The class for chunk reading:
class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
    private $_startRow = 0;

    private $_endRow = 0;

    /**  We expect a list of the rows that we want to read to be passed into the constructor  */
    public function __construct($startRow, $chunkSize) {
        $this->_startRow    = $startRow;
        $this->_endRow      = $startRow + $chunkSize;
    }

    public function readCell($column, $row, $worksheetName = '') {
        //  Only read the heading row, and the rows that were configured in the constructor
        if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
            return true;
        }
        return false;
    }
}
I'm getting the error: Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 79794656 bytes) in /var/www/xxxx/htdocs/Excel/Classes/PHPExcel/Reader/Excel2007.php on line 327

I've added some trace for the memory usage and to see where the script stops and here are the results:

2014-04-08 11:12:08 Filename: /tmp/phpPQ3BPI, File size: 15533695 - 2895984
2014-04-08 11:12:08 Identify: Excel2007 - 3765312
2014-04-08 11:12:08 Create reader - 3765948
2014-04-08 11:12:20 ChunkSize: 1000, Total rows: 329031, Total cols: 5 - 3769888
2014-04-08 11:12:20 Before loading file: 1 chunk - 3770052

The values after "-" are the result of memory usage function emory_get_usage() and the unit measure are bytes. So, the files size of the xlsx is around 15 mega.

Any ideas what can it be?

Any help will be greatly appreciated

Thanks.
May 1, 2015 at 3:15 PM
@mohorab where you able to find a solution to your issue? I have a similar one, in my case I don't have the memory problem but when I start reading the huge sheet (more than 750K rows with 5 columns) my entire system is useless, I can't even move the mouse.

Thanks in advance.