How to read large xlsx file using PHPExcel

Topics: Developer Forum, Project Management Forum, User Forum
Nov 21, 2013 at 11:37 AM
Edited Nov 21, 2013 at 11:43 AM
Hello Everyone

I want to load large file that is 90mb. when i am loading small file that is 4mb it work fine. but when large file is loaded it giving me Apache http server has stopped working

my code is:
<?php

require_once 'PHPExcel/Classes/PHPExcel.php';
$inputFileType = PHPExcel_IOFactory::identify('mylarge.xlsx');

                $objReader = PHPExcel_IOFactory::createReader($inputFileType);  

                $objReader->setReadDataOnly(true);

                /**  Load $inputFileName to a PHPExcel Object  **/  
                $objPHPExcel = $objReader->load('mylarge.xlsx');

                $total_sheets=$objPHPExcel->getSheetCount(); 

                $allSheetName=$objPHPExcel->getSheetNames(); 
                $objWorksheet = $objPHPExcel->setActiveSheetIndex(0); 
                $highestRow = $objWorksheet->getHighestRow(); 
                $highestColumn = $objWorksheet->getHighestColumn();  
                $highestColumnIndex = PHPExcel_Cell::columnIndexFromString($highestColumn);  
                for ($row = 1; $row <= $highestRow;++$row) 
                {  
                    for ($col = 0; $col <$highestColumnIndex;++$col)
                    {  
                        $value=$objWorksheet->getCellByColumnAndRow($col, $row)->getValue();  

                              $arraydata[$row-1][$col]=$value; 


                    }  

                }



        print_r($arraydata);

?>
i have changed my php.ini file memory_limit = -1 and also changed maximum execution time.

can any one tell me what was wrong.

thank you
Nov 21, 2013 at 1:41 PM
And no message from php about impossibility of having memory?
Try to adjust the memory limit on the basis of material resources, added spies to see the evolution, during course of the file, the occupation of memory (Mark considers that it takes about a KB per cell).
Try to implement cache techniques offered by PHPExcel (SQLite3 is one that saves the most)
Nov 22, 2013 at 4:33 AM
Hello LWol,

Thank you for reply. i tried PHPExcel SQLite3 as following in my code:

<?php

require_once 'PHPExcel/Classes/PHPExcel.php';

                 $cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_sqlite3;  /* here i added */
                $cacheEnabled = PHPExcel_Settings::setCacheStorageMethod($cacheMethod);
                if (!$cacheEnabled)
                {
                    echo "### WARNING - Sqlite3 not enabled ###" . PHP_EOL;
                }
                $objPHPExcel = new PHPExcel();
                     $inputFileType = PHPExcel_IOFactory::identify('mylarge.xlsx');

                $objReader = PHPExcel_IOFactory::createReader($inputFileType);  

                $objReader->setReadDataOnly(true);

                /**  Load $inputFileName to a PHPExcel Object  **/  
                $objPHPExcel = $objReader->load('mylarge.xlsx');

                $total_sheets=$objPHPExcel->getSheetCount(); 

                $allSheetName=$objPHPExcel->getSheetNames(); 
                $objWorksheet = $objPHPExcel->setActiveSheetIndex(0); 
                $highestRow = $objWorksheet->getHighestRow(); 
                $highestColumn = $objWorksheet->getHighestColumn();  
                $highestColumnIndex = PHPExcel_Cell::columnIndexFromString($highestColumn);  
                for ($row = 1; $row <= $highestRow;++$row) 
                {  
                    for ($col = 0; $col <$highestColumnIndex;++$col)
                    {  
                        $value=$objWorksheet->getCellByColumnAndRow($col, $row)->getValue();  

                              $arraydata[$row-1][$col]=$value; 


                    }  

                }



        print_r($arraydata);

?>


But i am getting same problem. can you please help me on this. where i am going wrong.

thank you.
Nov 22, 2013 at 8:40 AM
Your file contains multiple worksheets? If Yes, do you need all at the same time? One possibility would be to say to the reader the sheet to load, to do multiple passes if necessary.
For a given worksheet, you can perhaps recover data in several times, through a readFilter, storing them in a temporary table (and not in memory).

You will find examples of implementation in document "PHPExcel User Documentation - Reading Spreadsheet Files", chapters 4.2 and 4.3.
To combine with the use of the cache as you did.

Add debugging information during the various stages for memory used
Nov 22, 2013 at 10:13 AM
Hello LWol,

In my large file there is only one sheet. and tried all cache method which are here but i am faceing same problem.

refereed this and tried all.

i did cache technique but still not got.

Here is code:
<?php

require_once 'PHPExcel/Classes/PHPExcel.php';

$inputFileName = 'largefile.xlsx';

$inputFileType = PHPExcel_IOFactory::identify($inputFileName);

/**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter  */

class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
    private $_startRow = 0;

    private $_endRow = 0;

    /**  Set the list of rows that we want to read  */
    public function setRows($startRow, $chunkSize) {
        $this->_startRow    = $startRow;
        $this->_endRow        = $startRow + $chunkSize;
    }

    public function readCell($column, $row, $worksheetName = '') {
        //  Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
        if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
            return true;
        }
        return false;
    }
}


echo 'Loading file ',pathinfo($inputFileName,PATHINFO_BASENAME),' using IOFactory with a defined reader type of ',$inputFileType,'<br />';
/**  Create a new Reader of the type defined in $inputFileType  **/

$objReader = PHPExcel_IOFactory::createReader($inputFileType);



echo '<hr />';


/**  Define how many rows we want to read for each "chunk"  **/
$chunkSize = 20;
/**  Create a new Instance of our Read Filter  **/
$chunkFilter = new chunkReadFilter();

/**  Tell the Reader that we want to use the Read Filter that we've Instantiated  **/
$objReader->setReadFilter($chunkFilter);

/**  Loop to read our worksheet in "chunk size" blocks  **/
/**  $startRow is set to 2 initially because we always read the headings in row #1  **/

for ($startRow = 2; $startRow <= 240; $startRow += $chunkSize) {
    echo 'Loading WorkSheet using configurable filter for headings row 1 and for rows ',$startRow,' to ',($startRow+$chunkSize-1),'<br />';
    /**  Tell the Read Filter, the limits on which rows we want to read this iteration  **/
    $chunkFilter->setRows($startRow,$chunkSize);
    /**  Load only the rows that match our filter from $inputFileName to a PHPExcel Object  **/
    $objPHPExcel = $objReader->load($inputFileName);

    //    Do some processing here

    $sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);
    var_dump($sheetData);
    echo '<br /><br />';
}
?>
can you please tell me what is the problem.


and i also tried to convert this xlsx file to CSV but that is also coming same error

Code to convert xlsx to csv:
<?php
require_once 'PHPExcel/PHPExcel/IOFactory.php';
$excel = PHPExcel_IOFactory::load("large.xlsx");
$writer = PHPExcel_IOFactory::createWriter($excel, 'CSV');
$writer->setDelimiter(";");
$writer->setEnclosure("");
$writer->save("test123.csv");
?>

is any other way to read xlsx large file?

Thank you.
Nov 25, 2013 at 12:02 PM
OK, there is the readfilter, but where is the cache?
And it is necessary to add logs to the evolution of the memory required.
Prior to that, how many cells are present in your workbook?
How much memory can ask php? PHP version x86 or x64?
If you beyond the limits of the 'possible'...
Nov 26, 2013 at 8:34 AM
Hi LWol,

Thank you for your reply .

I have added cache. even those it is not working.

my large file has 9260442 cells.

and PHP version x86.

Thank you.
Nov 26, 2013 at 2:15 PM
Edited Nov 27, 2013 at 6:01 AM
Using the direct method (full load, no cache), you would need 9GB, that you can't get in x 86.
The method of filtering can reduce the need, but after test, the script that you use does not free memory.
But if you add
$objPHPExcel-&gt; disconnectWorksheets();
unset ($objPHPExcel);
After having recovered data (you can also add some echo of memory_get_usage to give you an idea of what it uses as memory), you recover memory.
Remake a test as that and be patient: the file must be read as many times as it passes through the load().
Then, see if you can adjust the chunksize and if the final startrow is correct.
You can try with and without cache to find the best compromise (time / memory).
(Note: test conducted with a workbook containing a sheet with 30 columns and ~ 30 k lines, representing more than 900 k cells - 1/10 of what you have, but on my small test machine, I can hardly go further - direct loading fails - it exceeds 128 MB allowed - but pieces loading is successful, memory_get_usage reports ~ 25 MB used, memory_get_peak_usage reported 45 MB. I could therefore double the size of the chunk. Edit : With SQLite3 cache, get_usage : ~5MB, peak : ~30MB
The loading takes of course much time)
Jul 1, 2014 at 2:55 AM
Dec 29, 2014 at 5:43 AM
Recently i am doing my work with barcode,I want read generated code128 in Excel file.So i find the following method of reading xlsx file using PHPExcel.Now i'd like to share with you guys.
First, we need to create a reader for our file. PHP Excel makes this easy for us, as we can use PHPExcel_IOFactory::createReaderForFile() method. It will try to detect filetype by analyzing the extension of the provided file. It can read XLS, XLSX, CSV, PDF and probably even more file formats. So to create a reader, use this code:
1   require('PHPExcel.php'); // found when you download the PHPExcel
2   $Reader = PHPExcel_IOFactory::createReaderForFile($file);
3   $Reader->setReadDataOnly(true); // set this, to not read all excel properties, just data
Reader is ready and I’ve also specified setReadDataOnly() to read only data, not all the properties around it. Usually we don’t need all the properties and this API is quote memory expensive, so try to read as little as possible to get the job done.
Next step is to load the file – you do that by calling load() method:
1   $objXLS = $Reader->load($file);
To read a value from cell A1 in sheet 0 (first sheet) use this code:
1   $value = $objXLS->getSheet(0)->getCell('A1')->getValue();
2   // or to get calculated value, if there is a formula, etc
3   $value = $objXLS->getSheet(0)->getCell('A1')->getCalculatedValue();
API is huge, so please check their not-so-well written documentation about it on their web page.

As I’ve written before, this API is using quite some memory to read in the Excel files, so if you have a long running script, it’s a good practice to unload it as soon as possible.
You can do that by calling disconnectWorksheets() method.
1   $objXLS->disconnectWorksheets();
2   unset($objXLS);