Reading is slow

Sep 29, 2010 at 2:20 PM
Edited Sep 29, 2010 at 2:21 PM

Hi,

I'm using PHPExcel 1.7.4 to read an excel 2003 file.

I experience major slowness : a 2Mo file takes 30 secondes to read.

The code I use is extremely basic :

require_once 'PHPExcel/PHPExcel.php';
$objPHPExcel = new PHPExcel();
$reader = PHPExcel_IOFactory::createReader('Excel5');
$objPHPExcel = $reader->load('C:\xampplite\htdocs\ca.xls');
$objWorksheet = $objPHPExcel->getActiveSheet("stats");
$cell = $objWorksheet->getCell('A1');
echo  $cell->getCalculatedValue();



I tested the same using fopen/fgetcsv on the same 2Mo file converted to CSV and it took 0,2s to open and browse the entire file.

Does someone have an idea ?

Thank you.

 

Coordinator
Sep 29, 2010 at 2:31 PM

There is a little bit of difference between reading a simple text-only CSV file using PHP's built-in functions compared with reading a proprietary binary file with datatyping and formatting information using code written in PHP. Comparing reading CSV using fopen/fgetcsv and reading XLS using PHPExcel is like comparing a ferrari with a giraffe.

If you don't need the formatting information, then do

$reader->setReadDataOnly(true);

between creating the reader and loading the file

 

and you don't need to instantiate a PHPExcel object when reading a file: the reader does that for you

Sep 29, 2010 at 9:38 PM

Well the CSV test was here to ensure there was no performance issue reading the file.

I formerly worked with PHPExcel for creating file and was very satisfied ; ff PHPExcel can't allow to read files bigger than 10mo (which is my case whatever memory or time I allow PHP) , there's no big deal,  I just want to know.

Coordinator
Sep 29, 2010 at 10:34 PM
Edited Sep 29, 2010 at 10:38 PM
sabotagex wrote:

ff PHPExcel can't allow to read files bigger than 10mo (which is my case whatever memory or time I allow PHP) , there's no big deal,  I just want to know.

 Can you please advise me of where this limit is imposed by PHPExcel. This is the second time that somebody has mentioned a 10Meg limit in threads here; and I'm not aware of any such limit. If I know where the limit is set, then I can remove it. The only limits imposed are those of Excel (e.g. 65536 rows and 256 columns per worksheet when using Excel5) or of PHP (memory_limit).

Sep 30, 2010 at 6:36 AM

My initial file was 30Mo large and is nothing more than datas without calcutations or formating.

Having my issue, I cropped 2/3 of the lignes, giving  a10Mo.

I first outpassed the 256Mo PHP memory limit on my server so I set to no memory limit.

I then outpasses the 60s PHP time execution for a script, so I set to no time limit.

The PHP process the took around 300Mo memory and between 50 and 90% of processor capacity and I choose to stop it after 5min running.

 

 

Coordinator
Sep 30, 2010 at 9:57 AM

Yes, PHPExcel is memory hungry; and because it's written in PHP rather than (for example) C or C++ like many of the PHP modules, it's slower than compiled code would be.

File size isn't a good measure for workbook files when working with PHPExcel. The number of rows and columns (ie cells) is more important. You might also say that your worksheet has no formatting, but that isn't strictly true: Excel applies a default formatting to all cells, so it is there.

The PHPExcel code itself has a footprint of between 10 and 25MB, depending on which components are being accessed.

At present, each cell in a workbook takes on average 1k of memory (without any caching) or 1.6k on 64-bit PHP - I'll assume 32-bit PHP for the moment - so (for example) a worksheet of 8000 lines with 31 columns (248,000 cells) will be about 242MB. With cell cacheing (such as php://temp or DiskISAM), that can be reduced to about a third, so the 8000 lines by 31 columns will require about 80MB.

As a typical rule of thumb, I'll double that value if you need to write the workbook as well, so that 80MB would become 160MB, plus the code footprint of 10-25M... giving an expected memory requirement of 185MB.

There are a number of options available to help you reduce the memory usage:

Cell caching reduces memory usage, but at a cost in speed?

require_once './Classes/PHPExcel.php'; 
 
$cacheMethod = PHPExcel_CachedObjectStorageFactory:: cache_to_phpTemp; 
$cacheSettings = array( ' memoryCacheSize ' => '8MB'); 
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings); 
 
$objReader = PHPExcel_IOFactory::createReader('Excel2007'); 
$objPHPExcel = $objReader->load("test.xlsx"); 

Additionally, there are a host of other techniques for reducing memory usage described in this thread

 

A lot of factors can affect speed of execution. I've created a simple Excel5 workbook with a single worksheet comprising 32769 rows and 17 columns (total 557073 cells, and a file size of 28MB) and run some tests:

Total time to read the workbook was 491 seconds, and Peak memory usage was 731.75 MB (no caching, and full workbook read)
Total time to read the workbook was 469 seconds, and Peak memory usage was 718 MB (no caching, but with ReadDataOnly set to true)
Total time to read the workbook was 802 seconds, and Peak memory usage was 319.75 MB (with phpTemp caching using a 32MB memory cache, and full workbook read)
Total time to read the workbook was 840 seconds, and Peak memory usage was 290.25 MB (with phpTemp caching using a 32MB memory cache, and with ReadDataOnly set to true)

Admittedly, this is on a heavily loaded server, with a slower processor and only 2GB memory. However, it gives a general feel for the effects of using cell caching and/or ReadDataOnly.

Believe me: if I could figure out a way of loading workbooks more quickly, and of using less memory, it would already be implemented