Problem of Speed

Topics: Developer Forum
Jan 6, 2011 at 11:49 AM

Hello,

I follow your project for 2 years now. Unfortunately, whereas phpexcel has great functionnality, I found it's really slow compared to basic PEAR-excel module.

Here are some tests I've done :

PHPExcel, generates excel 2007 file :
with standard cache (memory) : 2min15s and it uses 220Mo RAM
with gzipped cache : 3min59s and it uses 141Mo RAM

PEAR generatig excel5 file :
it takes 1min21s and 79Mo

My file contains 25 columns and 4573 rows on only one worksheet.

Do you have any clue in order to improve performance please?

With PEAR, I'm used to regenerate big file (20Mo-40Mo), and I'm not able to used phpexcel as it's slower than PEAR.

Thank you in advance for your help!

Coordinator
Jan 6, 2011 at 12:31 PM
Edited Jan 6, 2011 at 1:07 PM
MarcParis wrote:

Hello,

I follow your project for 2 years now. Unfortunately, whereas phpexcel has great functionnality, I found it's really slow compared to basic PEAR-excel module.

Yes, PHPExcel is slower than the old PEAR Spreadsheet Writer, as I'm constantly being reminded.

  1. Things to remember: PHPExcel both reads and writes, while you need different (and incompatible) PEAR libraries if you want to do both.
  2. PHPExcel supports a wider range of spreadsheet formats than PEAR.
  3. PHPExcel supports a wider range of Excel features than PEAR.
  4. PHPExcel is still supported, and still actively being developed to add new features; PEAR is not.

 

MarcParis wrote:

PHPExcel, generates excel 2007 file :
with standard cache (memory) : 2min15s and it uses 220Mo RAM
with gzipped cache : 3min59s and it uses 141Mo RAM

PEAR generatig excel5 file :
it takes 1min21s and 79Mo

Well you are comparing writing xlsx with one library, and writing xls with another. Why not compare writing xls with both? Especially as PHPExcel's Excel5 Writer is faster than the Excel2007 Writer. PHPExcel will still be slower than PEAR, but at least you're comparing like with like.

 

MarcParis wrote:

Do you have any clue in order to improve performance please?

Yes, but I'm not going to rewrite PHPExcel in C. You'll get far faster performance (even than PEAR) if you use Ilia Alshanetsky's Excel extension (xls only, and requires commercial component) or PHP's .COM extension (requires a COM enabled spreadsheet program such as MS Excel or OpenOffice Calc running on the server).

 

I have spent most of the last 4-months doing nothing but improve memory usage and performance, at the expense of a great deal of my sanity. If I had a simple fix (that didn't entail scrapping PHPExcel in its entirety and restarting from scratch), it would have been implemented already.

All I can promise is that I will spend endless hours testing every single new feature that I add to ensure that it uses as little memory as possible, and executes as quickly as possible. At the same time, if I can improve execution speed, and/or reduce memory usage still further, I will do so; but not at the expense of functionality, or backward compatibility.

MarcParis wrote:

Do you have any clue in order to improve performance please?

If your worksheets include formulae, then you can improve writer speed by setting setPreCalculateFormulas(false). This tells the writer not to calculate the resulting value of any formula, and only the formula itself is written to the generated Excel file. The cost of this is an overhead when opening the file in MS Excel itself as it forces recalculation then (possibly with a prompt asking if you want to recalculate, depending on the version of Excel)

You can also set styles (including number format masks, bold, colours, etc) against rows or columns, rather than each individual cell, while you are building the worksheet.

Coordinator
Jan 6, 2011 at 2:06 PM

Might I add: PHPExcel is not *only* about writing spreadsheets. It can convert between formats, it supports operations in a worksheet (like calculating a formula, which I still think is an insane cool feature!). My suggestion is that if you only need writing capabilities and only XLS, go with PEAR's writer. If you need anything else with more features and that is future proof, go with PHPExcel.

Jan 6, 2011 at 3:37 PM

Thank you for your answer. I understand.

I've made one extra test with excel5 writer on phphexcel

PHPExcel, generates excel5 (based on pear)  file :
with standard cache (memory) : 2min09s and it uses 213Mo RAM


PHPExcel, generates excel 2007 file :
with standard cache (memory) : 2min15s and it uses 220Mo RAM
with gzipped cache : 3min59s and it uses 141Mo RAM

PEAR generatig excel5 file :
it takes 1min21s and 79Mo

My file contains 25 columns and 4573 rows on only one worksheet.


Thus, for my point of view, writersof phpexcel (at least excel 2007 and excel5) are not the roadblock. I would say that the issue is directly phpexcel class.

@Mark Baker : I understand now that already compiled program are faster than php. Thank you for the links you gave me, but I won't use them.
Is there any way to load or compile in advance phpexcel?

Anyway as far as I remember, phpexcel 1.7.5 is far more faster than version I've tested in past (1.5.5)

Right now my application is using PEAR, but as my company has switched to excel 2007, I'm truly willing to apply phpexcel..:D Maybe I can help..:D

@Maartenba : I want it both : functions of phpexcel and speed of PEAR..:)

Coordinator
Jan 6, 2011 at 5:07 PM
MarcParis wrote:

@Mark Baker : I understand now that already compiled program are faster than php. Thank you for the links you gave me, but I won't use them.
Is there any way to load or compile in advance phpexcel?

Anyway as far as I remember, phpexcel 1.7.5 is far more faster than version I've tested in past (1.5.5)

Right now my application is using PEAR, but as my company has switched to excel 2007, I'm truly willing to apply phpexcel..:D Maybe I can help..:D

@Maartenba : I want it both : functions of phpexcel and speed of PEAR..:)

There is a table showing comparisons between execution speed and memory usage for different versions of PHPExcel here based on reading, updating and writing a fairly generic workbook. While this only goes back as far as version 1.7.0 (not as far back as 1.5.5) - mainly because some of the features of my test workbook (while fairly basic) simply weren't supported in earlier releases (and aren't supported in the PEAR Reader and Writer either) - it does show the improvements in the 1.7.5 quite markedly. Believe me, if we could give you the functionality of PHPExcel with the speed of PEAR (or Ilia Excel extension we would).

There is one alternative to PHPExcel for creating/writing xlsx files that I'm aware of: http://sourceforge.net/projects/excelwriterxml/

Native-compiled programs are generally significantly faster than scripted languages (even those that use JIT and compile to bytecode like PHP), but PHP is still a scripted language, so programs in PHP are generally not native-compiled. There are some compilers available (ranging from Facebook's HiPHoP to phc or RoadsEnd) but these generate full applications rather than library code for modules or extensions, and it is also necessary to create a native binary for every operating platform that you need to support... Linux, Windows, Mac, 32-bit, 64-bit, thread safe, and all combinations thereof) The best alternative to a compiler is to install an opcode cache such as APC, ionCube, eAccelerator or WinCache... this will benefit all of your PHP applications (not just libraries like PHPExcel).

I have looked at the possibilities of using phc to convert the PHPExcel source to a PHP extension (it's the only PHP compiler that doesn't only build complete PHP applications) and (theoretically) this should more than double execution speed (in optimized mode) but it has some problems with dependecies on other PHP modules like GD (which PHPExcel uses); and only created extensions for Linux. The gcc code that it produces would need to be built manually for other platforms such as Windows and Mac).

 

We're always glad for any offers of help, and I'm more than happy to have another set of eyes looking over the code. There's every possibility that a fresh perspective will notice things that I might miss.

Currently I'm looking at the Excel5 OLE writer (it needs completely rewriting to handle certain additional features such as workbook properties that aren't currently supported); the cell caching (I believe storing the cellCollection as a 2-D array might be faster, and would help resolve some problems with the iterators); and the calculation engine (looking at how I might boost speed there)... so those blocks of code are subject to frequent change in SVN.

Jan 6, 2011 at 7:17 PM

indeed you are right, if you compile php directly, you will lose compatibility between platform. As I'm developping on windows, and my main server is on Linux, I'll will need to stay on php files..:)

I'll have a look to code, maybe I can find some tips. I was thinking to declare writer type when initialyze phpexcel class, in order to load only necessary module. I guess this can help speeding the script.
For example, you can also precise complexity of excel2007 file : low, medium, high.

What do you think about that?

ps : I've tried pdf writer...and it was really long to generate...:)

Coordinator
Jan 6, 2011 at 8:18 PM
MarcParis wrote:

I'll have a look to code, maybe I can find some tips. I was thinking to declare writer type when initialyze phpexcel class, in order to load only necessary module. I guess this can help speeding the script.
For example, you can also precise complexity of excel2007 file : low, medium, high.

The idea has a few drawbacks... for example, if you're loading the PHPExcel object from a file simply to read, or because you want to convert it to a different format, or you want to generate several different formats. And the different classes/modules are only loaded when actually needed, courtesy of the "Lazy Loader", so pre-declaring the writer

There's also not much overhead in the way the readers generate a PHPExcel object, or the way the writers read from the PHPExcel object to extract the data they need. The slowest single process for the writers is the garbage collection, within the PHPExcel_Worksheet class; and then the myriad calls to each (sub-)object's getHash() method.

 

MarcParis wrote:

ps : I've tried pdf writer...and it was really long to generate...:)

 Yes, the latest version of tcPDF is very slow, and uses a lot of memory. Unfortunately, the upgrade fixed a number of issues people had with the PDF generation, but at a high cost. Somewhere down the line, I'll need to look at alternatives, or at forking the tcPDF code to optimize it.

Jan 7, 2011 at 1:54 PM

I've done others right now.

On same file generation, pear do the job in 7seconds and phpexcel in 60seconds..:( (without talking about memory)

the gap is really big...:(

I'll test xml writer...as writer and phpexcel as reader of excel 2007 file..:)

First step : I need to understand why phpexcel is so greedy on cpu and ram...and now I don't really why..:)

Jan 7, 2011 at 2:44 PM

well I've tried xmlwriter quickly...and...and it's worse than phpexcel, why?

Because it's using as much as memory as phpexcel but with no tools to compressed it.

Thus I forget it.

Jan 7, 2011 at 3:54 PM

Just for reference, as this was of hight importance for me.

This iteration:

<snipp>

$objPHPExcel->getActiveSheet()    
 ->setCellValue("A1", 'Value')    
);

$objPHPExcel->getActiveSheet()    
 ->setCellValue("B1", 'Value')    
);

</snipp>

Consumes pretty much twice as much RAM as this one:

<snipp>

$objPHPExcel->getActiveSheet()
  ->setCellValue("A1", 'Value')      
  ->setCellValue("B1", 'Value') 
);

</snipp>

Given an iteration where you are dumping out MySQL rows with say 50 cols, setting it up on one instance (second example) saves you pretty much 50x RAM.

Regards,

Kim Steinhaug, kim@steinhaug.com, www.steinhaug.no

 

Jan 8, 2011 at 12:19 AM

thank you for this tip.

unfortunately my script is full of "for" loops

do you have any tricks to avoid "for" loops? (for example : loop on row included in a loop on column)

thanks in advance

Jan 9, 2011 at 1:18 PM

Well, my script also have for loops looping through the data from mysql and such. What I ended up doing was hardcoding alot. As an example I use this one in one of the dumps:

$objPHPExcel->getActiveSheet()

->setCellValue($excel_cols[1] . ($row), $item[0])
->setCellValue($excel_cols[2] . ($row), $item[1])
->setCellValue($excel_cols[3] . ($row), $item[2])
->setCellValue($excel_cols[4] . ($row), $item[3])
->setCellValue($excel_cols[5] . ($row), $item[4])
->setCellValue($excel_cols[6] . ($row), $item[5])
->setCellValue($excel_cols[7] . ($row), $item[6])
->setCellValue($excel_cols[8] . ($row), $item[7])
->setCellValue($excel_cols[9] . ($row), $item[8])
->setCellValue($excel_cols[10] . ($row), $item[9])
->setCellValue($excel_cols[11] . ($row), $item[10])
->setCellValue($excel_cols[12] . ($row), $item[11])
->setCellValue($excel_cols[13] . ($row), $item[12])
->setCellValue($excel_cols[14] . ($row), $item[13])
->setCellValue($excel_cols[15] . ($row), $item[14])
->setCellValue($excel_cols[16] . ($row), $item[15])
->setCellValue($excel_cols[17] . ($row), $item[16])
->setCellValue($excel_cols[18] . ($row), $item[17])
->setCellValue($excel_cols[19] . ($row), $item[18])
->setCellValue($excel_cols[20] . ($row), $item[19])
->setCellValue($excel_cols[21] . ($row), $item[20])
->setCellValue($excel_cols[22] . ($row), $item[21])
->setCellValue($excel_cols[23] . ($row), $item[22])
->setCellValue($excel_cols[24] . ($row), $item[23])
->setCellValue($excel_cols[25] . ($row), $item[24])
->setCellValue($excel_cols[26] . ($row), $item[25])
->setCellValue($excel_cols[27] . ($row), $item[26])
;

Instead of the usual:

$i = 1;
foreach($item AS $k=>$v){

  $objPHPExcel->getActiveSheet()->setCellValue($excel_cols[$i] . ($row), $v);
  $i++; 

}

Which basically means I end up with different export scripts for different coloumn layouts. 

Jan 10, 2011 at 8:29 AM

Thank you for the tip.

I'll see how to use it.

Jan 10, 2011 at 10:14 AM

@kimms: did you try this one?

$sheet = $objPHPExcel->getActiveSheet();
$i = 1; foreach($item AS $k => $v) { $sheet->setCellValue($excel_cols[$i] . ($row), $v); $i++; }

Coordinator
Jan 10, 2011 at 2:38 PM

@Marc

Take a look at the fromArray() method referenced in the latest few messages of this thread.

Jan 10, 2011 at 3:04 PM

good idea, I can create a specific array in order to load them in excel file with FromArray() function.

I'll give it a try by wednesday (I'm not sure I will be able to test that before..:))

Jan 10, 2011 at 5:25 PM

@SlowFox

Thanks for this tip, I'll do some benchmarking on it, and no - I didn't try that one.

- - -

Other than that I cannot use the fromArray() function as I am exporting strings as "0123" which Excel turns into the number "123" which is wrong for me. I ended up using the setCellValueExplicit() function which lets me set the dataformat directly as I set the field.

I am getting better at this as we speak, its a huge beast this library (and a sweet one to!), if there only was a way to say that col H should be numeric, and col B should be date...

Coordinator
Jan 10, 2011 at 8:09 PM
kimss wrote:

Other than that I cannot use the fromArray() function as I am exporting strings as "0123" which Excel turns into the number "123" which is wrong for me. I ended up using the setCellValueExplicit() function which lets me set the dataformat directly as I set the field.

You could always write a dedicated Value Binder (similar to the Advanced Value Binder) that would recognise leading zeroes on a string, and set it explicit (or set a number format mask to ###0).

 

kimss wrote:

I am getting better at this as we speak, its a huge beast this library (and a sweet one to!), if there only was a way to say that col H should be numeric, and col B should be date...

 Alternatively, a Value Binder could be written to identify the column and set implicit or an appropriate format mask based on that.

Something like:

class PHPExcel_Cell_ColumnValueBinder extends PHPExcel_Cell_DefaultValueBinder implements PHPExcel_Cell_IValueBinder {
   public function bindValue(PHPExcel_Cell $cell, $value = null)
   {
      if (is_string($value)) {
         $value = PHPExcel_Shared_String::SanitizeUTF8($value);
      }
      switch ($cell->getColumn()) {
         case 'B' :
            $cell->setValueExplicit($value, PHPExcel_Cell_DataType::TYPE_NUMERIC);
            $cell->getParent()->getStyle( $cell->getCoordinate() )->getNumberFormat()->setFormatCode('yyyy-mm-dd');
            return true;
            break;
        case 'H' :  // Date
            $cell->setValueExplicit($value, PHPExcel_Cell_DataType::TYPE_NUMERIC);
            return true;
            break;
      }
      return parent::bindValue($cell, $value);
   }
}
Jan 10, 2011 at 8:22 PM

Not sure you got the private email I sendt you during the weekend, I would rather pay you for looking over my export/import scripts. You seem to have the better knowledge of this library and your input will obviously be worth my while instead of diving into the documentation. Basically it's a simple export and populate an excel file, then reading the same file back so I can populate/update the same database again. (Bare in mind, my export and import system is already made, however I belive you looking over the code could - or not - result in some changes regarding my implementation of PHPExcel).

I would be interested in your input as you are the one that could easily see what would be the most efficient way to do the code for PHPExcel. I am kinda grasping for straws here, I am getting there though, but must admit this library is extremely heavy and at a object level I am not quite used to, :) (also, if I could save some time here and be sure I have it perfect, I would sleep better, than using to much time since all I really need here is creating an excel file and reading it back again).

Thanks for your kind efforts in sharing your help here, and bare in mind if you have time say in a week or so to spare some hours I would be happy to buy your time. You can reach me at my email, kim@steinhaug.com if interested.

Regards,
Kim Steinhaug, kim@steinhaug.com, www.steinhaug.no

Jan 11, 2011 at 7:46 AM

Ok I've tried also fromArray() function. Indeed it's a clear saving of memory, more than 10%, and also in time ( about 5%).

Thus, I'll continue to use fromArray..:)

 

Thanks again for this tip.

Jan 17, 2011 at 11:52 AM
Edited Jan 17, 2011 at 12:33 PM

sorry for delay.

My version of php was bugged (zip module), and it takes me some times in order to install a complete fresh of apache/php/mysql manually..:)

I'll deepen my experiment with fromarray and large sheet (currently 29000 lines with 26 columns).

Currently, Pear is doing the job in about 30 seconds, whereas phpexcel in excel 2007 puts 7minutes (using 600Mo of RAM. serialized cache is really effective for my needs).

Sure phpexcel is longer than pear, but I see others advantages by using also phpexcel..:)

Coordinator
Jan 17, 2011 at 12:56 PM
MarcParis wrote:

My version of php was bugged (zip module), and it takes me some times in order to install a complete fresh of apache/php/mysql manually..:)

The latest version of the SVN code allows the use of the PCLZip library as an alternative to ZipArchive (within the Excel2007 Writer at present) for precisely the situation when your PHP has the buggy php_zip extension (or if you don't have php_zip enabled.

PHPExcel_Settings::setZipClass(PHPExcel_Settings::PCLZIP);

can be used to enable this.

I'm hoping to get this implemented for the Readers (Excel2007 and OOCalc) as well, sometime before next weekend.

Jan 18, 2011 at 7:38 AM

good news..:)

Otherwise, I found FromArray() function difficult to apply due to empty cells.

I've tried this formula :

$objPHPExcel->getActiveSheet()->fromArray($temp_display_phpexcel,"empty",'A1',true);

In fact empty cells (cell filled with "empty" are ignored by fromarray.

How can I keep empty cell, empty?..:)

Thank you in advance

Jan 19, 2011 at 11:43 PM

As an extreme sidenote, which will render the Excel file in "compability mode", however it still works. Just picked it out from some files I am moving over to phpexcel.

http://www.phpclasses.org/browse/file/2195.html

This class can be used like this:

include './class.excelgen.php';
$excel = new ExcelGen("ExcelGen");
$excel->WriteText($row,$col,$value);
$excel->WriteNumber($row,$col,$data[$key]);
$excel->SendFile("write",$filepath . 'excel.xls');

The memory used for the class is like nothing. It would be cool (Im wishing!) if phpexcel would have a little "turbomode" with no features just to create a spreadsheet for numbers and strings.

Having said that, I have tweaked all server installs to be able to use 1GB of RAM if needed so I do not have problems with phpexcel anymore, and phpexcel do create a perfect excel file which is what we want in the first place. I just find it a little "cool" that the above class, which is micro-small compared, accually creates an excel file (thour flawed, so it opens in compability mode, which is kind off good enough - it opens).

For a reader for Excel2003 files http://code.google.com/p/php-excel-reader/ has a very small memory footprint and is very fast!

Again - this was a sidenote, just wanted to mention it.

 

Jan 19, 2011 at 11:51 PM
Edited Jan 19, 2011 at 11:57 PM

Actually, the correct reader is this one: http://code.google.com/p/php-excel-reader2/ which is much faster.

I also had to google that "compability mode", and thats actually just excel2003 files. I learn something new all the time, :)

Coordinator
Jan 29, 2011 at 5:12 PM
MarcParis wrote:

Otherwise, I found FromArray() function difficult to apply due to empty cells.

I've tried this formula :

$objPHPExcel->getActiveSheet()->fromArray($temp_display_phpexcel,"empty",'A1',true);

In fact empty cells (cell filled with "empty" are ignored by fromarray.

How can I keep empty cell, empty?..:)

 Ensure that you have genuine PHP NULLs in your array, and

$objPHPExcel->getActiveSheet()->fromArray($temp_display_phpexcel,NULL,'A1',true);

should work.

PHPExcel should not touch any cells where the array value matches the NULL value. However, if those cells already exist in the worksheet, they won't be removed.

Feb 1, 2011 at 1:13 PM

I dont complaint about phpexcel... I have been working in my company for 5 months more or less, we have to send more than 100 reports everyday to clients around the world, and the first thing that i saw here was that all those reports were running manually with macros.... Can you imagine that???... go running report by report and after that sending them to the clients.... Sorry but thats not my style..., now im working on an application that run and send the reports automaticaly... right now i have like 30 reports running with phpexcel and still migrating... imagine the speed that i get with phpexcel vs macros... its like changing from a corolla to a ferrari..... I really like this Class... the only thing than im waiting its the Chart support.... i know there is another library call pChart that i can use... but i prefer to wait until we can have that running here... and I think for the time that i have my application finished we going to have the chart support in the Class..