large spreadsheet memory / time

Topics: Developer Forum, User Forum
Jun 17, 2009 at 1:14 PM

Hi, the spreadsheet I am creating has 12 columns and around 11,000 rows, the different time / memory useage is displayed below;

Original Spreadsheet_Writer;

Duration 125s

Memory 35.25MB

PHPExcel Build 27870 (Excel 2007)

Duration 570s

Memory 423MB

PHPExcel Build 27960 (Excel 2007)

Duration 540s

Memory 423.25MB

Will the speed / memory of the class be improved with the next stable build??

Developer
Jun 17, 2009 at 2:03 PM

With large workbooks, there are some techniques for reducing execution time.

For example, set cell values explicitly:

$cell->setValueExplicit(...)

$objWorksheet->setCellValueExplicit(...)

The above is always faster than using the below methods because PHPExcel can then skip data type detection

$cell->setValue(...)

$objWorksheet->setCellValue(...)

Whenever you can, always set cell values explicitly. For a start, you may try these kind of optimizations and post back the results.

The PEAR::Spreadsheet_Excel_Writer using disk caching which explains the low memory usage. Next release of PHPExcel (1.7.0) is only a couple of week away and we can not implement caching in such short time. Once caching is implemented, memory usage of PEAR::Spreadsheet_Excel_Writer and PHPExcel will probably be similar.

We are regularly improving speed in PHPExcel. There are a few percent to save here and there, and in the end these may add up to many seconds saved. This is an ongoing process.

Jun 17, 2009 at 3:30 PM

I changed the SetCellValue to use explicit, this drastically improved the speed, from around 500s to under 60s.

The problem I now have is with formulas, the formulas are appearing as text, if I use the PHPExcel_Cell_DataType::TYPE_FORMULA on a couple of columns it seems to extend the time to around 900s.

Also numbers are appearing as text.

Developer
Jun 17, 2009 at 4:18 PM

You need to use the correct data type for each cell.

Formulas:
    PHPExcel_Cell_DataType::TYPE_FORMULA

Strings:
    PHPExcel_Cell_DataType::TYPE_STRING

Numbers:
    PHPExcel_Cell_DataType::TYPE_NUMERIC


It sounds like most of the execution time has to do with formulas being calcualted.

You may consider disabling precalculation of formulas:

$writer = PHPExcel_IOFactory::createWriter($excel, 'Excel2007');
$writer->setPreCalculateFormulas(false);
$writer->save('write.xlsx');

Jun 18, 2009 at 8:37 AM

Thanks for your help so far, I am getting used to this class more and more and is proving very good.

What are the restrictions of the TYPE_NUMERIC?

Does this allow decimal places, if so how many? I ask because most of my columns have numbers in them, when i set them to be TYPE_NUMERIC i get a conversion error when trying to open the file in 2003.

Developer
Jun 18, 2009 at 1:23 PM

>> What are the restrictions of the TYPE_NUMERIC? Does this allow
>> decimalplaces, if so how many?

Generally you should use PHPExcel_Cell_DataType::TYPE_NUMERIC whenever you are storing plain numbers. This has really nothing to with decimal places. Number formatting is controlled via the style.

>> when i set them to be TYPE_NUMERIC i get a conversion error when trying to
>> open the file in 2003.

This should really not be happening. Do you have a samle line of code where you are using TYPE_NUMERIC ?

Jun 18, 2009 at 2:44 PM
Edited Jun 18, 2009 at 2:46 PM

Thanks for your help so far, it was a problem with the data i was outputting with a TYPE_NUMERIC, it now works great.

I have a problem with the reader reading cells incorrectly now!

In a cell i have a figure 80.76, when i get the cells value using ;

                $col_data = $sheet->getCellByColumnAndRow($col, $row)->getCalculatedValue();

when i echo the value to the screen, the value is 80.76000000000001, in another instance, 77.21 is returned as 77.20999999999999

What could be causing this?


EDIT: $col_data = $sheet->getCellByColumnAndRow($col, $row)->getValue(); does the same.

Coordinator
Jun 18, 2009 at 6:18 PM
Edited Jun 18, 2009 at 6:19 PM

In part, this is down to the way computers store real (floating point numbers), which can lead to a degree of inaccuracy when using them in mathematical formulae. Typically, errors of this nature are likely to give identical results in both Excel and PHPExcel, although this isn't guaranteed.

In a very few functions (particularly some of the Engineering or Statistical functions like ERF, ERFC or GAMMALN), PHPExcel actually uses more accurate methods of calculation (albeit slightly less efficient) and so can give different results. Typically, I would expect accuracy to 14 significant digits for most functions in the PHPExcel calculation library - I run my own tests to this precision, but these few exceptions will normally only match Excel's more efficient but less accurate calculation to 8 or 9 significant digits.

It could also be down to the degree of accuracy of the equations being used. In PHPExcel, you can modify this to a degree by setting the precision value at the top of Functions.php. PHP's default is (I believe) 14 decimal places. PHPExcel increases this to 16. I believe there's a similar setting in Excel, though I can't recall where it's set: I think it's Tools/Options/Calculation/Precision as displayed.

You can mask the cell to indicate how many decimal places should be displayed.This is controlled by the number format mask. You may well find that if you increase the number of decimals displayed in Excel for these cells, then you will get identical values at the same masking.

Jun 19, 2009 at 12:30 PM

Hi Mark

I am retrieving these figures from a database then using PHPExcel to write them. The figure in the database is as its displaying in Excel e.g. 77.21.

The problem is when i retrieve the cell value using the PHPExcel reader, this value is float(77.209999999999), this is only happening on a few cells not all in the column.

I could round this to 2 decimal places using PHP but I would rather it return the correct figure as is displayed in Excel.

Developer
Jun 23, 2009 at 11:26 PM

If I understand Mark's answer correctly, you could try to comment out this line in Classes/PHPExcel/Calculation/Functions.php to reduce precision

ini_set('precision',16);

I tried it, and it seemed to remove the problem. Would this have any serious side effects, Mark?

Coordinator
Jun 24, 2009 at 8:30 AM

>> Would this have any serious side effects, Mark?

"precision" doesn't affect the accuracy of the mathematics in any way, simply the number of significant digits displayed in floating point numbers when they are echoed without any additional formatting. It's the display precision rather than the calculation precision, so it should have no adverse effects unless you're using number masks with a large number of decimal places .

Jun 24, 2009 at 4:42 PM

Thanks guys, commenting out ini_set('precision', 16); has worked, the figures are now correct as they are in the spreadsheet.