I have some entry-level questions...

Topics: User Forum
Jun 19, 2009 at 5:34 AM
Edited Jun 19, 2009 at 5:34 AM

I am in the process of migrating a malfunctioning system over from the now-unsupported Spreadsheet Excel Writer renderer to PHPExcel. I am overall quite impressed with the available features.

The first question I have is regarding getting PHPExcel set up as a PEAR package. I've installed per instructions I located (pear install [path-to-TGZ]) and as a result have the package seemingly installed, on a channel called "__uri", but I cannot seem to access its functionality innately. I've been working via simple includes, but integrating it into the system via PEAR is something I'd really like to do. Have I missed something? I admit that PEAR is not my forte.

The second question is: how would I go about rendering and returning the spreadsheet file via a HTTP request, rather than saving it to disk? The examples I've located all seem to focus on saving the end result of the render to a file on the local file system, but the implementation I'm seeking would deliver the results of the render directly to users of a web-based system.

I have searched the discussion boards, but was unable to find answers to these questions, so answers and advice would be much appreciated. Thanks!

Developer
Jun 19, 2009 at 7:12 PM

>> The first question I have is regarding getting PHPExcel set up as a PEAR package.

I admit that I haven't been using PEAR that much. I tested your method locally on Windows XP running XAMPP, but I had no problems. Here is what I did.

1. Download the PEAR PHPExcel from "Downloads" here on this site. This gave me the file PHPExcel-1.6.7.tgz on the desktop.

2. Run from the command prompt:

pear install PHPExcel-1.6.7.tgz

http://img221.imageshack.us/img221/4085/pear.png

3. Check that the PHPExcel.php and PHPExcel folder are in place in the PEAR directory.

http://img197.imageshack.us/img197/4085/pear.png

After this, I could do like this:

$objPHPExcel = new PHPExcel();

without having to include any classes. I could have achieved the same by just copying those files manually to the PEAR directory. Not sure what the big difference is?

>> The second question is: how would I go about rendering and returning the spreadsheet
>> file via a HTTP request


Be sure to check the documentation. There is a section called "Redirect output to a client's web browser" showing which headers to use for both xls and xlsx.

Jun 20, 2009 at 7:15 AM

The documentation was really helpful in setting me on my way - you should really download it - it's along with the main package.

Jun 20, 2009 at 7:57 AM

Thank you for the help!  I found all the information I needed to get the system to output to client browsers right where you said it would be.

On the PEAR front, it seems that I am able to use PHPExcel via calls to include() as if it were local, even though it is not, so the installation must have worked to some degree.  Having to be sure I call include() before accessing the features is a minor inconvenience I can overlook.

I was wondering if you might direct me towards any useful information that could help me with optimization practices.  The system in question will often generate spreadsheets with thousands of rows, so any tips that you can offer to help cut overhead would be very useful.

Jun 21, 2009 at 10:18 AM

From what I've gathered from koyama's posts, you would do well to specify the datatype of the cell explicitly - you can find the syntax for this in the documentation again. He says it improves speed up to 30% and I trust his judgment implicitly! If you are using formulas, you could disable the pre-calculation of formulas to improve speed - just run a search on the forum, there is a discussion regarding this that's about 2-3 days old.

Would love to hear your benchmarks and experience - do let us know.

Jul 2, 2009 at 11:20 AM
Edited Jul 2, 2009 at 11:20 AM

I've implemented much of the features I need.  I'm noticing a decent amount of memory overhead, especially when particularly large (10,000+ row) sheets are generated.  Could you recommend some methods of perhaps decreasing the memory usage?

Also, while I am able to set cell formatting individually, I could find no reference to functions that might enable me to set formatting (numeric value display, justification) by column rather than by cell.  Each column always obeys specific formatting, and I'm concerned that specifying formatting per-cell might also be causing unneeded overhead.

Developer
Jul 2, 2009 at 11:25 AM

@kbarsotti: What version of PHPExcel are you using?

Jul 2, 2009 at 11:40 AM

I am using version 1.6.7.

Developer
Jul 2, 2009 at 11:54 AM

>> I am using version 1.6.7.

Then I recommend that you grab the latest source code:
http://phpexcel.codeplex.com/SourceControl/ListDownloadableCommits.aspx

Out of the box, latest source code will use less memory than PHPExcel 1.6.7, especially if you are using styling.

The preferred way of styling multiple cells is with latest source code:

$objWorksheet->getStyle(A1:A10000)->...

which is much faster than looping through cells and styling them one at a time. So I recommend that you update your code to make use of this.

You are correct that column styling is not yet implemented. I posted a feature request last year, but for some reason it didn't get many votes. You may vote for this:
http://phpexcel.codeplex.com/WorkItem/View.aspx?WorkItemId=7333

Jul 2, 2009 at 12:09 PM

Thanks, that's a very handy feature.  I'll definitely vote for the column-styling as well, it's important to have complete interoperability above and beyond the convenience such a feature would offer.

Just to make sure, to do a PEAR install of the contents of the archive you linked, I'll want to put the package.xml from /Build into /Classes?

 

Developer
Jul 2, 2009 at 12:31 PM

>> Just to make sure, to do a PEAR install of the contents of the archive you linked,
>> I'll want to put the package.xml from /Build into /Classes?

Hmm... I think that package.xml file is just a template, so it may not be good to do that.

I am not a PEAR expert, but somewhere in you PEAR directory you should find PHPExcel.php and the PHPExcel directory with the remaining classes. Manually replacing them with the new versions found in Classes should work, I think. Can you try that?

Jul 2, 2009 at 1:07 PM

Will do.  I've a few things to do today before I'll be sitting down to code, but I'll let you know how it all turns out.

Jul 24, 2009 at 8:43 PM

I apologize for the delay, a number of things came up that prevented my response.

The system works very well, but even with the latest source code I am experiencing a large efficiency loss compared to the original Spreadsheet_Excel_Writer solution.  I'll list my findings and methods.

Using an example sheet which has 48 columns and 2472 rows (a large spreadsheet, but the system is capable of producing result sets in the tens of thousands at times):

PHPExcel (No Formatting): 145MB peak memory.

PHPExcel (Full Formatting): 310MB peak memory.

Spreadsheet_Excel_Writer (Full Formatting): 31.5MB peak memory.

As you can see, the new system utilizes between five and ten times as much memory to produce the same output.  The computation time (and subsequent load time on the application's web front-end) is similarly longer.

Methods:

I was unable to use cell ranges as parameters for the getStyle() method, ala:

$objWorkSheet->getStyle(A1:A2000)->getAlignment()->setHorizontal(PHPExcel_Style_Alignment::HORIZONTAL_CENTER);

So I used a single cell assignment followed by a cell range duplicateStyle().

If you can offer me any advice as to how I might improve the efficiency of this system, I would be most grateful.  Thank you again for all your help!

Developer
Jul 24, 2009 at 9:08 PM

>> If you can offer me any advice as to how I might improve the efficiency of this system,
>> I would be most grateful.


Are you using setValue() or setValueExplicit()? Like karpar mentioned, the latter is faster than the first and makes a notable difference when you have many cells. Also, if you are generating xlsx, you may consider disabling formula precalculation.

Jul 24, 2009 at 9:35 PM

I am actually using setCellValueByColumnAndRow iteratively, ala:

for($i=0;$i<count(rows);$i++){
  for($j=0;$j<count(columns);$j++){
    $o->getActiveSheet()->setCellValueByColumnAndRow($j,$i,$val);
  }
}

What would be an analogous way to perform the same sort of operation but with the setValueExplicit() function?

Developer
Jul 24, 2009 at 9:53 PM

Instead of this

$sheet->setCellValueByColumnAndRow(0, 1, 'hello');
$sheet->setCellValueByColumnAndRow(0, 2, 1234567);

Do like this:
$sheet->setCellValueExplicitByColumnAndRow(0, 1, 'hello', PHPExcel_Cell_DataType::TYPE_STRING);
$sheet->setCellValueExplicitByColumnAndRow(0, 2, 1234567, PHPExcel_Cell_DataType::TYPE_NUMERIC);

Jul 26, 2009 at 8:06 PM

I've made the revision, along with another that I'll outline after the benchmark, with the following results:

Test Case: 48 Columns, 2472 Rows:

Peak Memory (Explicit Assignment): 310.75MB (an increase in memory usage, but negligible)

Peak Memory (Explicit Assignment, Conditional*): 292.5MB

Peak Memory (Explicit Assignment, Conditional, No Formatting): 110MB.

* I realized that the existing logic, which I based mine upon, never actually checks for null values from the database before it creates and fills a cell.  By adding a conditional to every cell assignment, I eliminated all the Excel logic associated with null/empty cells, which improved efficiency somewhat, though it'll vary depending on how many of the records are incomplete.

The end result is that using explicit assignments does not seem to have improved memory use (and actually may have increased it to a negligible degreee).  The conditional fix helped a bit more, but again not profoundly.  The memory usage for this particular test case is now oscillating between 290MB and 300MB.  Can you think of some other ways I could enhance the efficiency of the operation?  It seems that the formatting action does produce the lion's share (~60%) of the memory usage.

My formatting method involves setting the formatting for the first cell in a column (or the title row) then using the duplicateStyle() function to duplicate that style information to all subsequent cells in that column or row.  Is there a more efficient way to accomplish this?

Perhaps disk caching could be utilized to lighten the memory load?

Sidenote

I'm also trying to get currency-value columns formatted like "$1000".  The USD_SIMPLE currency format is "$1000.00".  Do you know of some good examples for creating a custom number formatting to accomplish the desired format, with a '$' prefix but no decimal precision?

Developer
Jul 26, 2009 at 8:52 PM

>> I was unable to use cell ranges as parameters for the getStyle() method, ala:

>> $objWorkSheet->getStyle(A1:A2000)->getAlignment()->setHorizontal(PHPExcel_Style_Alignment::HORIZONTAL_CENTER);

Ups, I discover missing quotes

getStyle(A1:A2000)


getStyle('A1:A2000')

Maybe that is why it is not working. Can you test again? I can't understand why memory usage is so high with latest source code. I just ran some test with 100,000 cells and styling didn't really add anything to memory usage. It stayed constant at around 140MB.

The explicit assignment does not lower memory usage, but increases speed a bit. I should have been more clear about that.

>>Do you know of some good examples for creating a custom number formatting to accomplish the desired format, with a '$' prefix but no decimal precision?

$sheet->getCell('B1')->setValueExplicit(199, PHPExcel_Cell_DataType::TYPE_NUMERIC);
$sheet->getStyle('B1')->getNumberFormat()->setFormatcode('[$$-409]#,##0');