Partial loss of formulas and formatting during load/save of template file

Topics: User Forum
Jan 12, 2011 at 9:29 PM

I have recently begun working with/learning to use PHPExcel and I am encountering some issues with the process of loading a "template" file and then saving it.

The issue I'm experiencing is that some, but not all, of the formulas and cell formatting are being lost when I load the file and then save it (without making any changes to the content) using the Excel5 reader/writer.

I ran the "30template.php" test file (I also ran other test files but this is the one that is pertinent to my intended usage) and almost everything worked fine.  The only difference between the input and output files is that in the output file; the green formatted text ("payable within 30 days after the end of the month") comes out as unformatted black text (like the rest of the text in that cell).  Everything else looks exactly like it does in the original templates/30template.xls source file.

I then copied the 30template.php file to a new test file and modified it so that it would load my sample (XLS) file.  I also removed the code that adds new content to the file but, other than that, no other code was changed.  This new test file simply loads the file and saves it back "as-is".  Eventually I plan to use this in a load-modify-save scenario but I'm currently limiting it to the load-save parts to reduce the number of things to troubleshoot.

 When I do this some, but not all, of the cells that contain formulas come out completely empty (i.e. no formulas or content of any kind) and lose all formatting (those cells have currency formatting set in the source file).  Other cells which also contain (similar) formulas and currency formatting remain intact though (i.e. the formulas and formatting are still in the output file).  The cells that come out empty are B6:M6, B8:M8, D9:M9, B14:M14, B16:M16, and B27:M27.

I then tried the same code with an XLSX format file.  The XLSX file was created by opening the XLS file in Excel and saving it as an XLSX file.  The test code was updated to use the Excel2007 reader/writer but otherwise unchanged from the initial test.  When running this; the output file comes out with all of the formulas intact but the currency formatting is then lost in any cell that contains a formula (i.e. all of the formatting is set to "general").

In both test cases the formatting for any cell that does not contain a formula is retained perfectly.

After reading the documentation, discussions, and bug tracker and several days of experimenting; I am stumped as to what is happening to cause this behavior.  The only formulas in the workbook are SUM formulas (a few have some additional multiplication or division as well).  I'm not sure if this is an issue with the server, test code, Excel file, or a bug.  No error messages are produced during the test run(s).

The environment is:

Server: Redhat Enterprise 5
PHP version: 5.2.12  (also tested with 5.2.13 on a completely different server running CentOS with the same results)
PHPExcel version: 1.7.5 and snapshot 66707 (tried 66707 after encountering the issues using 1.7.5)

The test file I'm using will be uploaded shortly.

Coordinator
Jan 13, 2011 at 3:22 PM

Rich Text isn't supported in the Excel5 Writer (Work Item 8916)

I've tried to load your file, but am getting a fatal error 'Workbook does not contain sheet:Costs'

Coordinator
Jan 13, 2011 at 3:40 PM
Edited Jan 13, 2011 at 4:05 PM

mea culpa on the "Costs" worksheet. I'll try and identify the problem with the formulae tonight.

The formulae load correctly; and the problem appears to be in the Excel5 Writer and directly related to the *-1.

If =SUM(C5*0.25)*-1 is changed to =-SUM(C5*0.25) the file will save correctly. My suspicion is that the Excel5 parser is misinterpreting this formula, and so generating an invalid formula value that Excel cannot subsequently read.

Jan 13, 2011 at 7:04 PM

Thank you for the quick response Mark.
I have been running a few more tests of my own using a much more simplified workbook and trying various load/save formats.  I started noticing issues relating to the '* -1' in the formulas as well.
I also have noticed a few oddities with cells formatted as Currency and using the parens.

Here is the sample data I set up and the results of the load/save operations ( I tried formulas with and without the SUM function in case that plays a role in any way .. but it doesn't seem to):

Test Data (Cell = Value/Formula [Formatting])
A1   = 10 [ General ]
A2   = A1 * 5 [ General ]
A3   = A1 / 5 [ General ]
A4   = SUM(A1 * 5) / 5 [ General ]
A5   = A1 * 5 [ Currency format ]
A6   = A1 * 5 [ Currency format red ]
A7   = A1 * 5 [ Currency format with parens ]
A8   = A1 * 5 [ Currency format with parens red ]
A9   = (A1 * 5) * -1 [ Currency format ]
A10 = (A1 * 5) * -1 [ Currency format red ]
A11 = (A1 * 5) * -1 [ Currency format with parens ]
A12 = (A1 * 5) * -1 [ Currency format with parens red ]
A13 = SUM(A1 / 5) * -1 [ Currency format ]
A14 = SUM(A1 / 5) * -1 [ Currency format red ]
A15 = SUM(A1 / 5) * -1 [ Currency format with parens ]
A16 = SUM(A1 / 5) * -1 [ Currency format with parens red ]
A17 = A1 * 5 [ Number format ]
A18 = SUM(A1 / 5) * -1 [ Number format ]

 

Results
Load as Excel5 - Save as Excel5
A1:A8 = retained values/formulas/formatting
A9:A16 = formulas and formatting lost
A17 = retained formula and formatting
A18 = formula and formatting lost

Load as Excel5 - Save as Excel2007
All values/formulas/formatting retained

Load as Excel2007 - Save as Excel5
A1:A6 = retained values/formulas/formatting
A7:A8 = retained formulas but lost formatting
A9:A16 = formulas and formatting lost
A17 = retained formula and formatting
A18 = formula and formatting lost

Load as Excel2007 - Save as Excel2007
A1:A6 = retained values/formulas/formatting
A7:A8 = retained formulas but lost formatting
A9:A10 = retained formulas and formatting
A11:A12 = retained formulas but lost formatting
A13:A14 = retained formulas and formatting
A15:A16 = retained formulas but lost formatting
A17:A18 = retained formulas and formatting

 Where it says "lost formatting" that means that the formatting is set to "General" for those cells in the output file.

Coordinator
Jan 13, 2011 at 8:26 PM

Formatting should always be retained; but I'll try running some tests with the problem formats that you've identified over the next week. Certainly for some of the Excel5 saves, it could be the problem with the formula that's affecting the number format as well. I'm particularly concerned about any number formats that are lost in the Excel2007 Writer.

 

Work Item 7895 is the problem with the unary minus in the Excel5 Writer Parser: the Excel5 Reader has no problem with this. Formulas that contain those unsupported features are returned as an empty string by the parser, when it converts the stored formula to its BIFF representation. Unary minus has been on the hit list for a while as the next feature to address in the Work Item 7895 list. I'll see if I can do something about it over the weekend.

Coordinator
Jan 14, 2011 at 1:22 PM

Having had a quick look through the Excel5 Writer Parser, there's no quick and easy fix for the unary minus operator. In some instances, it works correctly (eg. =-SUM(C5*0.25)), in others not. I can patch it so that =SUM(C5*0.25)*-1 will work, but not =SUM(21*-1). I'd far rather spend the time and get a full solution, than provide a quick fix that sometimes works and sometimes doesn't (without any clear reason why).

Jan 14, 2011 at 3:48 PM

I totally agree that the time would be better spent working on a solution that is both complete and well understood.  Additionally, it sounds like I may be able to work around the issue by changing the formulas and/or using the Excel2007 writer for the output (which is fine by me since the Excel5 format isn't a requirement).

I'll give it a try with a different formula format (i.e. using the "=-SUM()" style formula rather than the unary minus) and the Excel2007 writer and will post a follow-up with additional info if I run into any further issues using that combination (especially if there is a recurrence of number formats being lost using the Excel2007 writer).

Thank you again for the quick responses and your hard work on this project.