Can't reproduce 14excel5.xls

Topics: User Forum
Oct 9, 2009 at 10:45 PM

In the Tests/ directory, there is a sample output file 14excel5.xls, presumably generated by the corresponding 14excel5.php. We indeed are able to produce an .xls with 14Excel5.php, but we can't produce the same one. We can produce something that *looks* exactly the same as the supplied one, but only when you open it in Excel 2007, OO.o 3.0, and a recent Gnumeric.

Our problem is Excel 97. The 14Excel5.xls that we produce by running the corresponding script does not display images when opened under Excel 97. We suspect that there is a problem with how the routines in Escher.php write the image metadata to the .xls

We appreciate any and all help. We'll be looking at the .xls spec and Escher.php to see if we can fix this ourselves.

Developer
Oct 9, 2009 at 11:11 PM

Thanks for investigating this. I must admit I haven't tested in Excel 97. As far as I remember Excel 2003 is ok, right?

It is a while ago I looked at this, but I remember there were great difficulties getting images to work in PHPExcel_Writer_Excel5.

Information was found these places:

http://www.microsoft.com/interop/docs/OfficeBinaryFormats.mspx (Office Drawing)

http://msdn.microsoft.com/en-us/library/cc441433.aspx

Unfortunately, I found those documents quite messy. Several records were not documented. Let me know if I can help. Right now I need to find a way to test in Excel 97.

Oct 16, 2009 at 8:32 PM

Perl's Spreadsheet::ExcelWriter can write images into BIFF 8 files in a way that Excel 97 seems to be able to understand. I have spent most of this week trying to understand what Spreadsheet::ExcelWriter is doing and figuring out BIFF. The biggest differences with PHPExcel seem to be that Spreadsheet::ExcelWriter writes a checksum for the image (an MD5 sum) in the Dgg block and uses it as an RgbUid, whereas PHPExcel leaves it zeroed out. Spreadsheet::ExcelWriter also writes two more Escher records: Opt and SplitMenuColors. Spreadsheet::ExcelWriter has other differences outside of the MSODRAWINGGROUP record, but I'm not sure if any of this is relevant. There are other minor differences such as the SpidMax variable differing in the Perl and PHP engines, but I'm not sure if this is relevant.

If you have further hints on how to insert images into BIFF 8 documents such that Excel 97 can read it, they would be most welcome.

Developer
Oct 16, 2009 at 8:59 PM
jordigh wrote:

Perl's Spreadsheet::ExcelWriter can write images into BIFF 8 files in a way that Excel 97 seems to be able to understand. I have spent most of this week trying to understand what Spreadsheet::ExcelWriter is doing and figuring out BIFF. The biggest differences with PHPExcel seem to be that Spreadsheet::ExcelWriter writes a checksum for the image (an MD5 sum) in the Dgg block and uses it as an RgbUid, whereas PHPExcel leaves it zeroed out. Spreadsheet::ExcelWriter also writes two more Escher records: Opt and SplitMenuColors. Spreadsheet::ExcelWriter has other differences outside of the MSODRAWINGGROUP record, but I'm not sure if any of this is relevant. There are other minor differences such as the SpidMax variable differing in the Perl and PHP engines, but I'm not sure if this is relevant.

Very interesting information. Thanks for testing.

 

jordigh wrote:

If you have further hints on how to insert images into BIFF 8 documents such that Excel 97 can read it, they would be most welcome.

May I suggest that you start out with the an xls file generated by Perl's Spreadsheet::ExcelWriter and try to blank out those pieces of information which PHPExcel is not writing to find out which one is necessary for Excel 97.

You can do like this:

For the MD5 sum, blank it out in a HEX editor replacing with zeros. If it still opens fine in Excel 97, then we can consider it unimportant.

For the OPT and SplitMenuColors records, blank them out in a HEX editor by replacing the record type entry with some "garbage" data. (Then it will just be ignored by Excel 97)

That way we should be able to find out which piece of information is necessary for Excel 97. I will gladly help doing the HEX edits if you can provide the Perl Excel file with an image. Then maybe you can help trying to open in Excel 97 and say whether the image is there or not.

You are very welcome to upload test files here: http://phpexcel.codeplex.com/WorkItem/View.aspx?WorkItemId=10749

 

Oct 16, 2009 at 9:32 PM

Thanks for your help.

I have uploaded the test case I'm working with. I'll try to follow your recommendations and remove little by little the data that Spreadsheet::ExcelWriter generates and see if it still works with Excel 97.

I am starting to suspect the problem is not within the MSODRAWINGGROUP record, but best to be sure.

Developer
Oct 16, 2009 at 11:00 PM
Edited Oct 16, 2009 at 11:04 PM

Based on your perl.xls I have uploaded some files that should help locate the error.

excel97-images.zip

Here is what I did.

1.
In perl.xls extract the binary record contents of these records:

MSODRAWINGGROUP
MSODRAWING
OBJ


Saved in files:

MsoDrawingGroup.bin
MsoDrawing.bin
Obj.bin


2.
index.php writes a blank xls workbook using PHPExcel_Writer_Excel5

3.
Hack these two files in PHPExcel

PHPExcel\Writer\Excel5\Workbook.php
PHPExcel\Writer\Excel5\Worksheet.php

by editing them in a way so they inject the binary stuff which was extracted from perl.xls

Modified was simply this in PHPExcel\Writer\Excel5\Workbook.php:

private function _writeMsoDrawingGroup()
{
$record = 0x00EB;
$data = file_get_contents('C:/xampp/htdocs/_phpexcel/MsoDrawingGroup.bin');
$length = strlen($data);
$header = pack("vv", $record, $length);

return $this->writeData($header . $data);
}



and this in PHPExcel\Writer\Excel5\Worksheet.php

private function _storeMsoDrawing()
{
// chunk of Escher stream for one shape
$record = 0x00EC; // Record identifier
$data = file_get_contents('C:/xampp/htdocs/_phpexcel/MsoDrawing.bin');
$length = strlen($data);
$header = pack("vv", $record, $length);
$this->_append($header . $data);

// OBJ record
$record = 0x005D; // record identifier
$objData = file_get_contents('C:/xampp/htdocs/_phpexcel/Obj.bin');
$length = strlen($objData);
$header = pack('vv', $record, $length);
$this->_append($header . $objData);
return;
}


See modified files in attachment.

4.
Run the index.php script to produce write.xls (attached)

5.
The result of write.xls is a workbook created by PHPExcel, but using binary data from Perl's Spreadsheet::ExcelWriter

6.
When I open the file write.xls it looks ok (one image) in MS Office Excel 2007. Now how does it look in Excel 97? Can you tell us?

7.
If it opens ok in Excel 97, then we know that the error must be in one of the records MSODRAWINGGROUP, MSODRWAING, or OBJ. It should then be possible to find the problem by editing MsoDrawingGroup.bin, MsoDrawing.bin, and Obj.bin in a HEX editor and run the test again until the error is isolated.

If it opens not ok in Excel 97, then the error must be some entirely different place.

Oct 19, 2009 at 2:58 PM
koyama wrote:

Based on your perl.xls I have uploaded some files that should help locate the error.

excel97-images.zip

7.
If it opens ok in Excel 97, then we know that the error must be in one of the records MSODRAWINGGROUP, MSODRWAING, or OBJ. It should then be possible to find the problem by editing MsoDrawingGroup.bin, MsoDrawing.bin, and Obj.bin in a HEX editor and run the test again until the error is isolated.

Okay, this is definite progress. The generated file works in Excel 97. I think I neglected to look at the other records, MSDRAWING and OBJ. I focussed all of my efforts on MSODRAWINGGROUP, and I wasn't able to find the source of error there. I'm going to try looking at the other two records to see if I can pinpoint the source of the problem.

Oct 19, 2009 at 4:44 PM

I diffed Worksheet.php with revision 1.7.0 from 2009-08-10, and I discovered several more changes than the ones you described above. Is there a place to checkout the latest source tree to make sure that none of those extra changes are the source of the fix either?

Coordinator
Oct 19, 2009 at 5:35 PM

http://phpexcel.codeplex.com/SourceControl/ListDownloadableCommits.aspx

Oct 19, 2009 at 5:37 PM

Is there something I can feed to an svn checkout command? It's nicer to be able to look at revisions with my local tools than a web client.

Developer
Oct 20, 2009 at 2:02 AM

You should be able to check out SVN with this URL together with your username and password here at CodePlex.

https://phpexcel.svn.codeplex.com/svn/

 

Oct 20, 2009 at 10:01 PM

Thanks for your help. I checked out the code, and it's much more convenient. I hope that I can soon submit a patch to fix this.

I scrutinised the OBJ and MSODRAWING records. Between Perl and PHP, I found that Perl only adds

    # Add ftCf (Clipboard format) subobject
    $sub_record     = 0x0007;   # ftCf
    $sub_length     = 0x0002;
    $sub_data       = pack 'v',      0xFFFF;
    $data          .= pack 'vv',     $sub_record, $sub_length;
    $data          .= $sub_data;

    # Add ftPioGrbit (Picture option flags) subobject
    $sub_record     = 0x0008;   # ftPioGrbit
    $sub_length     = 0x0002;
    $sub_data       = pack 'v',      0x0001;
    $data          .= pack 'vv',     $sub_record, $sub_length;
    $data          .= $sub_data;

in the OBJ record. In the MSODRAWING record, the only difference is that one of the Sp (shapes) gets more options in the OPT record the Perl code. It gets two more: Fill Style -> fNoFillHitTest and  Group Shape -> fPrint.

I'm going to try next if changing just one of the three records at a time (between OBJ, MSODRAWING, and MSODRAWING group) will make this work in Excel 97. I hope the issue isn't a complex interplay between all three records. :-/

Developer
Oct 20, 2009 at 10:06 PM

Thanks for the update, it sounds like you are on the right track. Please keep us updated, it would be great if this can be fixed!

Oct 21, 2009 at 5:40 PM

I found the bug. Someone typoed in the PHP code. Patch attached. Where it should have been 0x0A00 was instead 0xA000 in PHPExcel in a Sp bitfield. I think it's a bitfield. As far as I can tell, it's two magic bytes attached at the end records of type 0xF00A which according to Microsoft's published standard SHOULD NOT EXIST. Probably undocumented magic numbers that were reverse engineered. Tested on Excel 97, and it works.

Thank you for your help finding this. If you wish to credit my minor patch, my full name is Jordi Gutiérrez Hermoso.

Developer
Oct 22, 2009 at 7:00 PM

How fantastic! I know how hard it can be to track down errors like this.

I applied your patch and you have been credited in the changelog.

http://phpexcel.codeplex.com/WorkItem/View.aspx?WorkItemId=10817

Thank you for your very hard work debugging this.