A PHPExcel generated file doubles in size once saved in Excel2007

Topics: Developer Forum
Mar 6, 2012 at 5:13 PM
Edited Mar 6, 2012 at 5:46 PM

Hi all,

I'm creating an Excel97/2003 file using PHPExcel and I add some fields and values. The idea is that this Excel file is treated using SQLserver 2005 data is extracted from it with a DTS scheme.

When trying to do this with the PHPExcel generated file, the DTS transformation doesn't succeed as the first line and first column appear to be missing.

When I open this PHPExcel generated file in Excel2007 (compatibilty mode is showing) and save the file directly (without touching any of the data) the file DOUBLES in file size, AND is perfectly accepted by the DTS transformation scheme in SQL Server 2005.

What is Excel2007 adding to this file to make it double in size AND work in SQL Server DTS?


I have tested this with an older version of Excel2003 which I have here. The generated file is about 5kb, the saved version by Excel2007 and 2003 produce a file of about 15kb


I have tested with Libreoffice, opening a file, saving: filesize doesn't double, but is accepted by the SQL DTS scheme.... 


$objPHPExcel = new PHPExcel();
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, "Excel5");

$objSheet = $objPHPExcel->getActiveSheet();
$objPHPExcel->getProperties()->setTitle("ApplyNow application");
$objPHPExcel->getProperties()->setSubject("ApplyNow application");
$objPHPExcel->getProperties()->setDescription("ApplyNow application");
$objSheet->SetCellValue('A1', 'jobid');
$objSheet->SetCellValue('B1', 'jobboardid');
$objSheet->SetCellValue('C1', 'candidatefirstname');
$objSheet->SetCellValue('D1', 'candidatelastname');
$objSheet->SetCellValue('E1', 'candidatenationality');
$objSheet->SetCellValue('F1', 'candidatesex');
$objSheet->SetCellValue('G1', 'candidatedateofbirth');
$objSheet->SetCellValue('H1', 'candidatestreet');
$objSheet->SetCellValue('I1', 'candidatehousenumber');
$objSheet->SetCellValue('J1', 'candidatepostalcode');
$objSheet->SetCellValue('K1', 'candidatecity');
$objSheet->SetCellValue('L1', 'candidatecountry');
$objSheet->SetCellValue('M1', 'cv');
$objSheet->SetCellValue('N1', 'candidatenationalnumber');				
$objSheet->SetCellValue('O1', 'candidatephone');
$objSheet->SetCellValue('P1', 'candidatemobile');
$objSheet->SetCellValue('Q1', 'candidateemail');
$objSheet->SetCellValue('R1', 'subscriptiondatetime');
$objSheet->SetCellValue('S1', 'candidatelanguage');

$objSheet->SetCellValue('A2', $formvars['jobid']);
$objSheet->SetCellValue('B2', $formvars['jobboardid']);
$objSheet->SetCellValue('C2', $formvars['firstname']);
$objSheet->SetCellValue('D2', $formvars['lastname']);
$objSheet->SetCellValue('E2', $formvars['nationality']);
$objSheet->SetCellValue('F2', $formvars['sex']);
$objSheet->SetCellValue('G2', formatDate($formvars['dateofbirth']));
$objSheet->SetCellValue('H2', $formvars['street']);
$objSheet->SetCellValue('I2', $formvars['number']);
$objSheet->SetCellValue('J2', $formvars['postalcode']);
$objSheet->SetCellValue('K2', $formvars['city']);
$objSheet->SetCellValue('L2', $formvars['country']);
$objSheet->SetCellValue('M2', $cvfilename);
$objSheet->SetCellValue('N2', remove_non_numeric($formvars['nationalnumber']));
$objSheet->SetCellValue('O2', remove_non_numeric($formvars['phone']));
$objSheet->SetCellValue('P2', remove_non_numeric($formvars['mobile']));
$objSheet->SetCellValue('Q2', $formvars['email']);
$objSheet->SetCellValue('R2', $formvars['submitdate']);
$objSheet->SetCellValue('S2', $formvars['language']);


Windows XP

PHP 5.3.5

SQL Server 2005

PHPEXcel 1.7.6

Mar 6, 2012 at 11:01 PM

Excel has a number of different ways of storing the same data/formatting/etc (For example, Excel may have a similar formula that appears in 100 cells, so it may store that formula just once as a shared formula and simply have a pointer to that shared formula against each cell where it appears). We don't use the whole host of options, but the easiest to write, which may not be as compact as Excel itself would choose (in the above case of the shared formula, we store the formula against every cell, which makes the file larger, but is faster to write).

Mar 7, 2012 at 7:42 AM

It's not the filesize which bothers me that much, that was purely an observation. What bothers me most is that the SQL DTS scheme messes up the generated file and that the opening the generated file (using Excel 2003, 2007, and Libreoffice) and saving it directly works perfeclty in the DTS Scheme.

It seems like some header data or properties are missing....

Mar 7, 2012 at 11:49 AM

That I don't know, and can't easily test either. The files generated by the Excel2007 writer are valid according to the Open XML SDK for MS Office, and I don't use SQL Server