XLS to CSV conversion problem

Topics: User Forum
May 16, 2013 at 8:30 AM
Edited May 16, 2013 at 8:34 AM
Hi,

I open an XLS in order to convert it to CSV.
    $objReader = PHPExcel_IOFactory::createReader("Excel5");
    $objReader->setLoadSheetsOnly("Description");
    $objPHPExcel = $objReader->load('file.xls');

    $objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'CSV')->setDelimiter(';');
    $objWriter->save('out.csv');
However, I have encoding problem in the csv, accented character are lost.
But more importantly, some date formats have been converted to strange patterns, on some row it's m-d-y, on others it's m/d/y. All the line in the XLS are in format d/m/y, and I need to keep them like that.

I am using the lastest code downloaded on GitHub.

Any idea how to resolve this ?
Coordinator
May 16, 2013 at 8:42 AM
Accented characters should not be lost, check the CSV file in a text editor that will show UTF-8 characters cleanly.

Check the number format masking of your date cells in MS Excel (or check whether they are simply string values). PHPExcel should use the appropriate format masking for those cells
May 16, 2013 at 8:57 AM
Edited May 16, 2013 at 9:01 AM
Accented characters are actually present when opened with Notepad++, so it's just Excel who don't read the CSV correctly.

The dates looks like strings in Excel.
EDIT : Cells format menu shows it's in date type.

When I directly read the date cells with PHPExcel, getValue() returns a float.
EDIT : On Notepad++, looks like all dates are in m-d-y format, so the m/d/y is another Excel joke. Can I change the output format of dates in PHPExcel ?
Coordinator
May 16, 2013 at 9:12 AM
Edited May 16, 2013 at 9:14 AM
Re. the UTF-8 characters in the CSV, try setting a BOM marker in the CSV file using
    $objWriter->setUseBOM(TRUE);
before saving

Re. the dates: Excel dates are a numeric timestamp value, and it's only the number format mask that renders them in a date-type format. When writing to CSV, PHPExcel will use that format mask to render the cell as a formatted date string. When MS Excel reads such a date string, it will convert it back to a float internally, and apply its default format mask, so you can get the format changing. You can try to change the Excel default mask (not sure how, but it's locale specific as well, or use the Excel import wizard which (IIRC) allows you to control how date columns are formatted... but otherwise MS Excel is kinda limited when reading CSV files. You can apply a new format mask to those columns in PHPExcel, but you can't control how MS Excel will reformat those dates subsequently when it's loading a CSV, only when it's loading an actual spreadsheet format (e.g. xls, xlsx).
May 16, 2013 at 9:17 AM
The CSV will be imported into PostgresSQL database, so the render in Excel don't actually matters.

The CSV date format is not accepted by our actual server, so I will look to change the format mask in PHPExcel.

Thanks for the fast support. :)
May 16, 2013 at 1:11 PM
How can I change the PHPExcel output date format, to save my CSV with wanted date format ?

Thanks.
Coordinator
May 16, 2013 at 2:00 PM
$objPHPExcel->getActiveSheet()
    ->getStyle('E1:E99')
    ->getNumberFormat()
    ->setFormatCode(
        PHPExcel_Style_NumberFormat::FORMAT_DATE_YYYYMMDD2
    );