leading zeros in CSV file... tried the setFormatCode() to no effect?

Topics: Developer Forum, User Forum
Sep 4, 2014 at 11:29 PM
Edited Sep 4, 2014 at 11:31 PM
Hey all,
I am uploading some files that could be csv, xls, xlsx... so im using the PHPExcel_IOFactory::identify($filepath) to create the file object.

But when i upload csv files I can't seem to pull leading zeroes. The other formats (xsls,xlsx) I can retrieve the column with the leading zeroes, since I guess the column has meta data set to Text formatting in Excel.
But with simple csv files i was hoping to force the column to be interpreted as Text it via:
$objWorksheet->getStyle('A1:A'.$maxRow)->getNumberFormat()->setFormatCode( PHPExcel_Style_NumberFormat::FORMAT_TEXT );
But it has had no effect... Thought I could upload a simple php file and sample csv.

For example, It should output: 'Cell #A2 is 0101' but it outputs currently 'Cell #A2 is 101'
Here is the php and csv file: https://dl.dropboxusercontent.com/u/36600570/phpexcel_example/php-excel-example.zip

Here is the code if afraid to test/download above file:
Looking for any feedback,
Thanks!
Arian

<?php
require '../../libs/PHPExcel/Classes/PHPExcel.php';
require '../../libs/PHPExcel/Classes/PHPExcel/IOFactory.php';


$filepath = 'test.csv'; //for now simple script to 
try {
  $fileType = PHPExcel_IOFactory::identify($filepath);
  $objReader = PHPExcel_IOFactory::createReader($fileType);
} catch ( Exception $e ){
  echo 'Please specific the correct file location in script';
  exit;
}
$objPHPExcel = $objReader->load($filepath);
$objWorksheet = $objPHPExcel->getActiveSheet();

//!!!make the 1st column A1:A4 Formatted as Text so leading zeroes come through.
$maxRow = $objWorksheet->getHighestDataRow();
$objWorksheet->getStyle('A1:A'.$maxRow)->getNumberFormat()->setFormatCode( PHPExcel_Style_NumberFormat::FORMAT_TEXT );



foreach ($objWorksheet->getRowIterator() as $row) {
  $row_num = $row->getRowIndex();
  
  $cellIterator = $row->getCellIterator();
  $cellIterator->setIterateOnlyExistingCells(false); // This loops all cells, even if it is not set. By default, only cells that are set will be iterated.

  $row_num = $row->getRowIndex();
  foreach ($cellIterator as $cell_num => $cell) { //Note: $cell_num index is 0 based!, so add a 1 to make more sense when viewing spreadsheet
    $cell_num = $cell_num + 1;
    $val = trim($cell->getValue());
    $formatted_val = trim($cell->getFormattedValue());
    
    echo 'Cell #'. chr($cell_num-1 +65) . $row_num .' is '. $val.'<br/>'; //should output 0101<br/>0101<br/>00102<br/>
    
  }

  echo '<br/>';
}          
Coordinator
Sep 5, 2014 at 7:07 AM
There is already an issue raised for this
Sep 5, 2014 at 7:35 AM
Hey Mark,

I just read that issue... Sorry if misinterpreting, but seems like you just said numbers with leading zeroes should be trimmed off and still converted by PHPExcel.
Wasn't sure if his case is slightly different, seems like he's getting values from database with leading zeros and wants to insert them into a csv with them intact.
When I am just trying to read from a csv already created with another application.

Seems like there wasn't a code-fix suggestion and treating issue like a 'wont fix' or 'works as designed' at least for his issue?
But I thought there was a solution that you posted for this that should work here?: http://stackoverflow.com/a/12457699
I didn't look into the Cell Binder solution yet, but thought the setFormatCode set on a range was a current solution (but not working in code above)?
Like forcing the Cell's format to be 'pure' text seems like it should work and should never do any sort of pre-processing work to trim/remove string values being read or written (in my case I'm just reading values from that cell).
Does that mean its a bug on setFormatCode() or Im just using that function incorrectly?


Thanks again for a great library,
Looking forward to some suggestions,
Arian
Coordinator
Sep 5, 2014 at 8:33 AM
The cell binder rules are applied whenever a cell is populated unless you set the value using the "explicit" methods, and (unless you've told PHPExcel to use a custom binder) the default value binder will always be used.
That applies whether you're populating the cell value in your own script, or if it's being populated by a Reader.... most of the Readers use "explicit" methods because the datatype to use is part of the file format, but a csv file has no such information, so it can't use "explicit" and the binding rules apply.

The SO response is about setting your own binding rules to override the default binder behaviour, while issue GH-394 is a request that the existing default binder should be less aggressive in always converting numeric values to float, and should treat numeric values with leading zeroes as string.

I'm going to implement the change to the default value binder so it is less aggressive, for the 1.8.1 release later this month..... it's not quite as straightforward as treating everything with a leading 0 as a string (as shown in the SO answer) because 0.123 has a leading zero, but should still be converted to a numeric.

If you need to resolve this more urgently, then the easiest solution is to use a custom binder.
Marked as answer by armyofda12mnkeys on 9/6/2014 at 2:46 AM
Sep 5, 2014 at 3:02 PM
Cool, The custom binder you recommended worked out...

Here is what I did if anyone else needs to do it:
I threw in LeadingZeroBinder.php into PHPExcel/Cell:
<?php

class LeadingZeroBinder extends PHPExcel_Cell_DefaultValueBinder
  implements PHPExcel_Cell_IValueBinder 
{ 
    public function bindValue(PHPExcel_Cell $cell, $value = null) 
    { 
        // sanitize UTF-8 strings 
        if (is_string($value)) { 
            $value = PHPExcel_Shared_String::SanitizeUTF8($value); 
        } 

        // Implement your own override logic 
        if (is_string($value) && $value[0] == '0') { //looks like if its a string, and starts with a leading 0, then force String
            $cell->setValueExplicit($value, PHPExcel_Cell_DataType::TYPE_STRING); 
            return true; 
        } 

        // Not bound yet? Use default value parent... 
        return parent::bindValue($cell, $value); 
    } 
} 
?>
If just want everything text, guess this could work too:
<?php

class TextValueBinder 
  implements PHPExcel_Cell_IValueBinder
{
    public function bindValue(PHPExcel_Cell $cell, $value = null) 
    {
        $cell->setValueExplicit($value, PHPExcel_Cell_DataType::TYPE_STRING);
        return true;
    }
}
?>
Then in my main code, something simple like this:
try {
    $fileType = PHPExcel_IOFactory::identify($filepath);
    $objReader = PHPExcel_IOFactory::createReader($fileType);
    if( $fileType==='CSV') ) { //common values I would see: Excel2007, Excel5, CSV
        require 'PHPExcel/Cell/LeadingZeroBinder.php';//this one specifically for Leading Zeros
        //require 'PHPExcel/Cell/TextValueBinder.php'; //backup incase above doesn't work
        PHPExcel_Cell::setValueBinder( new LeadingZeroBinder() );
    }

} ....