What value from the PHPExcel object should I pass into my custom detectDelim function?

Topics: Developer Forum
Jul 11, 2012 at 3:20 PM
Edited Jul 11, 2012 at 8:34 PM

PHP vers: 5.3.5, PHPExcel vers: 1.7.7

I'd like to extend the CSV class to include an auto-detector for the delimiter.

Is there an element in the object that holds the (CSV) file in an array (or even an element just with the file itself) ??

I'm becoming stronger in OOP, but still get a bit lost with the more advanced (to me) stuff ... so I can use a hand plugging my function into the class.

It's a pretty simple little thing.

function detectDelimiter( $file, $sample = 5 ){  
    $delimsRegex = ",^.;:\t"; // whichever is first in the list will be the default
    $delims = str_split($delimsRegex);
    $delimCount = $delimiters = array();
    foreach($delims as $delim){
        $delimCount[$delim]=0 ;
        $delimiters[]=$delim ; 
    $sample = ( count($file) < $sample)? count($file): $sample;  // set sample-size to the lesser value    
    array_splice($file, $sample);  // drop any unwanted rows from end of the array
    foreach ($file as $row) {
        $row=preg_replace( '/\r\n/', '', trim($row) );  // clean up .. strip new line and line return chars
        $row=preg_replace( "/[^$delimsRegex]/", '', $row);  // clean up .. strip evthg which is not a dilim'r
        $rowChars = str_split($row);  // break it apart char by char
        foreach ($rowChars as $char) {
            foreach ($delimiters as $delim) { 
                if( strpos($char, $delim) !== false) {  // if the char is the delim ...
                    $delimCount[$delim]++;  // ... increment
    $detected = array_keys($delimCount, max($delimCount));
    // naturally, we will be calling "setDelimiter($detected[0])" here ..
    // .. and returning "$this" instead of the delim'r itself
    return $detected[0]; 

Right now, I'm passing in an array like this

$csv = file($tabDelimitedFile);

Is there an element in the object that holds an array of the file like that? ... or any other suggestions??

I'm kinda in a bind rt now until I get this going, so any and all feedback will be greatly appreciated :)


Jul 12, 2012 at 4:52 PM

You'd need to put the call to this autodetector in the loadIntoExisting method of the PHPExcel/Reader/CSV.php file; but the script reads the CSV a line at a time rather than loading every line into memory (we have enough memory issues without deliberately trying to create them). Logically, you'd probably want to load up just a few lines immediately after the check for the BOM, set the $this->_delimiter value, and then remember to rewind the file afterwards.

Jul 12, 2012 at 7:39 PM

Thank you, thank you!

That is exactly what I needed.