ChunkFiltered Reader adding blank rows

Topics: User Forum
Jul 17, 2015 at 2:32 PM
Edited Jul 17, 2015 at 2:32 PM
I am using code found on another thread to configure and use the chunk filter to read a large Excel spreadsheet to insert into a MySQL database. I load the active sheet from the filtered read object into an array as follow to create an insert string for my query.
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true); 
I am getting blank rows inserted into my array equal to the number of rows from 0 to my starting row number (modified by my chunk size).

Example:
Reading 20 rows, chunk size of 10, starting on row 6:
I get 6 blank rows ($startRow) followed by the 10 rows from the chunk.
This is followed by 16 blank rows ($startRow + $chunkSize) and then the 10 data rows from the chunk.

It seems to be buffering the data read with blank rows up to the point where the data begins each time, and for the life of me, I can't figure out why. It's probably something simple, but I can't see it.

This is complete code for the read section (with string processing omitted):
class chunkReadFilter implements PHPExcel_Reader_IReadFilter {
     private $_startrow = 0;
     private $_endrow = 0;
     
     public function setRows($startRow, $chunkSize){
          $this->_startrow = $startRow;
          $this->_endrow = $startRow + $chunkSize;
     }
     
     public function readCell($column, $row, $worksheetName = ''){
          if(($row >= $this->_startrow && $row < $this->_endrow)){
               return TRUE;
          }
          
          return FALSE;
     }
}

$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$chunkSize = 10;
$chunkFilter = new chunkReadFilter();
$objReader->setReadFilter($chunkFilter);

for ($startRow = 6; $startRow <= 20; $startRow += $chunkSize){
     $chunkFilter->setRows($startRow, $chunkSize);
     $objPHPExcel = $objReader->load($inputFileName);
     
     $sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);     
     
     foreach ($sheetData as $row=>$values){
         // Process string creation here
     }
     $objPHPExcel->disconnectWorksheets();
     unset($objPHPExcel);
     
   
}
Coordinator
Jul 17, 2015 at 2:47 PM
Yes, that's right.

Row 17 is still row 17 whether you read the whole worksheet or a chunk... chunk reading is about specifying which cells get populated with data, and reading in chunks does exactly that, it doesn't automagically change row numbers so that every new chunk starts from row 1, because that would break many spreadsheets; chunks always retain their correct row number..... it's up to your code to keep track of start row, and process only those rows that actually contain data
Jul 17, 2015 at 5:34 PM
Edited Jul 17, 2015 at 5:56 PM
So, since the row number is retained, I should be able to use the start row inside the foreach loop to only read parts of the array starting after the current start row?

Example:
Use something like if ($row >= $startRow) {  /*do X*/ }
What is there to keep my reads from becoming unwieldy? At some point, I may have thousands of blank rows at the beginning of my array?

I thought the premise of the chunk filter was to instruct the reader to only return the specified number of rows from the sheet it was reading. If that is the case, where are the empty rows coming from?