Code snippet for counting the total rows in a file?

Topics: Developer Forum
Mar 21, 2012 at 2:27 AM
Edited Mar 21, 2012 at 2:30 AM
	class chunkReadFilter implements PHPExcel_Reader_IReadFilter {
		private $_startRow = 0;
		private $_endRow = 0;

		public function setRows($startRow, $chunkSize) { 
			$this->_startRow    = $startRow; 
			$this->_endRow      = $startRow + $chunkSize;
		} 

		public function readCell($column, $row, $worksheetName = '') {
			//  Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow 
			if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) { 
			   return true;
			}
			return false;
		} 
	}
	echo "Load from Excel file\n";

	//$objReader = PHPExcel_IOFactory::createReader($inputFileType);
	$objReader = new PHPExcel_Reader_Excel2007();
	$chunkSize = 100;
	$chunkFilter = new chunkReadFilter(); 
	$objReader->setReadFilter($chunkFilter); 
	$objReader->setReadDataOnly(true);
	$objReader->setLoadSheetsOnly($sheetname);

	for ( $startRow = 2; $startRow <= 65536; $startRow += $chunkSize ) {

		print $startRow . "\n";

		$chunkFilter->setRows($startRow,$chunkSize); 

		$objPHPExcel = $objReader->load( $xlsx_path );
		//    Do some processing here 

		//    Free up some of the memory 
		$objPHPExcel->disconnectWorksheets(); 
		unset($objPHPExcel); 
	}

I'd like something more efficient than 65536 as a value in that loop.  I have one xlsx with 250,000+ rows, and another with 200, all going through the same pipe.

It's awesome code.  Thank you. 

Mar 26, 2012 at 12:32 PM

It may be easier to just tally while iterating a whole sheet, or iterate just to tally...

foreach ($objPHPExcel->getRowIterator() as $row) {
	$j = 1;
	
	$cellIterator = $row->getCellIterator();
	$cellIterator->setIterateOnlyExistingCells(false);
	
	foreach ($cellIterator as $cell) {
		$num_cols = $j;
		$j++;
	} // end cell getter
	
	$num_rows = $i;
	$i++;
} // end row getter

Mar 26, 2012 at 10:29 PM

Excellent, thank you.  This is working with both Excel2007 and Excel5, whereas "getHighest" isn't working with Excel5 (at least for me)