removeRow slowness

Topics: Developer Forum
Sep 22, 2010 at 12:57 AM

I have an .xls sheet with approximately 4000 rows, I have written a php script to process each row and based on criteria, create an array of rows to be deleted.

The processing of the 4000 rows takes approximately 30 seconds, but the calls to delete approximately 1000 non-contigous rows times out after 15 minutes.

I did an experiment with a smaller file, a .xls with 25 rows and 7 deletes. The load/process time is 3 seconds, but the deletes bring the total to 45 seconds.

Is there a better way to process deletes? Here is a sample snipit where things slow:

foreach ($rowsToDelete as $k => $v)
{
 //echo $v."<br>";
 $activeSheet->removeRow($v, 1);

}

Coordinator
Sep 22, 2010 at 10:00 AM

Removing rows (or columns) is inherently slow because, not only does the code logic remove the cells and renumber the row of every cell that lies below the deleted row, it also checks through every cell in the worksheet looking for formulae that reference cells in the deleted row (or below) and adjusts them to match the new row numbers... likewise with any named ranges, cell validation rules, autofilters, etc. It's actually doing  great deal of work. Similarly, inserting new rows/columns as also very intensive.

One way you might improve the speed of execution is to delete the rows in reverse order... so instead of deleting row 10, then row 20, then row 30... delete row 30 first, then row 20, then row 10.

It's also faster if you can delete rows in contiguous blocks where possible, so if you want to delete rows 1-4, 6-8 and 12,

$activeSheet->removeRow(12, 1);
$activeSheet->removeRow(6, 3);
$activeSheet->removeRow(1, 4);

It's still not going to be instantaneous, but should be quicker