toArray returns 65382 rows

Topics: Developer Forum, User Forum
Oct 9, 2014 at 11:30 AM
I have a problem with reading only the rows in a sheet which contains data. I have a sheet with 38 rows. But when I use
$rows = count($objPHPExcel->getActiveSheet()->toArray()); 
I get 65382 rows and not 38 rows. And trying to iterate the sheet kills the webserver.

There is not 65382 rows of data, I have checked :)
Oct 9, 2014 at 3:53 PM
I have this problem a lot, especially in .xlsx spreadsheets. First of all, make sure there is not a row 65382 rows down with something in it. I've seen that, I think even a space can cause this. What usually happens, though, is the internal pointers in the spreadsheet think there is something way out down there when there isn't, and it loads all of the empty or non-existent rows between, times out, and bye bye program. I have critical processes that absolutely must complete, and this is a real show stopper.

What I do that solves this is to load the spreadsheet into LibreOffice ( I suppose you can use Excel if you must, I personally avoid Microsoft products but that is another story) and then just save it again. I save it as an .XLS, not .XLSX, and this fixes the problem most of the time. For reasons unknown to me, saving as .XLS instead of .XLSX can fix the problem. On those rare occasions when it doesn't, I have to copy/paste the contents of the spreadsheet into a new blank spreadsheet, and that fixes it.

One of the above fixes it 100% of the time.

I really wish we could specify a limit to the number of rows/columns when we do toArray because otherwise it times out. Sometimes I do this in an unattended session via ssh, and I don't know it failed until my users complain.
Coordinator
Oct 9, 2014 at 5:03 PM
Edited Oct 9, 2014 at 5:03 PM
Using the rangeToArray() method allows you to set a limit on toArray(); and use getHighestDataRow() and getHighestDataColumn() to find the highest row and column that actually contain data (though even a space in a cell qualifies as data
Oct 9, 2014 at 5:15 PM
Another problem I run into on some spreadsheets isn't when I get to the toArray point, it is here:

$objPHPExcel = PHPExcel_IOFactory::load($filename);

Sometimes this is where it goes south, uses up obscene amounts of memory, and then times out. I never get to this point:

$data = $objPHPExcel->getActiveSheet()->toArray(...

However, rangeToArray is awesome, and will solve a lot of these, as most of the time that is where it chokes.

Thanks!
Coordinator
Oct 9, 2014 at 6:09 PM
There's a whole host of advice on how to reduce memory when using load, from setting the Reader to only load data (ignoring styles, etc) to reading individual worksheets and/or only selected blocks of cells in the Reader documentation, and there's cell caching to reduce the memory used to store each cell
Oct 10, 2014 at 6:36 AM
Zootal wrote:
I have this problem a lot, especially in .xlsx spreadsheets. First of all, make sure there is not a row 65382 rows down with something in it. I've seen that, I think even a space can cause this. What usually happens, though, is the internal pointers in the spreadsheet think there is something way out down there when there isn't, and it loads all of the empty or non-existent rows between, times out, and bye bye program. I have critical processes that absolutely must complete, and this is a real show stopper.

What I do that solves this is to load the spreadsheet into LibreOffice ( I suppose you can use Excel if you must, I personally avoid Microsoft products but that is another story) and then just save it again. I save it as an .XLS, not .XLSX, and this fixes the problem most of the time. For reasons unknown to me, saving as .XLS instead of .XLSX can fix the problem. On those rare occasions when it doesn't, I have to copy/paste the contents of the spreadsheet into a new blank spreadsheet, and that fixes it.

One of the above fixes it 100% of the time.

I really wish we could specify a limit to the number of rows/columns when we do toArray because otherwise it times out. Sometimes I do this in an unattended session via ssh, and I don't know it failed until my users complain.
Yes, you're right. Strange things happened to the sheet when the user edited in Office for OS X. I don't have these problems in Libre Office.

One should not use MS products in a professional environment :)