Is there a less memory intensive way to read XLSX files?

Topics: Developer Forum
Mar 5, 2008 at 2:29 PM
Hi,
I wish to use PHPExcel to convert XLSX files to CSV or some other easy to work with format? (styling, etc does not matter)
I also might need to convert rather large files (up to about 40 MB), but a 2MB XLSX file needs more than 200MB of memory.
Is there possibly a way to read an XLSX file line by line in order to keep the memory usage down?

Thanks,
Andy
Coordinator
Mar 6, 2008 at 10:26 AM
There is a switch on the Excel2007 reader which does this:
$reader->setReadDataOnly(true);
Mar 6, 2008 at 7:48 PM
Thank you for the suggestion. By running that first I was able to load the 2 MB file, but I was still not able to load a 7.5 MB xlsx file.
I tried using some of the writers, such as html and csv, and found that csv would be best for me. I was able to convert all of the
xlsx files in the source code, but not my 2MB or 7.5 MB file. I was looking at the data structure for the cellCollection. It looked like there
is a lot of overhead when it comes to borders, margins, and fonts, etc. Would there be an easy way to disregard all of the styling, and
only load the values for each cell?
Coordinator
Mar 7, 2008 at 7:10 AM
Have you tried the latest build (Source Code tab)? Been doing some improvements yesterday.
Mar 7, 2008 at 2:22 PM
Yes, I am using revision 10003.
Here is the code I am using, does this look right?

include 'PHPExcel.php';
require 'PHPExcel/Reader/Excel2007.php';
include 'PHPExcel/Writer/CSV.php';

$objPHPExcel = new PHPExcel();

echo date('H:i:s') . " Load from Excel2007 file\n";
$objReader = new PHPExcelReaderExcel2007;
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load("/tmp/andy/new.xlsx");
$objWriter = new PHPExcelWriterCSV($objPHPExcel);
$objWriter->setSheetIndex(0);
$objWriter->save("/tmp/andy/test.csv");

Thanks,
Andy
Coordinator
Mar 10, 2008 at 10:20 AM
Looks perfect!

Would it help if I implemented a Excel reader option which only reads sheet X from a workbook?
Mar 10, 2008 at 9:43 PM
I think it would be more helpful to have an option that would extract cell values only, and disregard any styling properties.
Most spreadsheets I've dealt with have only had 1 sheet with actual data, and the rest are blank.
Coordinator
Mar 11, 2008 at 9:09 AM
$reader->setReadDataOnly(true); should do that.

How many rows/columns are in the sheet?
Mar 12, 2008 at 5:24 PM
Edited Mar 12, 2008 at 5:26 PM
Well the files I need to convert have a minimum of 5 columns, and probably up to 50,000 rows.
The test sheet I had contained 47,000 rows, and that was 2MB which was unable to load.
I was looking at a dump of the $objPHPExcel object, and it looks like it still stores a lot of extra arrays that aren't neccesary
Coordinator
Mar 13, 2008 at 8:50 AM
Will try to figure this out next week.
May 7, 2008 at 11:28 PM
I think this would solve my problem. I have a large .xlsx with four large sheets and the file size is over 7mb and running into memory issues. If I could read only sheet 1, I would be fine, but here is my concern : what if a sheet 1 has formulas pointing to other sheets?


maartenba wrote:
Looks perfect!

Would it help if I implemented a Excel reader option which only reads sheet X from a workbook?