refactor reading mechanics for XML/Excel2007 files

Topics: Developer Forum, Project Management Forum
Oct 14, 2011 at 7:27 AM
Edited Oct 14, 2011 at 8:13 AM



Well, i need to read really big files and it's real trouble to do with your library. 

First of all, i already read all this articles about loading big files via ReadFilter. But that's not solving the trouble actually.


Here is just simple example: I have 7Mb xlsx file with 20Mb(if unpack zip and look for sheet file) sheet file. After 'parsing' that file with simple ReadFilter that returns all time 'false' i have 40Mb+ memory usage!!! Is it joke? So i decided to look inside of library and found that u r using 'simpleXML' extension. And my next thought was - ' it joke?! i can't believe that they did it!!!'. Really, guys...why do u use simplexml extension when files can be really big and any way about loading WHOLE xml document into the memory is wrong by default. The only right way - stream loading, i.e. SAX way. Not matter what will u choose - expat or wrapper XMLReader/XMLWriter, but it's the only right way to load xml files. Of course u can use simplexml extension to load settings, document info and etc, but for sheet data you must use only stream (SAX) way to read it. It's faster, it's low memory usage and that's the only true way to do it actually.


I really hope u will refactor your  XML/Excel2007 reader/writer, cause right not it's just tempo solution for small files without any real usage in real world.

Oct 14, 2011 at 2:23 PM

It does have real usage in the real world -- just not with big files yet.

I have come across this too but have app-specific workarounds for them, including editing the source myself, and well PHPExcel is still the best (free) solution available.