May 7, 2013 at 1:09 AM
Edited May 8, 2013 at 7:09 PM
I've made some customization to my code that allows it to identify both HTML files and CSV files.
CSV identification uses two underlying rules:
- No blank rows. If blank rows are detected then file is an invalid CSV file.
- A majority of the rows must have the same number of fields otherwise file is an invalid CSV file.
Would you be interested in adding these modifications into the existing code?
If so, I can paste the the changes here. They are actually relatively minor changes.
- IOFactory.php - Added ability to pass options to identify and createReaderForFile methods.
- HTML.php - Added code to canRead method to more accurately test for a valid HTML file in reader.
- CSV.php - Added code to canRead method to test for a valid CSV file in reader.
The reader modifications add some overhead and time, but only when testing for a valid file so hopefully it is not too onerous.
Let me know if you need me to paste those changes.
-- Addendum: I've just been looking over your changes to the canRead methods in the repository. Looks cleaner and more modular for flexibility. I'll just keep my custom code in place until the next PHPExcel version comes out with these fixes. I'll still have
to add my code for the csvReader::_isValidFormat() method in order to trim down on invalid CSV files.