Converting HTML to proper Excel spreadsheet

Topics: User Forum
Feb 21, 2012 at 3:19 PM

I have a file from a third party supplier which I've been trying to get to open in PHPExcel so that I can read it and extract various pieces of data. After a lot of messing around trying the various readers (Excel5, Excel2003XML, Excel2007), I decided to have a poke around in the file directly and discovered that it is in fact just a plain text file containing HTML, which happens to have been saved with a '.xls' extension. Excel seems to handle these without too many problems (though it throws up an error dialogue when I open the file it will still display the spreadsheet), but PHPExcel fails on the canRead() function.

Whilst I could in theory parse the HTML, I'd prefer not to as I already have code written to use PHPExcel. Is there a programmatic way to convert the HTML to Excel? Ideally in PHP but anything which doesn't involve having a Windows machine with Excel would suffice.

Feb 21, 2012 at 3:26 PM
Edited Feb 21, 2012 at 3:29 PM


afaik mark is working on the html reader for phpexcel, you can see it in the "2012 roadmap" of the project

[quote]I've got about 90% of an experimental HTML Reader that I want to include in this release... I'm just so fed up of people saying "PHPExcel doesn't work", and it turns out they're trying to read an HTML file with a .xls extension. Doesn't yet handle styles, or merged cells, but otherwise can handle some pretty atrocious HTML markup errors relatively cleanly.[/quote]

you have just to wait for the release :)

Edit: lol for my "[quote]" :P

Feb 21, 2012 at 3:33 PM

Unfortunately I can't really wait for the release - helpful though it would be keep everything in one library. I don't actually need PHPExcel to support this functionality if there's a way to convert HTML->Excel in an external tool, as I would call that tool separately and then pass the final Excel file to PHPExcel.

Feb 21, 2012 at 3:55 PM

i would do it with a vbs script and ms excel, but you said you have to avoid it so really i don't know..

you can try asking to your friend Google, i don't think this is a problem related to phpexcel

Feb 22, 2012 at 8:28 PM
Edited Feb 22, 2012 at 8:28 PM

You may want to check out this discussion that contains some working code:

Here's how to create Excel from an HTML Table

Feb 23, 2012 at 4:54 PM
Edited Sep 5, 2013 at 10:05 PM
My company has created a product called DocRaptor that does just this. It's an API that converts HTML to Excel format with an HTTP POST request. DocRaptor is especially useful for quickly generating a lot of reports, and handles CSS better than comparable programs. Here's a link to DocRaptor's home page: DocRaptor
And a link to our coding examples: DocRaptor coding examples
Feb 23, 2012 at 4:58 PM

Unfortunately DocRaptor is based in the US, which rules it out for this particular scenario (the data can't be transferred outside the UK for data protection reasons).