Converting Excel5 spreadsheet to tab-delimited

Topics: User Forum
Nov 11, 2010 at 2:27 PM

How do I convert from xls to tab-delimited file?

Coordinator
Nov 11, 2010 at 9:27 PM

Read the file using the Excel5 Reader:

$inputFileType = 'Excel5';
$inputFileName = './example1.xls';
/**  Create a new Reader of the type defined in $inputFileType  **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/**  Load $inputFileName to a PHPExcel Object  **/
$objPHPExcel = $objReader->load($inputFileName);

Create a CSV Writer:

$outputFileType = 'CSV';
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, $outputFileType);

Set the separator to a TAB

$objWriter->setDelimiter("\t");

Identify which worksheet you want to save:

$objWriter->setSheetIndex(0);

And save the file

$outputFileName = './example1.tsv';
$objWriter2007->save($outputFileName);

 

Nov 12, 2010 at 2:37 AM
Edited Nov 12, 2010 at 2:38 AM

Thanks, Mark, maybe this should go into a FAQ, it is certainly not in the documentation?

Coordinator
Nov 12, 2010 at 8:24 AM

Probably not in the FAQ, but I've already started extending the documentation (currently only the Reader User Guide is included in the SVN code) for the next release to provide a new series of docs accompanied with working examples for all kinds of recipes (Reading, Writing, Reading Worksheets, Setting Worksheet Data, Styling, Working with Dates, Performing Calculations, Updating Template Workbooks, etc.). I'll certainly be including some examples for converting files between formats in that... although it will be an ongoing exercise extending that documentation.

Nov 16, 2010 at 1:07 PM

Mark, using the code below. I'm still getting this error:
Fatal error: Allowed memory size of 536870912 bytes exhausted
  (tried to allocate 520093666 bytes)
 
I uploaded the file in Issue# 14699
http://phpexcel.codeplex.com/workitem/14699

All I need to do is convert this thing to tab-delimited but,
of course, you need to read in the whole file to do that.
I am using cell caching but still run into this.

 

  set_include_path(get_include_path() . PATH_SEPARATOR . 'PHPExcel/Classes/'); 
  require_once 'PHPExcel/Classes/PHPExcel.php';
  require_once 'PHPExcel/Classes/PHPExcel/IOFactory.php';
  $cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_discISAM;
  PHPExcel_Settings::setCacheStorageMethod($cacheMethod);
  $objReader = PHPExcel_IOFactory::createReaderForFile($newfilename);
  $objReader->setReadDataOnly(true);
  $objPHPExcel  = $objReader->load($newfilename);
  $outputFileType = 'CSV';
  $objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, $outputFileType);
  $objWriter->setDelimiter("\t");
  $objWriter->setSheetIndex(0);
  $filename = $user_id.'_'.$path_parts['filename'].'_'.date('m_d_Y').'.txt';
  if($debug) {showProgressMsg('Saving as '.$filename);}
  $objWriter->save($fileName);



Here is my php.ini memory limit setting:

memory_limit = 512M      ; Maximum amount of memory a script may consume (32MB)

 

How do I fix this? This user uploads this file weekly and I need to either convert it to tab-delimited

or somehow be able to load the entire file to parse.

 

Nov 16, 2010 at 1:39 PM

Here are some of the things I've done.

 

1. changed the caching method to cache_to_apc.

2. double checked memory_limit as 512M.

3. wrote script using exact same code to test on localhost development environment.

 

Locally, it works, but when I try it on my web server, it crashes with the error:

Fatal error: Allowed memory size of 536870912 bytes exhausted (tried to allocate 520093666 bytes) in  

/home/listinve/public_html/PHPExcel/Classes/PHPExcel/Shared/OLERead.php on line 282

 

Now, I am reading emails and extracting this file from an attachment. Doing this may also be using

some of the memory that I am not using locally. I will look into doing some GC before calling this

code.

Nov 16, 2010 at 2:20 PM
Edited Nov 16, 2010 at 2:23 PM

Ok, done some GC by moving things around a bit. But I'm still getting the exhausted memory error even after I clear the memory down to 265288,

It was 15852664 before I cleared everything.

 

I'm at a loss here on what to do, the workbook it tried to load is only 1620449 bytes in size with 15187 rows and 5 columns.

That workbook is located in Issue 14701:

http://phpexcel.codeplex.com/workitem/14701

 

 

Nov 16, 2010 at 2:29 PM

What is the memory limit on your local server? Could it be that your local server runs 32bit, while production is 64 bit? (phpexcel uses slightly more memory per cell with 64bit php)

Coordinator
Nov 16, 2010 at 8:59 PM

My normal rule of thumb calculation for a 32-bit server is 1k/cell (doubled when writing) using default memory caching method, reduced to 1/3 of that when using DiscISAM or PHPTemp caching + the size of the PHPExcel script. On a 64-bit server, there's about 60% overhead. Your spreadsheet is 210,000 cells, giving an estimated memory requirement of 140MB + PHPExcel script size when using DiscISAM, or 225MB on a 64-bit server. setReadDataOnly(true) should reduce that further, but not by much as your workbook contains very little style information.

I tried running your code with the file that you uploaded earlier (41846 rows by 5 columns), and the peak memory usage was 140.5MB on a 32-bit server, 242.5MB on a 64-bit server. So I can't really understand where your memory is being used up. Another option for the reader would be to tell it only to read the first worksheet, but as the other two sheets are empty, that wouldn't help much. Another option would be to use a read filter to read the Excel in "bite-size chunks", writing each out to different tsv files, then concatenating all of those files together.



Nov 17, 2010 at 1:00 AM

So, how do I do a read filter to read it in chunks?

210,000 cells?   15187*5=75935, how do you calculate that?

@borft:

[quote]2. double checked memory_limit as 512M.[/quote]

Same for localhost and server. Both 32-bit.

Coordinator
Nov 17, 2010 at 7:52 AM
MrBaseball34 wrote:

210,000 cells?   15187*5=75935, how do you calculate that?

 The first workbook that you uploaded (40000 in-stock line items update 2010 NOV  8TH.xls) was 41846 rows by 5 columns... which is 209,230 cells

MrBaseball34 wrote:

So, how do I do a read filter to read it in chunks? 

$inputFileType 'Excel5';
$inputFileName './40000 in-stock line items update 2010 NOV  8TH.xls';


/**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter  */
class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
    private 
$_startRow 0;
    private 
$_endRow  0;

    
/**  Set the list of rows that we want to read  */
    
public function setRows($startRow$chunkSize) {
        
$this->_startRow $startRow;
        
$this->_endRow   $startRow $chunkSize;
    }

    public function 
readCell($column$row$worksheetName '') {
        
//  Only read the heading row, and the configured rows
        
if (($row == 1) ||

           ($row >= $this->_startRow && $row $this->_endRow)) {
            return 
true;
        }
        return 
false;
    }
}


/**  Create a new Reader of the type defined in $inputFileType  **/
$objReader PHPExcel_IOFactory::createReader($inputFileType);


/**  Define how many rows we want to read for each "chunk"  **/
$chunkSize 2048;
/**  Create a new Instance of our Read Filter  **/
$chunkFilter = new chunkReadFilter();

/**  Tell the Reader that we want to use the Read Filter  **/
$objReader->setReadFilter($chunkFilter);

/**  Loop to read our worksheet in "chunk size" blocks  **/
for ($startRow 2$startRow <= 65536$startRow += $chunkSize) {
    
/**  Tell the Read Filter which rows we want this iteration  **/
    
$chunkFilter->setRows($startRow,$chunkSize);
    
/**  Load only the rows that match our filter  **/
    
$objPHPExcel $objReader->load($inputFileName);
    
//    Do some processing here
}

Nov 22, 2010 at 5:27 PM
MarkBaker wrote:
MrBaseball34 wrote:

210,000 cells?   15187*5=75935, how do you calculate that?

 The first workbook that you uploaded (40000 in-stock line items update 2010 NOV  8TH.xls) was 41846 rows by 5 columns... which is 209,230 cells

But even the second one gets the error message.

 

Where you say "// Do some processing here" that is where I would save the CSV and then outside the loop concatenate them?

Jan 17, 2011 at 2:49 PM

Hi everybody,

I'm very new in PHP and PHPExcel, but it works great.

I'am just having a problem in saving CSV files : when a cell is empty, the separator (TAB) is not written into the text line.

How can I modify this (using a preference command ?)

Thanks for your answers.

Jul 8, 2014 at 1:10 PM
Edited Jul 8, 2014 at 1:27 PM
Hello!!
I'm trying to do almost same thing but i got a writing error.
I have a file in xls and try to update it and saving it in CSV, so here my stuff:
$inputFileName = "files/base.xls";
$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
$filename = "base_".date("Y_m_d_H_i_s")."_".$_SESSION["numcom"]."_export";

$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objPHPExcel = $objReader->load($inputFileName);

$sheet = $objPHPExcel->getActiveSheet();

//Doing some stuff here adding lot of data

$outputFileType = 'CSV';
    $objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, $outputFileType);
    $objWriter->setDelimiter("\t");
    $objWriter->setSheetIndex(0);
    $outputFileName = $filename.".csv";
    $objWriter->save(dirname(__FILE__) . "/files/updated/".$outputFileName);
Evry folder is in 777 and here my error:
Warning: fopen(/var/www/html/gestionv4/dov/files/updated/{filename}.csv): failed to open stream: Permission denied in /var/www/html/gestionv4/classes/PHPExcel/Writer/CSV.php on line 118
just for info i remplace the real name by {filename}
Any idea if i missing some step?
Jul 8, 2014 at 1:38 PM
Solved
Just need use of absolute path $_SERVER['DOCUMENT_ROOT'] and worked like a charm