preg_match bug in 1.7.5 - Compilation failed: PCRE does not support \\L, \\l, \\N, \\P, \\p, \\U, \\u, or \\X

Topics: Developer Forum, User Forum
Dec 16, 2010 at 1:58 AM

Hi all
After upgrading to 1.7.5 my worksheets started giving me 500 server errors and I think I've tracked it down to a problem with a preg_match function in Worksheet.php
Looking at the PHP error log I was finding this:


[Thu Dec 16 13:51:14 2010] [error] [client xxx] PHP Warning: preg_match() [function.preg-match]: Compilation failed: PCRE does not support \\L, \\l, \\N, \\P, \\p, \\U, \\u, or \\X at offset 8 in /httpsdocs/PHPExcel/Worksheet.php on line 940
PHP Fatal error: Uncaught exception 'Exception' with message 'Worksheet!N11 -> Formula Error: An unexpected error occured' in /httpsdocs/PHPExcel/Cell.php:284\nStack trace:\n#0 /httpsdocs/functions.php(4075): PHPExcel_Cell->getCalculatedValue()\n#1 /httpsdocs/reports_xls.php(1570): report_weekly_sales('33', '2010', 'xxx', '15th August 201...', Array, '137')\n#2 {main}\n thrown in /httpsdocs/PHPExcel/Cell.php on line 284

I couldn't see what was wrong with cell N11, it had the same sort of numbers as other rows in column N

I wondered if the preg_match error relating to \\N might be the problem.

Worksheet.php rows 940-945 are:

if ((!preg_match('/^'.PHPExcel_Calculation::CALCULATION_REGEXP_CELLREF.'$/i', $pCoordinate, $matches)) &&
			(preg_match('/^'.PHPExcel_Calculation::CALCULATION_REGEXP_NAMEDRANGE.'$/i', $pCoordinate, $matches))) {
			$namedRange = PHPExcel_NamedRange::resolveRange($pCoordinate, $this);
			if (!is_null($namedRange)) {
				$pCoordinate = $namedRange->getRange();
				return $namedRange->getWorksheet()->getCell($pCoordinate);
			}
		}

So I pulled out the relevant code into a test.php file like this:

<?
ini_set('display_errors', true);
ini_set('error_reporting', E_ALL);
$pCoordinate='N11';
$PHPExcel_Calculation_CALCULATION_REGEXP_CELLREF='((((?:\P{M}\p{M}*)+?)|(\'[^\']*\')|(\"[^\"]*\"))!)?\$?([a-z]{1,3})\$?(\d+)';
$PHPExcel_Calculation_CALCULATION_REGEXP_NAMEDRANGE='((((?:\P{M}\p{M}*)+?)|(\'[^\']*\')|(\"[^\"]*\"))!)?([_A-Z][_A-Z0-9]*)';
if ((!preg_match('/^'.$PHPExcel_Calculation_CALCULATION_REGEXP_CELLREF.'$/i', $pCoordinate, $matches)) &&
			(preg_match('/^'.$PHPExcel_Calculation_CALCULATION_REGEXP_NAMEDRANGE.'$/i', $pCoordinate, $matches))) {
}
?>

and got this error:

Warning: preg_match() [function.preg-match]: Compilation failed: PCRE does not support \L, \l, \N, \P, \p, \U, \u, or \X at offset 8 in /httpsdocs/test.php on line 7

Warning: preg_match() [function.preg-match]: Compilation failed: PCRE does not support \L, \l, \N, \P, \p, \U, \u, or \X at offset 8 in /httpsdocs/test.php on line 8

So am I right in thinking there's a problem with the code in Worksheet.php?

Dec 16, 2010 at 2:05 AM

btw this doesn't happen with 1.7.4

Coordinator
Dec 16, 2010 at 11:28 AM

This was a change to support UTF-8 sheet names in formulae. The regexp should be valid for most recent versions of PHP. What version of PHP are you using?

Dec 16, 2010 at 7:11 PM

HI Mark

Our hosting server is running PHP 5.2.8 Should I ask our hosting company to upgrade?

In the meantime is there a workaround I can use?

 

Cheers

 

Lucas

Coordinator
Dec 21, 2010 at 9:37 PM

I don't think upgrading PHP itself would help. Version 5.2.8 should be running against PCRE 7.8 (which should support \P) unless PHP was explicitly built against an earlier version of PCRE that didn't support this feature. I can't come up with a simple solution that doesn't break functionality for other users. However, if you aren't using Multibyte names for worksheets, you can replace the CALCULATION_REGEXP_CELLREF and CALCULATION_REGEXP_NAMEDRANGE constants in Calculation.php with the versions from the 1.7.4 version.

Dec 21, 2010 at 10:42 PM

I ran into a similar preg_match problem today. My php reve was 5.1.6. (or something like that). I upgraded

to 5.2x and it worked like a charm.

-wk
----- Original Message -----
From: "MarkBaker" <notifications@codeplex.com>
To: [email removed]
Sent: Tuesday, December 21, 2010 5:38:00 PM
Subject: Re: preg_match bug in 1.7.5 - Compilation failed: PCRE does not support \\L, \\l, \\N, \\P, \\p, \\U,... [PHPExcel:238547]

From: MarkBaker

I don't think upgrading PHP itself would help. Version 5.2.8 should be running against PCRE 7.8 (which should support \P) unless PHP was explicitly built against an earlier version of PCRE that didn't support this feature. I can't come up with a simple solution that doesn't break functionality for other users. However, if you aren't using Multibyte names for worksheets, you can replace the CALCULATION_REGEXP_CELLREF and CALCULATION_REGEXP_NAMEDRANGE constants in Calculation.php with the versions from the 1.7.4 version.

Dec 22, 2010 at 11:00 AM

I've get the same warnings using FreeBSD 6.2 using PHP 5.3.3 and pcre 8.00

php -r 'print phpversion() . "\n" . PCRE_VERSION . "\n";'
5.3.3
8.00 2009-10-19

If this really is a pcre version problem, then shouldn't the regexp consts be defined in a "if (version_compare(PCRE_VERSION, ...) { } else {}" block?

This is my warning:

[Wed Dec 22 11:58:25 2010] [error] [client 127.0.0.1] PHP Warning:  preg_match() [<a href='function.preg-match'>function.preg-match</a>]: Compilation failed: support for \\P, \\p, and \\X has not been compiled at offset 8 in /PHPExcel/Worksheet.php on line 940

 

Coordinator
Dec 22, 2010 at 11:41 AM
Edited Dec 22, 2010 at 11:44 AM

I'm puzzled as to why this should be occurring... I've double-checked the PHP and PCRE version documentation, and it's pretty conclusive that this shouldn't be a problem, and I've tested it against PHP from 5.2.8 to 5.3.2; but it looks like I'm screwballed. PCRE 8.0.0 should certainly support the multibyte expressions, but it may be that PCRE was built without the --enable-unicode-properties or --enable-utf8 configuration switches

I'll need to put in some additional tests to try and identify if these settings are available, and revert back to the version 1.7.4 regexps if the PCRE doesn't support this feature.

This is already raised as Work Item 14898 so I'll post any further details of code change against that.

Coordinator
Dec 22, 2010 at 3:47 PM

If you're suffering from this problem, can you please check to see whether the PREG_BAD_UTF8_OFFSET constant is defined. Thanks.

Dec 23, 2010 at 6:24 AM

Mark,

It doesn't seem to be defined if you meant in PHP:

php -r 'print PREG_BAD_UTF8_OFFSET . "\n";'
PHP Notice:  Use of undefined constant PREG_BAD_UTF8_OFFSET - assumed 'PREG_BAD_UTF8_OFFSET' in Command line code on line 1

Dec 23, 2010 at 6:48 AM

Mark,

Just to be sure and because you were very baffled, I checked the Apache PHP module to see which pcre version it was using and to my surprise it is a different version which explains the why I'm having that problem:

PCRE_VERSION: 5.0 13-Sep-2004

Now I'm just baffled as to why the Apache PHP module uses a different pcre version to the cli PHP executable.

Anyway, is it possible to use some kind of "if else" construct based on the PCRE_VERSION to make the regexps work for everyone? Defining them as class const's won't be possible then, they'll have to be static class variables initialized on 1st constructor call.

Coordinator
Dec 23, 2010 at 7:58 AM

Yes, I did mean the PHP PREG_BAD_UTF8_OFFSET constant, thanks.

If PHP is configured with --with-pcre-regex=DIR pointing to a PCRE built without the multibyte options, then you can get this type of situation, and a command-line pcretest can then subsequently report a different version.

The idea is that I can wrap a (global) constant definition within an "if test" outside of the Calculation class, then use that "global" constant to set the class constant.

Unfortunately, I can't simply rely on the value of PCRE_VERSION in my if test, because it's possible to have PCRE v8 with the utf-8 disabled (the --enable-unicode-properties or --enable-utf8 configuration switches)... so I need a method to identify that. My fallback option is to execute a preg_match using \P, then test for the error... but its possible that PREG_BAD_UTF8_OFFSET won't be defined unless the multibyte constructs are enabled, and that would be a much cleaner (and more efficient) "if test" than preg_match with an error trap. However, I need to build a PCRE from source, then rebuild PHP on top of that to test it myself. Hopefully, those of you having the problem may be able to save me that effort.

Coordinator
Dec 24, 2010 at 10:32 AM
Edited Dec 24, 2010 at 11:10 AM

Hopefully, this issue is resolved with the latest changes to Calculation.php. The code can be found in the SVN Repository

Dec 27, 2010 at 7:54 PM

I grabbed the latest version of Calculation.php, and it solved the issue.  Thanks!

Jan 19, 2011 at 6:51 PM

We started out with a PHP error, "Multibyte function overloading in PHP must be disabled for string functions (2)," just for Tests/01simple.php.

The solution to that was to set mbstring.func_overload = 0 in php.ini. (Ideally you'd set this in .htaccess perdir, but this functionality has been removed from PHP?)

Then I received the error from this discussion, so using Calculation.php from SVN fixed it.

This is just a note for anyone else who may have such trouble. :)

Coordinator
Jan 19, 2011 at 10:16 PM

The "Multibyte function overloading in PHP must be disabled for string functions (2)" error is actually an exception thrown by PHPExcel itself, which tests for the mbstring.func_overload within the autoloader.

I should probably add something to the documentation to explain this, and how to resolve it.