1441 |
ariadna |
1 |
composer/pcre
|
|
|
2 |
=============
|
|
|
3 |
|
|
|
4 |
PCRE wrapping library that offers type-safe `preg_*` replacements.
|
|
|
5 |
|
|
|
6 |
This library gives you a way to ensure `preg_*` functions do not fail silently, returning
|
|
|
7 |
unexpected `null`s that may not be handled.
|
|
|
8 |
|
|
|
9 |
As of 3.0 this library enforces [`PREG_UNMATCHED_AS_NULL`](#preg_unmatched_as_null) usage
|
|
|
10 |
for all matching and replaceCallback functions, [read more below](#preg_unmatched_as_null)
|
|
|
11 |
to understand the implications.
|
|
|
12 |
|
|
|
13 |
It thus makes it easier to work with static analysis tools like PHPStan or Psalm as it
|
|
|
14 |
simplifies and reduces the possible return values from all the `preg_*` functions which
|
|
|
15 |
are quite packed with edge cases. As of v2.2.0 / v3.2.0 the library also comes with a
|
|
|
16 |
[PHPStan extension](#phpstan-extension) for parsing regular expressions and giving you even better output types.
|
|
|
17 |
|
|
|
18 |
This library is a thin wrapper around `preg_*` functions with [some limitations](#restrictions--limitations).
|
|
|
19 |
If you are looking for a richer API to handle regular expressions have a look at
|
|
|
20 |
[rawr/t-regx](https://packagist.org/packages/rawr/t-regx) instead.
|
|
|
21 |
|
|
|
22 |
[](https://github.com/composer/pcre/actions)
|
|
|
23 |
|
|
|
24 |
|
|
|
25 |
Installation
|
|
|
26 |
------------
|
|
|
27 |
|
|
|
28 |
Install the latest version with:
|
|
|
29 |
|
|
|
30 |
```bash
|
|
|
31 |
$ composer require composer/pcre
|
|
|
32 |
```
|
|
|
33 |
|
|
|
34 |
|
|
|
35 |
Requirements
|
|
|
36 |
------------
|
|
|
37 |
|
|
|
38 |
* PHP 7.4.0 is required for 3.x versions
|
|
|
39 |
* PHP 7.2.0 is required for 2.x versions
|
|
|
40 |
* PHP 5.3.2 is required for 1.x versions
|
|
|
41 |
|
|
|
42 |
|
|
|
43 |
Basic usage
|
|
|
44 |
-----------
|
|
|
45 |
|
|
|
46 |
Instead of:
|
|
|
47 |
|
|
|
48 |
```php
|
|
|
49 |
if (preg_match('{fo+}', $string, $matches)) { ... }
|
|
|
50 |
if (preg_match('{fo+}', $string, $matches, PREG_OFFSET_CAPTURE)) { ... }
|
|
|
51 |
if (preg_match_all('{fo+}', $string, $matches)) { ... }
|
|
|
52 |
$newString = preg_replace('{fo+}', 'bar', $string);
|
|
|
53 |
$newString = preg_replace_callback('{fo+}', function ($match) { return strtoupper($match[0]); }, $string);
|
|
|
54 |
$newString = preg_replace_callback_array(['{fo+}' => fn ($match) => strtoupper($match[0])], $string);
|
|
|
55 |
$filtered = preg_grep('{[a-z]}', $elements);
|
|
|
56 |
$array = preg_split('{[a-z]+}', $string);
|
|
|
57 |
```
|
|
|
58 |
|
|
|
59 |
You can now call these on the `Preg` class:
|
|
|
60 |
|
|
|
61 |
```php
|
|
|
62 |
use Composer\Pcre\Preg;
|
|
|
63 |
|
|
|
64 |
if (Preg::match('{fo+}', $string, $matches)) { ... }
|
|
|
65 |
if (Preg::matchWithOffsets('{fo+}', $string, $matches)) { ... }
|
|
|
66 |
if (Preg::matchAll('{fo+}', $string, $matches)) { ... }
|
|
|
67 |
$newString = Preg::replace('{fo+}', 'bar', $string);
|
|
|
68 |
$newString = Preg::replaceCallback('{fo+}', function ($match) { return strtoupper($match[0]); }, $string);
|
|
|
69 |
$newString = Preg::replaceCallbackArray(['{fo+}' => fn ($match) => strtoupper($match[0])], $string);
|
|
|
70 |
$filtered = Preg::grep('{[a-z]}', $elements);
|
|
|
71 |
$array = Preg::split('{[a-z]+}', $string);
|
|
|
72 |
```
|
|
|
73 |
|
|
|
74 |
The main difference is if anything fails to match/replace/.., it will throw a `Composer\Pcre\PcreException`
|
|
|
75 |
instead of returning `null` (or false in some cases), so you can now use the return values safely relying on
|
|
|
76 |
the fact that they can only be strings (for replace), ints (for match) or arrays (for grep/split).
|
|
|
77 |
|
|
|
78 |
Additionally the `Preg` class provides match methods that return `bool` rather than `int`, for stricter type safety
|
|
|
79 |
when the number of pattern matches is not useful:
|
|
|
80 |
|
|
|
81 |
```php
|
|
|
82 |
use Composer\Pcre\Preg;
|
|
|
83 |
|
|
|
84 |
if (Preg::isMatch('{fo+}', $string, $matches)) // bool
|
|
|
85 |
if (Preg::isMatchAll('{fo+}', $string, $matches)) // bool
|
|
|
86 |
```
|
|
|
87 |
|
|
|
88 |
Finally the `Preg` class provides a few `*StrictGroups` method variants that ensure match groups
|
|
|
89 |
are always present and thus non-nullable, making it easier to write type-safe code:
|
|
|
90 |
|
|
|
91 |
```php
|
|
|
92 |
use Composer\Pcre\Preg;
|
|
|
93 |
|
|
|
94 |
// $matches is guaranteed to be an array of strings, if a subpattern does not match and produces a null it will throw
|
|
|
95 |
if (Preg::matchStrictGroups('{fo+}', $string, $matches))
|
|
|
96 |
if (Preg::matchAllStrictGroups('{fo+}', $string, $matches))
|
|
|
97 |
```
|
|
|
98 |
|
|
|
99 |
**Note:** This is generally safe to use as long as you do not have optional subpatterns (i.e. `(something)?`
|
|
|
100 |
or `(something)*` or branches with a `|` that result in some groups not being matched at all).
|
|
|
101 |
A subpattern that can match an empty string like `(.*)` is **not** optional, it will be present as an
|
|
|
102 |
empty string in the matches. A non-matching subpattern, even if optional like `(?:foo)?` will anyway not be present in
|
|
|
103 |
matches so it is also not a problem to use these with `*StrictGroups` methods.
|
|
|
104 |
|
|
|
105 |
If you would prefer a slightly more verbose usage, replacing by-ref arguments by result objects, you can use the `Regex` class:
|
|
|
106 |
|
|
|
107 |
```php
|
|
|
108 |
use Composer\Pcre\Regex;
|
|
|
109 |
|
|
|
110 |
// this is useful when you are just interested in knowing if something matched
|
|
|
111 |
// as it returns a bool instead of int(1/0) for match
|
|
|
112 |
$bool = Regex::isMatch('{fo+}', $string);
|
|
|
113 |
|
|
|
114 |
$result = Regex::match('{fo+}', $string);
|
|
|
115 |
if ($result->matched) { something($result->matches); }
|
|
|
116 |
|
|
|
117 |
$result = Regex::matchWithOffsets('{fo+}', $string);
|
|
|
118 |
if ($result->matched) { something($result->matches); }
|
|
|
119 |
|
|
|
120 |
$result = Regex::matchAll('{fo+}', $string);
|
|
|
121 |
if ($result->matched && $result->count > 3) { something($result->matches); }
|
|
|
122 |
|
|
|
123 |
$newString = Regex::replace('{fo+}', 'bar', $string)->result;
|
|
|
124 |
$newString = Regex::replaceCallback('{fo+}', function ($match) { return strtoupper($match[0]); }, $string)->result;
|
|
|
125 |
$newString = Regex::replaceCallbackArray(['{fo+}' => fn ($match) => strtoupper($match[0])], $string)->result;
|
|
|
126 |
```
|
|
|
127 |
|
|
|
128 |
Note that `preg_grep` and `preg_split` are only callable via the `Preg` class as they do not have
|
|
|
129 |
complex return types warranting a specific result object.
|
|
|
130 |
|
|
|
131 |
See the [MatchResult](src/MatchResult.php), [MatchWithOffsetsResult](src/MatchWithOffsetsResult.php), [MatchAllResult](src/MatchAllResult.php),
|
|
|
132 |
[MatchAllWithOffsetsResult](src/MatchAllWithOffsetsResult.php), and [ReplaceResult](src/ReplaceResult.php) class sources for more details.
|
|
|
133 |
|
|
|
134 |
Restrictions / Limitations
|
|
|
135 |
--------------------------
|
|
|
136 |
|
|
|
137 |
Due to type safety requirements a few restrictions are in place.
|
|
|
138 |
|
|
|
139 |
- matching using `PREG_OFFSET_CAPTURE` is made available via `matchWithOffsets` and `matchAllWithOffsets`.
|
|
|
140 |
You cannot pass the flag to `match`/`matchAll`.
|
|
|
141 |
- `Preg::split` will also reject `PREG_SPLIT_OFFSET_CAPTURE` and you should use `splitWithOffsets`
|
|
|
142 |
instead.
|
|
|
143 |
- `matchAll` rejects `PREG_SET_ORDER` as it also changes the shape of the returned matches. There
|
|
|
144 |
is no alternative provided as you can fairly easily code around it.
|
|
|
145 |
- `preg_filter` is not supported as it has a rather crazy API, most likely you should rather
|
|
|
146 |
use `Preg::grep` in combination with some loop and `Preg::replace`.
|
|
|
147 |
- `replace`, `replaceCallback` and `replaceCallbackArray` do not support an array `$subject`,
|
|
|
148 |
only simple strings.
|
|
|
149 |
- As of 2.0, the library always uses `PREG_UNMATCHED_AS_NULL` for matching, which offers [much
|
|
|
150 |
saner/more predictable results](#preg_unmatched_as_null). As of 3.0 the flag is also set for
|
|
|
151 |
`replaceCallback` and `replaceCallbackArray`.
|
|
|
152 |
|
|
|
153 |
#### PREG_UNMATCHED_AS_NULL
|
|
|
154 |
|
|
|
155 |
As of 2.0, this library always uses PREG_UNMATCHED_AS_NULL for all `match*` and `isMatch*`
|
|
|
156 |
functions. As of 3.0 it is also done for `replaceCallback` and `replaceCallbackArray`.
|
|
|
157 |
|
|
|
158 |
This means your matches will always contain all matching groups, either as null if unmatched
|
|
|
159 |
or as string if it matched.
|
|
|
160 |
|
|
|
161 |
The advantages in clarity and predictability are clearer if you compare the two outputs of
|
|
|
162 |
running this with and without PREG_UNMATCHED_AS_NULL in $flags:
|
|
|
163 |
|
|
|
164 |
```php
|
|
|
165 |
preg_match('/(a)(b)*(c)(d)*/', 'ac', $matches, $flags);
|
|
|
166 |
```
|
|
|
167 |
|
|
|
168 |
| no flag | PREG_UNMATCHED_AS_NULL |
|
|
|
169 |
| --- | --- |
|
|
|
170 |
| array (size=4) | array (size=5) |
|
|
|
171 |
| 0 => string 'ac' (length=2) | 0 => string 'ac' (length=2) |
|
|
|
172 |
| 1 => string 'a' (length=1) | 1 => string 'a' (length=1) |
|
|
|
173 |
| 2 => string '' (length=0) | 2 => null |
|
|
|
174 |
| 3 => string 'c' (length=1) | 3 => string 'c' (length=1) |
|
|
|
175 |
| | 4 => null |
|
|
|
176 |
| group 2 (any unmatched group preceding one that matched) is set to `''`. You cannot tell if it matched an empty string or did not match at all | group 2 is `null` when unmatched and a string if it matched, easy to check for |
|
|
|
177 |
| group 4 (any optional group without a matching one following) is missing altogether. So you have to check with `isset()`, but really you want `isset($m[4]) && $m[4] !== ''` for safety unless you are very careful to check that a non-optional group follows it | group 4 is always set, and null in this case as there was no match, easy to check for with `$m[4] !== null` |
|
|
|
178 |
|
|
|
179 |
PHPStan Extension
|
|
|
180 |
-----------------
|
|
|
181 |
|
|
|
182 |
To use the PHPStan extension if you do not use `phpstan/extension-installer` you can include `vendor/composer/pcre/extension.neon` in your PHPStan config.
|
|
|
183 |
|
|
|
184 |
The extension provides much better type information for $matches as well as regex validation where possible.
|
|
|
185 |
|
|
|
186 |
License
|
|
|
187 |
-------
|
|
|
188 |
|
|
|
189 |
composer/pcre is licensed under the MIT License, see the LICENSE file for details.
|