What is the use of preg_match in php?

❮ PHP RegExp Reference

Example

Use a regular expression to do a case-insensitive search for "w3schools" in a string:

Try it Yourself »

Definition and Usage

The preg_match[] function returns whether a match was found in a string.

Syntax

preg_match[pattern, input, matches, flags, offset]

Parameter Values

ParameterDescription
pattern Required. Contains a regular expression indicating what to search for
input Required. The string in which the search will be performed
matches Optional. The variable used in this parameter will be populated with an array containing all of the matches that were found
flags Optional. A set of options that change how the matches array is structured:
  • PREG_OFFSET_CAPTURE - When this option is enabled, each match, instead of being a string, will be an array where the first element is a substring containing the match and the second element is the position of the first character of the substring in the input.
  • PREG_UNMATCHED_AS_NULL - When this option is enabled, unmatched subpatterns will be returned as NULL instead of as an empty string.
offset Optional. Defaults to 0. Indicates how far into the string to begin searching. The preg_match[] function will not find matches that occur before the position given in this parameter

Technical Details

Return Value:PHP Version:Changelog:
Returns 1 if a match was found, 0 if no matches were found and false if an error occurred
4+
PHP 7.2 - Added the PREG_UNMATCHED_AS_NULL flag

PHP 5.3.6 - The function returns false when the offset is longer than the length of the input

PHP 5.2.2 - Named subpatterns can use the [?'name'] and [? ] syntax in addition to the previous [?P]

More Examples

Example

Use PREG_OFFSET_CAPTURE to find the position in the input string in which the matches were found:

Try it Yourself »

❮ PHP RegExp Reference


[PHP 4, PHP 5, PHP 7, PHP 8]

preg_matchPerform a regular expression match

Description

preg_match[
    string $pattern,
    string $subject,
    array &$matches = null,
    int $flags = 0,
    int $offset = 0
]: int|false

Parameters

pattern

The pattern to search for, as a string.

subject

The input string.

matches

If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.

flags

flags can be a combination of the following flags:

PREG_OFFSET_CAPTURE

If this flag is passed, for every occurring match the appendant string offset [in bytes] will also be returned. Note that this changes the value of matches into an array where every element is an array consisting of the matched string at offset 0 and its string offset into subject at offset 1.

The above example will output:

Array
[
    [0] => Array
        [
            [0] => foobarbaz
            [1] => 0
        ]

    [1] => Array
        [
            [0] => foo
            [1] => 0
        ]

    [2] => Array
        [
            [0] => bar
            [1] => 3
        ]

    [3] => Array
        [
            [0] => baz
            [1] => 6
        ]

]

PREG_UNMATCHED_AS_NULL

If this flag is passed, unmatched subpatterns are reported as null; otherwise they are reported as an empty string.

The above example will output:

array[4] {
  [0]=>
  string[2] "ac"
  [1]=>
  string[1] "a"
  [2]=>
  string[0] ""
  [3]=>
  string[1] "c"
}
array[4] {
  [0]=>
  string[2] "ac"
  [1]=>
  string[1] "a"
  [2]=>
  NULL
  [3]=>
  string[1] "c"
}

offset

Normally, the search starts from the beginning of the subject string. The optional parameter offset can be used to specify the alternate place from which to start the search [in bytes].

Note:

Using offset is not equivalent to passing substr[$subject, $offset] to preg_match[] in place of the subject string, because pattern can contain assertions such as ^, $ or [?

The above example will output:

while this example

will produce

Array
[
    [0] => Array
        [
            [0] => def
            [1] => 0
        ]

]

Alternatively, to avoid using substr[], use the \G assertion rather than the ^ anchor, or the A modifier instead, both of which work with the offset parameter.

Return Values

preg_match[] returns 1 if the pattern matches given subject, 0 if it does not, or false on failure.

Warning

This function may return Boolean false, but may also return a non-Boolean value which evaluates to false. Please read the section on Booleans for more information. Use the === operator for testing the return value of this function.

Errors/Exceptions

If the regex pattern passed does not compile to a valid regex, an E_WARNING is emitted.

Changelog

VersionDescription
7.2.0 The PREG_UNMATCHED_AS_NULL is now supported for the $flags parameter.

Examples

Example #1 Find the string of text "php"

Example #2 Find the word "web"

Example #3 Getting the domain name out of a URL

The above example will output:

Example #4 Using named subpattern

The above example will output:

Array
[
    [0] => foobar: 2008
    [name] => foobar
    [1] => foobar
    [digit] => 2008
    [2] => 2008
]

Notes

Tip

Do not use preg_match[] if you only want to check if one string is contained in another string. Use strpos[] instead as it will be faster.

See Also

  • PCRE Patterns
  • preg_quote[] - Quote regular expression characters
  • preg_match_all[] - Perform a global regular expression match
  • preg_replace[] - Perform a regular expression search and replace
  • preg_split[] - Split string by a regular expression
  • preg_last_error[] - Returns the error code of the last PCRE regex execution
  • preg_last_error_msg[] - Returns the error message of the last PCRE regex execution

force at md-t dot org

11 years ago

Simple regex

Regex quick reference
[abc]     A single character: a, b or c
[^abc]     Any single character but a, b, or c
[a-z]     Any single character in the range a-z
[a-zA-Z]     Any single character in the range a-z or A-Z
^     Start of line
$     End of line
\A     Start of string
\z     End of string
.     Any single character
\s     Any whitespace character
\S     Any non-whitespace character
\d     Any digit
\D     Any non-digit
\w     Any word character [letter, number, underscore]
\W     Any non-word character
\b     Any word boundary character
[...]     Capture everything enclosed
[a|b]     a or b
a?     Zero or one of a
a*     Zero or more of a
a+     One or more of a
a{3}     Exactly 3 of a
a{3,}     3 or more of a
a{3,6}     Between 3 and 6 of a

options: i case insensitive m make dot match newlines x ignore whitespace in regex o perform #{...} substitutions only once

MrBull

11 years ago

Sometimes its useful to negate a string. The first method which comes to mind to do this is: [^[string]] but this of course won't work. There is a solution, but it is not very well known. This is the simple piece of code on how a negation of a string is done:

[?:[?!string].]

?: makes a subpattern [see //www.php.net/manual/en/regexp.reference.subpatterns.php] and ?! is a negative look ahead. You put the negative look ahead in front of the dot because you want the regex engine to first check if there is an occurrence of the string you are negating. Only if it is not there, you want to match an arbitrary character.

Hope this helps some ppl.

ruakuu at NOSPAM dot com

12 years ago

Was working on a site that needed japanese and alphabetic letters and needed to
validate input using preg_match, I tried using \p{script} but didn't work:

arash dot hemmat at gmail dot com

11 years ago

For those who search for a unicode regular expression example using preg_match here it is:

Check for Persian digits
preg_match[ "/[^\x{06F0}-\x{06F9}\x]+/u" , '۱۲۳۴۵۶۷۸۹۰' ];

daevid at daevid dot com

13 years ago

I just learned about named groups from a Python friend today and was curious if PHP supported them, guess what -- it does!!!

//www.regular-expressions.info/named.html



will produce:

Array
[
    [0] => abcdefghijklmnopqrstuvwxyz
    [foo] => abc
    [1] => abc
    [2] => defghijklmnopqrstuvw
    [bar] => xyz
    [3] => xyz
]

Note that you actually get the named group as well as the numerical key
value too, so if you do use them, and you're counting array elements, be
aware that your array might be bigger than you initially expect it to be.

mohammad40g at gmail dot com

11 years ago

This sample is for checking persian character:

andre at koethur dot de

9 years ago

Be aware of bug //bugs.php.net/bug.php?id=50887 when using sub patterns: Un-matched optional sub patterns at the end won't show up in $matches.

Here is a workaround: Assign a name to all subpatterns you are interested in, and merge $match afterwards with an constant array containing some reasonable default values:



This outputs:
Array
[
    [lang] => de
    [qval] =>
    [0] => de
    [1] => de
]

Instead of:
Array
[
    [0] => de
    [lang] => de
    [1] => de
]

solixmexico at outlook dot com

5 years ago

To validate directorys on Windows i used this:

if[ preg_match["#^[[a-z]{1}\:{1}]?[\\\/]?[[\-\w]+[\\\/]?]*$#i",$_GET['path'],$matches] !== 1 ]{
    echo["Invalid value"];
}else{
    echo["Valid value"];
}

The parts are:

#^ and $i            Make the string matches at all the pattern, from start to end for ensure a complete match.
[[a-z]{1}\:{1}]?        The string may starts with one letter and a colon, but only 1 character for eachone, this is for the drive letter [C:]
[\\\/]?            The string may contain, but not require 1 slash or backslash after the drive letter, [\/]
[[\-\w]+[\\\/]?]*    The string must have 1 or more of any character like hyphen, letter, number, underscore, and may contain a slash or back slash at the end, to have a directory like ["/" or "folderName" or "folderName/"], this may be repeated one or more times.

sainnr at gmail dot com

11 years ago

This sample regexp may be useful if you are working with DB field types.

[?P\w+][$|\[[?P[\d+|[.*]]]\]]

For example, if you are have a such type as "varchar[255]" or "text", the next fragment



will output something like this:
Array [ [0] => varchar[255] [type] => varchar [1] => varchar [2] => [255] [length] => 255 [3] => 255 [4] => 255 ]

ian_channing at hotmail dot com

11 years ago

When trying to check a file path that could be windows or unix it took me quite a few tries to get the escape characters right.

The Unix directory separator must be escaped once and the windows directory separator must be escaped twice.

This will match path/to/file and path\to\file.exe

preg_match['/^[a-z0-9_.\/\\\]*$/i', $file_string];

cmallabon at homesfactory dot com

11 years ago

Just an interesting note. Was just updating code to replace ereg[] with strpos[] and preg_match and the thought occured that preg_match[] could be optimized to quit early when only searching if a string begins with something, for example


vs



As I guessed, strpos[] is always faster [about 2x] for short strings like a URL but for very long strings of several paragraphs [e.g. a block of XML] when the string doesn't start with the needle preg_match as twice as fast as strpos[] as it doesn't scan the entire string.

So, if you are searching long strings and expect it to normally be true [e.g. validating XML], strpos[] is a much faster BUT if you expect if to often fail, preg_match is the better choice.

geompse at gmail dot com

5 years ago

The function will return false and raise a warning if the input $subject is too long :
[PhpWarning] preg_match[]: Subject is too long

I believe the limit is 1 or 2 GB because I was using a 2.2GB string.
While a parameter might exist to alter this limit, in my case it was possible and wiser to use

splattermania at freenet dot de

12 years ago

As I wasted lots of time finding a REAL regex for URLs and resulted in building it on my own, I now have found one, that seems to work for all kinds of urls:



Then, the correct way to check against the regex ist as follows:

Anonymous

10 years ago

Here is a function that decreases the numbers inside a string [useful to convert DOM object into simplexml object]

e.g.: decremente_chaine["somenode->anode[2]->achildnode[3]"] will return "somenode->anode[1]->achildnode[2]"

the numbering of the nodes in simplexml starts from zero, but from 1 in DOM xpath objects

aer0s

10 years ago

Simple function to return a sub-string following the preg convention. Kind of expensive, and some might say lazy but it has saved me time.

# preg_substr[$pattern,$subject,[$offset]] function
# @author   aer0s
#  return a specific sub-string in a string using
#   a regular expression
# @param   $pattern   regular expression pattern to match
# @param   $subject   string to search
# @param   [$offset]   zero based match occurrence to return
#                            
# [$offset] is 0 by default which returns the first occurrence,
# if [$offset] is -1 it will return the last occurrence

function preg_substr[$pattern,$subject,$offset=0]{
    preg_match_all[$pattern,$subject,$matches,PREG_PATTERN_ORDER];
    return $offset==-1?array_pop[$matches[0]]:$matches[0][$offset];
}

example:

             $pattern = "/model[\s|-][a-z0-9]/i";
             $subject = "Is there something wrong with model 654, Model 732, and model 43xl or is Model aj45B the preferred choice?";

             echo preg_substr[$pattern,$subject];
             echo preg_substr[$pattern,$subject,1];
             echo preg_substr[$pattern,$subject,-1];

Returns something like:

             model 654
             Model 732
             Model aj45B

Jonny 5

10 years ago

Workaround for getting the offset in UTF-8
[in some cases mb_strpos might be an option as well]

akniep at rayo dot info

13 years ago

Bugs of preg_match [PHP-version 5.2.5]

In most cases, the following example will show one of two PHP-bugs discovered with preg_match depending on your PHP-version and configuration.

Chủ Đề