glibmm 2.80.0
Public Types | Public Member Functions | Static Public Member Functions | Protected Member Functions | Related Symbols | List of all members
Glib::Regex Class Referencefinal

Perl-compatible regular expressions - matches strings against regular expressions. More...

#include <glibmm/regex.h>

Public Types

enum class  CompileFlags {
  DEFAULT = 0x0 ,
  CASELESS = 1 << 0 ,
  MULTILINE = 1 << 1 ,
  DOTALL = 1 << 2 ,
  EXTENDED = 1 << 3 ,
  ANCHORED = 1 << 4 ,
  DOLLAR_ENDONLY = 1 << 5 ,
  UNGREEDY = 1 << 9 ,
  RAW = 1 << 11 ,
  NO_AUTO_CAPTURE = 1 << 12 ,
  OPTIMIZE = 1 << 13 ,
  FIRSTLINE = 1 << 18 ,
  DUPNAMES = 1 << 19 ,
  NEWLINE_CR = 1 << 20 ,
  NEWLINE_LF = 1 << 21 ,
  NEWLINE_CRLF = 0x300000 ,
  NEWLINE_ANYCRLF = 0x500000 ,
  BSR_ANYCRLF = 1 << 23 ,
  JAVASCRIPT_COMPAT = 1 << 25
}
 
enum class  MatchFlags {
  DEFAULT = 0x0 ,
  ANCHORED = 1 << 4 ,
  NOTBOL = 1 << 7 ,
  NOTEOL = 1 << 8 ,
  NOTEMPTY = 1 << 10 ,
  PARTIAL = 1 << 15 ,
  NEWLINE_CR = 1 << 20 ,
  NEWLINE_LF = 1 << 21 ,
  NEWLINE_CRLF = 0x300000 ,
  NEWLINE_ANY = 1 << 22 ,
  NEWLINE_ANYCRLF = 0x500000 ,
  BSR_ANYCRLF = 1 << 23 ,
  BSR_ANY = 1 << 24 ,
  PARTIAL_SOFT = 0x8000 ,
  PARTIAL_HARD = 1 << 27 ,
  NOTEMPTY_ATSTART = 1 << 28
}
 

Public Member Functions

void reference () const
 Increment the reference count for this object.
 
void unreference () const
 Decrement the reference count for this object.
 
GRegexgobj ()
 Provides access to the underlying C instance.
 
const GRegexgobj () const
 Provides access to the underlying C instance.
 
GRegexgobj_copy () const
 Provides access to the underlying C instance. The caller is responsible for unrefing it. Use when directly setting fields in structs.
 
 Regex ()=delete
 
 Regex (const Regex &)=delete
 
Regexoperator= (const Regex &)=delete
 
Glib::ustring get_pattern () const
 Gets the pattern string associated with regex, i.e. a copy of the string passed to g_regex_new().
 
int get_max_backref () const
 Returns the number of the highest back reference in the pattern, or 0 if the pattern does not contain back references.
 
int get_capture_count () const
 Returns the number of capturing subpatterns in the pattern.
 
bool get_has_cr_or_lf () const
 Checks whether the pattern contains explicit CR or LF references.
 
int get_max_lookbehind () const
 Gets the number of characters in the longest lookbehind assertion in the pattern.
 
int get_string_number (Glib::UStringView name) const
 Retrieves the number of the subexpression named name.
 
CompileFlags get_compile_flags () const
 Returns the compile options that regex was created with.
 
MatchFlags get_match_flags () const
 Returns the match options that regex was created with.
 
bool match (Glib::UStringView string, Glib::MatchInfo &match_info, MatchFlags match_options=static_cast< MatchFlags >(0))
 Scans for a match in string for the pattern in regex.
 
bool match (Glib::UStringView string, MatchFlags match_options=static_cast< MatchFlags >(0))
 A match() method not requiring a Glib::MatchInfo.
 
bool match (Glib::UStringView string, int start_position, Glib::MatchInfo &match_info, MatchFlags match_options=static_cast< MatchFlags >(0))
 A match() method with a start position and a Glib::MatchInfo.
 
bool match (Glib::UStringView string, gssize string_len, int start_position, Glib::MatchInfo &match_info, MatchFlags match_options=static_cast< MatchFlags >(0))
 Scans for a match in string for the pattern in regex.
 
bool match (Glib::UStringView string, int start_position, MatchFlags match_options)
 A match() method with a start position not requiring a Glib::MatchInfo.
 
bool match (Glib::UStringView string, gssize string_len, int start_position, MatchFlags match_options)
 A match() method with a string length and start position not requiring a Glib::MatchInfo.
 
bool match_all (Glib::UStringView string, Glib::MatchInfo &match_info, MatchFlags match_options=static_cast< MatchFlags >(0))
 Using the standard algorithm for regular expression matching only the longest match in the string is retrieved.
 
bool match_all (Glib::UStringView string, MatchFlags match_options=static_cast< MatchFlags >(0))
 A match_all() method not requiring a Glib::MatchInfo.
 
bool match_all (Glib::UStringView string, int start_position, Glib::MatchInfo &match_info, MatchFlags match_options=static_cast< MatchFlags >(0))
 A match_all() method with a start positon and a Glib::MatchInfo.
 
bool match_all (Glib::UStringView string, gssize string_len, int start_position, Glib::MatchInfo &match_info, MatchFlags match_options=static_cast< MatchFlags >(0))
 Using the standard algorithm for regular expression matching only the longest match in the string is retrieved, it is not possible to obtain all the available matches.
 
bool match_all (Glib::UStringView string, int start_position, MatchFlags match_options)
 A match_all() method with a start position not requiring a Glib::MatchInfo.
 
bool match_all (Glib::UStringView string, gssize string_len, int start_position, MatchFlags match_options)
 A match_all() method with a start position and a string length not requiring a Glib::MatchInfo.
 
std::vector< Glib::ustringsplit (Glib::UStringView string, MatchFlags match_options=static_cast< MatchFlags >(0))
 Breaks the string on the pattern, and returns an array of the tokens.
 
std::vector< Glib::ustringsplit (const gchar *string, gssize string_len, int start_position, MatchFlags match_options=static_cast< MatchFlags >(0), int max_tokens=0) const
 Breaks the string on the pattern, and returns an array of the tokens.
 
std::vector< Glib::ustringsplit (Glib::UStringView string, int start_position, MatchFlags match_options, int max_tokens) const
 
Glib::ustring replace (const gchar *string, gssize string_len, int start_position, Glib::UStringView replacement, MatchFlags match_options=static_cast< MatchFlags >(0))
 Replaces all occurrences of the pattern in regex with the replacement text.
 
Glib::ustring replace (Glib::UStringView string, int start_position, Glib::UStringView replacement, MatchFlags match_options)
 
Glib::ustring replace_literal (const gchar *string, gssize string_len, int start_position, Glib::UStringView replacement, MatchFlags match_options=static_cast< MatchFlags >(0))
 Replaces all occurrences of the pattern in regex with the replacement text.
 
Glib::ustring replace_literal (Glib::UStringView string, int start_position, Glib::UStringView replacement, MatchFlags match_options)
 
Glib::ustring replace_eval (Glib::UStringView string, gssize string_len, int start_position, MatchFlags match_options, GRegexEvalCallback eval, gpointer user_data)
 Replaces occurrences of the pattern in regex with the output of eval for that occurrence.
 

Static Public Member Functions

static Glib::RefPtr< Glib::Regexcreate (Glib::UStringView pattern, CompileFlags compile_options=static_cast< CompileFlags >(0), MatchFlags match_options=static_cast< MatchFlags >(0))
 
static Glib::ustring escape_string (const Glib::ustring &string)
 
static bool match_simple (Glib::UStringView pattern, Glib::UStringView string, CompileFlags compile_options=static_cast< CompileFlags >(0), MatchFlags match_options=static_cast< MatchFlags >(0))
 Scans for a match in string for pattern.
 
static std::vector< Glib::ustringsplit_simple (Glib::UStringView pattern, Glib::UStringView string, CompileFlags compile_options=static_cast< CompileFlags >(0), MatchFlags match_options=static_cast< MatchFlags >(0))
 Breaks the string on the pattern, and returns an array of the tokens.
 
static bool check_replacement (Glib::UStringView replacement, gboolean *has_references)
 Checks whether replacement is a valid replacement string (see g_regex_replace()), i.e. that all escape sequences in it are valid.
 

Protected Member Functions

void operator delete (void *, std::size_t)
 

Related Symbols

(Note that these are not member symbols.)

Glib::RefPtr< Glib::Regexwrap (GRegex *object, bool take_copy=false)
 A Glib::wrap() method for this object.
 

Detailed Description

Perl-compatible regular expressions - matches strings against regular expressions.

The Glib::Regex functions implement regular expression pattern matching using syntax and semantics similar to Perl regular expression.

Some functions accept a start_position argument, setting it differs from just passing over a shortened string and setting REGEX_MATCH_NOTBOL in the case of a pattern that begins with any kind of lookbehind assertion. For example, consider the pattern "\Biss\B" which finds occurrences of "iss" in the middle of words. ("\B" matches only if the current position in the subject is not a word boundary.) When applied to the string "Mississipi" from the fourth byte, namely "issipi", it does not match, because "\B" is always false at the start of the subject, which is deemed to be a word boundary. However, if the entire string is passed , but with start_position set to 4, it finds the second occurrence of "iss" because it is able to look behind the starting point to discover that it is preceded by a letter.

Note that, unless you set the REGEX_RAW flag, all the strings passed to these functions must be encoded in UTF-8. The lengths and the positions inside the strings are in bytes and not in characters, so, for instance, "\xc3\xa0" (i.e. "à") is two bytes long but it is treated as a single character. If you set REGEX_RAW the strings can be non-valid UTF-8 strings and a byte is treated as a character, so "\xc3\xa0" is two bytes and two characters long.

When matching a pattern, "\n" matches only against a "\n" character in the string, and "\r" matches only a "\r" character. To match any newline sequence use "\R". This particular group matches either the two-character sequence CR + LF ("\r\n"), or one of the single characters LF (linefeed, U+000A, "\n"), VT (vertical tab, U+000B, "\v"), FF (formfeed, U+000C, "\f"), CR (carriage return, U+000D, "\r"), NEL (next line, U+0085), LS (line separator, U+2028), or PS (paragraph separator, U+2029).

The behaviour of the dot, circumflex, and dollar metacharacters are affected by newline characters, the default is to recognize any newline character (the same characters recognized by "\R"). This can be changed with REGEX_NEWLINE_CR, REGEX_NEWLINE_LF and REGEX_NEWLINE_CRLF compile options, and with REGEX_MATCH_NEWLINE_ANY, REGEX_MATCH_NEWLINE_CR, REGEX_MATCH_NEWLINE_LF and REGEX_MATCH_NEWLINE_CRLF match options. These settings are also relevant when compiling a pattern if REGEX_EXTENDED is set, and an unescaped "#" outside a character class is encountered. This indicates a comment that lasts until after the next newline.

Creating and manipulating the same Glib::Regex class from different threads is not a problem as Glib::Regex does not modify its internal state between creation and destruction, on the other hand Glib::MatchInfo is not threadsafe.

The regular expressions low level functionalities are obtained through the excellent PCRE library written by Philip Hazel.

Since glibmm 2.14:

Member Enumeration Documentation

◆ CompileFlags

Enumerator
DEFAULT 

No special options set.

Since glibmm 2.74:
Since glibmm 2.74:
CASELESS 

Letters in the pattern match both upper- and lowercase letters.

This option can be changed within a pattern by a "(?i)" option setting.

MULTILINE 

By default, GRegex treats the strings as consisting of a single line of characters (even if it actually contains newlines).

The "start of line" metacharacter ("^") matches only at the start of the string, while the "end of line" metacharacter ("$") matches only at the end of the string, or before a terminating newline (unless Glib::Regex::CompileFlags::DOLLAR_ENDONLY is set). When Glib::Regex::CompileFlags::MULTILINE is set, the "start of line" and "end of line" constructs match immediately following or immediately before any newline in the string, respectively, as well as at the very start and end. This can be changed within a pattern by a "(?m)" option setting.

DOTALL 

A dot metacharacter (".") in the pattern matches all characters, including newlines.

Without it, newlines are excluded. This option can be changed within a pattern by a ("?s") option setting.

EXTENDED 

Whitespace data characters in the pattern are totally ignored except when escaped or inside a character class.

Whitespace does not include the VT character (code 11). In addition, characters between an unescaped "#" outside a character class and the next newline character, inclusive, are also ignored. This can be changed within a pattern by a "(?x)" option setting.

ANCHORED 

The pattern is forced to be "anchored", that is, it is constrained to match only at the first matching point in the string that is being searched.

This effect can also be achieved by appropriate constructs in the pattern itself such as the "^" metacharacter.

DOLLAR_ENDONLY 

A dollar metacharacter ("$") in the pattern matches only at the end of the string.

Without this option, a dollar also matches immediately before the final character if it is a newline (but not before any other newlines). This option is ignored if Glib::Regex::CompileFlags::MULTILINE is set.

UNGREEDY 

Inverts the "greediness" of the quantifiers so that they are not greedy by default, but become greedy if followed by "?".

It can also be set by a "(?U)" option setting within the pattern.

RAW 

Usually strings must be valid UTF-8 strings, using this flag they are considered as a raw sequence of bytes.

NO_AUTO_CAPTURE 

Disables the use of numbered capturing parentheses in the pattern.

Any opening parenthesis that is not followed by "?" behaves as if it were followed by "?:" but named parentheses can still be used for capturing (and they acquire numbers in the usual way).

OPTIMIZE 

Since 2.74 and the port to pcre2, requests JIT compilation, which, if the just-in-time compiler is available, further processes a compiled pattern into machine code that executes much faster.

However, it comes at the cost of extra processing before the match is performed, so it is most beneficial to use this when the same compiled pattern is used for matching many times. Before 2.74 this option used the built-in non-JIT optimizations in pcre1.

FIRSTLINE 

Limits an unanchored pattern to match before (or at) the first newline.

Since glibmm 2.34:
DUPNAMES 

Names used to identify capturing subpatterns need not be unique.

This can be helpful for certain types of pattern when it is known that only one instance of the named subpattern can ever be matched.

NEWLINE_CR 

Usually any newline character or character sequence is recognized.

Overrides the newline definition set when creating a new Regex, setting the '\r' character as line terminator.

If this option is set, the only recognized newline character is '\r'.

NEWLINE_LF 

Usually any newline character or character sequence is recognized.

Overrides the newline definition set when creating a new Regex, setting the '\n' character as line terminator.

If this option is set, the only recognized newline character is '\n'.

NEWLINE_CRLF 

Usually any newline character or character sequence is recognized.

Overrides the newline definition set when creating a new Regex, setting the '\r\n' characters sequence as line terminator.

If this option is set, the only recognized newline character sequence is '\r\n'.

NEWLINE_ANYCRLF 

Usually any newline character or character sequence is recognized.

Overrides the newline definition set when creating a new Regex; any '\r', '\n', or '\r\n' character sequence is recognized as a newline.

If this option is set, the only recognized newline character sequences are '\r', '\n', and '\r\n'.

Since glibmm 2.34:
Since glibmm 2.34:
BSR_ANYCRLF 

Usually any newline character or character sequence is recognised.

Overrides the newline definition for "\\R" set when creating a new Regex; only '\r', '\n', or '\r\n' character sequences are recognized as a newline by "\\R".

If this option is set, then "\\R" only recognizes the newline characters '\r', '\n' and '\r\n'.

Since glibmm 2.34:
Since glibmm 2.34:
JAVASCRIPT_COMPAT 

Changes behaviour so that it is compatible with JavaScript rather than PCRE.

Since GLib 2.74 this is no longer supported, as libpcre2 does not support it.

Since glibmm 2.34:
Deprecated: 2.74.

◆ MatchFlags

Enumerator
DEFAULT 
ANCHORED 
NOTBOL 

Specifies that first character of the string is not the beginning of a line, so the circumflex metacharacter should not match before it.

Setting this without Glib::Regex::CompileFlags::MULTILINE (at compile time) causes circumflex never to match. This option affects only the behaviour of the circumflex metacharacter, it does not affect "\\A".

NOTEOL 

Specifies that the end of the subject string is not the end of a line, so the dollar metacharacter should not match it nor (except in multiline mode) a newline immediately before it.

Setting this without Glib::Regex::CompileFlags::MULTILINE (at compile time) causes dollar never to match. This option affects only the behaviour of the dollar metacharacter, it does not affect "\\Z" or "\\z".

NOTEMPTY 

An empty string is not considered to be a valid match if this option is set.

If there are alternatives in the pattern, they are tried. If all the alternatives match the empty string, the entire match fails. For example, if the pattern "a?b?" is applied to a string not beginning with "a" or "b", it matches the empty string at the start of the string. With this flag set, this match is not valid, so GRegex searches further into the string for occurrences of "a" or "b".

PARTIAL 

Turns on the partial matching feature, for more documentation on partial matching see g_match_info_is_partial_match().

NEWLINE_CR 
NEWLINE_LF 
NEWLINE_CRLF 
NEWLINE_ANY 

Overrides the newline definition set when creating a new Regex, any Unicode newline sequence is recognised as a newline.

These are '\r', '\n' and '\rn', and the single characters U+000B LINE TABULATION, U+000C FORM FEED (FF), U+0085 NEXT LINE (NEL), U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR.

NEWLINE_ANYCRLF 
BSR_ANYCRLF 
BSR_ANY 

Overrides the newline definition for "\\R" set when creating a new Regex; any Unicode newline character or character sequence are recognized as a newline by "\\R".

These are '\r', '\n' and '\rn', and the single characters U+000B LINE TABULATION, U+000C FORM FEED (FF), U+0085 NEXT LINE (NEL), U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR.

Since glibmm 2.34:
PARTIAL_SOFT 

An alias for Glib::Regex::MatchFlags::PARTIAL.

Since glibmm 2.34:
PARTIAL_HARD 

Turns on the partial matching feature.

In contrast to to Glib::Regex::MatchFlags::PARTIAL_SOFT, this stops matching as soon as a partial match is found, without continuing to search for a possible complete match. See g_match_info_is_partial_match() for more information.

Since glibmm 2.34:
NOTEMPTY_ATSTART 

Like Glib::Regex::MatchFlags::NOTEMPTY, but only applied to the start of the matched string.

For anchored patterns this can only happen for pattern containing "\\K".

Since glibmm 2.34:

Constructor & Destructor Documentation

◆ Regex() [1/2]

Glib::Regex::Regex ( )
delete

◆ Regex() [2/2]

Glib::Regex::Regex ( const Regex )
delete

Member Function Documentation

◆ check_replacement()

static bool Glib::Regex::check_replacement ( Glib::UStringView  replacement,
gboolean has_references 
)
static

Checks whether replacement is a valid replacement string (see g_regex_replace()), i.e. that all escape sequences in it are valid.

If has_references is not nullptr then replacement is checked for pattern references. For instance, replacement text 'foo\n' does not contain references and may be evaluated without information about actual match, but '\0\1' (whole match followed by first subpattern) requires valid MatchInfo object.

Since glibmm 2.14:
Parameters
replacementThe replacement string.
has_referencesLocation to store information about references in replacement or nullptr.
Returns
Whether replacement is a valid replacement string.
Exceptions
Glib::RegexError

◆ create()

static Glib::RefPtr< Glib::Regex > Glib::Regex::create ( Glib::UStringView  pattern,
CompileFlags  compile_options = static_castCompileFlags >(0),
MatchFlags  match_options = static_castMatchFlags >(0) 
)
static
Exceptions
Glib::RegexError

◆ escape_string()

static Glib::ustring Glib::Regex::escape_string ( const Glib::ustring string)
static

◆ get_capture_count()

int Glib::Regex::get_capture_count ( ) const

Returns the number of capturing subpatterns in the pattern.

Since glibmm 2.14:
Returns
The number of capturing subpatterns.

◆ get_compile_flags()

CompileFlags Glib::Regex::get_compile_flags ( ) const

Returns the compile options that regex was created with.

Depending on the version of PCRE that is used, this may or may not include flags set by option expressions such as (?i) found at the top-level within the compiled pattern.

Since glibmm 2.26:
Returns
Flags from Glib::Regex::CompileFlags.

◆ get_has_cr_or_lf()

bool Glib::Regex::get_has_cr_or_lf ( ) const

Checks whether the pattern contains explicit CR or LF references.

Since glibmm 2.34:
Returns
true if the pattern contains explicit CR or LF references.

◆ get_match_flags()

MatchFlags Glib::Regex::get_match_flags ( ) const

Returns the match options that regex was created with.

Since glibmm 2.26:
Returns
Flags from Glib::Regex::MatchFlags.

◆ get_max_backref()

int Glib::Regex::get_max_backref ( ) const

Returns the number of the highest back reference in the pattern, or 0 if the pattern does not contain back references.

Since glibmm 2.14:
Returns
The number of the highest back reference.

◆ get_max_lookbehind()

int Glib::Regex::get_max_lookbehind ( ) const

Gets the number of characters in the longest lookbehind assertion in the pattern.

This information is useful when doing multi-segment matching using the partial matching facilities.

Since glibmm 2.38:
Returns
The number of characters in the longest lookbehind assertion.

◆ get_pattern()

Glib::ustring Glib::Regex::get_pattern ( ) const

Gets the pattern string associated with regex, i.e. a copy of the string passed to g_regex_new().

Since glibmm 2.14:
Returns
The pattern of regex.

◆ get_string_number()

int Glib::Regex::get_string_number ( Glib::UStringView  name) const

Retrieves the number of the subexpression named name.

Since glibmm 2.14:
Parameters
nameName of the subexpression.
Returns
The number of the subexpression or -1 if name does not exists.

◆ gobj() [1/2]

GRegex * Glib::Regex::gobj ( )

Provides access to the underlying C instance.

◆ gobj() [2/2]

const GRegex * Glib::Regex::gobj ( ) const

Provides access to the underlying C instance.

◆ gobj_copy()

GRegex * Glib::Regex::gobj_copy ( ) const

Provides access to the underlying C instance. The caller is responsible for unrefing it. Use when directly setting fields in structs.

◆ match() [1/6]

bool Glib::Regex::match ( Glib::UStringView  string,
Glib::MatchInfo match_info,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

Scans for a match in string for the pattern in regex.

The match_options are combined with the match options specified when the regex structure was created, letting you have more flexibility in reusing Regex structures.

Unless Glib::Regex::CompileFlags::RAW is specified in the options, string must be valid UTF-8.

A MatchInfo structure, used to get information on the match, is stored in match_info if not nullptr. Note that if match_info is not nullptr then it is created even if the function returns false, i.e. you must free it regardless if regular expression actually matched.

To retrieve all the non-overlapping matches of the pattern in string you can use g_match_info_next().

[C example ellipted]

string is not copied and is used in MatchInfo internally. If you use any MatchInfo method (except g_match_info_free()) after freeing or modifying string then the behaviour is undefined.

Since glibmm 2.14:
Parameters
stringThe string to scan for matches.
match_optionsMatch options.
match_infoPointer to location where to store the MatchInfo, or nullptr if you do not need it.
Returns
true is the string matched, false otherwise.

◆ match() [2/6]

bool Glib::Regex::match ( Glib::UStringView  string,
gssize  string_len,
int  start_position,
Glib::MatchInfo match_info,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

Scans for a match in string for the pattern in regex.

The match_options are combined with the match options specified when the regex structure was created, letting you have more flexibility in reusing Regex structures.

Setting start_position differs from just passing over a shortened string and setting Glib::Regex::MatchFlags::NOTBOL in the case of a pattern that begins with any kind of lookbehind assertion, such as "\\b".

Unless Glib::Regex::CompileFlags::RAW is specified in the options, string must be valid UTF-8.

A MatchInfo structure, used to get information on the match, is stored in match_info if not nullptr. Note that if match_info is not nullptr then it is created even if the function returns false, i.e. you must free it regardless if regular expression actually matched.

string is not copied and is used in MatchInfo internally. If you use any MatchInfo method (except g_match_info_free()) after freeing or modifying string then the behaviour is undefined.

To retrieve all the non-overlapping matches of the pattern in string you can use g_match_info_next().

[C example ellipted]

Since glibmm 2.14:
Parameters
stringThe string to scan for matches.
string_lenThe length of string, in bytes, or -1 if string is nul-terminated.
start_positionStarting index of the string to match, in bytes.
match_optionsMatch options.
match_infoPointer to location where to store the MatchInfo, or nullptr if you do not need it.
Returns
true is the string matched, false otherwise.
Exceptions
Glib::RegexError

◆ match() [3/6]

bool Glib::Regex::match ( Glib::UStringView  string,
gssize  string_len,
int  start_position,
MatchFlags  match_options 
)

A match() method with a string length and start position not requiring a Glib::MatchInfo.

◆ match() [4/6]

bool Glib::Regex::match ( Glib::UStringView  string,
int  start_position,
Glib::MatchInfo match_info,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

A match() method with a start position and a Glib::MatchInfo.

Exceptions
Glib::RegexError

◆ match() [5/6]

bool Glib::Regex::match ( Glib::UStringView  string,
int  start_position,
MatchFlags  match_options 
)

A match() method with a start position not requiring a Glib::MatchInfo.

Exceptions
Glib::RegexError

◆ match() [6/6]

bool Glib::Regex::match ( Glib::UStringView  string,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

A match() method not requiring a Glib::MatchInfo.

◆ match_all() [1/6]

bool Glib::Regex::match_all ( Glib::UStringView  string,
Glib::MatchInfo match_info,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

Using the standard algorithm for regular expression matching only the longest match in the string is retrieved.

This function uses a different algorithm so it can retrieve all the possible matches. For more documentation see g_regex_match_all_full().

A MatchInfo structure, used to get information on the match, is stored in match_info if not nullptr. Note that if match_info is not nullptr then it is created even if the function returns false, i.e. you must free it regardless if regular expression actually matched.

string is not copied and is used in MatchInfo internally. If you use any MatchInfo method (except g_match_info_free()) after freeing or modifying string then the behaviour is undefined.

Since glibmm 2.14:
Parameters
stringThe string to scan for matches.
match_optionsMatch options.
match_infoPointer to location where to store the MatchInfo, or nullptr if you do not need it.
Returns
true is the string matched, false otherwise.

◆ match_all() [2/6]

bool Glib::Regex::match_all ( Glib::UStringView  string,
gssize  string_len,
int  start_position,
Glib::MatchInfo match_info,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

Using the standard algorithm for regular expression matching only the longest match in the string is retrieved, it is not possible to obtain all the available matches.

For instance matching "<a> <b> <c>" against the pattern "<.*>" you get "<a> <b> <c>".

This function uses a different algorithm (called DFA, i.e. deterministic finite automaton), so it can retrieve all the possible matches, all starting at the same point in the string. For instance matching "<a> <b> <c>" against the pattern "<.*>;" you would obtain three matches: "<a> <b> <c>", "<a> <b>" and "<a>".

The number of matched strings is retrieved using g_match_info_get_match_count(). To obtain the matched strings and their position you can use, respectively, g_match_info_fetch() and g_match_info_fetch_pos(). Note that the strings are returned in reverse order of length; that is, the longest matching string is given first.

Note that the DFA algorithm is slower than the standard one and it is not able to capture substrings, so backreferences do not work.

Setting start_position differs from just passing over a shortened string and setting Glib::Regex::MatchFlags::NOTBOL in the case of a pattern that begins with any kind of lookbehind assertion, such as "\\b".

Unless Glib::Regex::CompileFlags::RAW is specified in the options, string must be valid UTF-8.

A MatchInfo structure, used to get information on the match, is stored in match_info if not nullptr. Note that if match_info is not nullptr then it is created even if the function returns false, i.e. you must free it regardless if regular expression actually matched.

string is not copied and is used in MatchInfo internally. If you use any MatchInfo method (except g_match_info_free()) after freeing or modifying string then the behaviour is undefined.

Since glibmm 2.14:
Parameters
stringThe string to scan for matches.
string_lenThe length of string, in bytes, or -1 if string is nul-terminated.
start_positionStarting index of the string to match, in bytes.
match_optionsMatch options.
match_infoPointer to location where to store the MatchInfo, or nullptr if you do not need it.
Returns
true is the string matched, false otherwise.
Exceptions
Glib::RegexError

◆ match_all() [3/6]

bool Glib::Regex::match_all ( Glib::UStringView  string,
gssize  string_len,
int  start_position,
MatchFlags  match_options 
)

A match_all() method with a start position and a string length not requiring a Glib::MatchInfo.

Exceptions
Glib::RegexError

◆ match_all() [4/6]

bool Glib::Regex::match_all ( Glib::UStringView  string,
int  start_position,
Glib::MatchInfo match_info,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

A match_all() method with a start positon and a Glib::MatchInfo.

Exceptions
Glib::RegexError

◆ match_all() [5/6]

bool Glib::Regex::match_all ( Glib::UStringView  string,
int  start_position,
MatchFlags  match_options 
)

A match_all() method with a start position not requiring a Glib::MatchInfo.

Exceptions
Glib::RegexError

◆ match_all() [6/6]

bool Glib::Regex::match_all ( Glib::UStringView  string,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

A match_all() method not requiring a Glib::MatchInfo.

◆ match_simple()

static bool Glib::Regex::match_simple ( Glib::UStringView  pattern,
Glib::UStringView  string,
CompileFlags  compile_options = static_castCompileFlags >(0),
MatchFlags  match_options = static_castMatchFlags >(0) 
)
static

Scans for a match in string for pattern.

This function is equivalent to g_regex_match() but it does not require to compile the pattern with g_regex_new(), avoiding some lines of code when you need just to do a match without extracting substrings, capture counts, and so on.

If this function is to be called on the same pattern more than once, it's more efficient to compile the pattern once with g_regex_new() and then use g_regex_match().

Since glibmm 2.14:
Parameters
patternThe regular expression.
stringThe string to scan for matches.
compile_optionsCompile options for the regular expression, or 0.
match_optionsMatch options, or 0.
Returns
true if the string matched, false otherwise.

◆ operator delete()

void Glib::Regex::operator delete ( void ,
std::size_t   
)
protected

◆ operator=()

Regex & Glib::Regex::operator= ( const Regex )
delete

◆ reference()

void Glib::Regex::reference ( ) const

Increment the reference count for this object.

You should never need to do this manually - use the object via a RefPtr instead.

◆ replace() [1/2]

Glib::ustring Glib::Regex::replace ( const gchar string,
gssize  string_len,
int  start_position,
Glib::UStringView  replacement,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

Replaces all occurrences of the pattern in regex with the replacement text.

Backreferences of the form '\number' or '\g<number>' in the replacement text are interpolated by the number-th captured subexpression of the match, '\g<name>' refers to the captured subexpression with the given name. '\0' refers to the complete match, but '\0' followed by a number is the octal representation of a character. To include a literal '\' in the replacement, write '\\\\'.

There are also escapes that changes the case of the following text:

  • \l: Convert to lower case the next character
  • \u: Convert to upper case the next character
  • \L: Convert to lower case till \E
  • \U: Convert to upper case till \E
  • \E: End case modification

If you do not need to use backreferences use g_regex_replace_literal().

The replacement string must be UTF-8 encoded even if Glib::Regex::CompileFlags::RAW was passed to g_regex_new(). If you want to use not UTF-8 encoded strings you can use g_regex_replace_literal().

Setting start_position differs from just passing over a shortened string and setting Glib::Regex::MatchFlags::NOTBOL in the case of a pattern that begins with any kind of lookbehind assertion, such as "\\b".

Since glibmm 2.14:
Parameters
stringThe string to perform matches against.
string_lenThe length of string, in bytes, or -1 if string is nul-terminated.
start_positionStarting index of the string to match, in bytes.
replacementText to replace each match with.
match_optionsOptions for the match.
Returns
A newly allocated string containing the replacements.
Exceptions
Glib::RegexError

◆ replace() [2/2]

Glib::ustring Glib::Regex::replace ( Glib::UStringView  string,
int  start_position,
Glib::UStringView  replacement,
MatchFlags  match_options 
)
Exceptions
Glib::RegexError

◆ replace_eval()

Glib::ustring Glib::Regex::replace_eval ( Glib::UStringView  string,
gssize  string_len,
int  start_position,
MatchFlags  match_options,
GRegexEvalCallback  eval,
gpointer  user_data 
)

Replaces occurrences of the pattern in regex with the output of eval for that occurrence.

Setting start_position differs from just passing over a shortened string and setting Glib::Regex::MatchFlags::NOTBOL in the case of a pattern that begins with any kind of lookbehind assertion, such as "\\b".

The following example uses g_regex_replace_eval() to replace multiple strings at once:

[C example ellipted]

Since glibmm 2.14:
Parameters
stringString to perform matches against.
string_lenThe length of string, in bytes, or -1 if string is nul-terminated.
start_positionStarting index of the string to match, in bytes.
match_optionsOptions for the match.
evalA function to call for each match.
user_dataUser data to pass to the function.
Returns
A newly allocated string containing the replacements.
Exceptions
Glib::RegexError

◆ replace_literal() [1/2]

Glib::ustring Glib::Regex::replace_literal ( const gchar string,
gssize  string_len,
int  start_position,
Glib::UStringView  replacement,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

Replaces all occurrences of the pattern in regex with the replacement text.

replacement is replaced literally, to include backreferences use g_regex_replace().

Setting start_position differs from just passing over a shortened string and setting Glib::Regex::MatchFlags::NOTBOL in the case of a pattern that begins with any kind of lookbehind assertion, such as "\\b".

Since glibmm 2.14:
Parameters
stringThe string to perform matches against.
string_lenThe length of string, in bytes, or -1 if string is nul-terminated.
start_positionStarting index of the string to match, in bytes.
replacementText to replace each match with.
match_optionsOptions for the match.
Returns
A newly allocated string containing the replacements.
Exceptions
Glib::RegexError

◆ replace_literal() [2/2]

Glib::ustring Glib::Regex::replace_literal ( Glib::UStringView  string,
int  start_position,
Glib::UStringView  replacement,
MatchFlags  match_options 
)
Exceptions
Glib::RegexError

◆ split() [1/3]

std::vector< Glib::ustring > Glib::Regex::split ( const gchar string,
gssize  string_len,
int  start_position,
MatchFlags  match_options = static_castMatchFlags >(0),
int  max_tokens = 0 
) const

Breaks the string on the pattern, and returns an array of the tokens.

If the pattern contains capturing parentheses, then the text for each of the substrings will also be returned. If the pattern does not match anywhere in the string, then the whole string is returned as the first token.

As a special case, the result of splitting the empty string "" is an empty vector, not a vector containing a single string. The reason for this special case is that being able to represent an empty vector is typically more useful than consistent handling of empty elements. If you do need to represent empty elements, you'll need to check for the empty string before calling this function.

A pattern that can match empty strings splits string into separate characters wherever it matches the empty string between characters. For example splitting "ab c" using as a separator "\\s*", you will get "a", "b" and "c".

Setting start_position differs from just passing over a shortened string and setting Glib::Regex::MatchFlags::NOTBOL in the case of a pattern that begins with any kind of lookbehind assertion, such as "\\b".

Since glibmm 2.14:
Parameters
stringThe string to split with the pattern.
string_lenThe length of string, in bytes, or -1 if string is nul-terminated.
start_positionStarting index of the string to match, in bytes.
match_optionsMatch time option flags.
max_tokensThe maximum number of tokens to split string into. If this is less than 1, the string is split completely.
Returns
A nullptr-terminated gchar ** array.
Exceptions
Glib::RegexError

◆ split() [2/3]

std::vector< Glib::ustring > Glib::Regex::split ( Glib::UStringView  string,
int  start_position,
MatchFlags  match_options,
int  max_tokens 
) const
Exceptions
Glib::RegexError

◆ split() [3/3]

std::vector< Glib::ustring > Glib::Regex::split ( Glib::UStringView  string,
MatchFlags  match_options = static_castMatchFlags >(0) 
)

Breaks the string on the pattern, and returns an array of the tokens.

If the pattern contains capturing parentheses, then the text for each of the substrings will also be returned. If the pattern does not match anywhere in the string, then the whole string is returned as the first token.

As a special case, the result of splitting the empty string "" is an empty vector, not a vector containing a single string. The reason for this special case is that being able to represent an empty vector is typically more useful than consistent handling of empty elements. If you do need to represent empty elements, you'll need to check for the empty string before calling this function.

A pattern that can match empty strings splits string into separate characters wherever it matches the empty string between characters. For example splitting "ab c" using as a separator "\\s*", you will get "a", "b" and "c".

Since glibmm 2.14:
Parameters
stringThe string to split with the pattern.
match_optionsMatch time option flags.
Returns
A nullptr-terminated gchar ** array.

◆ split_simple()

static std::vector< Glib::ustring > Glib::Regex::split_simple ( Glib::UStringView  pattern,
Glib::UStringView  string,
CompileFlags  compile_options = static_castCompileFlags >(0),
MatchFlags  match_options = static_castMatchFlags >(0) 
)
static

Breaks the string on the pattern, and returns an array of the tokens.

If the pattern contains capturing parentheses, then the text for each of the substrings will also be returned. If the pattern does not match anywhere in the string, then the whole string is returned as the first token.

This function is equivalent to g_regex_split() but it does not require to compile the pattern with g_regex_new(), avoiding some lines of code when you need just to do a split without extracting substrings, capture counts, and so on.

If this function is to be called on the same pattern more than once, it's more efficient to compile the pattern once with g_regex_new() and then use g_regex_split().

As a special case, the result of splitting the empty string "" is an empty vector, not a vector containing a single string. The reason for this special case is that being able to represent an empty vector is typically more useful than consistent handling of empty elements. If you do need to represent empty elements, you'll need to check for the empty string before calling this function.

A pattern that can match empty strings splits string into separate characters wherever it matches the empty string between characters. For example splitting "ab c" using as a separator "\\s*", you will get "a", "b" and "c".

Since glibmm 2.14:
Parameters
patternThe regular expression.
stringThe string to scan for matches.
compile_optionsCompile options for the regular expression, or 0.
match_optionsMatch options, or 0.
Returns
A nullptr-terminated array of strings.

◆ unreference()

void Glib::Regex::unreference ( ) const

Decrement the reference count for this object.

You should never need to do this manually - use the object via a RefPtr instead.

Friends And Related Symbol Documentation

◆ wrap()

Glib::RefPtr< Glib::Regex > wrap ( GRegex object,
bool  take_copy = false 
)
related

A Glib::wrap() method for this object.

Parameters
objectThe C instance.
take_copyFalse if the result should take ownership of the C instance. True if it should take a new copy or ref.
Returns
A C++ instance that wraps this C instance.