Live Active security incident? Get immediate response
CWE Reference

CWE-185: Incorrect Regular Expression

Official CWE-185 CWE context with Glexia analysis, remediation guidance, related CVEs, and ATT&CK context.

Release 4.20weaknessDraft

Glexia's Take

CWE-185: Incorrect Regular Expression

Incorrect Regular Expression represents a recurring weakness pattern that can create exploitable paths when design, validation, or implementation controls are missing.

Executive Impact

  • Other: Unexpected State,Varies by Context: When the regular expression is not correctly specified, data might have a different format or type than the rest of the program expects, producing resultant weaknesses or errors.
  • Access Control: Bypass Protection Mechanism: In PHP, regular expression checks can sometimes be bypassed with a null byte, leading to any number of weaknesses.

Developer Pattern

CWE-185 is the kind of defect developers can usually prevent with explicit validation, safer framework defaults, and tests that exercise hostile input or unsafe state transitions.

Confidence

high confidence from CWE-185, 4.20.

Official CWE Definition

CWE-185: Incorrect Regular Expression

The product specifies a regular expression in a way that causes data to be improperly matched or compared.

When the regular expression is used in protection mechanisms such as filtering or validation, this may allow an attacker to bypass the intended restrictions on the incoming data.

Type
weakness
Abstraction
Class
Status
Draft
Source
MITRE CWE definition

Developer And Remediation Guidance

How teams prevent and detect this weakness

Causes

  • The following code takes phone numbers as input, and uses a regular expression to reject invalid phone numbers. An attacker could provide an argument such as: "; ls -l ; echo 123-456" This would pass the check, since "123-456" is sufficient to match the "\d+-\d+" portion of the regular expression.
  • This code uses a regular expression to validate an IP string prior to using it in a call to the "ping" command. Since the regular expression does not have anchors (CWE-777), i.e. is unbounded without ^ or $ characters, then prepending a 0 or 0x to the beginning of the IP address will still result in a matched regex pattern. Since the ping command supports octal and hex prepended IP addresses, it will use the unexpectedly valid IP address (CWE-1389). For example, "0x63.63.63.63" would be considered equivalent to "99.63.63.63". As a result, the attacker could potentially ping systems that the attacker cannot reach directly.

Remediation

  • Implementation: Regular expressions can become error prone when defining a complex language even for those experienced in writing grammars. Determine if several smaller regular expressions simplify one large regular expression. Also, subject the regular expression to thorough testing techniques such as equivalence partitioning, boundary value analysis, and robustness. After testing and a reasonable confidence level is achieved, a regular expression may not be foolproof. If an exploit is allowed to slip through, then record the exploit and refactor the regular expression.

Detection

  • Automated Static Analysis: Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then searching for potentially-vulnerable patterns that connect "sources" (origins of input) with "sinks" (destinations where the data interacts with external components, a lower layer such as the OS, etc.)

Mappings

Related CVEs, CWEs, and ATT&CK context

Related CWEs

Related CVEs

Related CVE mappings appear after CVE records are cross-indexed.

Open CWE CVE mapping

ATT&CK Relevance

ATT&CK relevance is shown only when reviewed or responsibly inferred.