CWE-185: Incorrect Regular Expression
Official CWE-185 CWE context with Glexia analysis, remediation guidance, related CVEs, and ATT&CK context.
Glexia's Take
CWE-185: Incorrect Regular Expression
Incorrect Regular Expression represents a recurring weakness pattern that can create exploitable paths when design, validation, or implementation controls are missing.
Executive Impact
- Other: Unexpected State,Varies by Context: When the regular expression is not correctly specified, data might have a different format or type than the rest of the program expects, producing resultant weaknesses or errors.
- Access Control: Bypass Protection Mechanism: In PHP, regular expression checks can sometimes be bypassed with a null byte, leading to any number of weaknesses.
Developer Pattern
CWE-185 is the kind of defect developers can usually prevent with explicit validation, safer framework defaults, and tests that exercise hostile input or unsafe state transitions.
Confidence
high confidence from CWE-185, 4.20.
Official CWE Definition
CWE-185: Incorrect Regular Expression
The product specifies a regular expression in a way that causes data to be improperly matched or compared.
When the regular expression is used in protection mechanisms such as filtering or validation, this may allow an attacker to bypass the intended restrictions on the incoming data.
Developer And Remediation Guidance
How teams prevent and detect this weakness
Causes
- The following code takes phone numbers as input, and uses a regular expression to reject invalid phone numbers. An attacker could provide an argument such as: "; ls -l ; echo 123-456" This would pass the check, since "123-456" is sufficient to match the "\d+-\d+" portion of the regular expression.
- This code uses a regular expression to validate an IP string prior to using it in a call to the "ping" command. Since the regular expression does not have anchors (CWE-777), i.e. is unbounded without ^ or $ characters, then prepending a 0 or 0x to the beginning of the IP address will still result in a matched regex pattern. Since the ping command supports octal and hex prepended IP addresses, it will use the unexpectedly valid IP address (CWE-1389). For example, "0x63.63.63.63" would be considered equivalent to "99.63.63.63". As a result, the attacker could potentially ping systems that the attacker cannot reach directly.
Remediation
- Implementation: Regular expressions can become error prone when defining a complex language even for those experienced in writing grammars. Determine if several smaller regular expressions simplify one large regular expression. Also, subject the regular expression to thorough testing techniques such as equivalence partitioning, boundary value analysis, and robustness. After testing and a reasonable confidence level is achieved, a regular expression may not be foolproof. If an exploit is allowed to slip through, then record the exploit and refactor the regular expression.
Detection
- Automated Static Analysis: Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then searching for potentially-vulnerable patterns that connect "sources" (origins of input) with "sinks" (destinations where the data interacts with external components, a lower layer such as the OS, etc.)
Mappings
Related CVEs, CWEs, and ATT&CK context
Related CWEs
ATT&CK Relevance
ATT&CK relevance is shown only when reviewed or responsibly inferred.