CWE-94: Improper Control of Generation of Code ('Code Injection')
Official CWE-94 CWE context with Glexia analysis, remediation guidance, related CVEs, and ATT&CK context.
Glexia's Take
CWE-94: Code Injection
Improper Control of Generation of Code ('Code Injection') represents a recurring weakness pattern that can create exploitable paths when design, validation, or implementation controls are missing.
Executive Impact
- Access Control: Bypass Protection Mechanism: In some cases, injectable code controls authentication; this may lead to a remote vulnerability.
- Access Control: Gain Privileges or Assume Identity: Injected code can access resources that the attacker is directly prevented from accessing.
- Integrity,Confidentiality,Availability: Execute Unauthorized Code or Commands: When a product allows a user's input to contain code syntax, it might be possible for an attacker to craft the code in such a way that it will alter the intended control flow of the product. As a result, code injection can often result in the execution of arbitrary code. Code injection attacks can also lead to loss of data integrity in nearly all cases, since the control-plane data injected is always incidental to data recall or writing.
- Non-Repudiation: Hide Activities: Often the actions performed by injected control code are unlogged.
Developer Pattern
CWE-94 is the kind of defect developers can usually prevent with explicit validation, safer framework defaults, and tests that exercise hostile input or unsafe state transitions.
Confidence
high confidence from CWE-94, 4.20.
Official CWE Definition
CWE-94: Improper Control of Generation of Code ('Code Injection')
The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.
Developer And Remediation Guidance
How teams prevent and detect this weakness
Causes
- This example attempts to write user messages to a message file and allow users to view them. While the programmer intends for the MessageFile to only include data, an attacker can provide a message such as:,which will decode to the following:,The programmer thought they were just including the contents of a regular data file, but PHP parsed it and executed the code. Now, this code is executed any time people view messages.,Notice that XSS (CWE-79) is also possible in this situation.
- edit-config.pl: This CGI script is used to modify settings in a configuration file. The script intends to take the 'action' parameter and invoke one of a variety of functions based on the value of that parameter - config_file_add_key(), config_file_set_key(), or config_file_delete_key(). It could set up a conditional to invoke each function separately, but eval() is a powerful way of doing the same thing in fewer lines of code, especially when a large number of functions or variables are involved. Unfortunately, in this case, the attacker can provide other values in the action parameter, such as:,This would produce the following string in handleConfigAction():,Any arbitrary Perl code could be added after the attacker has "closed off" the construction of the original function call, in order to prevent parsing errors from causing the malicious eval() to fail before the attacker's payload is activated. This particular manipulation would fail after the system() call, because the "_key(\$fname, \$key, \$val)" portion of the string would cause an error, but this is irrelevant to the attack because the payload has already been activated.
- This simple python3 script asks a user to supply a comma-separated list of numbers as input and adds them together. The eval() function can take the user-supplied list and convert it into a Python list object, therefore allowing the programmer to use list comprehension methods to work with the data. However, if code is supplied to the eval() function, it will execute that code. For example, a malicious user could supply the following string:,This would delete all the files in the current directory. For this reason, it is not recommended to use eval() with untrusted input.,A way to accomplish this without the use of eval() is to apply an integer conversion on the input within a try/except block. If the user-supplied input is not numeric, this will raise a ValueError. By avoiding eval(), there is no opportunity for the input string to be executed as code.,An alternative, commonly-cited mitigation for this kind of weakness is to use the ast.literal_eval() function, since it is intentionally designed to avoid executing code. However, an adversary could still cause excessive memory or stack consumption via deeply nested structures [REF-1372], so the python documentation discourages use of ast.literal_eval() on untrusted data [REF-1373].
- The following code is a workflow job written using YAML. The code attempts to download pull request artifacts, unzip from the artifact called pr.zip and extract the value of the file NR into a variable "pr_number" that will be used later in another job. It attempts to create a github workflow environment variable, writing to $GITHUB_ENV. The environment variable value is retrieved from an external resource. [object Object],[object Object]
Remediation
- Architecture and Design: Refactor your program so that you do not have to dynamically generate code.
- Architecture and Design: [object Object]
- Implementation: [object Object]
- Testing: Use dynamic tools and techniques that interact with the product using large test suites with many diverse inputs, such as fuzz testing (fuzzing), robustness testing, and fault injection. The product's operation may slow down, but it should not become unstable, crash, or generate incorrect results.
- Operation: Run the code in an environment that performs automatic taint propagation and prevents any command execution that uses tainted variables, such as Perl's "-T" switch. This will force the program to perform validation steps that remove the taint, although you must be careful to correctly validate your inputs so that you do not accidentally mark dangerous inputs as untainted (see CWE-183 and CWE-184).
Detection
- Automated Static Analysis: Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then searching for potentially-vulnerable patterns that connect "sources" (origins of input) with "sinks" (destinations where the data interacts with external components, a lower layer such as the OS, etc.)
Mappings
Related CVEs, CWEs, and ATT&CK context
Related CWEs
- CWE-1336: Improper Neutralization of Special Elements Used in a Template Engine
- CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')
- CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')
- CWE-913: Improper Control of Dynamically-Managed Code Resources
- CWE-95: Improper Neutralization of Directives in Dynamically Evaluated Code ('Eval Injection')
- CWE-96: Improper Neutralization of Directives in Statically Saved Code ('Static Code Injection')
- CWE-98: Improper Control of Filename for Include/Require Statement in PHP Program ('PHP Remote File Inclusion')
ATT&CK Relevance
ATT&CK relevance is shown only when reviewed or responsibly inferred.