Summary:
New base class for all future SMT Exprs.
No major changes except moving `areEquivalent` and `getFloatSemantics` outside of `Z3Expr` to keep the class minimal.
Reviewers: NoQ, george.karpenkov
Reviewed By: george.karpenkov
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D49551
llvm-svn: 337917
Summary:
New base class for all future SMT sorts.
The only change is that the class implements methods `isBooleanSort()`, `isBitvectorSort()` and `isFloatSort()` so it doesn't rely on `Z3`'s enum.
Reviewers: NoQ, george.karpenkov
Reviewed By: george.karpenkov
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D49550
llvm-svn: 337916
Summary:
Although it is a big patch, the changes are simple:
1. There is one `Z3_Context` now, member of the `SMTConstraintManager` class.
2. `Z3Expr`, `Z3Sort`, `Z3Model` and `Z3Solver` are constructed with a reference to the `Z3_Context` in `SMTConstraintManager`.
3. All static functions are now members of `Z3Solver`, e.g, the `SMTConstraintManager` now calls `Solver.fromBoolean(false)` instead of `Z3Expr::fromBoolean(false)`.
Most of the patch only move stuff around except:
1. New method `Z3Sort MkSort(const QualType &Ty, unsigned BitWidth)`, that creates a sort based on the `QualType` and its width. Used to simplify the `fromData` method.
Unfortunate consequence of this patch:
1. `getInterpretation` was moved from `Z3Model` class to `Z3Solver`, because it needs to create a `Z3Sort` before returning the interpretation. This can be fixed by changing both `toAPFloat` and `toAPSInt` by removing the dependency of `Z3Sort` (it's only used to check which Sort was created and to retrieve the type width).
Reviewers: NoQ, george.karpenkov, ddcc
Reviewed By: george.karpenkov
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D49236
llvm-svn: 337915
Summary:
This patch creates `SMTContext` which will wrap a specific SMT context, through `SMTSolverContext`.
The templated `SMTSolverContext` class it's a simple wrapper around a SMT specific context (currently only used in the Z3 backend), while `Z3Context` inherits `SMTSolverContext<Z3_context>` and implements solver specific operations like initialization and destruction of the context.
This separation was done because:
1. We might want to keep one single context, shared across different `SMTConstraintManager`s. It can be achieved by constructing a `SMTContext`, through a function like `CreateSMTContext(Z3)`, `CreateSMTContext(BOOLECTOR)`, etc. The rest of the CSA only need to know about `SMTContext`, so maybe it's a good idea moving `SMTSolverContext` to a separate header in the future.
2. Any generic SMT operation will only require one `SMTSolverContext`object, which can access the specific context by calling `getContext()`.
Reviewers: NoQ, george.karpenkov
Reviewed By: george.karpenkov
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D49233
llvm-svn: 337914
A checker for detecting leaks resulting from allocating temporary
autoreleasing objects before starting the main run loop.
Checks for two antipatterns:
1. ObjCMessageExpr followed by [[NARunLoop mainRunLoop] run] in the same
autorelease pool.
2. ObjCMessageExpr followed by [[NARunLoop mainRunLoop] run] in no
autorelease pool.
Happens-before relationship is modeled purely syntactically.
rdar://39299145
Differential Revision: https://reviews.llvm.org/D49528
llvm-svn: 337876
The note is added in the following situation:
- We are throwing a nullability-related warning on an IVar
- The path goes through a method which *could have* (syntactically
determined) written into that IVar, but did not
rdar://42444460
Differential Revision: https://reviews.llvm.org/D49689
llvm-svn: 337864
Remove an assertion in RangeConstraintManager that expects such symbols to never
appear, while admitting that the constraint manager doesn't yet handle them.
Differential Revision: https://reviews.llvm.org/D49703
llvm-svn: 337769
Patch https://reviews.llvm.org/rC329780 not only rearranges comparisons but
also binary expressions. This latter behavior is not protected by the analyzer
option. Hower, since no complexity threshold is enforced to the symbols this
may result in exponential execution time if the expressions are too complex:
https://bugs.llvm.org/show_bug.cgi?id=38208. For a quick fix we extended the
analyzer option to also cover the additive cases.
This is only a temporary fix, the final solution should be enforcing the
complexity threshold to the symbols.
Differential Revision: https://reviews.llvm.org/D49536
llvm-svn: 337678
The last argument is expected to be the destination buffer size (or less).
Detects if it points to destination buffer size directly or via a variable.
Detects if it is an integral, try to detect if the destination buffer can receive the source length.
Updating bsd-string.c unit tests as it make it fails now.
Reviewers: george.karpenpov, NoQ
Reviewed By: george.karpenkov
Differential Revision: https://reviews.llvm.org/D48884
llvm-svn: 337499
StringRef's data() returns a string that may be non-null-terminated.
Switch to using StringRefs from const char pointers in visitor notes
to avoid problems.
llvm-svn: 337474
Summary:
This patch introduces a new member to SymExpr, which stores the symbol complexity, avoiding recalculating it every time computeComplexity() is called.
Also, increase the complexity of conjured Symbols by one, so it's clear that it has a greater complexity than its underlying symbols.
Reviewers: NoQ, george.karpenkov
Reviewed By: NoQ, george.karpenkov
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D49232
llvm-svn: 337472
DanglingInternalBufferChecker.
A pointer referring to the elements of a basic_string may be invalidated
by calling a non-const member function, except operator[], at, front,
back, begin, rbegin, end, and rend. The checker now warns if the pointer
is used after such operations.
Differential Revision: https://reviews.llvm.org/D49360
llvm-svn: 337463
Summary:
An assertion was added in D48205 to catch places where a `nonloc::SymbolVal` was wrapping a `loc` object.
This patch fixes that in the Z3 backend by making the `SValBuilder` object accessible from inherited instances of `SimpleConstraintManager` and calling `SVB.makeSymbolVal(foo)` instead of `nonloc::SymbolVal(foo)`.
Reviewers: NoQ, george.karpenkov
Reviewed By: NoQ
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D49430
llvm-svn: 337304
The canonical representation of pointer &SymRegion{$x} casted to boolean is
"$x != 0", not "$x". Assertion added in r337227 catches that.
Differential Revision: https://reviews.llvm.org/D48232
llvm-svn: 337228
In the current SVal hierarchy there are multiple ways of representing certain
values but few are actually used and expected to be seen by the code.
In particular, a value of a symbolic pointer is always represented by a
loc::MemRegionVal that wraps a SymbolicRegion that wraps the pointer symbol
and never by a nonloc::SymbolVal that wraps that symbol directly.
Assert the aforementioned fact. Fix one minor violation of it.
Differential Revision: https://reviews.llvm.org/D48205
llvm-svn: 337227
Only suppress those cases where the null which came from the macro is
relevant to the bug, and was not overwritten in between.
rdar://41497323
Differential Revision: https://reviews.llvm.org/D48856
llvm-svn: 337213
Initializing a semaphore with a different constant most likely signals a different intent
rdar://41802552
Differential Revision: https://reviews.llvm.org/D48911
llvm-svn: 337212
Summary:
In `toAPSInt`, the Z3 backend was not checking the variable `Int`'s type and was always generating unsigned `APSInt`s.
This was found by accident when I removed:
```
llvm::APSInt ConvertedLHS, ConvertedRHS;
QualType LTy, RTy;
std::tie(ConvertedLHS, LTy) = fixAPSInt(*LHS);
std::tie(ConvertedRHS, RTy) = fixAPSInt(*RHS);
- doIntTypePromotion<llvm::APSInt, Z3ConstraintManager::castAPSInt>(
- ConvertedLHS, LTy, ConvertedRHS, RTy);
return BVF.evalAPSInt(BSE->getOpcode(), ConvertedLHS, ConvertedRHS);
```
And the `BasicValueFactory` started to complain about different `signedness`.
Reviewers: george.karpenkov, NoQ, ddcc
Reviewed By: ddcc
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D49305
llvm-svn: 337169
Summary:
This patch removes the constraint dropping when taint tracking is disabled.
It also voids the crash reported in D28953 by treating a SymSymExpr with non pointer symbols as an opaque expression.
Updated the regressions and verifying the big projects now; I'll update here when they're done.
Based on the discussion on the mailing list and the patches by @ddcc.
Reviewers: george.karpenkov, NoQ, ddcc, baloghadamsoftware
Reviewed By: george.karpenkov
Subscribers: delcypher, llvm-commits, rnkovacs, xazax.hun, szepet, a.sidorin, ddcc
Differential Revision: https://reviews.llvm.org/D48650
llvm-svn: 337167
Marking a symbolic expression as live is non-recursive. In our checkers we
either use conjured symbols or conjured symbols plus/minus integers to
represent abstract position of iterators, so in this latter case we also
must mark the `SymbolData` part of these symbolic expressions as live to
prevent them from getting reaped.
Differential Revision: https://reviews.llvm.org/D48764
llvm-svn: 337151
It was not possible to disable alpha.unix.cstring.OutOfBounds checker's reports
since unix.Malloc checker always implicitly enabled the filter. Moreover if the
checker was disabled from command line (-analyzer-disable-checker ..) the out
of bounds warnings were nevertheless emitted under different checker names such
as unix.cstring.NullArg, or unix.Malloc.
This patch fixes the case sot that Malloc checker only enables implicitly the
underlying modeling of strcpy, memcpy etc. but not the warning messages that
would have been emmitted by alpha.unix.cstring.OutOfBounds
Patch by: Dániel Krupp
Differential Revision: https://reviews.llvm.org/D48831
llvm-svn: 337000
As the code for the checker grew, it became increasinly difficult to see
whether a function was global or statically defined. In this patch,
anything that isn't a type declaration or definition was moved out of the
anonymous namespace and is marked as static.
llvm-svn: 336901
Previously, the checker only tracked one raw pointer symbol for each
container object. But member functions returning a pointer to the
object's inner buffer may be called on the object several times. These
pointer symbols are now collected in a set inside the program state map
and thus all of them is checked for use-after-free problems.
Differential Revision: https://reviews.llvm.org/D49057
llvm-svn: 336835
This allows more qualification conversions, eg. conversion from
'int *(*)[]' -> 'const int *const (*)[]'
is now permitted, along with all the consequences of that: more types
are similar, more cases are permitted by const_cast, and conversely,
fewer "casting away constness" cases are permitted by reinterpret_cast.
llvm-svn: 336745
Summary:
This adds an option, max-symbol-complexity, so an user can set the maximum symbol complexity threshold.
Note that the current behaviour is equivalent to max complexity = 0, when taint analysis is not enabled and tests show that in a number of tests, having complexity = 25 yields the same results as complexity = 10000.
This patch was extracted and modified from Dominic Chen's patch, D35450.
Reviewers: george.karpenkov, NoQ, ddcc
Reviewed By: george.karpenkov
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D49093
llvm-svn: 336671
DanglingInternalBufferChecker now tracks use-after-free problems related
to the incorrect usage of std::basic_string::data().
Differential Revision: https://reviews.llvm.org/D48532
llvm-svn: 336497
Add a bug visitor to DanglingInternalBufferChecker that places a note
at the point where the dangling pointer was obtained. The visitor is
handed over to MallocChecker and attached to the report there.
Differential Revision: https://reviews.llvm.org/D48522
llvm-svn: 336495
Extend MallocBugVisitor to place a note at the point where objects with
AF_InternalBuffer allocation family are destroyed.
Differential Revision: https://reviews.llvm.org/D48521
llvm-svn: 336489
Summary: In the provided test case the PathDiagnostic compare function was not able to find a difference.
Reviewers: xazax.hun, NoQ, dcoughlin, george.karpenkov
Reviewed By: george.karpenkov
Subscribers: a_sidorin, szepet, rnkovacs, a.sidorin, mikhail.ramalho, cfe-commits
Differential Revision: https://reviews.llvm.org/D48474
llvm-svn: 336275
Now, instead of adding the constraints when they are removed, this patch adds them when they first appear and, since we walk the bug report backward, it should be the last set of ranges generated by the CSA for a given symbol.
These are the number before and after the patch:
```
Project | current | patch |
tmux | 283.222 | 123.052 |
redis | 614.858 | 400.347 |
openssl | 308.292 | 307.149 |
twin | 274.478 | 245.411 |
git | 547.687 | 477.335 |
postgresql | 2927.495 | 2002.526 |
sqlite3 | 3264.305 | 1028.416 |
```
Major speedups in tmux and sqlite (less than half of the time), redis and postgresql were about 25% faster while the rest are basically the same.
Reviewers: NoQ, george.karpenkov
Reviewed By: george.karpenkov
Subscribers: rnkovacs, xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D48565
llvm-svn: 336002
In order to better support consumers of the plist output that don't
parse note entries just yet, a 'NotesAsWarnings' flag was added.
If it's set to true, all notes will be converted to warnings.
Differential Revision: https://reviews.llvm.org/D48285
llvm-svn: 335964
The refutation manager is removing a true bug from the test in this patch.
The problem is that the following constraint:
```
(conj_$1{struct o *}) - (reg_$3<int * r>): [-9223372036854775808, 0]
```
is encoded as:
```
(and (bvuge (bvsub $1 $3) #x8000000000000000)
(bvule (bvsub $1 $3) #x0000000000000000))
```
The issue is that unsigned comparisons (bvuge and bvule) are being generated instead of signed comparisons (bvsge and bvsle).
When generating the expressions:
```
(conj_$1{p *}) - (reg_$3<int * r>) >= -9223372036854775808
```
and
```
(conj_$1{p *}) - (reg_$3<int * r>) <= 0
```
both -9223372036854775808 and 0 are casted to pointer type and `LTy->isSignedIntegerOrEnumerationType()` in `Z3ConstraintManager::getZ3BinExpr` only checks if the type is signed, not if it's a pointer.
Reviewers: NoQ, george.karpenkov, ddcc
Subscribers: rnkovacs, NoQ, george.karpenkov, ddcc, xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D48324
llvm-svn: 335926
Add handling of the begin() funcion of containers to the iterator checkers,
together with the pre- and postfix ++ and -- operators of the iterators. This
makes possible the checking of iterators dereferenced ahead of the begin of the
container.
Differential Revision: https://reviews.llvm.org/D32642
llvm-svn: 335835
If range [m .. n] is stored for symbolic expression A - B, then we can deduce the range for B - A which is [-n .. -m]. This is only true for signed types, unless the range is [0 .. 0].
Differential Revision: https://reviews.llvm.org/D35110
llvm-svn: 335814
The ProgramState::assumeInBound() API is used by checkers to make an assumption
that a certain array index is within the array's bounds (i.e. is greater than or
equal to 0 and is less than the length of the array). When the type of the
index was unspecified by the caller, it assumed that the type is 'int', which
caused some indices and sizes to truncate during calculations.
Use ArrayIndexTy by default instead, which is used by the analyzer to represent
index types and is currently hardcoded to long long.
Patch by Bevin Hansson!
Differential Revision: https://reviews.llvm.org/D46944
llvm-svn: 335803
r335795 adds copy elision information to CFG. This commit allows static analyzer
to elide elidable copy constructors by constructing the objects that were
previously subject to elidable copy directly in the target region of the copy.
The chain of elided constructors may potentially be indefinitely long. This
only happens when the object is being returned from a function which in turn is
returned from another function, etc.
NRVO is not supported yet.
Differential Revision: https://reviews.llvm.org/D47671
llvm-svn: 335800
When a temporary object is materialized and through that obtain lifetime that
is longer than the duration of the full-expression, it does not require a
temporary object destructor; it will be destroyed in a different manner.
Therefore it's not necessary to include CXXBindTemporaryExpr into the
construction context for such temporary in the CFG only to make clients
throw it away.
Differential Revision: https://reviews.llvm.org/D47667
llvm-svn: 335798
When an object's class provides no destructor, it's less important to
materialize that object properly because we don't have to model the destructor
correctly, so previously we skipped the support for these syntax patterns.
Additionally, fix support for construction contexts of "static temporaries"
(temporaries that are lifetime-extended by static references) because
it turned out that we only had tests for them without destructors, which caused
us to regress when we re-introduced the construction context for such
temporaries.
Differential Revision: https://reviews.llvm.org/D47658
llvm-svn: 335796
Before C++17 copy elision was optional, even if the elidable copy/move
constructor had arbitrary side effects. The elidable constructor is present
in the AST, but marked as elidable.
In these cases CFG now contains additional information that allows its clients
to figure out if a temporary object is only being constructed so that to pass
it to an elidable constructor. If so, it includes a reference to the elidable
constructor's construction context, so that the client could elide the
elidable constructor and construct the object directly at its final destination.
Differential Revision: https://reviews.llvm.org/D47616
llvm-svn: 335795
Summary:
Add an extension point to allow registration of statically-linked Clang Static
Analyzer checkers that are not a part of the Clang tree. This extension point
employs the mechanism used when checkers are registered from dynamically loaded
plugins.
Reviewers: george.karpenkov, NoQ, xazax.hun, dcoughlin
Reviewed By: george.karpenkov
Subscribers: mgorny, mikhail.ramalho, rnkovacs, xazax.hun, szepet, a.sidorin, cfe-commits
Differential Revision: https://reviews.llvm.org/D45718
llvm-svn: 335740
In the current implementation, we run visitors until the fixed point is
reached.
That is, if a visitor adds another visitor, the currently processed path
is destroyed, all diagnostics is discarded, and it is regenerated again,
until it's no longer modified.
This pattern has a few negative implications:
- This loop does not even guarantee to terminate.
E.g. just imagine two visitors bouncing a diagnostics around.
- Performance-wise, e.g. for sqlite3 all visitors are being re-run at
least 10 times for some bugs.
We have already seen a few reports where it leads to timeouts.
- If we want to add more computationally intense visitors, this will
become worse.
- From architectural standpoint, the current layout requires copying
visitors, which is conceptually wrong, and can be annoying (e.g. no
unique_ptr on visitors allowed).
The proposed change is a much simpler architecture: the outer loop
processes nodes upwards, and whenever the visitor is added it only
processes current nodes and above, thus guaranteeing termination.
Differential Revision: https://reviews.llvm.org/D47856
llvm-svn: 335666
ExprWithCleanups wraps full-expressions that require temporary destructors
and highlights the moment of time in which these destructors need to be called
(i.e., "at the end of the full-expression...").
Such expressions don't necessarily return an object; they may return anything,
including a null or undefined value.
When the analyzer tries to understand where the null or undefined value came
from in order to present better diagnostics to the user, it will now skip
any ExprWithCleanups it encounters and look into the expression itself.
Differential Revision: https://reviews.llvm.org/D48204
llvm-svn: 335559
Conservative evaluation of a C++ method call would invalidate the object,
as long as the method is not const or the object has mutable fields.
When checking for mutable fields, we need to scan the type of the object on
which the method is called, which may be more specific than the type of the
object on which the method is defined, hence we look up the type from the
this-argument expression.
If arrow syntax or implicit-this syntax is used, this-argument expression
has pointer type, not record type, and lookup accidentally failed for that
reason. Obtain object type correctly.
Differential Revision: https://reviews.llvm.org/D48460
llvm-svn: 335555
This diff includes the logic for setting the precision bits for each primary fixed point type in the target info and logic for initializing a fixed point literal.
Fixed point literals are declared using the suffixes
```
hr: short _Fract
uhr: unsigned short _Fract
r: _Fract
ur: unsigned _Fract
lr: long _Fract
ulr: unsigned long _Fract
hk: short _Accum
uhk: unsigned short _Accum
k: _Accum
uk: unsigned _Accum
```
Errors are also thrown for illegal literal values
```
unsigned short _Accum u_short_accum = 256.0uhk; // expected-error{{the integral part of this literal is too large for this unsigned _Accum type}}
```
Differential Revision: https://reviews.llvm.org/D46915
llvm-svn: 335148
Summary:
If a constraint is something like:
```
$0 = [1,1]
```
it'll now be created as:
```
assert($0 == 1)
```
instead of:
```
assert($0 >= 1 && $0 <= 1)
```
In general, ~3% speedup when solving per query in my machine. Biggest improvement was when verifying sqlite3, total time went down from 3000s to 2200s.
I couldn't create a test for this as there is no way to dump the formula yet. D48221 adds a method to dump the formula but there is no way to do it from the command line.
Also, a test that prints the formula will most likely fail in the future, as different solvers print the formula in different formats.
Reviewers: NoQ, george.karpenkov, ddcc
Reviewed By: george.karpenkov
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D48227
llvm-svn: 335116
Since `isPrimitiveType` was only used in an assert, a builbot with `-Werror`
and no asserts enabled failed to build it as it was unused.
llvm-svn: 335030
This checker analyzes C++ constructor calls, and reports uninitialized fields.
Due to the nature of this problem (uninitialized fields after an object
construction), this checker doesn't search for bugs, but rather is a tool to
enforce a specific programming model where every field needs to be initialized.
This checker lands in alpha for now, and a number of followup patches will be
made to reduce false negatives and to make it easier for the user to understand
what rules the checker relies on, eg. whether a derived class' constructor is
responsible for initializing inherited data members or whether it should be
handled in the base class' constructor.
Differential Revision: https://reviews.llvm.org/D45532
llvm-svn: 334935
Summary:
New method dump the SMT formula and the Z3 implementation.
There is no test because I only used it for debugging.
However, if requested, I can add an option to the static analyzer to dump the formula (whole program? per path?), maybe something like the trimmed graph but for SMT formulas.
Reviewers: NoQ, george.karpenkov, ddcc
Reviewed By: george.karpenkov
Subscribers: xazax.hun, szepet, a.sidorin
Differential Revision: https://reviews.llvm.org/D48221
llvm-svn: 334891
Not contexts themselves, but rather support for them in the analyzer.
Such construction contexts appear when C++17 mandatory copy elision occurs
while returning an object from a function, and presence of a destructor causes
a CXXBindTemporaryExpr to appear in the AST.
Additionally, such construction contexts may be chained, because a return-value
construction context doesn't really explain where the object is being returned
into, but only points to the parent stack frame, where the object may be
consumed by literally anything including another return statement. This
behavior is now modeled correctly by the analyzer as long as the object is not
returned beyond the boundaries of the analysis.
Differential Revision: https://reviews.llvm.org/D47405
llvm-svn: 334684
Not contexts themselves, but rather support for them in the analyzer.
Such construction contexts appear when C++17 mandatory copy elision occurs
during initialization, and presence of a destructor causes a
CXXBindTemporaryExpr to appear in the AST.
Similar C++17-specific constructors for return values are still to be supported.
Differential Revision: https://reviews.llvm.org/D47351
llvm-svn: 334683
The reasoning behind this change is similar to the previous commit, r334681.
Because members are already in scope when construction occurs, we are not
suffering from liveness problems, but we still want to figure out if the object
was constructed with construction context, because in this case we'll be able
to avoid trivial copy, which we don't always model perfectly. It'd also have
more importance when copy elision is implemented.
This also gets rid of the old CFG look-behind mechanism.
Differential Revision: https://reviews.llvm.org/D47350
llvm-svn: 334682
The very idea of construction context implies that first the object is
constructed, and then later, in a separate moment of time, the constructed
object goes into scope, i.e. becomes "live".
Most construction contexts require path-sensitive tracking of the constructed
object region in order to compute the outer expressions accordingly before
the object becomes live.
Semantics of simple variable construction contexts don't immediately require
that such tracking happens in path-sensitive manner, but shortcomings of the
analyzer force us to track it path-sensitively as well. Namely, whether
construction context was available at all during construction is a
path-sensitive information. Additionally, path-sensitive tracking takes care of
our liveness problems that kick in as the temporal gap between construction and
going-into-scope becomes larger (eg., due to copy elision).
Differential Revision: https://reviews.llvm.org/D47305
llvm-svn: 334681
When analyzing C++ code, a common operation in the analyzer is to discover
target region for object construction by looking at CFG metadata ("construction
contexts"), and then track the region path-sensitively until object construction
is resolved, where the amount of information, again, depends on construction
context.
Scan construction context only once for both purposes.
Differential Revision: https://reviews.llvm.org/D47304
llvm-svn: 334678
Loop widening can invalidate a reference. If the analyzer attempts to visit the
destructor to a non-existent reference, it will crash. This patch ensures that
the reference is preserved.
https://reviews.llvm.org/D47044
llvm-svn: 334554
removeInvalidation is a very problematic API, as it makes suppression
order-dependent.
Moreover, it was used only once, and could be rewritten in a much
cleaner way.
Differential Revision: https://reviews.llvm.org/D48045
llvm-svn: 334542
BugReporter.cpp is already severely overloaded, and those dump methods
are on PathDiagnostics and should belong in the corresponding
implementation file.
Differential Revision: https://reviews.llvm.org/D48035
llvm-svn: 334541
getEndPath is a problematic API, because it's not clear when it's called
(hint: not always at the end of the path), it crashes at runtime with
more than one non-nullptr returning implementation, and diagnostics
internal depend on it being called at some exact place.
However, most visitors don't actually need that: all they want is a
function consistently called after all nodes are traversed, to perform
finalization and to decide whether invalidation is needed.
Differential Revision: https://reviews.llvm.org/D48042
llvm-svn: 334540
Once we removed AlternateExtensive, I've looked closer into the
difference between Minimal and Extensive, and turns out, the difference
was not that large.
Differential Revision: https://reviews.llvm.org/D47756
llvm-svn: 334525
Rename AlternateExtensive to Extensive.
In 2013, five years ago, we have switched to AlternateExtensive
diagnostics by default, and Extensive was available under unused,
undocumented flag.
This change remove the flag, renames the Alternate
diagnostic to Extensive (as it's no longer Alternate), and ports the
test.
Differential Revision: https://reviews.llvm.org/D47670
llvm-svn: 334524
This simplifies some code which had StringRefs to begin with, and
makes other code more complicated which had const char* to begin
with.
In the end, I think this makes for a more idiomatic and platform
agnostic API. Not all platforms launch process with null terminated
c-string arrays for the environment pointer and argv, but the api
was designed that way because it allowed easy pass-through for
posix-based platforms. There's a little additional overhead now
since on posix based platforms we'll be takign StringRefs which
were constructed from null terminated strings and then copying
them to null terminate them again, but from a readability and
usability standpoint of the API user, I think this API signature
is strictly better.
llvm-svn: 334518
Symbols are cleaned up from the program state map when they go out of scope.
Memory regions are cleaned up when the corresponding object is destroyed, and
additionally in 'checkDeadSymbols' in case destructor modeling was incomplete.
Differential Revision: https://reviews.llvm.org/D47416
llvm-svn: 334352
This check will mark raw pointers to C++ standard library container internal
buffers 'released' when the objects themselves are destroyed. Such information
can be used by MallocChecker to warn about use-after-free problems.
In this first version, 'std::basic_string's are supported.
Differential Revision: https://reviews.llvm.org/D47135
llvm-svn: 334348
This breaks the OpenFlags enumeration into two separate
enumerations: OpenFlags and CreationDisposition. The first
controls the behavior of the API depending on whether or not
the target file already exists, and is not a flags-based
enum. The second controls more flags-like values.
This yields a more easy to understand API, while also allowing
flags to be passed to the openForRead api, where most of the
values didn't make sense before. This also makes the apis more
testable as it becomes easy to enumerate all the configurations
which make sense, so I've added many new tests to exercise all
the different values.
llvm-svn: 334221
Temporary object constructor inlining was disabled in r326240 for code like
const int &x = A().x;
because automatic destructor for the lifetime-extended object A() was not
working correctly in CFG.
CFG was fixed in r333941, so inlining can be re-enabled. CFG for lifetime
extension through aggregates still needs to be fixed.
Differential Revision: https://reviews.llvm.org/D44239
llvm-svn: 333946
Summary: This is a prototype of a bug reporter visitor that invalidates bug reports by re-checking constraints of certain states on the bug path using the Z3 constraint manager backend. The functionality is available under the `crosscheck-with-z3` analyzer config flag.
Reviewers: george.karpenkov, NoQ, dcoughlin, rnkovacs
Reviewed By: george.karpenkov
Subscribers: rnkovacs, NoQ, george.karpenkov, dcoughlin, xbolva00, ddcc, mikhail.ramalho, MTC, fhahn, whisperity, baloghadamsoftware, szepet, a.sidorin, gsd, dkrupp, xazax.hun, cfe-commits
Differential Revision: https://reviews.llvm.org/D45517
llvm-svn: 333903
Summary:
This patch implements a simple SMTConstraintManager API, and requires the implementation of two methods for now: `addRangeConstraints` and `isModelFeasible`.
Update Z3ConstraintManager to inherit it and implement required methods.
I also moved the method to dump the SMT formula from D45517 to this patch.
This patch was created based on the reviews from D47640.
Reviewers: george.karpenkov, NoQ, ddcc, dcoughlin
Reviewed By: george.karpenkov
Differential Revision: https://reviews.llvm.org/D47689
llvm-svn: 333899
Summary:
Moved `RangedConstraintManager` header from `lib/StaticAnalyzer/Core/` to `clang/StaticAnalyzer/Core/PathSensitive/`. No changes to the code.
Reviewers: NoQ, george.karpenkov, dcoughlin
Reviewed By: george.karpenkov
Subscribers: NoQ, george.karpenkov, dcoughlin, ddcc
Differential Revision: https://reviews.llvm.org/D47640
llvm-svn: 333862
ExprEngine already maintains three internal program state traits to track
path-sensitive information related to object construction: pointer returned by
operator new, and pointer to temporary object for two different purposes - for
destruction and for lifetime extension. We'll need to add 2-3 more in a few
follow-up commits.
Merge these traits into one because they all essentially serve one purpose and
work similarly.
Differential Revision: https://reviews.llvm.org/D47303
llvm-svn: 333719
Summary: Clang does not have a corresponding QualType for a 1-bit APSInt, so use the BoolTy and extend the APSInt. Split from D35450. Fixes PR37622.
Reviewers: george.karpenkov, NoQ
Subscribers: mikhail.ramalho, xazax.hun, szepet, rnkovacs, cfe-commits, a.sidorin
Differential Revision: https://reviews.llvm.org/D47603
llvm-svn: 333704
Memoize simplification so that we didn't need to simplify the same symbolic
expression twice within the same program state.
Gives ~25% performance boost on the artificial test in test/Analysis/hangs.c.
Differential Revision: https://reviews.llvm.org/D47402
llvm-svn: 333671
When neither LHS nor RHS of a binary operator expression can be simplified,
return the original expression instead of re-evaluating the binary operator.
Such re-evaluation was causing recusrive re-simplification which caused
the algorithmic complexity to explode.
Differential Revision: https://reviews.llvm.org/D47155
llvm-svn: 333670
Previously, the checker was using the nullability of the expression,
which is nonnull IFF both receiver and method are annotated as _Nonnull.
However, the receiver could be known to the analyzer to be nonnull
without being explicitly marked as _Nonnull.
rdar://40635584
Differential Revision: https://reviews.llvm.org/D47510
llvm-svn: 333612
Summary: Since the `addTransitionImpl()` has a check about same state transition, there is no need to check it in `ArrayBoundCheckerV2.cpp`.
Reviewers: NoQ, xazax.hun, george.karpenkov
Reviewed By: NoQ
Subscribers: szepet, rnkovacs, a.sidorin, cfe-commits, MTC
Differential Revision: https://reviews.llvm.org/D47451
llvm-svn: 333531
Summary: If the access is out of bounds, return UndefinedVal. If it is missing an explicit init, return the implicit zero value it must have.
Reviewers: NoQ, xazax.hun, george.karpenkov
Reviewed By: NoQ
Subscribers: szepet, rnkovacs, a.sidorin, cfe-commits
Differential Revision: https://reviews.llvm.org/D46823
llvm-svn: 333417
These functions are obsolete. The analyzer would advice to replace them with
memcmp(), memcpy() or memmove(), or memset().
Patch by Tom Rix!
Differential Revision: https://reviews.llvm.org/D41881
llvm-svn: 333326
Because template parameter lists were not displayed
in the plist output, it was difficult to decide in
some cases whether a given checker found a true or a
false positive. This patch aims to correct this.
Differential Revision: https://reviews.llvm.org/D46933
llvm-svn: 333275
Summary: I could also move `RangedConstraintManager.h` under `include/` if you agree as it seems slightly out of place under `lib/`.
Patch by Réka Kovács
Reviewers: NoQ, george.karpenkov, dcoughlin, rnkovacs
Reviewed By: NoQ
Subscribers: mikhail.ramalho, whisperity, xazax.hun, baloghadamsoftware, szepet, a.sidorin, dkrupp, cfe-commits
Differential Revision: https://reviews.llvm.org/D45920
llvm-svn: 333179
Again, strlc* does not return a pointer so the zero size case doest not fit.
Reviewers: NoQ, george.karpenkov
Reviewed by: NoQ
Differential Revision: https://reviews.llvm.org/D47007
llvm-svn: 333060
Since there is no perfect way bind the non-zero value with the default binding, this patch only considers the case where buffer's offset is zero and the char value is 0. And according to the value for overwriting, decide how to update the string length.
Reviewers: dcoughlin, NoQ, xazax.hun, a.sidorin, george.karpenkov
Reviewed By: NoQ
Differential Revision: https://reviews.llvm.org/D44934
llvm-svn: 332463
Previously plist-html output produced multi-file HTML reports
but only single-file Plist reports.
Change plist-html output to produce multi-file Plist reports as well.
Differential Revision: https://reviews.llvm.org/D46902
llvm-svn: 332417
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
Explicitly avoided changing the strings in the clang-format tests.
Differential Revision: https://reviews.llvm.org/D44975
llvm-svn: 332350
A common pattern is that the code in the block does not write into the
variable explicitly, but instead passes it to a helper function which
performs the write.
Differential Revision: https://reviews.llvm.org/D46772
llvm-svn: 332300
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
We weren't invalidating our unions correctly. The previous behavior in
invalidateRegionsWorker::VisitCluster() was to direct-bind an UnknownVal
to the union (at offset 0).
For that reason we were never actually loading default bindings from our unions,
because there never was any default binding to load, and the value
that is presumed when there's no default binding to load
is usually completely incorrect (eg. UndefinedVal for stack unions).
The new behavior is to default-bind a conjured symbol (of irrelevant type)
to the union that's being invalidated, similarly to what we do for structures
and classes. Then it becomes safe to load the value properly.
Differential Revision: https://reviews.llvm.org/D45241
llvm-svn: 331563
C allows us to write any bytes into any memory region. When loading weird bytes
from memory regions of known types, the analyzer is required to make sure that
the loaded value makes sense by casting it to an appropriate type.
Fix such cast for loading values that represent void pointers from non-void
pointer type places.
Differential Revision: https://reviews.llvm.org/D46415
llvm-svn: 331562
The bindDefault() API of the ProgramState allows setting a default value
for reads from memory regions that were not preceded by writes.
It was used for implementing C++ zeroing constructors (i.e. default constructors
that boil down to setting all fields of the object to 0).
Because differences between zeroing consturctors and other forms of default
initialization have been piling up (in particular, zeroing constructors can be
called multiple times over the same object, probably even at the same offset,
requiring a careful and potentially slow cleanup of previous bindings in the
RegionStore), we split the API in two: bindDefaultInitial() for modeling
initial values and bindDefaultZero() for modeling zeroing constructors.
This fixes a few assertion failures from which the investigation originated.
The imperfect protection from both inability of the RegionStore to support
binding extents and lack of information in ASTRecordLayout has been loosened
because it's, well, imperfect, and it is unclear if it fixing more than it
was breaking.
Differential Revision: https://reviews.llvm.org/D46368
llvm-svn: 331561
Many glvalue expressions aren't of their respective reference type -
they are simply glvalues of their value type.
This was causing problems when we were trying to obtain type of the original
expression while evaluating certain glvalue bit-casts.
Fixed by artificially forging a reference type to provide to the casting
procedure.
Differential Revision: https://reviews.llvm.org/D46224
llvm-svn: 331558
When loading from a variable or a field that is declared as constant,
the analyzer will try to inspect its initializer and constant-fold it.
Upon success, the analyzer would skip normal load and return the respective
constant.
The new behavior also applies to fields/elements of brace-initialized structures
and arrays.
Patch by Rafael Stahl!
Differential Revision: https://reviews.llvm.org/D45774
llvm-svn: 331556
FunctionProtoType.
We previously re-evaluated the expression each time we wanted to know whether
the type is noexcept or not. We now evaluate the expression exactly once.
This is not quite "no functional change": it fixes a crasher bug during AST
deserialization where we would try to evaluate the noexcept specification in a
situation where we have not deserialized sufficient portions of the AST to
permit such evaluation.
llvm-svn: 331428
The return values of the newly supported functions were not handled correctly:
strlcpy()/strlcat() return string sizes rather than pointers.
Differential Revision: https://reviews.llvm.org/D45177
llvm-svn: 331401
Summary:
The filename is currently taken from the start of the path, while the
line and column are taken from the end of the path.
This didn't matter until cross-file path reporting was added.
Reviewers: george.karpenkov, dcoughlin, vlad.tsyrklevich
Reviewed By: george.karpenkov, vlad.tsyrklevich
Subscribers: xazax.hun, szepet, a.sidorin, cfe-commits
Differential Revision: https://reviews.llvm.org/D45611
llvm-svn: 331361
Summary: Add `TaintBugVisitor` to the ArrayBoundV2, DivideZero, VLASize to be able to indicate where the taint information originated from.
Reviewers: NoQ, george.karpenkov, xazax.hun, a.sidorin
Reviewed By: NoQ
Subscribers: szepet, rnkovacs, cfe-commits, MTC
Differential Revision: https://reviews.llvm.org/D46007
llvm-svn: 331345
When a '>>' token is split into two '>' tokens (in C++11 onwards), or (as an
extension) when we do the same for other tokens starting with a '>', we can't
just use a location pointing to the first '>' as the location of the split
token, because that would result in our miscomputing the length and spelling
for the token. As a consequence, for example, a refactoring replacing 'A<X>'
with something else would sometimes replace one character too many, and
similarly diagnostics highlighting a template-id source range would highlight
one character too many.
Fix this by creating an expansion range covering the first character of the
'>>' token, whose spelling is '>'. For this to work, we generalize the
expansion range of a macro FileID to be either a token range (the common case)
or a character range (used in this new case).
llvm-svn: 331155
Avoid crash when the sub-expression of operator delete[] is of array type.
This is not the same as simply using a delete[] syntax.
We're still not properly calling destructors in this case in the analyzer.
Differential Revision: https://reviews.llvm.org/D46146
llvm-svn: 331014
If 'A' is a C++ aggregate with a reference field of type 'C', in code like
A a = { C() };
C() is lifetime-extended by 'a'. The analyzer wasn't expecting this pattern and
crashing. Additionally, destructors aren't added in the CFG for this case,
so for now we shouldn't be inlining the constructor for C().
Differential Revision: https://reviews.llvm.org/D46037
llvm-svn: 330882
Normally the analyzer begins path-sensitive analysis from functions within
the main file, even though the path is allowed to go through any functions
within the translation unit.
When a recent version of WebKit is compiled, the "unified sources" technique
is used, that assumes #including multiple code files into a single main file.
Such file would have no functions defined in it, so the analyzer wouldn't be
able to find any entry points for path-sensitive analysis.
This patch pattern-matches unified file names that are similar to those
used by WebKit and allows the analyzer to find entry points in the included
code files. A more aggressive/generic approach is being planned as well.
Differential Revision: https://reviews.llvm.org/D45839
llvm-svn: 330876
Note diagnostic pieces are an additional way of highlighting code sections to
the user. They aren't part of the normal path diagnostic sequence. They can
also be attached to path-insensitive reports.
Notes are already supported by the text output and scan-build.
Expanding our machine-readable plist output format to be able to represent notes
opens up the possibility for various analyzer GUIs to pick them up.
Patch by Umann Kristóf!
Differential Revision: https://reviews.llvm.org/D45407
llvm-svn: 330766
Printing of ConcreteInts with size >64 bits resulted in assertion failure
in get[Z|S]ExtValue() because these methods are only allowed to be used
with integers of 64 max bit width. This patch fixes the issue.
llvm-svn: 330605
Summary: `TaintBugVisitor` is a universal visitor, and many checkers rely on it, such as `ArrayBoundCheckerV2.cpp`, `DivZeroChecker.cpp` and `VLASizeChecker.cpp`. Moving `TaintBugVisitor` to `BugReporterVisitors.h` enables other checker can also track where `tainted` value came from.
Reviewers: NoQ, george.karpenkov, xazax.hun
Reviewed By: george.karpenkov
Subscribers: szepet, rnkovacs, a.sidorin, cfe-commits, MTC
Differential Revision: https://reviews.llvm.org/D45682
llvm-svn: 330596
If a pointer cast fails (evaluates to an UnknownVal, i.e. not implemented in the
analyzer) and such cast is in fact the last use of the pointer, the pointer
symbol is no longer referenced by the program state and a leak is
(mis-)diagnosed.
"Escape" the pointer upon a failed cast, i.e. inform the checker that we can no
longer reliably track it.
Differential Revision: https://reviews.llvm.org/D45698
llvm-svn: 330380
r315736 added support for the misplaced CF_RETURNS_RETAINED annotation on
CFRetain() wrappers. It works by trusting the function's name (seeing if it
confirms to the CoreFoundation naming convention) rather than the annotation.
There are more false positives caused by users using a different naming
convention, namely starting the function name with "retain" or "release"
rather than suffixing it with "retain" or "release" respectively.
Because this isn't according to the naming convention, these functions
are usually inlined and the annotation is therefore ignored, which is correct.
But sometimes we run out of inlining stack depth and the function is
evaluated conservatively and then the annotation is trusted.
Add support for the "alternative" naming convention and test the situation when
we're running out of inlining stack depth.
rdar://problem/18270122
Differential Revision: https://reviews.llvm.org/D45117
llvm-svn: 330375
Summary:
Clean carriage returns from lib/ and include/. NFC.
(I have to make this change locally in order for `git diff` to show sane output after I edit a file, so I might as well ask for it to be committed. I don't have commit privs myself.)
(Without this patch, `git rebase`ing any change involving SemaDeclCXX.cpp is a real nightmare. :( So while I have no right to ask for this to be committed, geez would it make my workflow easier if it were.)
Here's the command I used to reformat things. (Requires bash and OSX/FreeBSD sed.)
git grep -l $'\r' lib include | xargs sed -i -e $'s/\r//'
find lib include -name '*-e' -delete
Reviewers: malcolm.parsons
Reviewed By: malcolm.parsons
Subscribers: emaste, krytarowski, cfe-commits
Differential Revision: https://reviews.llvm.org/D45591
Patch by Arthur O'Dwyer.
llvm-svn: 330112
Summary:
`this` pointer is not an l-value, although we have modeled `CXXThisRegion` for `this` pointer, we can only bind it once, which is when we start to inline method. And this patch fixes https://bugs.llvm.org/show_bug.cgi?id=35506.
In addition, I didn't find any other cases other than loop-widen that could invalidate `this` pointer.
Reviewers: NoQ, george.karpenkov, a.sidorin, seaneveson, szepet
Reviewed By: NoQ
Subscribers: xazax.hun, rnkovacs, cfe-commits, MTC
Differential Revision: https://reviews.llvm.org/D45491
llvm-svn: 330095
Expression rearrangement in SValBuilder (see rL329780) crashes with an assert if the type of the integer is different from the type of the symbol. This fix adds a check that prevents rearrangement in such cases.
Differential Revision: https://reviews.llvm.org/D45557
llvm-svn: 330064
Since the range-based constraint manager (default) is weak in handling comparisons where symbols are on both sides it is wise to rearrange them to have symbols only on the left side. Thus e.g. A + n >= B + m becomes A - B >= m - n which enables the constraint manager to store a range m - n .. MAX_VALUE for the symbolic expression A - B. This can be used later to check whether e.g. A + k == B + l can be true, which is also rearranged to A - B == l - k so the constraint manager can check whether l - k is in the range (thus greater than or equal to m - n).
The restriction in this version is the the rearrangement happens only if both the symbols and the concrete integers are within the range [min/4 .. max/4] where min and max are the minimal and maximal values of their type.
The rearrangement is not enabled by default. It has to be enabled by using -analyzer-config aggressive-relational-comparison-simplification=true.
Co-author of this patch is Artem Dergachev (NoQ).
Differential Revision: https://reviews.llvm.org/D41938
llvm-svn: 329780
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:
archtype
cas
classs
checkk
compres
definit
frome
iff
inteval
ith
lod
methode
nd
optin
ot
pres
statics
te
thru
Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)
Differential revision: https://reviews.llvm.org/D44188
llvm-svn: 329399
removeUnneededCalls() is responsible for removing path diagnostic pieces within
functions that don't contain "interesting" events. It makes bug reports
much tidier.
When a stack frame is known to be interesting, the function doesn't descend
into it to prune anything within it, even other callees that are totally boring.
Fix the function to prune boring callees in interesting stack frames.
Differential Revision: https://reviews.llvm.org/D45117
llvm-svn: 329102