What is a good tool to analyse code for unsafe code fragments? [closed]
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this questionI'm looking for some tool that would do basic testing of PHP script.
This time I'm not interested in complex testing solution that requires writing tests for each and every piece of code. I'm neither interested in functio开发者_StackOverflownal tests.
What I expect this tool to do is to point out code fragments that might result in invalid usage of:
- variables;
- function arguments;
- function return values.
Invalid usage would be, for example, using NULL as an object or as an array or passing boolean as an arguent that is required to be an array or an object (or NULL).
Additionally, it might also check if "good practices" are used, i.e., it might warn that it's not safe to use if ( $condition ) $doSomething;
, if it finds something like that.
To illustrate what the tool should detect as "unsafe" code fragments, here are couple of examples.
In this case, $filter
may be NULL
, but it's used as an array:
<?php
function getList(array $filter = null) {
$sql = '...';
// ...
foreach ( $filter as $field => $value ) {
// ...
}
// ...
}
In this case, $res
might be false
, but it's used as a resource
.
<?php
$res = mysql_query('...');
while ( $row = mysql_fetch_assoc($res) ) {
// ...
}
mysql_free_result($res);
And here are correspondent "safe" code fragments:
// $filter is used only if it's non-empty; as list of arguments describes, it may be only NULL or array
function getList(array $filter = null) {
$sql = '...';
// ...
if ( $filter ) {
foreach ( $filter as $field => $value ) {
// ...
}
}
// ...
}
...
// $filter is forced to be an array
function getList(array $filter = null) {
$filter = (array)$filter;
$sql = '...';
// ...
foreach ( $filter as $field => $value ) {
// ...
}
// ...
}
...
// if $res is false, mysql_fetch_assoc() and mysql_free_result() never executes
$res = mysql_query('...');
if ( $res === false ) {
trigger_error('...');
return;
}
while ( $row = mysql_fetch_assoc($res) ) {
// ...
}
mysql_free_result($res);
...
// if $res is false, mysql_fetch_assoc() and mysql_free_result() are skipped
$res = mysql_query('...');
if ( $res !== false ) {
while ( $row = mysql_fetch_assoc($res) ) {
// ...
}
mysql_free_result($res);
}
This is not basic testing, but requires non-trivial analysis of the token stream.
You can either use
phpmd
scans PHP source code and looks for potential problems such as possible bugs, dead code, suboptimal code, and overcomplicated expressions
phpcs
phpcs tokenises PHP, JavaScript and CSS files and detects violations of a defined set of coding standards. It is an essential development tool that ensures your code remains clean and consistent. It can also help prevent some common semantic errors made by developers.
or - if that doesn't suffice - look at the bytecode level with
bytekit-cli
bytekit-cli provides a command-line tool that leverages the Bytekit extension to perform common code analysis tasks on the PHP bytecode level.
You will have to write you own sniffs for that.
Further tools and resources:
- http://phpqatools.org/
A good IDE should point out basic errors as part of the development process.
I use Netbeans, for example, and it highlights common code mistakes such as variables which are defined but not used, or mis-use of assignment operators in an if() condition where an equality operator is more normal (ie writing if($x = $y)
instead of if($x == $y)
).
Basic stuff like that shows up in Netbeans with a yellow warning triangle by the line number. Other IDEs will have similar features.
I don't think it picks up the specific error conditions you described, but it certainly picks up a fair number of errors, and even for the errors you talked about, this is where I would expect those kind of things to be flagged up, rather than in a separate tool.
What you want is traditionally called a static analysis tool. What such tools often do, is determine for each point in the code, what kind of facts it knows about the variables (after X= NULL, the tool knows X is NULL), and then propagates what it knows along various control flow paths to see if that state of the variables are inconsistent with an operation (e.g., after find that X is null, finding code that must be executed which attempts to access X as an array).
To do this well, you need a complete PHP parser, producing ASTs, symbol tables telling you at least the scopes of PHP variables, some way to determine control and data flow, and bunch of patterns over this collective set of information that detects various kinds of coding errors.
One such tool for PHP is PHPSat. It appears to do some of this and you can likely download an run it (I have no specific experience with it). The technology on which it is built, Stratego, is at least appropriate for the task; Stratego produces ASTs and can collect facts from various places in it, although I don't think it is very good at control and data flow. This is in contrast to a tool that simply has access to PHP tokens such as PHPCS mentioned in another answer; computing control and data flow from just the tokens is such a nightmare that in practice it won't get done at all.
The right machinery seems to be hiding in Paul Biggar's thesis. I can't seem to find any hints that anybody picked this up and used it as the basis for a static analyzer, though.
精彩评论