Reverse Engineering a Perl script based on a core dump
A friend's server (yes, really. Not mine.) was broken into and we discovered a perl binary running some bot code. We could not find the script itself (probably eval'ed as received over the network), but we managed to create a core dump of the perl process.
Running strings on the core gave us some hints (hostnames, usernames / passwords), but not the source code of the script.
We'd like to know what the script was capable of doing, so we'd like to reverse-engineer the perl code that was running inside that perl interpreter.
Searching around, the closest thing to a perl de-compi开发者_如何学JAVAler I found is the B::Deparse module which seems to be perfectly suitable for converting the bytecode of the parse-trees back into readable code.
Now, how do I get B::Deparse to operate on a core dump? Or, alternatively, how could I restart the program from the core, load B::Deparse and execute it?
Any ideas are welcome.
ysth asked me on IRC to comment on your question. I've done a whole pile of stuff "disassembling" compiled perl and stuff (just see my CPAN page [http://search.cpan.org/~jjore]).
Perl compiles your source to a tree of OP*
structs which
occasionally have C pointers to SV*
which are perl values. Your core
dump now has a bunch of those OP*
and SV*
stashed.
The best possible world would be to have a perl module like
B::Deparse do the information-understanding work for you. It
works by using a light interface to perl memory in the B::OP
and
B::SV
classes (documented in B, perlguts, and
perlhack). This is unrealistic for you because a B::*
object is
just a pointer into memory with accessors to decode the struct for our
use. Consider:
require Data::Dumper;
require Scalar::Util;
require B;
my $value = 'this is a string';
my $sv = B::svref_2object( \ $value );
my $address = Scalar::Util::refaddr( \ $value );
local $Data::Dumper::Sortkeys = 1;
local $Data::Dumper::Purity = 1;
print Data::Dumper::Dumper(
{
address => $address,
value => \ $value,
sv => $sv,
sv_attr => {
CUR => $sv->CUR,
LEN => $sv->LEN,
PV => $sv->PV,
PVBM => $sv->PVBM,
PVX => $sv->PVX,
as_string => $sv->as_string,
FLAGS => $sv->FLAGS,
MAGICAL => $sv->MAGICAL,
POK => $sv->POK,
REFCNT => $sv->REFCNT,
ROK => $sv->ROK,
SvTYPE => $sv->SvTYPE,
object_2svref => $sv->object_2svref,
},
}
);
which when run showed that the B::PV
object (it is ISA B::SV
) is
truely merely an interface to the memory representation of the
compiled string this is a string
.
$VAR1 = {
'address' => 438506984,
'sv' => bless( do{\(my $o = 438506984)}, 'B::PV' ),
'sv_attr' => {
'CUR' => 16,
'FLAGS' => 279557,
'LEN' => 24,
'MAGICAL' => 0,
'POK' => 1024,
'PV' => 'this is a string',
'PVBM' => 'this is a string',
'PVX' => 'this is a string',
'REFCNT' => 2,
'ROK' => 0,
'SvTYPE' => 5,
'as_string' => 'this is a string',
'object_2svref' => \'this is a string'
},
'value' => do{my $o}
};
$VAR1->{'value'} = $VAR1->{'sv_attr'}{'object_2svref'};
This however implies that any B::*
using code must actually operate
on live memory. Tye McQueen thought he remembered a C debugger which
could fully revive a working process given a core dump. My gdb
can't. gdb
can allow you to dump the contents of your OP*
and
SV*
structs. You would most likely just read the dumped structs to
interpret your program's structure. You could, if you wished, use
gdb
to dump the structs, then synthetically create B::*
objects
which behaved in interface as if they were ordinary and use
B::Deparse
on that. At root, our deparser and other debug dumping
tools are mostly object oriented so you could just "fool" them by
creating a pile of fake B::*
classes and objects.
You may find reading the B::Deparse class's coderef2text
method
instructive. It accepts a function reference, casts it to a B::CV
object, and uses that for input to the deparse_sub
method:
require B;
require B::Deparse;
sub your_function { ... }
my $cv = B::svref_2object( \ &your_function );
my $deparser = B::Deparse->new;
print $deparser->deparse_sub( $cv );
For gentler introductions to OP*
and related ideas, see the updated
PerlGuts Illustrated and Optree guts.
I doubt there's a tool out there that does this out of the box, so...
Find the source code to the version of perl you were running. This should help you understand the memory layout of the perl interpreter. It will also help you figure out if there's a way to take a shortcut here (e.g. if bytecode is preceded by an easy to find header in memory or something).
Load up the binary + core dump in a debugger, probably gdb
Use the information in the perl source code to guide you in convincing the debugger to spit out the bytecode you're interested in.
Once you have the bytecode, B::Deparse should be able to get you to something more readable.
Well, undump will turn that core dump back into a binary executable (if you can find a working version). You should then be able to load that into perl
and -MO=Deparse
it.
精彩评论