mirror of
https://codeberg.org/KeybadeBlox/JSRF-Decompilation.git
synced 2026-02-20 02:07:02 +03:00
Create docs directory; begin "Decompiling C++"
This commit is contained in:
parent
683818b637
commit
547f2ba179
3 changed files with 159 additions and 19 deletions
152
documentation/decompilingcpp.md
Normal file
152
documentation/decompilingcpp.md
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
# Decompiling C++
|
||||
Like most (all?) Xbox titles and most sixth-generation games more generally,
|
||||
JSRF is not written in assembly or C as those before it were, but rather C++.
|
||||
C++ introduces new features that both complicate the final machine code and
|
||||
weaken the correspondence between said machine code and the original C++
|
||||
source.
|
||||
|
||||
This guide will cover various C++ features appearing in JSRF, explaining how
|
||||
they manifest in the game's executable and how to properly decompile them, to
|
||||
the extent possible. Basic familiarity with C features (e.g. functions,
|
||||
structs) and how to decompile them is assumed.
|
||||
|
||||
|
||||
## Name Mangling
|
||||
(on the off chance you actually get symbol names, like from debug info; also
|
||||
why symbol names don't match in objdiff)
|
||||
|
||||
|
||||
## Classes
|
||||
C++ classes evolve the C struct to associate the data structure with code,
|
||||
which are called methods in this context. Classes can also inherit from one or
|
||||
more other classes, sharing their data members and access to their methods.
|
||||
Certain special methods called constructors and destructors can also be added
|
||||
to a class, and these can be called implicitly when an instance of a class goes
|
||||
in or out of scope. Classes can also have fields and methods marked as
|
||||
private, but these permissions are usually completely erased during
|
||||
compilation and don't need to be respected by a decompilation.
|
||||
|
||||
### `class` vs. `struct`
|
||||
The `struct` keyword can still be used in C++ and is equivalent to `class`,
|
||||
except that the former makes all members public by default and the latter makes
|
||||
all private by default. Since there's not much reason to make anything private
|
||||
in a decompilation, one will usually use `struct` declarations in
|
||||
decompilations rather than `class`.
|
||||
|
||||
```c++
|
||||
// These two declarations are equivalent
|
||||
class SomeClass {
|
||||
public: // Makes everything after public
|
||||
float someMemberVariable;
|
||||
unsigned anotherMemberVariable;
|
||||
};
|
||||
|
||||
struct SomeStruct {
|
||||
float someMemberVariable;
|
||||
unsigned anotherMemberVariable;
|
||||
};
|
||||
```
|
||||
|
||||
A reasonable way to implement an inherited struct in Ghidra is to define the
|
||||
base class normally, and then define the child with a first member called
|
||||
`super` of the parent class type. Members specific to the child class can then
|
||||
be inserted afterwards.
|
||||
|
||||
### Class Methods
|
||||
Methods are functions declared within a class's namespace, like so:
|
||||
```c++
|
||||
class SomeClass {
|
||||
// Regular data members
|
||||
float someMemberVariable;
|
||||
unsigned anotherMemberVariable;
|
||||
|
||||
// Methods declared in class definition
|
||||
SomeClass(int anArgument); // Constructor
|
||||
~SomeClass(); // Destructor
|
||||
|
||||
void regularMethod(unsigned anArgument);
|
||||
virtual void virtualMethod(char * anArgument);
|
||||
static void staticMethod (char * anArgument);
|
||||
|
||||
// Can also provide entire definition in class
|
||||
float anotherMethod(float x) {
|
||||
this->someMemberVariable += x;
|
||||
return this->someMemberVariable;
|
||||
}
|
||||
};
|
||||
|
||||
// Definition of a method declared in class
|
||||
void SomeClass::regularMethod(unsigned anArgument) {
|
||||
this->anotherMemberVariable -= anArgument;
|
||||
}
|
||||
```
|
||||
|
||||
Methods can then be accessed and called with member access syntax, like
|
||||
`classInstance.regularMethod(3)` and `instancePtr->anotherMethod(1.2)`.
|
||||
|
||||
Static methods are indistinguishable from regular functions in compiled code,
|
||||
so they probably won't see much use in decompilations. They don't have access
|
||||
to the `this` pointer that other types of methods can use.
|
||||
|
||||
Regular methods are similar to regular functions, but have an implicit first
|
||||
argument called `this` representing a pointer to the object that the method
|
||||
was called from. Some C++ implementations use a different calling convention
|
||||
for method calls, such as Microsoft's implementation for the Xbox using the
|
||||
`__thiscall` convention where the `this` pointer is passed in the ECX register
|
||||
while all other arguments are passed on the stack.
|
||||
|
||||
Constructors and destructors function largely like regular methods, but
|
||||
implicitly return the `this` pointer.
|
||||
|
||||
Virtual methods are methods that can be overridden on child classes. They're
|
||||
not called directly, but instead called through a hidden first member that
|
||||
points to an array of method function pointers, usually called a vtable (Visual
|
||||
C++ 7 calls it `` ClassName::`vftable' ``). If a destructor specifically is
|
||||
made virtual, additional "deleting destructors" may be generated as well, which
|
||||
are methods taking one boolean argument that call the destructor and then,
|
||||
depending on the argument, free the object's memory.
|
||||
|
||||
(TODO: how to implement methods and vtables in Ghidra)
|
||||
|
||||
### Inheritance
|
||||
Child classes can be used in most places that their parent class can be used:
|
||||
```c++
|
||||
// Class inheriting from SomeStruct
|
||||
struct SomeStructChild : SomeStruct {
|
||||
// Inherits these from SomeStruct:
|
||||
// float someMemberVariable;
|
||||
// unsigned anotherMemberVariable;
|
||||
char * additionalMemberVariable;
|
||||
};
|
||||
|
||||
// Could call this with either a SomeStruct* or SomeStructChild* argument
|
||||
float getSomeMemberVariable(SomeStruct const * const ss) {
|
||||
return ss->someMemberVariable;
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## The `new` and `delete` Operators
|
||||
One way to allocate an object in C++ is using `new` and `delete`. The former
|
||||
can both allocate and construct the object, while the latter is analogous to
|
||||
calling `free()`. Each has a corresponding `operator new()` or
|
||||
`operator delete()` function called implicitly.
|
||||
|
||||
The generated code for a use of `new` with a constructor (like
|
||||
`SomeStruct ss = new SomeStruct(7)`) performs the allocator and constructor
|
||||
calls separately, roughly as follows (as it would appear in Ghidra; note that
|
||||
Ghidra shows explicitly the passing of the `this` pointer):
|
||||
```c++
|
||||
SomeStruct *ss;
|
||||
ss = (SomeStruct *)operator_new(0xc);
|
||||
if (ss == NULL) {
|
||||
ss = NULL; // No, I'm not sure what the point of reassigning NULL is
|
||||
}
|
||||
else {
|
||||
SomeStruct::SomeStruct(7);
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Exception Handling
|
||||
|
||||
|
|
@ -170,13 +170,9 @@ importing headers from `decompile/src/` into Ghidra
|
|||
(`File > Parse C Source...`) to get access to JSRF classes, and use Ghidra's
|
||||
decompilation of the function in the CodeBrowser as a starting point for
|
||||
writing your matching function, exercising whatever C++ and x86 assembly
|
||||
knowledge you have. Exception handling code in particular can appear in
|
||||
unexpected places (e.g. around `new` statements and in constructors) and has
|
||||
unambiguous but nonobvious signs in the disassembly, so it might be worth
|
||||
[reading](https://www.openrce.org/articles/full_view/21) up
|
||||
[on](https://www.openrce.org/articles/full_view/23) how they're
|
||||
[implemented](https://web.archive.org/web/20101007110629/http://www.microsoft.com/msj/0197/exception/exception.aspx)
|
||||
to learn to recognize them in disassembly and recreate them in C++ code.
|
||||
knowledge you have. If you have basic decompilation experience but are new to
|
||||
decompiling C++ specifically, you might want to take a look at the
|
||||
[Decompiling C++](decompilingcpp.md) article.
|
||||
|
||||
Whenever you have some decompiled code that you'd like to contribute to the
|
||||
repository, commit it to your local copy of the repository and create a merge
|
||||
|
|
@ -280,12 +276,3 @@ actually been implemented yet.
|
|||
|
||||
When you have it all sorted out, make a merge request to share your work with
|
||||
us!
|
||||
|
||||
|
||||
# Special Topics
|
||||
This would be a good place to include guidance on some trickier aspects of
|
||||
reverse engineering C++ code, like an accessible explanation of navigating
|
||||
exception handling in Ghidra, implementing classes with virtual methods or
|
||||
inheritance Ghidra and writing decompiled code for them, or what in the world a
|
||||
COM object is and how to make Ghidra understand it (especially the one wrapping
|
||||
all of JSRF's Direct3D calls).
|
||||
|
|
@ -16,9 +16,10 @@ The approach of this decompilation is to:
|
|||
|
||||
We are currently engaging in the first two steps simultaneously, decompiling
|
||||
code as it's delinked. Further details on these steps can be found in the
|
||||
[contribution guide](contributing.md). Step 3 will use the linker from the
|
||||
same Visual C++ 7.0 already used to compile object files. Step 4 is expected
|
||||
to use the `cxbe` tool found in e.g. [nxdk](https://github.com/XboxDev/nxdk).
|
||||
[contribution guide](documentation/gettingstarted.md). Step 3 will use the
|
||||
linker from the same Visual C++ 7.0 already used to compile object files. Step
|
||||
4 is expected to use the `cxbe` tool found in e.g.
|
||||
[nxdk](https://github.com/XboxDev/nxdk).
|
||||
|
||||
## Contributing
|
||||
Anybody interested in joining the effort is welcome to read the
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue