mirror of
https://codeberg.org/KeybadeBlox/JSRF-Decompilation.git
synced 2026-02-20 02:07:02 +03:00
Create docs directory; begin "Decompiling C++"
This commit is contained in:
parent
683818b637
commit
547f2ba179
3 changed files with 159 additions and 19 deletions
152
documentation/decompilingcpp.md
Normal file
152
documentation/decompilingcpp.md
Normal file
|
|
@ -0,0 +1,152 @@
|
||||||
|
# Decompiling C++
|
||||||
|
Like most (all?) Xbox titles and most sixth-generation games more generally,
|
||||||
|
JSRF is not written in assembly or C as those before it were, but rather C++.
|
||||||
|
C++ introduces new features that both complicate the final machine code and
|
||||||
|
weaken the correspondence between said machine code and the original C++
|
||||||
|
source.
|
||||||
|
|
||||||
|
This guide will cover various C++ features appearing in JSRF, explaining how
|
||||||
|
they manifest in the game's executable and how to properly decompile them, to
|
||||||
|
the extent possible. Basic familiarity with C features (e.g. functions,
|
||||||
|
structs) and how to decompile them is assumed.
|
||||||
|
|
||||||
|
|
||||||
|
## Name Mangling
|
||||||
|
(on the off chance you actually get symbol names, like from debug info; also
|
||||||
|
why symbol names don't match in objdiff)
|
||||||
|
|
||||||
|
|
||||||
|
## Classes
|
||||||
|
C++ classes evolve the C struct to associate the data structure with code,
|
||||||
|
which are called methods in this context. Classes can also inherit from one or
|
||||||
|
more other classes, sharing their data members and access to their methods.
|
||||||
|
Certain special methods called constructors and destructors can also be added
|
||||||
|
to a class, and these can be called implicitly when an instance of a class goes
|
||||||
|
in or out of scope. Classes can also have fields and methods marked as
|
||||||
|
private, but these permissions are usually completely erased during
|
||||||
|
compilation and don't need to be respected by a decompilation.
|
||||||
|
|
||||||
|
### `class` vs. `struct`
|
||||||
|
The `struct` keyword can still be used in C++ and is equivalent to `class`,
|
||||||
|
except that the former makes all members public by default and the latter makes
|
||||||
|
all private by default. Since there's not much reason to make anything private
|
||||||
|
in a decompilation, one will usually use `struct` declarations in
|
||||||
|
decompilations rather than `class`.
|
||||||
|
|
||||||
|
```c++
|
||||||
|
// These two declarations are equivalent
|
||||||
|
class SomeClass {
|
||||||
|
public: // Makes everything after public
|
||||||
|
float someMemberVariable;
|
||||||
|
unsigned anotherMemberVariable;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct SomeStruct {
|
||||||
|
float someMemberVariable;
|
||||||
|
unsigned anotherMemberVariable;
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
A reasonable way to implement an inherited struct in Ghidra is to define the
|
||||||
|
base class normally, and then define the child with a first member called
|
||||||
|
`super` of the parent class type. Members specific to the child class can then
|
||||||
|
be inserted afterwards.
|
||||||
|
|
||||||
|
### Class Methods
|
||||||
|
Methods are functions declared within a class's namespace, like so:
|
||||||
|
```c++
|
||||||
|
class SomeClass {
|
||||||
|
// Regular data members
|
||||||
|
float someMemberVariable;
|
||||||
|
unsigned anotherMemberVariable;
|
||||||
|
|
||||||
|
// Methods declared in class definition
|
||||||
|
SomeClass(int anArgument); // Constructor
|
||||||
|
~SomeClass(); // Destructor
|
||||||
|
|
||||||
|
void regularMethod(unsigned anArgument);
|
||||||
|
virtual void virtualMethod(char * anArgument);
|
||||||
|
static void staticMethod (char * anArgument);
|
||||||
|
|
||||||
|
// Can also provide entire definition in class
|
||||||
|
float anotherMethod(float x) {
|
||||||
|
this->someMemberVariable += x;
|
||||||
|
return this->someMemberVariable;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Definition of a method declared in class
|
||||||
|
void SomeClass::regularMethod(unsigned anArgument) {
|
||||||
|
this->anotherMemberVariable -= anArgument;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Methods can then be accessed and called with member access syntax, like
|
||||||
|
`classInstance.regularMethod(3)` and `instancePtr->anotherMethod(1.2)`.
|
||||||
|
|
||||||
|
Static methods are indistinguishable from regular functions in compiled code,
|
||||||
|
so they probably won't see much use in decompilations. They don't have access
|
||||||
|
to the `this` pointer that other types of methods can use.
|
||||||
|
|
||||||
|
Regular methods are similar to regular functions, but have an implicit first
|
||||||
|
argument called `this` representing a pointer to the object that the method
|
||||||
|
was called from. Some C++ implementations use a different calling convention
|
||||||
|
for method calls, such as Microsoft's implementation for the Xbox using the
|
||||||
|
`__thiscall` convention where the `this` pointer is passed in the ECX register
|
||||||
|
while all other arguments are passed on the stack.
|
||||||
|
|
||||||
|
Constructors and destructors function largely like regular methods, but
|
||||||
|
implicitly return the `this` pointer.
|
||||||
|
|
||||||
|
Virtual methods are methods that can be overridden on child classes. They're
|
||||||
|
not called directly, but instead called through a hidden first member that
|
||||||
|
points to an array of method function pointers, usually called a vtable (Visual
|
||||||
|
C++ 7 calls it `` ClassName::`vftable' ``). If a destructor specifically is
|
||||||
|
made virtual, additional "deleting destructors" may be generated as well, which
|
||||||
|
are methods taking one boolean argument that call the destructor and then,
|
||||||
|
depending on the argument, free the object's memory.
|
||||||
|
|
||||||
|
(TODO: how to implement methods and vtables in Ghidra)
|
||||||
|
|
||||||
|
### Inheritance
|
||||||
|
Child classes can be used in most places that their parent class can be used:
|
||||||
|
```c++
|
||||||
|
// Class inheriting from SomeStruct
|
||||||
|
struct SomeStructChild : SomeStruct {
|
||||||
|
// Inherits these from SomeStruct:
|
||||||
|
// float someMemberVariable;
|
||||||
|
// unsigned anotherMemberVariable;
|
||||||
|
char * additionalMemberVariable;
|
||||||
|
};
|
||||||
|
|
||||||
|
// Could call this with either a SomeStruct* or SomeStructChild* argument
|
||||||
|
float getSomeMemberVariable(SomeStruct const * const ss) {
|
||||||
|
return ss->someMemberVariable;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## The `new` and `delete` Operators
|
||||||
|
One way to allocate an object in C++ is using `new` and `delete`. The former
|
||||||
|
can both allocate and construct the object, while the latter is analogous to
|
||||||
|
calling `free()`. Each has a corresponding `operator new()` or
|
||||||
|
`operator delete()` function called implicitly.
|
||||||
|
|
||||||
|
The generated code for a use of `new` with a constructor (like
|
||||||
|
`SomeStruct ss = new SomeStruct(7)`) performs the allocator and constructor
|
||||||
|
calls separately, roughly as follows (as it would appear in Ghidra; note that
|
||||||
|
Ghidra shows explicitly the passing of the `this` pointer):
|
||||||
|
```c++
|
||||||
|
SomeStruct *ss;
|
||||||
|
ss = (SomeStruct *)operator_new(0xc);
|
||||||
|
if (ss == NULL) {
|
||||||
|
ss = NULL; // No, I'm not sure what the point of reassigning NULL is
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
SomeStruct::SomeStruct(7);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Exception Handling
|
||||||
|
|
||||||
|
|
@ -170,13 +170,9 @@ importing headers from `decompile/src/` into Ghidra
|
||||||
(`File > Parse C Source...`) to get access to JSRF classes, and use Ghidra's
|
(`File > Parse C Source...`) to get access to JSRF classes, and use Ghidra's
|
||||||
decompilation of the function in the CodeBrowser as a starting point for
|
decompilation of the function in the CodeBrowser as a starting point for
|
||||||
writing your matching function, exercising whatever C++ and x86 assembly
|
writing your matching function, exercising whatever C++ and x86 assembly
|
||||||
knowledge you have. Exception handling code in particular can appear in
|
knowledge you have. If you have basic decompilation experience but are new to
|
||||||
unexpected places (e.g. around `new` statements and in constructors) and has
|
decompiling C++ specifically, you might want to take a look at the
|
||||||
unambiguous but nonobvious signs in the disassembly, so it might be worth
|
[Decompiling C++](decompilingcpp.md) article.
|
||||||
[reading](https://www.openrce.org/articles/full_view/21) up
|
|
||||||
[on](https://www.openrce.org/articles/full_view/23) how they're
|
|
||||||
[implemented](https://web.archive.org/web/20101007110629/http://www.microsoft.com/msj/0197/exception/exception.aspx)
|
|
||||||
to learn to recognize them in disassembly and recreate them in C++ code.
|
|
||||||
|
|
||||||
Whenever you have some decompiled code that you'd like to contribute to the
|
Whenever you have some decompiled code that you'd like to contribute to the
|
||||||
repository, commit it to your local copy of the repository and create a merge
|
repository, commit it to your local copy of the repository and create a merge
|
||||||
|
|
@ -280,12 +276,3 @@ actually been implemented yet.
|
||||||
|
|
||||||
When you have it all sorted out, make a merge request to share your work with
|
When you have it all sorted out, make a merge request to share your work with
|
||||||
us!
|
us!
|
||||||
|
|
||||||
|
|
||||||
# Special Topics
|
|
||||||
This would be a good place to include guidance on some trickier aspects of
|
|
||||||
reverse engineering C++ code, like an accessible explanation of navigating
|
|
||||||
exception handling in Ghidra, implementing classes with virtual methods or
|
|
||||||
inheritance Ghidra and writing decompiled code for them, or what in the world a
|
|
||||||
COM object is and how to make Ghidra understand it (especially the one wrapping
|
|
||||||
all of JSRF's Direct3D calls).
|
|
||||||
|
|
@ -16,9 +16,10 @@ The approach of this decompilation is to:
|
||||||
|
|
||||||
We are currently engaging in the first two steps simultaneously, decompiling
|
We are currently engaging in the first two steps simultaneously, decompiling
|
||||||
code as it's delinked. Further details on these steps can be found in the
|
code as it's delinked. Further details on these steps can be found in the
|
||||||
[contribution guide](contributing.md). Step 3 will use the linker from the
|
[contribution guide](documentation/gettingstarted.md). Step 3 will use the
|
||||||
same Visual C++ 7.0 already used to compile object files. Step 4 is expected
|
linker from the same Visual C++ 7.0 already used to compile object files. Step
|
||||||
to use the `cxbe` tool found in e.g. [nxdk](https://github.com/XboxDev/nxdk).
|
4 is expected to use the `cxbe` tool found in e.g.
|
||||||
|
[nxdk](https://github.com/XboxDev/nxdk).
|
||||||
|
|
||||||
## Contributing
|
## Contributing
|
||||||
Anybody interested in joining the effort is welcome to read the
|
Anybody interested in joining the effort is welcome to read the
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue