First-er draft of contributing.md

Some notes on keeping data structure definitions in decompiled source
files have been added, and a section was added for specific topics in
the future.
This commit is contained in:
KeybadeBlox 2025-12-18 21:36:30 -05:00
parent 83553a3d24
commit 6ac4cdc5ed

View file

@ -145,7 +145,9 @@ automatically. Otherwise, one has to click on one of the corresponding
functions in one pane and the other function in the other pane to tell objdiff
to link them. Common cases of this are class methods (the names won't match)
and implicitly generated functions, such as exception handling code placed in
`.text$x` in the recompiled object file.
`.text$x` in the recompiled object file. Keep in mind that objdiff also
appears to misidentify many symbols as functions even if they're data in e.g.
the `.data` section, which confuses the overall match percentage somewhat.
Clicking on a function that's been linked across both object files shows a diff
of the disassembly of both versions of the function, with any differences
@ -155,17 +157,18 @@ reaches 100%. Depending on how you configure objdiff, it will rebuild
automatically whenever you save a change to a source file, or you can manually
rebuild with the "Build" button at the top of the right pane.
There are no hard instructions to give for writing decompiled code. Use
Ghidra's decompilation of the function in the CodeBrowser as a starting point,
and exercise whatever C++ and x86 assembly knowledge you have. Exception
handling code in particular can appear in unexpected places (around `new`
statements, in constructors) and has unambiguous but nonobvious signs in the
disassembly, so it might be worth
[reading](https://www.openrce.org/articles/full_view/21)
[up](https://www.openrce.org/articles/full_view/23)
[on](https://web.archive.org/web/20101007110629/http://www.microsoft.com/msj/0197/exception/exception.aspx)
how they're implemented to learn to recognize them in disassembly and recreate
them in C++ code.
There are no concrete instructions to give for writing decompiled code. Try
importing headers from `decompile/src/` into Ghidra
(`File > Parse C Source...`) to get access to JSRF classes, and use Ghidra's
decompilation of the function in the CodeBrowser as a starting point for
writing your matching function, exercising whatever C++ and x86 assembly
knowledge you have. Exception handling code in particular can appear in
unexpected places (e.g. around `new` statements and in constructors) and has
unambiguous but nonobvious signs in the disassembly, so it might be worth
[reading](https://www.openrce.org/articles/full_view/21) up
[on](https://www.openrce.org/articles/full_view/23) how they're
[implemented](https://web.archive.org/web/20101007110629/http://www.microsoft.com/msj/0197/exception/exception.aspx)
to learn to recognize them in disassembly and recreate them in C++ code.
Whenever you have some decompiled code that you'd like to contribute to the
repository, commit it to your local copy of the repository and create a merge
@ -247,7 +250,23 @@ Once an object is ready for extracting, its `Delink?` column should be set to
updated to include it (give it an entry in the `units` list, modelled after
other existing entries minus the `complete` and `symbol_mappings` fields), plus
a `.cpp` file (and `.hpp` file if suitable) for it should be added for it in
the `decompile/src/` directory. Give the extraction via `delink.sh` a test and
make sure everything's working right for this new object file in objdiff.
the `decompile/src/` directory. Make sure that any relevant data structures
you've figured out are included in the new source files, then give extraction
via `delink.sh` a test. Add a new prerequisite to `all:` at the top of the
`Makefile` at the top of the `decompile/` directory, and add an entry at the
bottom to record which header files need to be up to date to build the new
object file (including anything included transitively!). Finally, make sure
that the new object file builds in objdiff, even if its functions haven't
actually been implemented yet.
Finally, make a merge request to share your work with us!
When you have it all sorted out, make a merge request to share your work with
us!
# Special Topics
This would be a good place to include guidance on some trickier aspects of
reverse engineering C++ code, like an accessible explanation of navigating
exception handling in Ghidra, implementing classes with virtual methods or
inheritance Ghidra and writing decompiled code for them, or what in the world a
COM object is and how to make Ghidra understand it (especially the one wrapping
all of JSRF's Direct3D calls).