Updated Decompiling a function (markdown)

AnonymousRandomPerson 2023-08-24 23:35:30 -04:00
parent a99783c490
commit eb006cceb2

@ -95,6 +95,7 @@ And now to add the load to the function body.
```
s32 param_0_value = *param_0;
```
Note that the line of code above is currently optimized out of the compiled assembly, since `param_0_value` is loaded by not used. Don't worry, it will be used by the time the end of the function is reached.
Now for the next lines of assembly:
```
@ -114,27 +115,88 @@ else
param_0_result = 0;
}
```
This code can be simplified to:
```
s32 param_0_result = param_0_value != 0;
```
The code above remains optimized out of the compiled assembly. Even though `param_0_value` is used to assign `param_0_result`, `param_0_result` is not used, so this section of code is still considered unused.
Since `param_0_result` is now used
The function can be cleaned up a bit. For example, the `param_1 == 0x0` would be clearer as `param_1 == NULL`. If you know what the function does in the context of game functionality, you can clean up further by naming variables. Cleanup is not strictly required as long as the function is matching, but it will help people who are reading the function you decompiled.
The next line of assembly is:
```
and r0, r0, #0xff
```
Taken literally, this would be the following line of code:
```
param_0_result &= 0xFF;
```
We will revisit this later. For now, let's continue to the last line of assembly:
```
bx lr
```
The function returns after assigning a value to `r0`, which means the current value of `r0` at this point is the return value. In this case, that would be `param_0_result`.
```
return param_0_result;
```
Now that `param_0_result` is used, the assembly for the above code is now generated.
![Screenshot 2023-08-24 at 10 46 36 PM](https://github.com/pret/pmd-sky/assets/6516839/c8be36e0-2aad-48c2-b13b-65a8a91311f3)
Cool, a match! Technically this could be considered a stopping point, but let's look back at the code and clean it up a bit now that it matches.
The first item of note is the `&= 0xFF`. A bitwise `and` with `0xFF` is special, as it takes the 8 least significant bits of the number. This indicates that `param_0_result` is likely a `u8`, with the `and` being an automatic cast added by the compiler. This means it is not necessary to add the `&=` manually, and the previous line can instead be:
```
u8 param_0_result = param_0_value != 0;
```
If you make this change, the compiled assembly still matches. This demonstrates an important point: there are often multiple ways to write C code that all produce the same assembly.
Note that the type is specifically an unsigned `u8` type rather than a signed `s8` type. A signed type often produces different assembly, as the signed bit needs to be handled specially. For example, if you change the `u8` to an `s8` here, the assembly will use `lsl` and `asr` to cast the value rather than `and`.
Since the returned value is a `u8`, the function's return type can be changed to that as well.
```
u8 ov29_022E0354(s32 *param_0)
```
Back in the `if` statement, `param_0` is a pointer, so it can be compared to the `NULL` macro instead of 0 for clarity.
```
if (param_0 == NULL)
```
Note that the function only returns 1 or 0, which indicates that the return type is boolean. There is no specific `bool8` type, so the `u8` type will suffice here. However, the `return 0` can be changed to use the boolean macros, turning into `return FALSE`. Remember to use the special boolean macros (`TRUE` and `FALSE`) instead of the regular boolean keywords (`true` and `false`).
Now for some more standard code cleanup. All of this:
```
s32 param_0_value = *param_0;
u8 param_0_result = param_0_value != 0;
return param_0_result;
```
can be simplified to:
```
return *param_0 != 0;
```
Finally, if you already know what the function does in the context of game functionality, or if you want to research the game to learn this, you can name the function and its variables.
![Screenshot 2023-08-24 at 12 00 46 AM](https://github.com/pret/pmd-sky/assets/6516839/5d6759d4-73b3-42d4-af1a-f668c671a8f7)
The function is ready to add back to the decomp. [Here](https://decomp.me/scratch/41gTB) is the completed scratch. Though before getting to that, let's go over the other decomp approach using an automated decompiler.
### Starting with automated decompiler output
If you haven't already set up Ghidra, follow [this guide](https://www.starcubelabs.com/reverse-engineering-ds/#setting-up-a-reverse-engineering-environment) to do so. Once Ghidra is set up, choose the overlay of the function you're decompiling and find the function within the overlay. Copy the decompiler output into decomp.me as a starting point.
> ![Screenshot 2023-08-22 at 11 12 53 PM](https://github.com/pret/pmd-sky/assets/6516839/c40587da-45df-4c29-9912-6a66b9304608)
> ![Screenshot 2023-08-24 at 10 14 27 PM](https://github.com/pret/pmd-sky/assets/6516839/35aaf71c-ca97-476c-8dcc-1585ba885417)
>
> Decompiled function in Ghidra
> ![Screenshot 2023-08-23 at 10 46 59 PM](https://github.com/pret/pmd-sky/assets/6516839/92edf8db-ee64-4d21-a110-c3b2c1d970b0)
> ![Screenshot 2023-08-24 at 10 16 36 PM](https://github.com/pret/pmd-sky/assets/6516839/7d6885c2-65c4-4a14-baec-c74813246ece)
>
> decomp.me with the Ghidra decompiler's output
This function is already labeled in Ghidra because of previous reverse engineering work done with [pmdsky-debug](https://github.com/UsernameFodder/pmdsky-debug). This knowledge can be helpful if present, but it won't always be there.
Ghidra uses primitive C types, but the decomp uses custom typedefs for its types, so the primitive types should be converted to the custom types. For example, `int` becomes `s32` and `bool` becomes `u8`. Also, use the macros `FALSE` and `TRUE` for booleans instead of `false` and `true`. Here's what the function looks like after cleaning up these types and macros, along with indentation and newlines.
![Screenshot 2023-08-24 at 12 02 58 AM](https://github.com/pret/pmd-sky/assets/6516839/d6bbd5ca-0d7f-4dcf-9a06-e3e0013495a4)
![Screenshot 2023-08-24 at 10 18 46 PM](https://github.com/pret/pmd-sky/assets/6516839/33e6c3f7-6a50-497e-8129-a5bda2ee5fad)
The function now compiles successfully, but the compiled assembly does not the target assembly. In the vast majority of cases, the automated decompiler will not produce matching output. You'll have to read the target assembly and the mismatches and see what changes can be made to the C code to possibly produce a match.
@ -171,7 +233,7 @@ if (param_1 == (s32 *)0x0)
}
return *param_1 != 0;
```
![Screenshot 2023-08-24 at 12 04 19 AM](https://github.com/pret/pmd-sky/assets/6516839/36462445-f0d6-4334-adbe-6c66a81c2ef9)
![Screenshot 2023-08-24 at 10 19 18 PM](https://github.com/pret/pmd-sky/assets/6516839/e9da7e71-e746-469f-9d5e-20d7ffc84273)
That did the trick! The compiled and target assembly are now matching.