When splat marks a fragment-internal symbol as undefined (e.g.
'D_8140DD78 = 0x8140DD78;' in undefined_syms_auto.ld), the elf parser
records target_section = SectionAbsolute and target_section_offset =
the symbol's literal vram. Previously the recompilation walker treated
this as a non-relocatable reference and emitted the link-time literal
as a lui+addiu pair. At runtime the containing fragment loads at a
non-canonical address, so the access misses by the relocation delta
and lands in unwritten memory.
Same pattern as the prior bss-remap (28de57f) and unsorted-relocs
(506b9fc) fixes: producer/consumer asymmetry where some references
get RELOC and others bake in the link-time literal.
Fix: after the bss → parent remap, walk the registered relocatable
sections; if the absolute value falls inside one, redirect
reloc_section to that section index and use bss_remap_offset_adjustment
to subtract the new section's vram base. Downstream target_relocatable
check then treats it correctly and emits RELOC_HI16/LO16.
Verified on PokemonStadiumRecomp: the attract-path G_DL target
0x8140DD78 (a Gfx array in fragment34, used by fragment62) now emits
as RELOC_HI16(147, 0xDD78) instead of literal 0x8141<<16. Stadium
attract advances past the prior send_dl=1157 freeze; environment
renders.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bss → parent text section remap was happening AFTER the
target_relocatable check. Bss sections are not themselves marked
relocatable; their parent text sections are. So a HI16/LO16 pair
targeting a bss section (e.g. .fragment34_bss) hit
target_relocatable=false and the reloc was silently dropped — the
lui/addiu emitted as link-time literals.
Symptom: producer/consumer asymmetry across fragments. fragment62's
func_8432D5B0 writes to D_8140E720 (in fragment34's bss) using
emitted literal `S32(0x8141 << 16)` — i.e., the canonical link
addr 0x8140E720. fragment34's func_8140C204 reads D_8140E720 via
RELOC_HI16(147, 0xE720) against the RUNTIME base. When fragment34
is loaded at a non-canonical runtime address (e.g. 0x80114C10),
the writer hits canonical RDRAM[0x40E720] while the reader hits
runtime+0xE720 = RDRAM[0x123330] — different locations. Reader
sees uninitialized memory (observed value 3, near-NULL deref at
0xD3 in func_8140C204).
Two changes:
1. Hoist the bss-remap above the target_relocatable check so the
parent's relocatable flag is what gates emit, not the bss
section's.
2. When remapping, add (bss_vram - parent_vram) to
target_section_offset so it stays relative to the new (parent)
base. The reloc's stored target_section_offset is computed
relative to the bss section's vram in elf.cpp; the parent text
section starts before bss in the link layout, so the offset
needs the bss-vs-parent vram delta added (typically equal to
the parent text size).
Verified: Stadium attract demo now runs without the
func_8140C204:0xD3 crash. Reaches frame 2138 cleanly through
fragment62 + fragment34 dispatch.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add function hooks to mod symbol format
* Add function sizes to section function tables
* Add support for function hooks in live generator
* Add an option to the context to force function lookup for all non-relocated function calls
* Include relocs in overlay data
* Include R_MIPS_26 relocs in symbol file dumping/parsing
* Add manual patch symbols (syms.ld) to the output overlay file and relocs
* Fix which relocs were being emitted for patch sections
* Fix sign extension issue with mfc1, add TODO for banker's rounding
This commit implements the "live recompiler", which is another backend for the recompiler that generates platform-specific assembly at runtime. This is still static recompilation as opposed to dynamic recompilation, as it still requires information about the binary to recompile and leverages the same static analysis that the C recompiler uses. However, similarly to dynamic recompilation it's aimed at recompiling binaries at runtime, mainly for modding purposes.
The live recompiler leverages a library called sljit to generate platform-specific code. This library provides an API that's implemented on several platforms, including the main targets of this component: x86_64 and ARM64.
Performance is expected to be slower than the C recompiler, but should still be plenty fast enough for running large amounts of recompiled code without an issue. Considering these ROMs can often be run through an interpreter and still hit their full speed, performance should not be a concern for running native code even if it's less optimal than the C recompiler's codegen.
As mentioned earlier, the main use of the live recompiler will be for loading mods in the N64Recomp runtime. This makes it so that modders don't need to ship platform-specific binaries for their mods, and allows fixing bugs with recompilation down the line without requiring modders to update their binaries.
This PR also includes a utility for testing the live recompiler. It accepts binaries in a custom format which contain the instructions, input data, and target data. Documentation for the test format as well as most of the tests that were used to validate the live recompiler can be found here. The few remaining tests were hacked together binaries that I put together very hastily, so they need to be cleaned up and will probably be uploaded at a later date. The only test in that suite that doesn't currently succeed is the div test, due to unknown behavior when the two operands aren't properly sign extended to 64 bits. This has no bearing on practical usage, since the inputs will always be sign extended as expected.
* implement nrm filename toml input
* change name of mod toml setting to 'mod_filename'
* add renaming and re mode
* fix --dump-context arg, fix entrypoint detection
* refactor re_mode to function_trace_mode
* adjust trace mode to use a general TRACE_ENTRY() macro
* fix some renaming and trace mode comments, revert no toml entrypoint code, add TODO to broken block
* fix arg2 check and usage string
* Terminate offline mod recompilation if any functions fail to recompile
* Fixed edge case with switch case jump table detection when lo16 immediate is exactly 0
* Prevent emitting duplicate reference symbol defines in offline mod recompilation
* Fix function calls and add missing runtime function pointers in offline mod recompiler
* Remove reference context from parse_mod_symbols argument
* Add support for special dependency names (self and base recomp), fix non-compliant offline mod recompiler output
* Fix export names not being set on functions when parsing mod syms, add missing returns to mod parsing
* Switch offline mod recompilation to use a base global event index instead of per-event global indices
* Add support for creating events in normal recompilation
* Output recomp API version in offline mod recompiler
* Removed dependency version from mod symbols (moved to manifest)
* Added mod manifest generation to mod tool
* Implement mod file creation in Windows
* Fixed some error prints not using stderr
* Implement mod file creation on posix systems
* De-hardcode symbol file path for offline mod recompiler
* Fix duplicate import symbols issue and prevent emitting unused imports
* Initial implementation of binary operation table
* Initial implementation of unary operation table
* More binary op types, moved binary expression string generation into separate function
* Added and implemented conditional branch instruction table
* Fixed likely swap on bgezal, fixed extra indent branch close and missing
indent on branch statement
* Add operands for other uses of float registers
* Added CHECK_FR generation to binary operation processing, moved float comparison instructions to binary op table
* Finished moving float arithmetic instructions to operation tables
* Added store instruction operation table
* Created Generator interface, separated operation types and tables and C generation code into new files
* Fix mov.d using the wrong input operand
* Move recompiler core logic into a core library and make the existing CLI consume the core library
* Removed unnecessary config input to recompilation functions
* Moved parts of recomp_port.h into new internal headers in src folder
* Changed recomp port naming to N64Recomp
* Remove some unused code and document which Context fields are actually required for recompilation
* Implement mod symbol parsing
* Restructure mod symbols to make replacements global instead of per-section
* Refactor elf parsing into static Context method for reusability
* Move elf parsing into a separate library
* WIP elf to mod tool, currently working without relocations or API exports/imports
* Make mod tool emit relocs and patch binary for non-relocatable symbol references as needed
* Implemented writing import and exports in the mod tool
* Add dependencies to the mod symbol format, finish exporting and importing of mod symbols
* Add first pass offline mod recompiler (generates C from mods that can be compiled and linked into a dynamic library)
* Add strict mode and ability to generate exports for normal recompilation (for patches)
* Move mod context fields into base context, move import symbols into separate vector, misc cleanup
* Some cleanup by making some Context members private
* Add events (from dependencies and exported) and callbacks to the mod symbol format and add support to them in elf parsing
* Add runtime-driven fields to offline mod recompiler, fix event symbol relocs using the wrong section in the mod tool
* Move file header writing outside of function recompilation
* Allow cross-section relocations, encode exact target section in mod relocations, add way to tag reference symbol relocations
* Add local symbol addresses array to offline mod recompiler output and rename original one to reference section addresses
* Add more comments to the offline mod recompiler's output
* Fix handling of section load addresses to match objcopy behavior, added event parsing to dependency tomls, minor cleanup
* Fixed incorrect size used for finding section segments
* Add missing includes for libstdc++
* Rework callbacks and imports to use the section name for identifying the dependency instead of relying on per-dependency tomls
* Consolidate context dumping toggle into a single bool, begin work on data symbol context dumping
* Added data symbol context dumping
* Fix mthi/mtlo implementation
* Add option to control unpaired LO16 warnings