Commit Graph

21 Commits

Author SHA1 Message Date
Matthew Stanley
5b42a76748 recomp: pattern auto-discovery for dynamic-asset slot fragments (Shape A)
Adds [[input.decompressed_section_pattern]] for slots where many
fragments share a link vram (e.g. Stadium streams 279+ different
fragments through vram 0x8FF00000 across the game). Per-fragment
[[input.decompressed_section]] entries don't scale to that cardinality
and miss the runtime-swap dispatch problem entirely.

Engine pipeline:
  1. Scan baserom.z64 for every Yay0 wrapper.
  2. For each, decompress 0x40 bytes and check whether the prefix
     matches the expected J <vram + 0x20> trampoline + FRAGMENT magic.
     Wrappers in PERS-SZP form are detected by the -0x18 prefix.
  3. For matches, fully decompress and FNV-1a-64 hash the body.
  4. Deduplicate by content hash (Stadium has ~11 byte-identical
     duplicates across its 279 wrappers).
  5. Synthesize one Section per unique content. Section names
     <base_name>__rom_<wrapper_offset>; functions become
     func_<vram>__rom_<offset> via the existing collision-suffix
     machinery (default for pattern-discovered sections, since
     collisions are the EXPECTED case here).

Implementation function (the +0x20 entry) gets a basic forward CFG
walk to determine its size:
  - Walk instructions tracking forward branch targets within the func.
  - Stop at jr $ra IF no tracked forward branches still need to be
    reached.
  - Falls back to first-jr-ra heuristic if walk is inconclusive.

Pattern-synthesized recompile failures are non-fatal: pattern sections
have rom_addr in synthetic 0xFE000000 range, and main.cpp's recompile
loop log + skips them instead of std::exit. Lets the build proceed
even when our basic CFG walk misjudges a function with weird shape
(e.g. computed jumps through jump tables we don't analyze). Stadium's
Path-3 single-fragment case (fragment78 wrapper at ROM 0x9E93F0)
still recompiles cleanly; ~225 of 282 dynamic-slot fragments
recompile, ~57 fail and skip.

Validation on Stadium's 0x8FF00000 slot:
  - 293 Yay0 wrappers found (293 vs 279 from prior validate script —
    earlier scan undercounted due to a tight 1KB decode window).
  - 282 sections after dedupe (11 collapsed as content-identical).
  - Build proceeds to completion; no Stadium boot regression
    (logo + PIKA jingle still render).

Outstanding for next session — runtime side:
  - Modify register_runtime_fragment in librecomp/src/overlays.cpp
    to read bytes at fragment_ptr (first 0x40 → fall back to full
    body for the residual ~5%), hash, and look up the matching
    section. Currently it picks by id alone, so for slot 0x8FF00000
    only ONE of the 282 sections gets bound to func_map at any time
    (the most-recently registered).
  - Refactor cross-section R_MIPS_32 retargeting to use a vram
    hashmap (currently O(N²) which gets expensive at 282 sections).
  - Relink fragment78's prior single-fragment block can stay; it
    works alongside patterns and serves as the "I know exactly which
    one I want" form.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 21:47:44 -07:00
Matthew Stanley
b517a7195a recomp: build-time decompression of CPU-decompressed-at-runtime fragments
Adds [[input.decompressed_section]] toml block + Yay0/PERS-SZP wrapper
decoders + an in-memory section synthesis pass. Required for games
like Pokemon Stadium where Stadium's CPU-side decompressor materializes
fragment bytes at runtime and the static recompiler can't see them in
the ELF/ROM-direct path.

User-facing config:
    [[input.decompressed_section]]
    name = "fragment78"
    vram = 0x8FF00000
    rom_wrapper = 0x9E93F0
    wrapper_format = "pers_szp_yay0"

Pipeline:
  1. compression/{yay0,pers_szp}.{h,cpp} decode the wrapper.
  2. decompressed.cpp parses the FRAGMENT-format header (relocOffset,
     sizeInRam) + Stadium-format reloc table, translates it to
     N64Recomp::Reloc entries (R_MIPS_32/26/HI16/LO16) with paired
     HI16/LO16 immediate computation, and synthesizes a Section
     handed to the existing recompilation pipeline. Stores
     decompressed bytes into context.rom at synthetic_rom =
     0xFE000000 | rom_wrapper to keep them out of real-ROM addr space.
  3. Two functions per fragment: the +0x00 entry trampoline (J + nop)
     and the +0x20 implementation (runs to first jr ra in body).
  4. After all decompressed sections are added, retargets each
     R_MIPS_32 reloc to whichever existing section's vram range
     contains its target address (cross-section pointer support).

Adds [output] collision_policy:
  "error"  (default) — abort the build if two emitted symbols collide
                       on name; print both colliders + how to opt in.
  "suffix"           — auto-disambiguate by appending __rom_<rom_addr>
                       to colliding symbols. Suffix only appears where
                       collisions exist.

Validated end-to-end on Stadium's fragment78 (wrapper at ROM 0x9E93F0,
decomp_size=0x25340, 319 relocs). Recompiled func_8FF00020 dispatches
to runtime_addr+0x24DC0 correctly; Stadium boots past the prior
crash point, no regression on the N64 logo + PIKA jingle.

Future work: pattern form ([[input.decompressed_section_pattern]]) for
slots like vram 0x8FF00000 where Stadium streams 279 different
fragments at the same link addr. Validation script
(tools/_validate_dynfrag.py in the consumer repo) confirms 268 distinct
content-hashes, 23MB total payload — feasible as engine work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 21:47:39 -07:00
Wiseguy
afc2ff93a5
Implement mdebug parsing for static symbols in IDO elfs (#155)
Co-authored-by: Tharo <17233964+Thar0@users.noreply.github.com>
2025-09-06 18:44:18 -04:00
Wiseguy
38df8e3ddc
Mod function hooking (#124)
* Add function hooks to mod symbol format

* Add function sizes to section function tables

* Add support for function hooks in live generator

* Add an option to the context to force function lookup for all non-relocated function calls

* Include relocs in overlay data

* Include R_MIPS_26 relocs in symbol file dumping/parsing

* Add manual patch symbols (syms.ld) to the output overlay file and relocs

* Fix which relocs were being emitted for patch sections

* Fix sign extension issue with mfc1, add TODO for banker's rounding
2025-01-26 21:52:46 -05:00
Ethan Lafrenais
36b5d9ae33
PIC Jump Table Support (#120)
* Support for $gp relative jump table calls
2025-01-16 00:40:50 -05:00
LittleCube
351482e9c6
Fix TRACE_ENTRY and move function_sizes (#112) 2025-01-04 21:49:31 -05:00
Wiseguy
66062a06e9
Implement live recompiler (#114)
This commit implements the "live recompiler", which is another backend for the recompiler that generates platform-specific assembly at runtime. This is still static recompilation as opposed to dynamic recompilation, as it still requires information about the binary to recompile and leverages the same static analysis that the C recompiler uses. However, similarly to dynamic recompilation it's aimed at recompiling binaries at runtime, mainly for modding purposes.

The live recompiler leverages a library called sljit to generate platform-specific code. This library provides an API that's implemented on several platforms, including the main targets of this component: x86_64 and ARM64.

Performance is expected to be slower than the C recompiler, but should still be plenty fast enough for running large amounts of recompiled code without an issue. Considering these ROMs can often be run through an interpreter and still hit their full speed, performance should not be a concern for running native code even if it's less optimal than the C recompiler's codegen.

As mentioned earlier, the main use of the live recompiler will be for loading mods in the N64Recomp runtime. This makes it so that modders don't need to ship platform-specific binaries for their mods, and allows fixing bugs with recompilation down the line without requiring modders to update their binaries.

This PR also includes a utility for testing the live recompiler. It accepts binaries in a custom format which contain the instructions, input data, and target data. Documentation for the test format as well as most of the tests that were used to validate the live recompiler can be found here. The few remaining tests were hacked together binaries that I put together very hastily, so they need to be cleaned up and will probably be uploaded at a later date. The only test in that suite that doesn't currently succeed is the div test, due to unknown behavior when the two operands aren't properly sign extended to 64 bits. This has no bearing on practical usage, since the inputs will always be sign extended as expected.
2024-12-31 16:11:40 -05:00
LittleCube
17438755a1
Implement nrm filename toml input, renaming list, trace mode, and context dumping flag (#111)
* implement nrm filename toml input

* change name of mod toml setting to 'mod_filename'

* add renaming and re mode

* fix --dump-context arg, fix entrypoint detection

* refactor re_mode to function_trace_mode

* adjust trace mode to use a general TRACE_ENTRY() macro

* fix some renaming and trace mode comments, revert no toml entrypoint code, add TODO to broken block

* fix arg2 check and usage string
2024-12-24 02:10:26 -05:00
Wiseguy
5b17bf8bb5
Modding Support PR 1 (Instruction tables, modding support, mod symbol format, library conversion) (#89)
* Initial implementation of binary operation table

* Initial implementation of unary operation table

* More binary op types, moved binary expression string generation into separate function

* Added and implemented conditional branch instruction table

* Fixed likely swap on bgezal, fixed extra indent branch close and missing
indent on branch statement

* Add operands for other uses of float registers

* Added CHECK_FR generation to binary operation processing, moved float comparison instructions to binary op table

* Finished moving float arithmetic instructions to operation tables

* Added store instruction operation table

* Created Generator interface, separated operation types and tables and C generation code into new files

* Fix mov.d using the wrong input operand

* Move recompiler core logic into a core library and make the existing CLI consume the core library

* Removed unnecessary config input to recompilation functions

* Moved parts of recomp_port.h into new internal headers in src folder

* Changed recomp port naming to N64Recomp

* Remove some unused code and document which Context fields are actually required for recompilation

* Implement mod symbol parsing

* Restructure mod symbols to make replacements global instead of per-section

* Refactor elf parsing into static Context method for reusability

* Move elf parsing into a separate library

* WIP elf to mod tool, currently working without relocations or API exports/imports

* Make mod tool emit relocs and patch binary for non-relocatable symbol references as needed

* Implemented writing import and exports in the mod tool

* Add dependencies to the mod symbol format, finish exporting and importing of mod symbols

* Add first pass offline mod recompiler (generates C from mods that can be compiled and linked into a dynamic library)

* Add strict mode and ability to generate exports for normal recompilation (for patches)

* Move mod context fields into base context, move import symbols into separate vector, misc cleanup

* Some cleanup by making some Context members private

* Add events (from dependencies and exported) and callbacks to the mod symbol format and add support to them in elf parsing

* Add runtime-driven fields to offline mod recompiler, fix event symbol relocs using the wrong section in the mod tool

* Move file header writing outside of function recompilation

* Allow cross-section relocations, encode exact target section in mod relocations, add way to tag reference symbol relocations

* Add local symbol addresses array to offline mod recompiler output and rename original one to reference section addresses

* Add more comments to the offline mod recompiler's output

* Fix handling of section load addresses to match objcopy behavior, added event parsing to dependency tomls, minor cleanup

* Fixed incorrect size used for finding section segments

* Add missing includes for libstdc++

* Rework callbacks and imports to use the section name for identifying the dependency instead of relying on per-dependency tomls
2024-08-26 23:06:34 -04:00
Wiseguy
f8d439aeee
Add option to output multiple functions per file, defaults to 50 (#88) 2024-08-15 00:17:09 -04:00
Mr-Wiseguy
4161ef68cc Made recompilation header include configurable 2024-08-15 00:00:25 -04:00
Wiseguy
ba4aede49c
Add symbol reference file mechanism for elf recompilation (#82)
* Consolidate context dumping toggle into a single bool, begin work on data symbol context dumping
* Added data symbol context dumping
* Fix mthi/mtlo implementation
* Add option to control unpaired LO16 warnings
2024-07-02 21:42:22 -04:00
Gilles Siberlin
6eb7d5bd3e
Implement hook insertion (#73)
* Implement function hook insertion

* Fix recompiled code indentation

* Add _matherr to renamed_funcs

* Replace after_vram by before_vram

* Emit dummy value if relocatable_sections_ordered is empty
2024-05-31 23:31:50 -04:00
Mr-Wiseguy
e0e52d1fc3
Symbol file toml update (#52)
* Symbol input file mechanism

* Migration to new toml lib

---------

Co-authored-by: dcvz <david@dcvz.io>
2024-05-16 22:33:08 -04:00
Mr-Wiseguy
50d55bd171 Added manual sections input option, fixed bug with multiplications and added mthi/lo instructions 2024-04-20 20:00:29 -04:00
Mr-Wiseguy
be275c198a Added single-file mode and absolute symbol options (for patch recompilation) 2023-11-12 14:50:50 -05:00
Mr-Wiseguy
d249363fe5 Misc upgrades including mips3 float mode support, skip overwriting existing files if they're identical to the current recompiled output 2023-10-29 20:53:17 -04:00
Mr-Wiseguy
302dd091c2 Implement application of single-instruction patches 2023-03-24 20:28:36 -04:00
Mr-Wiseguy
9949813018 Implemented parsing of instruction patches in config file 2023-03-24 19:22:30 -04:00
Mr-Wiseguy
7df3e28c76 Implemented function stubbing 2023-03-24 18:04:21 -04:00
Mr-Wiseguy
fba0085946 Added toml11 and implemented initial config file parsing, replaces command-line arg inputs 2023-03-24 17:11:17 -04:00