Commit Graph

3603 Commits

Author SHA1 Message Date
JosJuice
a6278030c2 JitArm64: Fix twx
The conditions were in reverse order (maybe someone was reading the
PowerPC manual and forgot about IBM's bit numbering), and additionally
the two conditions for unsigned comparison were wrong.

Fixes https://bugs.dolphin-emu.org/issues/14054.
2026-05-31 11:49:34 +02:00
JosJuice
058c7021b8 JitArm64: Fix DR check in MSRUpdated
TBZ takes the index of the bit to test, not a LogicalImm.
2026-05-01 10:23:59 +02:00
Dr. Dystopia
9ae9c12938 Replace find(x) != npos with contains(x) - Core 2026-04-20 09:36:08 +02:00
Martino Fontana
95dec13203 Improve usage of std::move and const references parameters
Accomplished using `run-clang-tidy` with `performance-move-const-arg,performance-unnecessary-value-param,modernize-pass-by-value`.

Changed arguments to const references, removed them where inappropriate (e.g. sink parameters). Same with std::move.

Manually reviewed each change to make sure that it makes sense, and do something more appropriate if possible.
2026-04-17 12:39:46 +02:00
Jordan Woyak
23e8a3c569 Core/PowerPC: Minor code cleanup to CheckExternalExceptions function. 2026-04-10 14:57:52 -05:00
JosJuice
f3adef75ed Core: Raise PI interrupt when accessing unmapped memory
Unmapped on the physical level, not the MMU level.

Fixes booting Game Boy Interface. Previously, Game Boy Interface thought
it was running on a Wii because accessing MEM2 didn't raise a PI
interrupt, and as a result tried to exit to the Homebrew Channel in a
way Dolphin's HLE doesn't recognize. (Dolphin's HLE catches jumps to
0x80001800, but GBI is running without address translation at this point
and therefore jumps to 0x00001800 instead.)
2026-04-04 17:45:17 +02:00
JosJuice
f7b9c1f034 Jit: Move dcbx ENABLE_IF
INSTRUCTION_START is supposed to be before anything else in the
function. The difference only matters if INSTRUCTION_START gets
redefined, though.
2026-04-02 11:14:39 +02:00
JosJuice
904ed4b785 Jit64: Use dcbz slow path with accurate dcache
Jit64::dcbz's fast path bypasses the dcache, so we shouldn't use it if
accurate dcache is turned on. This fixes the graphical corruption that
would occur in Mario Kart Wii's menu FMVs with accurate dcache.

JitArm64 never had this problem, because it implements dcbz in a
different way. It calls EmitBackpatchRoutine, which already has a check
for accurate dcache.
2026-04-02 11:06:28 +02:00
JosJuice
ff4a7c8a95 Reword the invalid read/write panic alert
Some users seem to be under the impression that the panic alert is
saying that enabling MMU will fix the issue, but that's not what it
actually says. Let's try to make this a bit clearer.
2026-03-24 22:25:01 +01:00
JosJuice
989a95a177 Core: Add INI-only setting for page table fastmem 2026-03-17 18:27:05 +01:00
JosJuice
c1a26808ce Jit: Use RangeSet for physical_addresses
This makes JitBaseBlockCache::ErasePhysicalRange around 50% faster and
PPCAnalyzer::Analyze around 40% faster. Rogue Squadron 2's notoriously
laggy action of switching to and from cockpit view is made something
like 20-30% faster by this, though this is a very rough measurement.
2026-02-23 23:01:49 +01:00
JosJuice
36f45dce44 Move RangeSet from Externals to Common
This is a very small libary, and as I understand it, it was more or less
developed for Dolphin.

This moves the two relevant files from Externals to Common, changes the
namespace to Common, reformats the code, and adds Dolphin copyright
notices. The change in copyright notice and license was approved by
AdmiralCurtiss.
2026-02-23 22:55:38 +01:00
Dentomologist
3d16e0c5be
Merge pull request #14343 from TryTwo/toggle_breaking
BreakpointsWidget: Add option to toggle all breaking
2026-02-19 21:55:38 -08:00
TryTwo
f9c7731f4d Debugger/ BreakpointsWidget: Add option to disable/enable all breaking without affecting individual breakpoint enabled states.
Allows you to quickly stop breaking, play the game, then re-enable breaking. useful if you have many active breakpoints, but need to run the game.
2026-02-19 16:23:42 -07:00
Martino Fontana
c9457cf906 Jit: Emit Branch Watch code only if it's enabled
JIT code related to Branch Watch was emitted if the debugging UI was active: the emitted code would dynamically check whether Branch Watch is active.
However, this causes two problems:
1. It decreases performance by just having the debugging UI enabled
2. It clutters the host assembly in the JIT tab, making it harder to read (unaware readers will wonder what these instructions are for)

With this PR, code related to Branch Watch is emitted only if Branch Watch itself is active, fixing the issues above.
The JIT cache will now be wiped whenever the feature is toggled, causing a slight stutter. However, this isn't the kind of feature that is toggled over and over, so IMO it is an acceptable trade-off.
2026-02-15 11:03:02 +01:00
JosJuice
0ce95299f6 Core: Don't create page table mappings before R/C bits are set
This gets rid of the hack of setting the R and C bits pessimistically,
reversing the performance regression in Rogue Squadron 3.
2026-02-04 21:35:22 +01:00
JosJuice
9462e9d890 Core: Update page table mappings incrementally
Removing and readding every page table mapping every time something
changes in the page table is very slow. Instead, let's generate a diff
and ask Memmap to update only the diff.
2026-02-04 21:35:20 +01:00
JosJuice
7b885b857e Core: Postpone page table updates when DR is unset
Page table mappings are only used when DR is set, so if page tables are
updated when DR isn't set, we can wait with updating page table mappings
until DR gets set. This lets us batch page table updates in the Disney
Trio of Destruction, improving performance when the games are loading
data. It doesn't help much for GameCube games, because those run tlbie
with DR set.

The PowerPCState struct has had its members slightly reordered. I had to
put pagetable_update_pending less than 4 KiB from the start so AArch64's
LDRB (immediate) can access it, and I also took the opportunity to move
some other members around to cut down on padding.
2026-02-04 21:34:07 +01:00
JosJuice
083f3a7e0e Core: Create fastmem mappings for page address translation
Previously we've only been setting up fastmem mappings for block address
translation, but now we also do it for page address translation. This
increases performance when games access memory using page tables, but
decreases performance when games set up page tables.

The tlbie instruction is used as an indication that the mappings need to
be updated.

There are some accuracy downsides:

* The TLB is now effectively infinitely large, which matters if games
  don't use tlbie when modifying page tables.
* The R and C bits for page table entries get set pessimistically rather
  than when the page is actually accessed.

No games are known to be broken by these inaccuracies, but unfortunately
the second inaccuracy causes a large performance regression in Rogue
Squadron 3. You still get the old, more accurate behavior if Enable
Write-Back Cache is on.
2026-02-04 21:33:56 +01:00
JosJuice
d3ec630904 Core: Pre-shift pagetable_hashmask left by 6
This will make the upcoming commits just a little bit neater to
implement.
2026-02-01 12:39:33 +01:00
JosJuice
08884746ed Core: Detect SR updates 2026-02-01 12:39:33 +01:00
Martino Fontana
a14c88ba67 Remove unused imports
Yellow squiggly lines begone!
Done automatically on .cpp files through `run-clang-tidy`, with manual corrections to the mistakes.
If an import is directly used, but is technically unnecessary since it's recursively imported by something else, it is *not* removed.
The tool doesn't touch .h files, so I did some of them by hand while fixing errors due to old recursive imports.
Not everything is removed, but the cleanup should be substantial enough.
Because this done on Linux, code that isn't used on it is mostly untouched.
(Hopefully no open PR is depending on these imports...)
2026-01-25 16:12:15 +01:00
JosJuice
3221e982d3
Merge pull request #13900 from JosJuice/jit-fma-double-rounding
Jit: Implement error-free transformation for single-precision FMA
2026-01-23 21:43:18 +01:00
Dentomologist
009c53ab89
Merge pull request #14146 from jordan-woyak/cached-interp-fix-function-cast-warning
CachedInterpreter: Replace reinterpret_cast with std::bit_cast to resolve -Wcast-function-type-mismatch warnings.
2026-01-21 13:29:44 -08:00
JosJuice
3b1a4739bc JitArm64: Special-case fmadds with single-precision inputs
If all inputs to an fmadds instruction (including cousins like fmsubs,
fnmadd...) are single-precision, then the result is identical between a
double-precision calculation with an error-free transform (whether the
calculation is fused or not) and a single-precision FMA instruction
(must be fused). So as a performance optimization in JitArm64, if we
were going to use double precision with EFT but the inputs are singles,
instead we'll use a normal single-precision FMA instruction without
anything extra. This lets us skip both the EFT and double-to-single
conversions.

Also renaming `inaccurate_fma` to `nonfused` because it's confusing that
`inaccurate_fma` and `m_accurate_fmadds` have such similar names
despite controlling separate things.
2026-01-18 20:03:54 +01:00
JosJuice
58487f1633 Jit: Implement error-free transformation for single-precision FMA
This implements the equivalent of 07443e2d41 in Jit64 and JitArm64.
Aims to fix https://bugs.dolphin-emu.org/issues/13865.
2026-01-18 20:02:49 +01:00
JosJuice
6ac7ffcdd7 Jit64: Return FixupBranch from HandleNaNs
This will be used in the next commit to skip running code that's
unnecessary when the result is NaN.
2026-01-18 20:02:49 +01:00
JosJuice
d5067b6276 Jit64: Replace MOVSD with MOVAPD in software FMA
Should be a little faster by avoiding false dependencies. Note that
there is one remaining MOVSD that really needs to be a MOVSD.
2026-01-18 20:02:49 +01:00
JosJuice
caad84c636 JitArm64: Reduce register pressure for inaccurate FMA with accurate NaNs
If result_reg is set to a temporary register instead of VD because of
accurate NaNs, there's no need to allocate a secondary temporary
register because of inaccurate FMA.
2026-01-18 20:02:49 +01:00
JosJuice
addededecf JitArm64: Always use double precision for inaccurate FMA
When we're emulating single-precision FMA using an FMA instruction,
there's no precision benefit from using a double-precision instruction,
assuming all inputs are single-precision. But when we're emulating
single-precision FMA using separate multiplication and addition
instructions, there is.

This change increases the precision of inaccurate FMA to the same level
as Jit64, which matters since the only reason we have the inaccurate
FMA mode is for sync compatibility with Jit64.
2026-01-18 10:36:00 +01:00
iwubcode
6d40f4e897
Merge pull request #14265 from JoshuaVandaele/std-unreachable
c++23: Replace Common::Unreachable with std::unreachable
2026-01-17 22:32:53 -06:00
JMC47
035bcffc63
Merge pull request #14289 from Sintendo/typos
Fix various typos and spelling mistakes
2026-01-17 19:10:50 -05:00
Joshua Vandaële
e822cc3715
c++23: Replace Common::Unreachable with std::unreachable
Requires at least GCC 12, Clang 15, MSVC 19.32, or AppleClang 14.0.3.
2026-01-17 23:53:21 +01:00
iwubcode
b556bd99d7
Merge pull request #14268 from JoshuaVandaele/std-tounderlying
c++23: Replace Common::ToUnderlying with std::to_underlying
2026-01-17 16:49:57 -06:00
Sintendo
1e0473e44f Fix various typos and spelling mistakes 2026-01-17 20:11:38 +01:00
JMC47
2aee998a8e
Merge pull request #14199 from JosJuice/jit64-rcoparg-isimm
Jit64: Return current value from RCOpArg::IsImm
2026-01-12 13:06:09 -05:00
JosJuice
3ea366119f Jit64: Make TrampolineInfo smaller
Combined with the previous commit, this brings the TrampolineInfo struct
down to 48 bytes. This matters, because Jit64 has a big
std::unordered_map where it stores many megabytes of TrampolineInfo
entries.

The key saving comes from shrinking the len member from u32 to u16. It
should be safe to even turn it into a u8, but going that far brings no
additional savings due to how the padding works out.
2026-01-11 19:12:26 +01:00
Joshua Vandaële
55f0715ad4
c++23: Replace Common::ToUnderlying with std::to_underlying
Requires at least GCC 11, Clang 13, MSVC 19.30 (VS2022 17.0), or AppleClang 13.1.6 (XCode 13.3).
2026-01-09 23:49:10 +01:00
Joshua Vandaële
74b1930da4
JitArm64_RegCache: Fix is always true warnings 2025-12-29 11:12:07 +01:00
JMC47
e17f6cff30
Merge pull request #13959 from Sintendo/jitarm64-subfx-merge
JitArm64_Integer: Merge subfx and subfcx
2025-12-22 13:27:38 -05:00
JosJuice
fca27c375a Jit64: Explicitly get imm for clobbered stores
If we're on an x64 CPU that doesn't have the MOVBE extension, trying to
SwapAndStore a host register results in that register's value getting
clobbered with the swapped value. Jit64::stX and Jit64::stXx detect this
case, and if necessary, emit a MOV to a register that's fine to clobber.

This logic was broken by the merge of PR 12134. Jit64::stX and
Jit64::stXx were assuming that if RegCache::IsImm returns true for a
guest register, calling RegCache::Use or RegCache::BindOrImm for that
guest register would result in an immediate. However, PR 12134 made it
possible for a guest register to have both a host register and an
immediate in the register cache at the same time. When this happens,
RegCache::IsImm returns true, yet RegCache::Use and RegCache::BindForImm
return an RCOpArg whose Location returns a host register. (To make it
extra confusing, RCOpArg::IsImm calls RegCache::IsImm if the RCOpArg
came from RegCache, so RCOpArg::IsImm returns true!)

To fix this, in cases where Jit64::stX and Jit64::stXx explicitly need
an immediate to avoid having to emit an extra MOV, let's call
RegCache::Imm32 so that we're certain that we're getting an immediate.

This fixes an issue on older x64 CPUs that manifested as e.g. completely
broken graphics in Spyro: Enter the Dragonfly.
2025-12-08 23:19:10 +01:00
JosJuice
48009fd898 Jit64: Return current value from RCOpArg::IsImm
The constant propagation PR made it so that a guest register can be
present in the register cache as both a host register and an immediate
at the same time. If such a guest register is requested from the
register cache, the register cache prefers returning it as a host
register. However, RCOpArg::IsImm still returns true in this case. This
is confusing, especially since OpArg::IsImm does not return true if the
RCOpArg is converted into an OpArg.

This commit makes RCOpArg::IsImm check whether RCOpArg::Location returns
an immediate, so that RCOpArg::IsImm returns false when a host register
is being used. Code that wants to know whether an immediate exists in
the register cache rather than whether an immediate is currently being
used should call RegCache::IsImm instead.
2025-12-07 23:09:07 +01:00
JosJuice
213dc1c9af
Merge pull request #14178 from Dentomologist/jit64_avoid_passing_immediate_to_non_immediate_parameter
Jit64: Avoid passing immediate to non-immediate parameter
2025-12-01 20:01:52 +01:00
JosJuice
0c024de591 Jit64: Flush carry flag in FallBackToInterpreter
We have an optimization where the guest carry flag is kept in the host
carry flag between certain back-to-back pairs of integer instructions.
If the second instruction falls back to the interpreter, then
FallBackToInterpreter should flush the carry flag to m_ppc_state,
otherwise the interpreter reads a stale carry flag and at some later
point Jit64 trips the "Attempt to modify flags while flags locked!"
assertion.

An alternative solution would be to not store the guest carry flag in
the host carry flag to begin with if we know the next instruction is
going to fall back to the interpreter, but knowing that in advance is
non-trivial. Since interpreter fallbacks aren't exactly intended to be
super optimized, I went for the flushing solution instead, which is how
JitArm64 already works. In most cases, the emitted code shouldn't even
differ between these two solutions.

Note that the problematic situation only happens if the first integer
instruction doesn't fall back to the interpreter but the second one
does. This used to be impossible because there's no "JIT disable"
setting that's granular enough to disable some integer instructions but
not all, but with the constant propagation PR, it's possible if constant
propagation is able to entirely evaluate the first instruction but not
the second.
2025-11-29 11:45:43 +01:00
Dentomologist
c2d277c5d1 Jit64: Avoid passing immediate to non-immediate parameter
Call `UseNoImm` instead of `Use` on parameter `a` of `MultiplyImmediate`
since `Ra` gets passed to `IMUL` which asserts that parameter is not an
immediate.
2025-11-26 16:27:26 -08:00
Sintendo
a18cf5693e JitArm64: Remove some unused includes 2025-11-23 09:54:53 +01:00
Sintendo
419f90107d JitArm64_Integer: Merge subfx and subfcx
The optimizations for subfcx introduced in #13852 also apply to subfx.
Rather than duplicating the logic, we merge the handlers, like we did
in #10120 for x86.
2025-11-23 09:54:45 +01:00
Jordan Woyak
4a89300929 CachedInterpreter: Replace reinterpret_cast with std::bit_cast to resolve -Wcast-function-type-mismatch warnings. 2025-11-20 15:31:23 -06:00
JosJuice
49e9cd42d4 JitArm64: Call GetImm before BindToRegister in subfcx
When BindToRegister is called, the register cache marks the relevant
guest register as no longer containing an immediate. However, subfcx was
calling GetImm after BindToRegister. This led to a lot of panic alerts
after 2995aa5be4 added an assert to GetImm to check that the passed-in
register is an immediate.

Both before and after 2995aa5be4, the actual value of the immediate
wasn't overwritten by BindForRegister, only the fact that the register
is an immediate. Because of this, the emitted code happened to work
correctly.
2025-11-17 20:00:36 +01:00
JosJuice
b9d9f36ce5 JitArm64: Replace dirty flag and partially replace RegType enum
Like Jit64, JitArm64 now keeps track of the location of a guest register
using three booleans: Whether it is in ppcState, whether it is in a host
register, and whether it is a known immediate. The RegType enum remains
only for the purpose of keeping track of what format FPRs are stored in
in host registers.
2025-11-16 09:52:09 +01:00