This should have been done when rebasing 6cc4f593e5 after the merge of
3a00ff625e. There are no correctness implications as far as I know,
only very minor performance implications.
FL_EVIL is only used for blocking instructions from being reordered.
There are three types of instructions which have FL_EVIL set:
1. CR operations. The previous commits improved our CR analysis
and removed FL_EVIL from these instructions.
2. Load/store operations. These are always blocked from reordering
due to always having canCauseException set.
3. isync. I don't know if we actually need to prevent reordering
around this one, since as far as I know we only do reorderings
that are guaranteed to not change the behavior of the program.
But just in case, I've renamed FL_EVIL to FL_NO_REORDER instead of
removing it entirely, so that it can be used for this instruction.
Other than the CR instructions, which we now analyze properly,
all the covered instructions are not integer operations and also
have either FL_ENDBLOCK or FL_EVIL set, so there are two other
checks in CanSwapAdjacentOps that will reject them.
This brings the analysis done for condition registers
more in line with the analysis done for GPRs and FPRs.
This gets rid of the old wantsCR member, which wasn't actually
used anyway. In case someone wants it again in the future, they
can compute the bitwise inverse of crDiscardable.
Not sure if this was causing correctness issues – it depends on whether
the HLE code was actually reading the discarded registers – but it was
at least causing annoying assert messages in one piece of homebrew.
This bug has been lurking in the code ever since I added the discard
functionality. It doesn't seem to be triggered all that often,
and on top of that the emitted code only runs conditionally, so I'm not
sure if people have been affected by this bug in practice or not.
We can utilize the fact that if swapping two instructions causes another
swap to become possible, that swap involves one of the two instructions
we just swapped. There's need to search for new swap opportunities as
far back as the beginning of the block; we can just go one step back.
This fixes a problem I was having where using frame advance with the
debugger open would frequently cause panic alerts about invalid addresses
due to the CPU thread changing MSR.DR while the host thread was trying
to access memory.
To aid in tracking down all the places where we weren't properly locking
the CPU, I've created a new type (in Core.h) that you have to pass as a
reference or pointer to functions that require running as the CPU thread.
In a code block where a guest register is accessed at least twice and the
last access is a write and the register is not discardable immediately
after the second-to-last instruction (perhaps there is an instruction
in between that can cause an exception), currently Dolphin's JITs will
flush the register after the second-to-last instruction.
It would be better if we replaced the flush after the second-to-last
instruction with a flush that only happens if the exception path is
taken. This change accomplishes that by marking guest registers as
"in use" not just when they are used as inputs but also when they are
used as outputs, preventing the loop in DoJit from flushing the
register until after the last access.
This piece of code is rather hard to understand, but my best guess
at what it's trying to do is that it tries to create opportunities
to skip writing CRs back to ppcState if we know that there are no
CR instructions (or branch instructions, etc) between an instruction
that writes to a CR register and the next blr. This is technically
inaccurate emulation, but as long as games don't do anything too
weird with their ABIs, I suppose it doesn't break anything.
So why do I want to get rid of it? Well, other than breaking some
hypothetical weird game, I imagine it could trip up people trying
to debug a game who are looking at the CR contents. And the code
is just plain confusing. (blr clearly doesn't write to CRs!)
This is done entirely through interpreter fallbacks. It would
probably be possible to implement this using host exception
handlers instead, but I think it would be a lot of complexity
for a rarely used feature, so let's not do it for now.
For performance reasons, there are two settings for this feature:
One setting which does enables just what True Crime: New York City
needs and one setting which enables it all. The latter makes
almost all float instructions fall back to the interpreter.
SPDX standardizes how source code conveys its copyright and licensing
information. See https://spdx.github.io/spdx-spec/1-rationale/ . SPDX
tags are adopted in many large projects, including things like the Linux
kernel.
When the interpreter writes to a discarded register, its type
must be changed so that it is no longer considered discarded.
Fixes a 62ce1c7 regression.
This commit adds a new "discarded" state for registers.
Discarding a register is like flushing it, but without
actually writing its value back to memory. We can discard
a register only when it is guaranteed that no instruction
will read from the register before it is next written to.
Discarding reduces the register pressure a little, and can
also let us skip a few flushes on interpreter fallbacks.
The output of instructions like fabsx and ps_sel is store-safe
if and only if the relevant inputs are. The old code was always
marking the output as store-safe if the output was a single,
and never otherwise.
Also, the old code was treating the output of psq_l/psq_lu as
store-safe, which seems incorrect (if dequantization is disabled).
Converts the remaining PowerPC code over to fmt-capable logging.
Now, all that's left to convert over are the lingering remnants within
the frontend code.
The idea of this code was to not unroll loops, but it was completely broken.
So we've unrolled all loops, but only up to the second iteration.
Honestly, a better check would test if we branch to code which is already in the compiling block. But this is out of scope for now.
But testing shows that this unrolling actually improve the performance. So instead of fixing this bug, this check can be dropped.