dolphin

mirror of https://github.com/dolphin-emu/dolphin.git synced 2026-08-03 02:02:57 -05:00

Author	SHA1	Message	Date
LC	fa91b47863	Merge pull request #9054 from sepalani/hle-cleanup HLE cleanup	2020-09-07 22:36:19 -04:00
Jordan Woyak	0a63340c20	Merge pull request #9037 from shuffle2/code-cleanup Code cleanup	2020-08-30 19:43:23 -05:00
Sepalani	4c75b96254	HLE: Improve naming Replace 'function' with 'hook' when appropriate	2020-08-28 20:29:05 +04:00
Kate	5981a1929d	Add support for FreeBSD/arm64	2020-08-27 21:54:04 +01:00
Shawn Hoffman	938fd4e438	use constexpr for some compile-time expressions	2020-08-23 13:57:05 -07:00
Shawn Hoffman	79f5ea0474	initialize some variables which need to be	2020-08-23 13:57:05 -07:00
Tilka	a161e58591	Merge pull request #8914 from JosJuice/jit64-low-dcbz Jit64: Implement low DCBZ hack	2020-08-08 21:19:16 +01:00
JosJuice	76228fa482	Jit64: Implement low DCBZ hack I was hoping this would improve the performance of Cars 2 by avoiding interpreter fallbacks, but it doesn't seem to have made any measurable impact.	2020-08-08 22:03:34 +02:00
Tilka	3101d957b6	Merge pull request #8886 from JosJuice/stack-check-instruction PatchEngine: Attempt to fix crash in IsStackSane	2020-08-08 20:59:48 +01:00
Tilka	76b955e090	Merge pull request #8940 from RenaKunisaki/master add Break On Hit and Log On Hit for instruction breakpoints	2020-08-08 19:46:10 +01:00
Tilka	6d0bc03e00	Merge pull request #8992 from Sintendo/fselx-avx Jit64: Avoid unnecessary MOVAPS instructions	2020-08-08 19:38:48 +01:00
JosJuice	8b4f16a310	JitArm64: Avoid double rounding in fctiwzx FCVT doesn't necessarily round to zero, so the result might be inaccurate if we use it. To ensure correct rounding, we use FCVTS from double FPR to 32-bit GPR. Unfortunately, FCVTS can't do double FPR to single FPR.	2020-08-07 22:44:04 +02:00
Sintendo	08bdeefe05	Jit64AsmCommon: Use AVX in ConvertDoubleToSingle Using AVX we can eliminate another MOVAPS instruction here. Before: 0F 28 C8 movaps xmm1,xmm0 66 0F DB 0D CF 2C 00 00 pand xmm1,xmmword ptr [1F8D283B220h] After: C5 F9 DB 0D D2 2C 00 00 vpand xmm1,xmm0,xmmword ptr [271835FB220h]	2020-08-02 18:07:47 +02:00
Sintendo	31755bc13a	Jit64: fselx - Optimize SSE4.1 packed Pretty much the same optimization we did for AVX, although slightly more constrained because we're stuck with the two-operand instruction where destination and source have to match. We could also specialize the case where registers b, c, and d are all distinct, but I decided against it since I couldn't find any game that does this. Before: 66 0F 57 C0 xorpd xmm0,xmm0 66 41 0F C2 C1 06 cmpnlepd xmm0,xmm9 41 0F 28 CE movaps xmm1,xmm14 66 41 0F 38 15 CC blendvpd xmm1,xmm12,xmm0 44 0F 28 F1 movaps xmm14,xmm1 After: 66 0F 57 C0 xorpd xmm0,xmm0 66 41 0F C2 C1 06 cmpnlepd xmm0,xmm9 66 45 0F 38 15 F4 blendvpd xmm14,xmm12,xmm0	2020-07-29 17:28:48 +02:00
Sintendo	afb86a12ab	Jit64: fselx - Optimize AVX packed For the packed variant, we can skip the final MOVAPS and write the result directly into the destination register. Before: 66 0F 57 C0 xorpd xmm0,xmm0 66 41 0F C2 C1 06 cmpnlepd xmm0,xmm9 C4 C3 09 4B CC 00 vblendvpd xmm1,xmm14,xmm12,xmm0 44 0F 28 F1 movaps xmm14,xmm1 After: 66 0F 57 C0 xorpd xmm0,xmm0 66 41 0F C2 C1 06 cmpnlepd xmm0,xmm9 C4 43 09 4B F4 00 vblendvpd xmm14,xmm14,xmm12,xmm0	2020-07-29 17:06:52 +02:00
Sintendo	a52774ca63	Jit64: fselx - Add AVX path AVX has a four-operand VBLENDVPD instruction, which allows for the first input and the destination to be different. By taking advantage of this, we no longer need to copy one of the inputs around and we can just reference it directly, provided it's already in a register (I have yet to see this not be the case). Before: 66 0F 57 C0 xorpd xmm0,xmm0 F2 41 0F C2 C6 06 cmpnlesd xmm0,xmm14 41 0F 28 CE movaps xmm1,xmm14 66 41 0F 38 15 CA blendvpd xmm1,xmm10,xmm0 F2 44 0F 10 F1 movsd xmm14,xmm1 After: 66 0F 57 C0 xorpd xmm0,xmm0 F2 41 0F C2 C6 06 cmpnlesd xmm0,xmm14 C4 C3 09 4B CA 00 vblendvpd xmm1,xmm14,xmm10,xmm0 F2 44 0F 10 F1 movsd xmm14,xmm1	2020-07-28 23:17:18 +02:00
Rena Kunisaki	a553f22385	Add Break On Hit and Log On Hit for instruction breakpoints	2020-07-11 13:38:58 -04:00
MerryMage	a10447eae2	JitArm64_Paired: Fix ps_msub when d == b	2020-07-01 20:11:54 +01:00
Tillmann Karras	a04ac23794	JitArm64: no intermediate rounding for paired FMA	2020-07-01 00:24:08 +01:00
Tillmann Karras	2a46c1f86f	JitArm64: annotate intentional fallthrough	2020-07-01 00:10:15 +01:00
OatmealDome	089ffb9ef4	JitArm64: Don't assume fastmem arena is available	2020-06-29 00:42:56 -04:00
JosJuice	364ef76ba1	PatchEngine: Attempt to fix crash in IsStackSane HostIsInstructionRAMAddress uses XCheckTLBFlag::OpcodeNoException, so we should also use XCheckTLBFlag::OpcodeNoException when reading, to ensure that we use the IBAT (as opposed to the DBAT) for both.	2020-06-18 11:57:00 +02:00
Pierre Bourdon	dd1fc711c7	PowerPC: partially implement thermal related SPRs Doesn't support triggering interrupts when the thermal threshold is exceeded, but allows polling for temperature information. The THRM[123] registers are documented in most PPC datasheets, see e.g. this PPC750CX one: http://datasheets.chipdb.org/IBM/PowerPC/750/750cx_um3-17-05.pdf	2020-06-18 07:37:44 +02:00
Jun Su	bb75050f68	Jit: fix warning -Winvalid-offsetof Remove the warning: warning: offsetof within non-standard-layout type ‘JitBlock’ is conditionally-supported JitBlock contains non-trival types now. Split the fields with trival types that needs to be access from JIT code into JitBlockData structure.	2020-05-04 18:26:56 +02:00
Minty-Meeo	cc858c63b8	Configurable MEM1 and MEM2 sizes at runtime via Dolphin.ini Changed several enums from Memmap.h to be static vars and implemented Get functions to query them. This seems to have boosted speed a bit in some titles? The new variables and some previously statically initialized items are now initialized via Memory::Init() and the new AddressSpace::Init(). s_ram_size_real and the new s_exram_size_real in particular are initialized from new OnionConfig values "MAIN_MEM1_SIZE" and "MAIN_MEM2_SIZE", only if "MAIN_RAM_OVERRIDE_ENABLE" is true. GUI features have been added to Config > Advanced to adjust the new OnionConfig values. A check has been added to State::doState to ensure savestates with memory configurations different from the current settings aren't loaded. The STATE_VERSION is now 115. FIFO Files have been updated from version 4 to version 5, now including the MEM1 and MEM2 sizes from the time of DFF creation. FIFO Logs not using the new features (OnionConfig MAIN_RAM_OVERRIDE_ENABLE is false) are still backwards compatible. FIFO Logs that do use the new features have a MIN_LOADER_VERSION of 5. Thanks to the order of function calls, FIFO logs are able to automatically configure the new OnionConfig settings to match what is needed. This is a bit hacky, though, so I also threw in a failsafe for if the conditions that allow this to work ever go away. I took the liberty of adding a log message to explain why the core fails to initialize if the MIN_LOADER_VERSION is too great. Some IOS code has had the function "RAMOverrideForIOSMemoryValues" appended to it to recalculate IOS Memory Values from retail IOSes/apploaders to fit the extended memory sizes. Worry not, if MAIN_RAM_OVERRIDE_ENABLE is false, this function does absolutely nothing. A hotfix in DolphinQt/MenuBar.cpp has been implemented for RAM Override.	2020-04-28 12:10:50 -05:00
Lioncash	ee200d09eb	Jit64/Jit64_Tables: Construct tables at compile-time Utilizing constexpr, we can eliminate the need to construct the tables at runtime and just do all the work at compile-time. Making for less moving parts overall. The general structure is more or less the same, however rather than one single initialization function, each table is built off an immediately executed lambda function. This is nice, since it narrows the scope of the table building logic down to the tables that actually need it.	2020-04-28 17:12:24 +02:00
Sintendo	19dda51a0d	Jit64: subfx - Use LEA when possible Similar to what we do for addx. Since we're calculating b - a and because subtraction is not communitative, we can only apply this when source register a holds the constant. Before: 45 8B EE mov r13d,r14d 41 83 ED 08 sub r13d,8 After: 45 8D 6E F8 lea r13d,[r14-8]	2020-04-21 22:45:47 +02:00
Sintendo	89646c898f	Jit64: addx - Skip ADD after MOV when possible We can get away with skipping the addition when we know we're dealing with a constant zero. Just a MOV will suffice in this case. Once again, we don't bother to add separate handling for when overflow is needed, because no titles would ever hit that path during my testing. Before: 8B 7D F8 mov edi,dword ptr [rbp-8] 83 C7 00 add edi,0 After: 8B 7D F8 mov edi,dword ptr [rbp-8]	2020-04-21 22:45:47 +02:00
Sintendo	50f7a7d248	Jit64: addx - Prefer smaller MOV+ADD sequence ADD has a smaller encoding for immediates that can be expressed as an 8-bit signed integer (in other words, between -128 and 127). MOV lacks this compact representation. Since addition allows us to swap the source registers, we can always get the shortest sequence here by carefully checking if we're dealing with a small immediate first. If we are, move the other source into the destination and add the small immediate onto that. For large immediates the reverse is preferrable. Before: 41 BE 40 00 00 00 mov r14d,40h 44 03 75 A8 add r14d,dword ptr [rbp-58h] After: 44 8B 75 A8 mov r14d,dword ptr [rbp-58h] 41 83 C6 40 add r14d,40h Before: 44 8B 7D F8 mov r15d,dword ptr [rbp-8] 41 81 C7 00 68 00 CC add r15d,0CC006800h After: 41 BF 00 68 00 CC mov r15d,0CC006800h 44 03 7D F8 add r15d,dword ptr [rbp-8]	2020-04-21 22:42:02 +02:00
Sintendo	2481660519	Jit64: addx - Emit MOV when possible When the source registers are a simple register and a constant zero and overflow isn't needed, emitting LEA is kinda silly. This will occasionally save a single byte for certain registers due to how x86 encoding works. More importantly, LEA takes up execution resources while MOV does not. Before: 41 8D 7D 00 lea edi,[r13] After: 41 8B FD mov edi,r13d	2020-04-21 22:36:20 +02:00
Sintendo	1c25e6352a	Jit64: addx - Emit nothing when possible When the destination register matches a source register, the other source register contains zero, and overflow isn't needed, the instruction becomes a nop and we don't need to emit anything. We could add specialized handling for the case where overflow is needed, but none of the titles I tried would hit this path. Before: 83 C7 00 add edi,0 After:	2020-04-21 22:35:17 +02:00
Sintendo	f1c3ab359d	Jit64: addx - Deduplicate branches part 2 No functional change, just simplify some repeated logic in the case where we're dealing with exactly one immediate and one simple register when overflow isn't needed.	2020-04-21 22:06:46 +02:00
Sintendo	72fbdf1a6b	Jit64: addx - Deduplicate branches part 1 No functional change, just simplify some repeated logic for the cases where the destination register matches one of the sources.	2020-04-21 22:06:39 +02:00
container1234	75a69b1145	Breakpoints: Fix crash after clearing all memory breakpoints	2020-03-14 21:57:09 +09:00
Tilka	e323f47ceb	Merge pull request #8472 from degasus/jitsetting Core/Jits: Adds an option to disable the register cache.	2020-02-08 13:49:33 +00:00
Techjar	a106c99826	Jit64: Don't use PEXT in DoubleToSingle on AMD Zen This was causing severe slowdown in some games.	2020-01-26 22:10:46 -05:00
Tilka	709862b818	Merge pull request #8120 from MerryMage/cdts Jit64: Make DoubleToSingle a common asm routine	2020-01-25 19:10:37 +00:00
Connor McLaughlin	efc1ee8e6a	Merge pull request #8537 from degasus/fastmem Core/HW -> PowerPC/JIT: Fastmem arena construction	2020-01-14 09:38:15 +10:00
Tilka	98f645daac	Merge pull request #8158 from Sintendo/jitopts x64 micro-optimizations	2020-01-06 14:09:43 +01:00
Sintendo	12fcbac2a3	Jit64: addx - Emit LEA for register + immediate Prefer LEA over MOV + ADD when dealing with immediates. Before: 44 8B EE mov r13d,esi 41 83 C5 20 add r13d,20h After: 44 8D 6E 20 lea r13d,[rsi+20h]	2020-01-05 23:39:13 +01:00
Sintendo	8e7b6f4178	Jit64: addx - Prefer ADD over LEA when possible The old logic would always emit LEA when both sources are in a register and OE is disabled. However, ADD is still preferable when one of the sources matches the destination. Before: 45 8D 6C 35 00 lea r13d,[r13+rsi] After: 44 03 EE add r13d,esi	2020-01-05 23:23:56 +01:00
David Korth	c2dd2e8a2e	Use std::istringstream or std::ostringstream instead of std::stringstream where possible. This removes std::iostream from the inheritance chain, which reduces overhead slightly.	2019-12-29 23:45:02 -05:00
David Korth	9f3b9acad9	PowerPC.cpp: No need to explicitly initialize ppcState. "ppcState{}" is stored in the .data segment, which means the full ~4 MB is stored in the executable. "ppcState" is stored in the .bss segment, which means it only stores a note that tells it to allocate and zero ~4 MB at runtime.	2019-12-29 23:45:02 -05:00
degasus	aad8aab698	Jit64: Disable the fast address check if fastmem is disabled. This was a huge speedup with disabled fastmem, but it still requires the fastmem arena. So let's disable it for now, even if this commit has a huge performance hit with disabled fastmem.	2019-12-28 13:41:57 +01:00
degasus	d735943aa2	Jit64: Use safe memory helpers for psq_l* without fastmem. RMEM won't help if there is no fastmem arena, so let's use our memory helpers.	2019-12-28 13:41:57 +01:00
degasus	74cb692591	Jit64: Only activate dcbz fastpath with fastmem. The code is safe not to create memory errors, but it accesses the fastmem area.	2019-12-28 13:41:57 +01:00
degasus	c6019f9814	PowerPC/Jit: Create fastmem arena on init.	2019-12-28 13:41:57 +01:00
degasus	9d88180df7	MMU: Use the Memory helpers for physical memory. physical_base is a fastmem helper. Its access is unsafe and might not be available without a Jit.	2019-12-28 12:57:51 +01:00
Stenzek	d744c5a148	Compile fixes for Windows-on-ARM64	2019-12-28 19:20:41 +10:00
Léo Lam	3cf2857aac	Merge pull request #8520 from lioncash/analyst-tidy PowerPC/PPCAnalyst: Remove unimplemented LogFunctionCall prototype	2019-12-15 12:07:38 +01:00

1 2 3 4 5 ...

2254 Commits