-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[D3D9] Rayman Origins: Mostly missing Electoon hair in World 1 Finale #2545
Comments
DadSchoorse
added a commit
to DadSchoorse/dxvk
that referenced
this issue
Mar 16, 2022
DadSchoorse
added a commit
to DadSchoorse/dxvk
that referenced
this issue
Mar 16, 2022
That PR fixes the issue for me on nvidia, please test. |
DadSchoorse
added a commit
to DadSchoorse/dxvk
that referenced
this issue
Mar 16, 2022
DadSchoorse
added a commit
to DadSchoorse/dxvk
that referenced
this issue
Mar 16, 2022
misyltoad
pushed a commit
that referenced
this issue
Mar 16, 2022
TheIronWolfModding
added a commit
to TheIronWolfModding/dxvk
that referenced
this issue
Apr 23, 2022
…s unmap (Thanks, k0bin) (#7) * [dxvk] Introduce DxvkStagingBuffer * [d3d11] Use DxvkStagingBuffer in D3D11DeviceContext * [dxvk] Use staging buffer for gamma ramp uploads * [dxvk] Remove unused updateImage function * [dxvk] Use DxvkStagingBuffer in DxvkContext * [dxvk] Remove DxvkStagingDataAlloc Unused and overly clunky. * [dxvk] Allow large sysmem allocations on 64-bit platforms again Since we frequently discard staging buffers now, having larger chunks is actually beneficial again. * [dxvk] Introduce transient memory flag for staging buffers Potentially reduces fragmentation by putting short-lived staging buffers and sysmem resources created by the application into different memory pools. * [dxvk] Rework HUD font texture initialization We really shouldn't need a separate context for this. * [dxvk] Remove unused trimStagingBuffers method * [dxvk] Don't suballocate large staging buffer allocations Otherwise we'll risk wasting almost half the staging buffer memory. Creating a temporary buffer is cheap enough, so just do that. * [dxvk] Reduce context staging buffer size to 4 MiB Same idea as before, just create a temporary buffer for larger resources. This can avoid frequent Vulkan memory allocations and deallocations since many small buffers are more likely to fit into a single memory chunk than a small number of large buffers, thus reducing the overall memory footprint. * [dxvk] Add stat counter for pipeline barriers * [dxvk] Display barrier count in draw call HUD item * [dxvk] Introduce DxvkDevice::waitForResource Blocks on the queue thread's condition variable instead of busy-waiting, and tracks synchronization with new stat counters. Cleanup is rearranged to minimize delays before signals and resources are notified. * [d3d11] Use new waitForResource method * [d3d9] Use new waitForResource method * [hud] Display GPU synchronization in HUD * [d3d11] Add d3d11.maxImplicitDiscardSize option * [util] Bump maxImplicitDiscardSize for Quantum Break Otherwise we're synchronizing and frame times are garbage. * [util] Enable apitrace mode for Nier Replicant Game is broken and reads back dynamic vertex/index buffers over PCI-E. * [d3d11] Handle subresource field in copy/move operations Derp. * [d3d11] Enable stall tracking for timestamp queries Because games are dumb and don't understand that the GPU doesn't work synchronously with the render thread. * [d3d11] Add implicit flush after tracking sequence numbers Flushing early when using a tracked resource may reduce stalls. * [dxvk] Repurpose initImage method This is now supposed to clear images of any type, and only to be used for resource initialization after creation. * [d3d11] Use initImage to clear uninitialized image resources * [d3d9] Use initImage to clear uninitialized image resources * [dxvk] Remove unused clear methods * [dxvk] Add command buffer parameter to cmdFillBuffer * [dxvk] Add initBuffer method * [d3d11] Use initBuffer method * [d3d9] Use initBuffer method * [dxvk] Track buffer as used in initBuffer Git ate my commit when I was testing something... * [d3d11] Consider empty CS chunks when tracking resources Avoids deadlocks if we track multiple resources and flush in between. * [dxvk] Free existing staging buffer before creating a new one * [d3d11] Set zero stride when binding null vertex buffer * [dxvk] Remove null check when setting vertex stride Move the responsibility to the front-end instead. * [dxvk] Don't use spinlocks for CS chunk pool No reason to anymore since SRWLocks are fast enough here. * [dxvk] Introduce lock-free list * [dxvk] Use lock-free list for graphics pipeline lookup And use a proper mutex if we do have to synchronize, so that we can avoid busy-waits. * [dxvk] Use lock-free list for compute pipeline lookup * [meta] Readme update. * [meta] Readme update * [meta] Readme update * [dxvk] Use lock-free list for render pass instances And replace the spin lock with a regular mutex. * [dxvk] Get rid of spinlock when allocating GPU events This is not performance-critical * [d3d9] Handle different mip chain lengths in UpdateTexture * [d3d9] Track last staging resource usage with a sequence number * [d3d9] Synchronize only to given sequence number in WaitForResources * [d3d9] Store buffer map mode in D3D9CommonBuffer * [d3d9] Add option to disable direct buffer mapping * [d3d9] Disable direct buffer mapping for RE games * [meta] Bump arch-mingw-github-action to v7 * [dxvk] Only mark transfer buffers as transient Otherwise we may accidentally catch things like uniform buffers as well. * [hud] Greatly simplify text rendering in the HUD * [hud] Greatly simplify frame time graph rendering * [tests] Allow includes when compiling HLSL shaders * [util] Add computeMipLevelOffset * [d3d9] Unify texture uploads * [d3d9] Rename WrittenByGPU to NeedsReadback * [d3d9] Clean up texture locking We had two code paths that largely did the same. * [d3d9] Don't set NeedsReadback for POOL_SYSMEM textures ... or in SetRenderTarget because we always do readback for render targets. * Revert "[dxvk] Use lock-free list for graphics pipeline lookup" This reverts commit 67e2ee1. TIW: causes 20% perf drop. * [vr] Upddate VR sync calls to new way of syncing. Still unsure if seq nr used is correct. * [meta] Version bump. * [d3d11] Apply apitrace mode to image upload buffers * [dxvk] Use lock-free list for graphics pipeline lookup And use a proper mutex if we do have to synchronize, so that we can avoid busy-waits. * test * [vr] Comment upddate. * [d3d11] Use appropriate memory types for directly mapped images * [d3d11] Introduce d3d11.maxDynamicImageBufferSize option * [meta] Update example config file * [util] Set maxDynamicImageBufferSize for Total War: Warhammer III Massively increases performance since the game otherwise keeps uploading a huge 48 MiB texture in every frame. * [meta] Formatting. * [d3d9] Fix synchronization after readback * [d3d9] Fix sysmem readback * [dxvk] Fork lockless List<T> into main and GTR2 specific versions. * [meta] Readme update. * Update README.md * Update README.md * Update README.md * [d3d11] Replace apitrace mode option with something more granular And enable it only for vertex and index buffers in Nier Replicant. * [util] Enable cached vertex and index buffers for FFXIV Fixes some weird performance issues on the Garlemald map. Doesn't seem to affect performance in other areas. * [meta] Update example configuration file * [dxvk] Invalidate buffer in clearBuffer if possible * [util] Use CPU-cached constant buffers for Anno 1800 Sigh. * [util] Enable cached vertex and index buffers for The Evil Within Large performance win. * [meta] Update README * [meta] Release 1.10 * [d3d10] Forward OpenSharedResource to D3D11 implementation Trivial since the requested IID is passed by the application. * [d3d9] Fix default initialization of some state values If we end up being the same as what we are, we don't dirty initially. * [d3d9] Set initial dirty state flags We had a bug where initial state values caused the data to not get sent to the backend. Let's fix that going forward and dirty everything we possibly can on device creation. * [build] Enable -Wimplicit-fallthrough I got bit by this in D3D9. * [d3d9] Fix fallthroughs in PickFormats * [dxso] Fix ExpP fallthrough * [dxso] Fix potential fallthrough in RasterizerOut * [d3d9] Add fallthrough comment to SetRenderState Silences a warning * [d3d11] Add fallthrough comment to PickFormats Silences a warning * [dxbc] Use new [[fallthrough]] attribute * [dxvk] Use new [[fallthrough]] attribute * [dxso] Fix ExpP instruction on Shader Model 2+ * [dxvk] Enable VK_KHR_EXTERNAL_MEMORY_WIN32 if available. * [dxvk] Add shared handle access to DxvkImage memory. Based off preliminary work from Josh. * [util] Add helpers for shared resource metadata access. * [d3d11] Add support for shared ID3D11Texture2D resources. * [d3d9] Add support for shared IDirect3DTexture9 resources. * [d3d11] Explicitly handle R32-compatible UAV formats * [d3d11] Reimplement R11G11B10 UAV clears without R32 views * [d3d9] Fix shared handle check for exporting images Co-authored-by: Philip Rebohle <philip.rebohle@tu-dortmund.de> * [d3d9] Fix texture formats that can be exported * [dxvk] Force dedicated allocation for exportable images The Nvidia driver does not set prefers-/requiresDedicatedAllocation for exportable images on its own. This makes DXVK ignore the dedicated allocation struct ptr which also contains VkExportMemoryAllocateInfo or VkImportMemoryWin32HandleInfoKHR. * [d3d9] Mark backend image as shared for shared resources Otherwise, the backend may not transition the image to the correct layout after each submission. * [util] Add another weeb game to the list of workarounds Sophie is apparently D3D9 an we already have Lydie and Suelle in there, so it's just this on missing from that series. * [dxvk] Normalize color write masks for non-RGBA formats * [d3d9] Fix CS thread synchronization for directly mapped buffers * [d3d9] Update texture sequence number AFTER using it * [d3d9] Update buffer seq number in FlushBuffer How did I miss this?! * [dxgi] Add DXVK_ENABLE_NVAPI envvar Add a new environment variable DXVK_ENABLE_NVAPI as an environment-level override for 'nvapiHack'. This will allow for DLSS (and other NvAPI-backed features) to be available without the user manually writing a configuration file, allowing for more seamless integration with Proton's launch script. * [dxvk] Fix color write mask normalization Previously we'd set too many bits by accident here. Also, we should not modify partial write masks to include unnecessary bits. Only do this if we can actually promote to a full write mask for consistency. * [d3d9] Don't expose D32 format Not supported anywhere except REF device it seems... *sigh* Supercedes: doitsujin#2547 * [dxso] Implement zerowins for Lerp. Fixes doitsujin#2545. * [dxso] Emit spirv OpCross if we can. * [build] Cleanup build system. No changes except dropping support for msvc before 15.3. * [d3d11] Fix D3D11UserDefinedAnnotation declaration Mark it as final too. * [dxvk] Define IDXVKUserDefinedAnnotation Something common to share for perf markers between D3D9 and D3D11. Inherits from the public D3D11 interface. * [util] Move DecodeD3DColor to util This will be used in the D3D11UserDefinedAnnotation implementation to handle PIX calls which contain a color. * [d3d11] Use IDXVKUserDefinedAnnotation * [d3d9] Implement D3D9GlobalAnnotationList * [d3d9] Implement D3D9UserDefinedAnnotation * [d3d9] Add hidden exports for registering annotations Adds DXVK_RegisterAnnotation at ordinal 28257 and DXVK_UnRegisterAnnotation at ordinal 28258. * [d3d11] Register annotation interfaces with D3D9 Some apps try use the D3DPERF_ functions for debug markers/annotations. This utilizes the DXVK_RegisterAnnotation hidden functions to share the interfaces. Co-authored-by: Oleg Kuznetsov <okouznetsov@nvidia.com> * [dxvk] Add a config option to enable debug utils in addition to DXVK_PERF_EVENTS=1 * [d3d9] Fix Visual Studio build to resolve 'operator !=' is ambiguous error for RECT * [util] Enable cached dynamic resources for AC3 and AC4 Without it, AC3 chugs along at 40 FPS on my 5950X. * [dxbc] Support switch-case fallthrough Apparently this is a thing in Shader Model 4, although FXC cannot emit it. * [d3d11] Use smart pointer for swap chain back buffer * [d3d11] Get strong reference to swap chain in swap chain back buffers * [dxgi] Work around swapchain use-after-free bugs Affects Divinity: Original Sin Enhanced Edition. Requires Wine hack to delay memory deallocation to not crash during resolution changes. * [dxbc] Handle fallthrough around default properly * [dxvk] Zero-initialize newly allocated buffer slices on creation Fixes random flicker in God of War. Since patch 1.0.9, the game's lighting system relies on MAP_DISCARD returning a zero-initialized memory slices for its constant buffers, or some lights would get skipped in various compute passes. Changing the memset to e.g. write 0xFF instead of 0 shows this issue. * [util] Set frame latency to 1 for God of War Frame pacing is horrible otherwise, as of the 1.0.9 update. * [d3d9] UpdateTexture: Handle automatic mip gen properly * [dxbc] Generate smallest possible vectors for local arrays FXC is buggy and always emits vec4 in the array declaration, so we'll have to analyze the used components ourselves. * [dxbc] Only emit temp array range check for dynamically indexed stores Generates less code and makes things slightly more readable. * [dxbc] Actually do the skip range check thing properly Turns out the first attempt only worked because my test case didn't do any dynamically indexed stores at all, but broke everything else. Oops. * [util] Enable d3d9.deferSurfaceCreation for Stranger of Paradise FFO Reportedly required for VRR to work. Game still doesn't work here. * [d3d9] Disable culling when the app passes an invalid value * [d3d9] Remove evictManagedOnUnlock This is annoying to maintain and hopefully won't be necessary anymore. * [d3d9] Use regular memory for managed data * [d3d9] Use memory mapped files for managed data * [d3d9] Count frames This will be used to identify managed textures that have not been used for a while. * [d3d9] Unmap unused textures after some time * [d3d9] Add HUD item for managed memory * [d3d9] Make unmapping delay configurable * [d3d9] Make unmapping more aggressive for Borderlands * [util] Enable cached constant buffers for Frostpunk Massively improves CPU-bound performance. * [meta] Release 1.10.1 * [util] Enable d3d9.deferSurfaceCreation for Atelier Sophie 2 2022 and K-T are still using D3D9 for video stuff. * [dxvk] Filter out unnecessary access flags when recording barriers Rationale is as follows: - srcAccess never needs to contain read flags, since any memory being read must have been made visible before by a write operation - dstAccess is only relevant if srcAccess contains a write, because reads alone cannot modify memory and thus do not require making the same memory available again. An exception are layout transitions. Doesn't really change performance in anything as far as I can tell, but we avoid some unnecessary UBO cache flushes in compute-heavy scenarios. * [d3d11] Always export correct shared handle type from ::GetSharedHandle and ::CreateSharedHandle Before we just assumed that the calls here would match the corresponding flag value (D3D11_RESOURCE_MISC_SHARED -> ::GetSharedHandle, D3D11_RESOURCE_MISC_SHARED_NTHANDLE -> ::CreateSharedHandle), but it turns out that its possible to set both flags and use both methods. Now we always tell Vulkan to export a KMT handle if D3D11_RESOURCE_MISC_SHARED is present, and use openKmtHandle to get an NT handle when needed. * [util] Limit Limbo to 60 fps Fixes: doitsujin#2564 * [meta] Move apitrace guide over to dxvk repo * [dxvk] GPU query reset path Require VK_EXT_host_query_reset instead. This fallback path is untested nowadays and too slow to be useful. * [d3d11] Only apply anisotropy override to linear samplers Mirrors D3D9, more or less. * [util] Fix typo in app profiles Accidentally broke everything. * [util] Force sampler type spec const for Star Wars TFU2 The game tries to binda 2D texture to a slot that is declared as a 3D texture in the shader. This causes one particle effect to be completely black because DXVK does not bind the texture * [d3d9] Filter window messages when processing WM_ACTIVATEAPP. * [d3d9] Ignore multiple app activation window messages. * [d3d9] Calculate slice alignment when uploading straight from the mapping buffer * [util] Use cached constant buffers for Armored Warfare * [util] correct enableDebugUtils conf to default False * [util] remove allowLockFlagReadonly from conf * [spirv] Implement faster in-memory compression for shaders Seems to be anything up to 3x as fast to decode than the previous code, with the compression ratio being slightly worse. Encoding seems faster as well. * [dxvk] Introduce new way to create DxvkShader objects * [dxvk] Use new DxvkShaderCreateInfo struct to retrieve shader info * [dxvk] Use new DxvkShader constructor for swap chain shaders * [dxvk] Use new DxvkShader constructor for HUD shaders * [dxbc] Use new DxvkShader constructor * [dxso] Use new DxvkShader constructor * [d3d9] Use new DxvkShader constructor for fixed-function shaders * [d3d9] Use new DxvkShader constructor for SWVP emulation * [d3d9] Use new DxvkShader constructor for format conversion * [d3d11] Use new DxvkShaderCreateInfo struct to retrieve shader info * [d3d11] Use new DxvkShader constructor for video shaders * [dxvk] Remove old shader creation code * [d3d9+util] Enable invariant position by default * [util] Implement thread helpers on non-Windows platforms * [util] Add strlcpy helper strncpy is not safe. * [d3d9] Use strlcpy helper * [util] Implement env helpers on non-Windows platforms * [util] Add missing include to thread.h * [d3d9] Remove evictManagedOnUnlock This is annoying to maintain and hopefully won't be necessary anymore. * [d3d9] Use regular memory for managed data * [d3d9] Use memory mapped files for managed data * [d3d9] Count frames This will be used to identify managed textures that have not been used for a while. * [d3d9] Unmap unused textures after some time * [d3d9] Add HUD item for managed memory * [d3d9] Make unmapping delay configurable * [d3d9] Make unmapping more aggressive for Borderlands * [d3d9] Only bind depth buffer if the depth or stencil test is enabled * [d3d9] Only bind RT if we actually write to it The alternative render path for shadow maps in Dead Space relies on this. * [dxbc] Implement range check for private array reads We already do this for stores. Co-authored-by: Philip Rebohle <philip.rebohle@tu-dortmund.de> Co-authored-by: Robin Kertels <robin.kertels@gmail.com> Co-authored-by: Joshua Ashton <joshua@froggi.es> Co-authored-by: Oschowa <oschowa@web.de> Co-authored-by: Derek Lesho <dlesho@codeweavers.com> Co-authored-by: Liam Middlebrook <lmiddlebrook@nvidia.com> Co-authored-by: Georg Lehmann <dadschoorse@gmail.com> Co-authored-by: Oleg Kuznetsov <okouznetsov@nvidia.com> Co-authored-by: Paul Gofman <gofmanp@gmail.com> Co-authored-by: Blisto91 <47954800+Blisto91@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In Rayman Origins, every single world has a finale at the end. In the world 1 finale, the hair from little creatures known as Electoons is supposed to grow and create a pathway for the player to traverse to get to the end. When using DXVK, this hair is mostly missing, but little chunks of it are still visible.
DXVK
Correct Behavior (Achieved with dgVoodoo2)
Software information
Rayman Origins (207490), default settings
System information
Apitrace file(s)
Rayman Origins.trace
Log files
Rayman Origins_d3d9.log
The text was updated successfully, but these errors were encountered: