Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[D3D9] Rayman Origins: Mostly missing Electoon hair in World 1 Finale #2545

Closed
Sterophonick opened this issue Mar 16, 2022 · 2 comments · Fixed by #2546
Closed

[D3D9] Rayman Origins: Mostly missing Electoon hair in World 1 Finale #2545

Sterophonick opened this issue Mar 16, 2022 · 2 comments · Fixed by #2546

Comments

@Sterophonick
Copy link

Sterophonick commented Mar 16, 2022

In Rayman Origins, every single world has a finale at the end. In the world 1 finale, the hair from little creatures known as Electoons is supposed to grow and create a pathway for the player to traverse to get to the end. When using DXVK, this hair is mostly missing, but little chunks of it are still visible.

DXVK

image

Correct Behavior (Achieved with dgVoodoo2)

image

Software information

Rayman Origins (207490), default settings

System information

  • GPU: Mesa Intel CometLake-U GT2 [UHD Graphics]
  • Driver: Mesa 21.3.7
  • Wine version: GE-Proton7-9
  • DXVK version: 89b1f02 (with async)

Apitrace file(s)

Rayman Origins.trace

Log files

Rayman Origins_d3d9.log

DadSchoorse added a commit to DadSchoorse/dxvk that referenced this issue Mar 16, 2022
DadSchoorse added a commit to DadSchoorse/dxvk that referenced this issue Mar 16, 2022
@DadSchoorse
Copy link
Contributor

That PR fixes the issue for me on nvidia, please test.

@Sterophonick
Copy link
Author

That PR fixes the issue for me on nvidia, please test.

That fixed it! Fantastic work!
image

DadSchoorse added a commit to DadSchoorse/dxvk that referenced this issue Mar 16, 2022
DadSchoorse added a commit to DadSchoorse/dxvk that referenced this issue Mar 16, 2022
misyltoad pushed a commit that referenced this issue Mar 16, 2022
TheIronWolfModding added a commit to TheIronWolfModding/dxvk that referenced this issue Apr 23, 2022
…s unmap (Thanks, k0bin) (#7)

* [dxvk] Introduce DxvkStagingBuffer

* [d3d11] Use DxvkStagingBuffer in D3D11DeviceContext

* [dxvk] Use staging buffer for gamma ramp uploads

* [dxvk] Remove unused updateImage function

* [dxvk] Use DxvkStagingBuffer in DxvkContext

* [dxvk] Remove DxvkStagingDataAlloc

Unused and overly clunky.

* [dxvk] Allow large sysmem allocations on 64-bit platforms again

Since we frequently discard staging buffers now, having larger chunks
is actually beneficial again.

* [dxvk] Introduce transient memory flag for staging buffers

Potentially reduces fragmentation by putting short-lived staging buffers
and sysmem resources created by the application into different memory pools.

* [dxvk] Rework HUD font texture initialization

We really shouldn't need a separate context for this.

* [dxvk] Remove unused trimStagingBuffers method

* [dxvk] Don't suballocate large staging buffer allocations

Otherwise we'll risk wasting almost half the staging buffer memory.
Creating a temporary buffer is cheap enough, so just do that.

* [dxvk] Reduce context staging buffer size to 4 MiB

Same idea as before, just create a temporary buffer for larger resources.

This can avoid frequent Vulkan memory allocations and deallocations since
many small buffers are more likely to fit into a single memory chunk than
a small number of large buffers, thus reducing the overall memory footprint.

* [dxvk] Add stat counter for pipeline barriers

* [dxvk] Display barrier count in draw call HUD item

* [dxvk] Introduce DxvkDevice::waitForResource

Blocks on the queue thread's condition variable instead of busy-waiting,
and tracks synchronization with new stat counters. Cleanup is rearranged
to minimize delays before signals and resources are notified.

* [d3d11] Use new waitForResource method

* [d3d9] Use new waitForResource method

* [hud] Display GPU synchronization in HUD

* [d3d11] Add d3d11.maxImplicitDiscardSize option

* [util] Bump maxImplicitDiscardSize for Quantum Break

Otherwise we're synchronizing and frame times are garbage.

* [util] Enable apitrace mode for Nier Replicant

Game is broken and reads back dynamic vertex/index buffers over PCI-E.

* [d3d11] Handle subresource field in copy/move operations

Derp.

* [d3d11] Enable stall tracking for timestamp queries

Because games are dumb and don't understand that the GPU doesn't
work synchronously with the render thread.

* [d3d11] Add implicit flush after tracking sequence numbers

Flushing early when using a tracked resource may reduce stalls.

* [dxvk] Repurpose initImage method

This is now supposed to clear images of any type, and only to be
used for resource initialization after creation.

* [d3d11] Use initImage to clear uninitialized image resources

* [d3d9] Use initImage to clear uninitialized image resources

* [dxvk] Remove unused clear methods

* [dxvk] Add command buffer parameter to cmdFillBuffer

* [dxvk] Add initBuffer method

* [d3d11] Use initBuffer method

* [d3d9] Use initBuffer method

* [dxvk] Track buffer as used in initBuffer

Git ate my commit when I was testing something...

* [d3d11] Consider empty CS chunks when tracking resources

Avoids deadlocks if we track multiple resources and flush in between.

* [dxvk] Free existing staging buffer before creating a new one

* [d3d11] Set zero stride when binding null vertex buffer

* [dxvk] Remove null check when setting vertex stride

Move the responsibility to the front-end instead.

* [dxvk] Don't use spinlocks for CS chunk pool

No reason to anymore since SRWLocks are fast enough here.

* [dxvk] Introduce lock-free list

* [dxvk] Use lock-free list for graphics pipeline lookup

And use a proper mutex if we do have to synchronize,
so that we can avoid busy-waits.

* [dxvk] Use lock-free list for compute pipeline lookup

* [meta] Readme update.

* [meta] Readme update

* [meta] Readme update

* [dxvk] Use lock-free list for render pass instances

And replace the spin lock with a regular mutex.

* [dxvk] Get rid of spinlock when allocating GPU events

This is not performance-critical

* [d3d9] Handle different mip chain lengths in UpdateTexture

* [d3d9] Track last staging resource usage with a sequence number

* [d3d9] Synchronize only to given sequence number in WaitForResources

* [d3d9] Store buffer map mode in D3D9CommonBuffer

* [d3d9] Add option to disable direct buffer mapping

* [d3d9] Disable direct buffer mapping for RE games

* [meta] Bump arch-mingw-github-action to v7

* [dxvk] Only mark transfer buffers as transient

Otherwise we may accidentally catch things like uniform buffers as well.

* [hud] Greatly simplify text rendering in the HUD

* [hud] Greatly simplify frame time graph rendering

* [tests] Allow includes when compiling HLSL shaders

* [util] Add computeMipLevelOffset

* [d3d9] Unify texture uploads

* [d3d9] Rename WrittenByGPU to NeedsReadback

* [d3d9] Clean up texture locking

We had two code paths that largely did the same.

* [d3d9] Don't set NeedsReadback for POOL_SYSMEM textures

... or in SetRenderTarget because we always do readback for render targets.

* Revert "[dxvk] Use lock-free list for graphics pipeline lookup"

This reverts commit 67e2ee1.

TIW: causes 20% perf drop.

* [vr] Upddate VR sync calls to new way of syncing.  Still unsure if seq nr used is correct.

* [meta] Version bump.

* [d3d11] Apply apitrace mode to image upload buffers

* [dxvk] Use lock-free list for graphics pipeline lookup

And use a proper mutex if we do have to synchronize,
so that we can avoid busy-waits.

* test

* [vr] Comment upddate.

* [d3d11] Use appropriate memory types for directly mapped images

* [d3d11] Introduce d3d11.maxDynamicImageBufferSize option

* [meta] Update example config file

* [util] Set maxDynamicImageBufferSize for Total War: Warhammer III

Massively increases performance since the game otherwise keeps
uploading a huge 48 MiB texture in every frame.

* [meta] Formatting.

* [d3d9] Fix synchronization after readback

* [d3d9] Fix sysmem readback

* [dxvk] Fork lockless List<T> into main and GTR2 specific versions.

* [meta] Readme update.

* Update README.md

* Update README.md

* Update README.md

* [d3d11] Replace apitrace mode option with something more granular

And enable it only for vertex and index buffers in Nier Replicant.

* [util] Enable cached vertex and index buffers for FFXIV

Fixes some weird performance issues on the Garlemald map. Doesn't seem
to affect performance in other areas.

* [meta] Update example configuration file

* [dxvk] Invalidate buffer in clearBuffer if possible

* [util] Use CPU-cached constant buffers for Anno 1800

Sigh.

* [util] Enable cached vertex and index buffers for The Evil Within
Large performance win.

* [meta] Update README

* [meta] Release 1.10

* [d3d10] Forward OpenSharedResource to D3D11 implementation

Trivial since the requested IID is passed by the application.

* [d3d9] Fix default initialization of some state values

If we end up being the same as what we are, we don't dirty initially.

* [d3d9] Set initial dirty state flags

We had a bug where initial state values caused the data to not get sent to the backend.

Let's fix that going forward and dirty everything we possibly can on device creation.

* [build] Enable -Wimplicit-fallthrough

I got bit by this in D3D9.

* [d3d9] Fix fallthroughs in PickFormats

* [dxso] Fix ExpP fallthrough

* [dxso] Fix potential fallthrough in RasterizerOut

* [d3d9] Add fallthrough comment to SetRenderState

Silences a warning

* [d3d11] Add fallthrough comment to PickFormats

Silences a warning

* [dxbc] Use new [[fallthrough]] attribute

* [dxvk] Use new [[fallthrough]] attribute

* [dxso] Fix ExpP instruction on Shader Model 2+

* [dxvk] Enable VK_KHR_EXTERNAL_MEMORY_WIN32 if available.

* [dxvk] Add shared handle access to DxvkImage memory.

Based off preliminary work from Josh.

* [util] Add helpers for shared resource metadata access.

* [d3d11] Add support for shared ID3D11Texture2D resources.

* [d3d9] Add support for shared IDirect3DTexture9 resources.

* [d3d11] Explicitly handle R32-compatible UAV formats

* [d3d11] Reimplement R11G11B10 UAV clears without R32 views

* [d3d9] Fix shared handle check for exporting images

Co-authored-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>

* [d3d9] Fix texture formats that can be exported

* [dxvk] Force dedicated allocation for exportable images

The Nvidia driver does not set prefers-/requiresDedicatedAllocation
for exportable images on its own.

This makes DXVK ignore the dedicated allocation struct ptr
which also contains VkExportMemoryAllocateInfo or
VkImportMemoryWin32HandleInfoKHR.

* [d3d9] Mark backend image as shared for shared resources

Otherwise, the backend may not transition the image to the correct
layout after each submission.

* [util] Add another weeb game to the list of workarounds

Sophie is apparently D3D9 an we already have Lydie and Suelle in there,
so it's just this on missing from that series.

* [dxvk] Normalize color write masks for non-RGBA formats

* [d3d9] Fix CS thread synchronization for directly mapped buffers

* [d3d9] Update texture sequence number AFTER using it

* [d3d9] Update buffer seq number in FlushBuffer

How did I miss this?!

* [dxgi] Add DXVK_ENABLE_NVAPI envvar

Add a new environment variable DXVK_ENABLE_NVAPI as an environment-level
override for 'nvapiHack'. This will allow for DLSS (and other
NvAPI-backed features) to be available without the user manually writing
a configuration file, allowing for more seamless integration with
Proton's launch script.

* [dxvk] Fix color write mask normalization

Previously we'd set too many bits by accident here. Also, we should
not modify partial write masks to include unnecessary bits. Only do
this if we can actually promote to a full write mask for consistency.

* [d3d9] Don't expose D32 format

Not supported anywhere except REF device it seems... *sigh*

Supercedes: doitsujin#2547

* [dxso] Implement zerowins for Lerp.

Fixes doitsujin#2545.

* [dxso] Emit spirv OpCross if we can.

* [build] Cleanup build system.

No changes except dropping support for msvc before 15.3.

* [d3d11] Fix D3D11UserDefinedAnnotation declaration

Mark it as final too.

* [dxvk] Define IDXVKUserDefinedAnnotation

Something common to share for perf markers between D3D9 and D3D11.

Inherits from the public D3D11 interface.

* [util] Move DecodeD3DColor to util

This will be used in the D3D11UserDefinedAnnotation implementation to handle PIX calls which contain a color.

* [d3d11] Use IDXVKUserDefinedAnnotation

* [d3d9] Implement D3D9GlobalAnnotationList

* [d3d9] Implement D3D9UserDefinedAnnotation

* [d3d9] Add hidden exports for registering annotations

Adds DXVK_RegisterAnnotation at ordinal 28257 and DXVK_UnRegisterAnnotation at ordinal 28258.

* [d3d11] Register annotation interfaces with D3D9

Some apps try use the D3DPERF_ functions for debug markers/annotations.

This utilizes the DXVK_RegisterAnnotation hidden functions to share the interfaces.

Co-authored-by: Oleg Kuznetsov <okouznetsov@nvidia.com>

* [dxvk] Add a config option to enable debug utils in addition to DXVK_PERF_EVENTS=1

* [d3d9] Fix Visual Studio build to resolve 'operator !=' is ambiguous error for RECT

* [util] Enable cached dynamic resources for AC3 and AC4

Without it, AC3 chugs along at 40 FPS on my 5950X.

* [dxbc] Support switch-case fallthrough

Apparently this is a thing in Shader Model 4, although FXC cannot emit it.

* [d3d11] Use smart pointer for swap chain back buffer

* [d3d11] Get strong reference to swap chain in swap chain back buffers

* [dxgi] Work around swapchain use-after-free bugs

Affects Divinity: Original Sin Enhanced Edition. Requires Wine hack to
delay memory deallocation to not crash during resolution changes.

* [dxbc] Handle fallthrough around default properly

* [dxvk] Zero-initialize newly allocated buffer slices on creation

Fixes random flicker in God of War. Since patch 1.0.9, the game's lighting
system relies on MAP_DISCARD returning a zero-initialized memory slices for
its constant buffers, or some lights would get skipped in various compute
passes. Changing the memset to e.g. write 0xFF instead of 0 shows this issue.

* [util] Set frame latency to 1 for God of War

Frame pacing is horrible otherwise, as of the 1.0.9 update.

* [d3d9] UpdateTexture: Handle automatic mip gen properly

* [dxbc] Generate smallest possible vectors for local arrays

FXC is buggy and always emits vec4 in the array declaration,
so we'll have to analyze the used components ourselves.

* [dxbc] Only emit temp array range check for dynamically indexed stores

Generates less code and makes things slightly more readable.

* [dxbc] Actually do the skip range check thing properly

Turns out the first attempt only worked because my test case didn't
do any dynamically indexed stores at all, but broke everything else.
Oops.

* [util] Enable d3d9.deferSurfaceCreation for Stranger of Paradise FFO

Reportedly required for VRR to work. Game still doesn't work here.

* [d3d9] Disable culling when the app passes an invalid value

* [d3d9] Remove evictManagedOnUnlock

This is annoying to maintain and hopefully won't be necessary anymore.

* [d3d9] Use regular memory for managed data

* [d3d9] Use memory mapped files for managed data

* [d3d9] Count frames

This will be used to identify managed textures that
have not been used for a while.

* [d3d9] Unmap unused textures after some time

* [d3d9] Add HUD item for managed memory

* [d3d9] Make unmapping delay configurable

* [d3d9] Make unmapping more aggressive for Borderlands

* [util] Enable cached constant buffers for Frostpunk

Massively improves CPU-bound performance.

* [meta] Release 1.10.1

* [util] Enable d3d9.deferSurfaceCreation for Atelier Sophie 2

2022 and K-T are still using D3D9 for video stuff.

* [dxvk] Filter out unnecessary access flags when recording barriers

Rationale is as follows:
- srcAccess never needs to contain read flags, since any memory being
  read must have been made visible before by a write operation
- dstAccess is only relevant if srcAccess contains a write, because
  reads alone cannot modify memory and thus do not require making the
  same memory available again. An exception are layout transitions.

Doesn't really change performance in anything as far as I can tell, but
we avoid some unnecessary UBO cache flushes in compute-heavy scenarios.

* [d3d11] Always export correct shared handle type from ::GetSharedHandle and ::CreateSharedHandle

Before we just assumed that the calls here would match the corresponding flag value (D3D11_RESOURCE_MISC_SHARED -> ::GetSharedHandle, D3D11_RESOURCE_MISC_SHARED_NTHANDLE -> ::CreateSharedHandle), but it turns out that its possible to set both flags and use both methods.  Now we always tell Vulkan to export a KMT handle if D3D11_RESOURCE_MISC_SHARED is present, and use openKmtHandle to get an NT handle when needed.

* [util] Limit Limbo to 60 fps

Fixes: doitsujin#2564

* [meta] Move apitrace guide over to dxvk repo

* [dxvk] GPU query reset path

Require VK_EXT_host_query_reset instead. This fallback path is
untested nowadays and too slow to be useful.

* [d3d11] Only apply anisotropy override to linear samplers

Mirrors D3D9, more or less.

* [util] Fix typo in app profiles

Accidentally broke everything.

* [util] Force sampler type spec const for Star Wars TFU2

The game tries to binda  2D texture to a slot that is declared
as a 3D texture in the shader. This causes one particle effect
to be completely black because DXVK does not bind the texture

* [d3d9] Filter window messages when processing WM_ACTIVATEAPP.

* [d3d9] Ignore multiple app activation window messages.

* [d3d9] Calculate slice alignment when uploading straight from the mapping buffer

* [util] Use cached constant buffers for Armored Warfare

* [util] correct enableDebugUtils conf to default False

* [util] remove allowLockFlagReadonly from conf

* [spirv] Implement faster in-memory compression for shaders

Seems to be anything up to 3x as fast to decode than the previous code,
with the compression ratio being slightly worse. Encoding seems faster
as well.

* [dxvk] Introduce new way to create DxvkShader objects

* [dxvk] Use new DxvkShaderCreateInfo struct to retrieve shader info

* [dxvk] Use new DxvkShader constructor for swap chain shaders

* [dxvk] Use new DxvkShader constructor for HUD shaders

* [dxbc] Use new DxvkShader constructor

* [dxso] Use new DxvkShader constructor

* [d3d9] Use new DxvkShader constructor for fixed-function shaders

* [d3d9] Use new DxvkShader constructor for SWVP emulation

* [d3d9] Use new DxvkShader constructor for format conversion

* [d3d11] Use new DxvkShaderCreateInfo struct to retrieve shader info

* [d3d11] Use new DxvkShader constructor for video shaders

* [dxvk] Remove old shader creation code

* [d3d9+util] Enable invariant position by default

* [util] Implement thread helpers on non-Windows platforms

* [util] Add strlcpy helper

strncpy is not safe.

* [d3d9] Use strlcpy helper

* [util] Implement env helpers on non-Windows platforms

* [util] Add missing include to thread.h

* [d3d9] Remove evictManagedOnUnlock

This is annoying to maintain and hopefully won't be necessary anymore.

* [d3d9] Use regular memory for managed data

* [d3d9] Use memory mapped files for managed data

* [d3d9] Count frames

This will be used to identify managed textures that
have not been used for a while.

* [d3d9] Unmap unused textures after some time

* [d3d9] Add HUD item for managed memory

* [d3d9] Make unmapping delay configurable

* [d3d9] Make unmapping more aggressive for Borderlands

* [d3d9] Only bind depth buffer if the depth or stencil test is enabled

* [d3d9] Only bind RT if we actually write to it

The alternative render path for shadow maps in Dead Space relies on this.

* [dxbc] Implement range check for private array reads

We already do this for stores.
Co-authored-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Co-authored-by: Robin Kertels <robin.kertels@gmail.com>
Co-authored-by: Joshua Ashton <joshua@froggi.es>
Co-authored-by: Oschowa <oschowa@web.de>
Co-authored-by: Derek Lesho <dlesho@codeweavers.com>
Co-authored-by: Liam Middlebrook <lmiddlebrook@nvidia.com>
Co-authored-by: Georg Lehmann <dadschoorse@gmail.com>
Co-authored-by: Oleg Kuznetsov <okouznetsov@nvidia.com>
Co-authored-by: Paul Gofman <gofmanp@gmail.com>
Co-authored-by: Blisto91 <47954800+Blisto91@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
  翻译: