Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for scalar layout for constant buffers #317

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Tobski
Copy link

@Tobski Tobski commented Sep 12, 2024

This proposal adds an attribute to specify that the layout of constant buffers uses C-style packing rules, according to the size of the scalar components of matrices and vectors, rather than the packing rules currently used for constant buffers when they are involved.

@Tobski
Copy link
Author

Tobski commented Sep 12, 2024

@microsoft-github-policy-service agree company="AMD"

@damyanp damyanp added this to the Shader Model 6.9 milestone Sep 26, 2024
@llvm-beanz
Copy link
Collaborator

A few thoughts/questions, mostly around some extra details that would be useful.

Usage

I think you're suggesting we add a source-level attribute something like:

[bufferLayout(legacy)]
cbuffer CB { ... }
[bufferLayout(scalar)]
ConstantBuffer<MyStruct> CBuf;

Is that correct? Should we also have a command line argument to change the default value?
This gets into details that don't yet need to be captured, but we should remember that we'll need to keep track of this in the reflection metadata so that the reflection can communicate the expected layout for a CBV.

Expected Alignments

Can you expand a bit on the expected alignments for different types? Most C & C++ compilers do support vector types with 16-byte alignment, so they wouldn't necessarily be aligned to their element component. How would the new packing rules apply for something like:

struct MyConstants {
  int X;
  int64_t Y[2];
  int16_t Z[3];
  float __attribute__((vector_size(4))) V; // equivalent to float4
};

@Tobski
Copy link
Author

Tobski commented Oct 11, 2024

A few thoughts/questions, mostly around some extra details that would be useful.

Usage

I think you're suggesting we add a source-level attribute something like:

[bufferLayout(legacy)]
cbuffer CB { ... }
[bufferLayout(scalar)]
ConstantBuffer<MyStruct> CBuf;

Is that correct? Should we also have a command line argument to change the default value? This gets into details that don't yet need to be captured, but we should remember that we'll need to keep track of this in the reflection metadata so that the reflection can communicate the expected layout for a CBV.

Yes that's what I'm thinking yep, and I agree a compiler flag to switch the default would be useful.

Expected Alignments

Can you expand a bit on the expected alignments for different types? Most C & C++ compilers do support vector types with 16-byte alignment, so they wouldn't necessarily be aligned to their element component. How would the new packing rules apply for something like:

struct MyConstants {
  int X;
  int64_t Y[2];
  int16_t Z[3];
  float __attribute__((vector_size(4))) V; // equivalent to float4
};

In the Vulkan version of this feature against SPIR-V, vector size doesn't change the alignment requirements; we basically treat them the same as arrays of floats. So the alignment required is that of a float - 4 bytes. I would expect the same here for float4 in HLSL. Granted this is not the same as C/C++ language extensions, but vectors aren't really treated the same way on modern GPUs as they are in CPU ISAs, which is why it works this way in Vulkan (and every supporting vendor's hardware).

I could see an argument for matching C/C++ vector extensions, but it's going to depend on what developers want to see here. We could add in a way to switch between these two packing modes if we think it's helpful though.

@llvm-beanz llvm-beanz added the Design Meeting Agenda item for the design meeting label Nov 12, 2024
@Alexander-Johnston
Copy link

I've updated this with a first pass to a detailed design. I have the open question of if metadata will be required to determine if the constant buffer is in legacy or scalar layout. Currently when examining the DXIL output of a shader it can be impossible to determine if a constant buffer is in legacy or scalar layout. My (limited) understanding of DX reflection makes me think we do need some metadata to state when a constant buffer is in scalar layout so the runtime can act appropraitely, but I'd like to hear other opinions on this, and if we need to support this or not.

Copy link
Author

@Tobski Tobski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking on the detail sections of this proposal, looks good overall, just a few clarifying questions!

# Scalar constant buffer layout

* Proposal: [0023](0023-scalar-constant-buffer-layout.md)
* Author(s): [Tobias Hector](https://github.com/tobski)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Alexander-Johnston you should probably add yourself as author at this point!

```

As all values must be aligned to their scalar size, a scalar or element of a
composite type may not lay across the 16 byte boundary.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to say 4-byte? Or was this meant to refer to the legacy layout?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the language is probably a bit redundant. If scalar objects are aligned to their size for all scalar objects under 16-bytes, it would be impossible to cross a 16-byte boundary. HLSL doesn't have any larger than 8-bytes, but I don't think we need language about not crossing a 4-byte boundary because 64-bit types would be 8-byte aligned and will cross the 4-byte boundary between 8-byte alignments.

As all values must be aligned to their scalar size, a scalar or element of a
composite type may not lay across the 16 byte boundary.
In the following example the double is aligned identically in both legacy and
scalar layout, to 8 bytes, demonstrating this point.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on what the prior sentence was supposed to say this might need some more explanation.

showing the cbuffer loads and extractvalue instructions needed to load each
element of a 2x2 float matrix.
Note that the legacy example requires two 16 byte loads from the cbuffer, while
the scalar example requires only one.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As someone not entirely familiar with DXIL, it would be useful to explain here why the legacy version needs an extra load, because it's not at all clear here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cbuffer storage for a matrix<T,2,2> is basically a vector<T,2>[2]. The packing rules for arrays in cbuffers are that each array index starts on a new "row" of the cbuffer. You can see this visualized here:
https://maraneshi.github.io/HLSL-ConstantBufferLayoutVisualizer/?visualizer=MYIwrgZhCmBOAE0AeBDAtgBwDbXgbwCh5j4IsB7FAFwCYkb41qBuAgX2aA

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks, that does help quite a bit - would be good to get this explanation added to the proposal!


### SPIR-V Support

SPIR-V already supports this scalar layout proposal through the
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
SPIR-V already supports this scalar layout proposal through the
SPIR-V already supports this scalar layout proposal, as exercised by the

(noting that SPIR-V can express this natively, the Vulkan extension just makes it valid to use with Vulkan)

### Interchange Format Additions

This feature does not require any change to the DXIL language to represent
scalar layouts.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems questionable...

I'd really like @tex3d to think on this. I suspect we could use the existing cbuffer load instructions to emulate the scalar layout but that may have limitations and impacts on things like the reflection data which we would need to surface the layout to users.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should attempt to implement scalar layout on top of the CBufferLoadLegacy DXIL op.

While it should be possible to map arbitrary accesses of the scalar layout to the legacy cbuffer load op, it would be ugly, and there would be complications (especially around dynamic indexing). The reason the non-legacy cbuffer load (CBufferLoad) DXIL op exists is to support this kind of scalar layout in a future compiler version. The intended mechanism was through a global switch, rather than a per-cbuffer (range) flag, however.

In any case, scalar layout is (as far as I can tell) the same as the layout we would use in structured buffer and what's described by the data layout, plus the default vector alignment modification DXC uses.

We already surface reflection information for structured buffers using the data layout rather than using the constant buffer offsets from the type annotation metadata. It shouldn't be a stretch to use this for scalar-layout constant buffer reflection, but we will need to retain a flag for the cbuffer to guide the reflection code.

There is also code in DXC for marking constants as used for reflection which would be impacted, though that code is problematic (such as for offset array indexing) and needs a new approach anyway.

Regarding the existing "CBufferLoad" DXIL op (non-legacy version), we intended to retroactively disallow it on existing shader models, since we know that it's not HLK tested or well supported by existing drivers, reserving it for a new shader model once we had the HLK testing and driver support lined up.

There is also a proposal for alignas (#154), which would interact with this, requiring some solution for better-than-default alignment of fields and a corresponding reflection solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design Meeting Agenda item for the design meeting
Projects
Status: No status
Status: Triaged
Development

Successfully merging this pull request may close these issues.

5 participants