?/TD> |

Microsoft DirectX 9.0 |

This section contains reference information for the pixel shader version 2_x instructions.

There are several types of pixel shader instructions, as shown in the table. Columns to the right mean the following:

- Instruction slots - Number of instruction slots used by each instruction.
- Setup - A pixel shader must have a version instruction and it must be the first instruction.
- Arithmetic - These instructions provide the mathematical operations in a shader.
- Macro-ops - These instructions combine arithmetic instructions to provide higher level functionality. The use of macro-ops is optional. It is preferable to use macro-ops, especially the matrix multiply instructions (m3x2, m3x3, m3x4, m4x3, m4x4), because they provide optimization opportunities to the underlying implementation.
- Texture - These instructions are used to load and sample texture data, and to modify texture coordinates.
- Flow control - These instructions provide static and dynamic flow control to the execution of instructions.
- New - These instructions are new to this version.

Name | Description | Instruction slots | Setup | Arithmetic | Macro-ops | Texture | Flow Control | New |
---|---|---|---|---|---|---|---|---|

abs | Absolute value | 1 | x | |||||

add | Add two vectors | 1 | x | |||||

break | Break out of a loop...endloop or rep...endrep block | 1 | x | x | ||||

break_comp | Conditionally break out of a loop...endloop or rep...endrep block, with a comparison | 3 | x | x | ||||

break pred | Break out of a loop...endloop or rep...endrep block, based on a predicate | 3 | x | x | ||||

call | Call a subroutine | 2 | x | x | ||||

callnz | Call a subroutine if a boolean register is not zero | 3 | x | x | ||||

callnz pred | Call a subroutine if a predicate register is not zero | 3 | x | x | ||||

cmp | Compare source to 0 | 1 | x | |||||

crs | Cross product | 2 | x | |||||

dcl | Map a vertex element type to an input vertex register | 0 | x | |||||

dcl_textureType | Declare the texture dimension for a sampler | 0 | x | |||||

def | Define constants | 0 | x | |||||

dp2add | 2-D dot product and add | 1 | x | |||||

dp3 | 3-D dot product | 1 | x | |||||

dp4 | 4-D dot product | 1 | x | |||||

dsx | Rate of change in the x-direction | 2 | x | x | ||||

dsy | Rate of change in the y direction | 2 | x | x | ||||

else | Begin an else block | 1 | x | x | ||||

endif | End an if...else block | 1 | x | x | ||||

endrep | End of a repeat block | 2 | x | x | ||||

exp | Full precision 2^{x} | 1 | x | x | ||||

frc | Fractional component | 1 | x | |||||

if | Begin an if block | 3 | x | x | ||||

if comp | Begin an if block with a comparison | 3 | x | x | ||||

if pred | Begin an if block with predication | 3 | x | x | ||||

label | Label | 0 | x | x | ||||

log | Full precision log_{2}(x) | 1 | x | |||||

lrp | Linear interpolate | 2 | x | |||||

m3x2 | 3x2 multiply | 2 | x | |||||

m3x3 | 3x3 multiply | 3 | x | |||||

m3x4 | 3x4 multiply | 4 | x | |||||

m4x3 | 4x3 multiply | 3 | x | |||||

m4x4 | 4x4 multiply | 4 | x | |||||

mad | Multiply and add | 1 | x | |||||

max | Maximum | 1 | x | |||||

min | Minimum | 1 | x | |||||

mov | Move | 1 | x | |||||

mul | Multiply | 1 | x | |||||

nop | No operation | 1 | x | |||||

nrm | Normalize | 3 | x | |||||

pow | 2^{x} | 3 | x | |||||

ps | Version | 0 | x | |||||

rcp | Reciprocal | 1 | x | |||||

rep | Repeat | 3 | x | x | ||||

ret | End of a subroutine | 1 | x | x | ||||

rsq | Reciprocal square root | 1 | x | |||||

setp | Set the predicate register | 1 | x | x | ||||

sincos | Sine and cosine | 8 | x | |||||

sub | Subtract | 1 | x | |||||

texkill | Kill pixel render | 2(tex) | x | |||||

texld | Sample a texture | 1 + 3CUBE | x | |||||

texldb | Texture sampling with level of detail (LOD) bias from w-component | 6(tex) | x | |||||

texldd | Texture sampling with user-provided gradients | 3 | x | x | ||||

texldp | Texture sampling with projective divide by w-component | 3 + 1CUBE | x |

Where:

- tex - 1 texture instruction slot. However, if D3DPS20CAPS_NOTEXINSTRUCTIONLIMIT is set, this instruction is counted against the instruction ( non texture) count.
- 1 + 3CUBE means 1 + 3 if the texture is a cube map.
- 3 + 1CUBE means 3 + 1 if the texture is a cube map.

Shaders have restrictions for maximum instuction counts, as well as nesting depths for static and dynamic flow control instructions.

Total Instruction slots: 512 maximum

The maximum number of instructions run is indicated by the MaxPShaderInstructionsExecuted cap in D3DCAPS9.

- On hardware that supports ps_2_0, it should be at least 96 (more if they support the flow control caps).

This can be set to the #define D3DINFINITEINSTRUCTIONS, indicating that the actual number of instructions run is unlimited.

The total number of instructions run should be clamped to the device driver D3DRS_MAXPIXELSHADERINST. The legal values for this renderstate are numbers that are powers of 2; if any other integer is set, the next nearest pow2 number is assumed. This renderstate defaults to D3DINFINITEINSTRUCTIONS.

D3DCAPS9.D3DPSHADERCAPS2_0.DynamicFlowControlDepth represents the nesting depth of static flow control instructions: if, break, and break_comp. The value is equal to the nesting depth of the if_comp block. The range of values for this cap is 0 to 24. If this cap is zero, the device does not support dynamic flow control instructions.

D3DCAPS9.D3DPSHADERCAPS2_0.StaticFlowControlDepth represents the nesting depth of static flow control instructions: loop/rep and call/callnz. The range of values for this cap is 1 to 4. Note that **loop**/**rep** count toward the same nesting depth, and **call**/**callnz** count toward the same nesting depth. If this cap is zero, the device does not support static flow control instructions.