• JP Lee

IPC with GPU.

We are also have thinking about Disassembled HW Code and more important considering of performance with each cycle count by individually instruction count in shaders.

Writing basic gles shader as .vsh and .psh that is PowerVR shader format for the iOS native shader compile.

See below.


我们还考虑了反汇编的HW代码以及着色器中各个指令计数对每个循环计数的更重要的考虑因素。

将基本的opengl-es着色器编写为.vsh和.psh,这是用于iOS本机着色器编译的PowerVR着色器格式。

见下文。



IPC with GPU.

Installation of SDK.

POWEVR

*如果不用VPN下载需要10小时

OVERVIEW.

Excution time (Response time)

GPU执行一项工作所需要的时间。

GPU Excution time.

GPU执行一项工作所需的实际时间(除了导入导出时间)

Clock period

完成每个 Clock cycle所需的时间。也叫做Clock cycle time。

Clock rate

Clock period(Clock cycle time)的倒数。1秒时间存在的Clock cycle数。

GPU clock cycles = Instrucions for program x Average clock cycles per instruction

Clock cycles per instruction

执行每个命令所需的Clock cycle数。

SIMT : Single instruction Multiple-thread的缩略语,一个instruction可以多线程同时处理的意思。即,把它考虑为一个程序可以同时在多个线程处理的意思。CUDA就是SIMT模型的意思。

因为warp的大小是32,一个 instruction在一个warp里面是由32个operation构成的。(SIMT模型,一个warp的大小是32的意思就是有32个thread的意思,一个instruction可以在32个thread线程中32次同时运行。意思就是 32 operations。)

Relate of technical documents.

EXPERIMENT SHADER INSTRUCTION PER CYCLE

Exmaple of IPC. (PowerVR)



为帮助理解,使用了Lambert 灯光模型。

运行PVRShaderEditor 2.9 。 (X64)


Compiler setup.


void main()

{

vec4 diffuseTex = texture(DiffuseTexture, TexCoord);

//Ordinary coding this type.

diffuseTex = pow(diffuseTex , powVal);

oColor = diffuseTex * DestColor;

}



void main()

{

vec4 diffuseTex = texture(DiffuseTexture, TexCoord);

//Optimizing with Approximated pow math

diffuseTex = 5.0 * (diffuseTex * diffuseTex) * 0.19875;

oColor = diffuseTex * DestColor;

}



作业

制作做出Fresnel 函数,尽可能减少IPC。

Hint.



related topic of optimzing.

https://www.opengl.org/wiki/GLSL_Optimizations



Vertex shader.

lambert.vhs


lambert.vhs disassembled HW code at below.







Pixel Shader.

lambert.psh


Disassembled HW Code



example.





조회 13회

© 2014 by LEEGOONZ HOME.