Hi All,
I love neoscrypt, I have made my own gpu scrypt and is testing it on PXC. Theblocksfactory. :)
Ralph
Hi All,
I love neoscrypt, I have made my own gpu scrypt and is testing it on PXC. Theblocksfactory. :)
Ralph
Hello Everyone,
I’m back. O0
After 4 day and nights, finally got my neoscrypt code being optimized successfully by lovely opencl compiler.
The current relsult is: ScrachReg reduced to 224 and the overall hash rate for R9 290 is 160-170K/s. :)
With 5 R9290, I got around 800-830k/s locally and 780-800k/s on PXC.theblocksfactory.
link: http://i58.tinypic.com/noxd9h.jpg
My rig:
5 ASUS R9 290
Win8.1
GPU: default core and memory frequency.
AppSDK: 2.9.1
Crystal Driver: 14.4
Plus: Coding opencl is really nightmare: Comment one line or add one useless line will cause the result 100% different.
Sorry for my national holiday, but the result is exciting.
Love Neoscrypt, hate opencl code but enjoying the fun.
Ralph
Hi,
I’m so glad to meet you here!!! :)
Actually, I played with original cgminer for a while, but I found it contains too much code that I do not need.
So this time, I just created simple opencl initialization and exection code into cpuminer, orginally neoscrypt cpuminer, then got it working.
(I like the neoscrypt cpuminer code, simple, easy to understand and maintain.)
(Sorry for the neoscrypt code, I removed lots of unnecessary code, to my understanding, compare to the original one.)
As the result, the code is very much customized for my R9 290 and does not have hardware monitor function.
For me, I want to have a very thin gpu miner for neoscrypt and easy to debug and code.
For neoscrypt that desired to asginst ASIC, I think it will not work that well as what people expected.
I write FPGA code for PTS(For fun :)) and knows the current memory operation for FPGA is not a problem and as long as people can invest it, the ASIC can be made very soon.
Thanks
Regards.
Ralph
Hi All,
I think it’s better to look into the current code:
If you compile it using CodeXL 1.5 with optimization disabled, the strachreg number is huge: more than 1500.
If open the optimization, the strachreg unumber is more than 400.
The key issue for neoscrypt is: it uses too much dynamic copy: calculating the buffer position for B and A.
It’s hard to make full use of the uint4 and opencl compiler nees lots of VREGS to calculate the next buffer position.
The second difficulty is: It uses too much local array and the result is indexing local array will hurt the opencl code’s performance.
Because of the above 2, the reginsers will be used very quickly and memory will spill to global memory.
Directly convert the c code to opencl is just the very first step, need to reduce the strachreg number to 0.
I have rewrite the original code and reduce the strachreg number to 372 in non-optimization mode.
I’m not modifying the cgminer3.7.2, but adding code into original cpuminer and simpliy it. (Current, working on Win8.1 only)
Without the optimization, my code on R9290 can run 95K/s and with 5 R9290, I got 440k/s on pxc.theblocksfactory.
The code can run 145k/s if open the optimization and the Strachreg number reduce to around 220.
But unfortunely, it will reproduce wrong nonce with very wired mid value. I am still testing it.
Like to discuss any techknowledge with all of you.
Thanks
Ralph
Hi All,
I love neoscrypt, I have made my own gpu scrypt and is testing it on PXC. Theblocksfactory. :)
Ralph