Message boards : Number crunching : Rosetta@home using AVX / AVX2 ?
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · Next
| Author | Message | 
|---|---|
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579   | 
 | 
| rjs5 Send message Joined: 22 Nov 10 Posts: 274 Credit: 23,730,845 RAC: 0 | 
 GIMPS project introduces AVX512 support PrimeGrid uses GIMPS. I am see the benefit on my 9980xe machine. I have not measured it accurately, but it seemed like about 30% improvement on PrimeGrid LLR. When a dense AVX application starts crunching, the CPU will throttle back the clock because of the higher CPU power usage. This highlighted an issue with some CPUs. Some CPUs are designed with one AVX unit for each core instead of one AVX unit per thread. This means that the AVX unit can only be used by one of the threads on that core at a time. On systems with a single AVX unit per core, the bottleneck that creates cause the application to run slower that not using AVX 512. "Performance" is a fickle thing. | 
|  G.L.I.S.  Send message Joined: 25 Dec 08 Posts: 26 Credit: 2,780,490 RAC: 0 | 
 Also a Prime Grid app; but the question is that only Intel can handle it and to a much greater consumption for the CPU. Mind you, an SSE2 / SSE3 would be fine for me too ... even in Phenom (K10) they can handle them. | 
|  G.L.I.S.  Send message Joined: 25 Dec 08 Posts: 26 Credit: 2,780,490 RAC: 0 | 
 Also a Prime Grid app; but the question is that only Intel can handle it and to a much greater consumption for the CPU. Of course, less time, same credits score (imho). | 
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579   | 
 Ryzen 3xxx will have Avx 256 (Ryzen 1xxx and 2xxx have only Avx 128) support: AMD has improved IPC by roughly 15% (though that can vary by workload) doubled the L3 cache size to keep data as close to the execution units as possible, and doubled floating point performance by stepping up to two 256-bit floating point units (FPUs) that enable support for AVX2 instructions. | 
| mmonnin Send message Joined: 2 Jun 16 Posts: 61 Credit: 25,390,629 RAC: 0 | 
 To clarify, Zen 2 will have AVX2 support in a single cycle. Zen/+ can do AVX2 but needs 2 cycles to complete it. | 
|  G.L.I.S.  Send message Joined: 25 Dec 08 Posts: 26 Credit: 2,780,490 RAC: 0 | 
 Health to all, I believe that applications that deal with proteins in Folding @ home have fast SIMDs. Personally, I hope (and I wish the whole project) that it will not have to go through all 2020, without Rosetta @ home providing it properly. Byez | 
|  G.L.I.S.  Send message Joined: 25 Dec 08 Posts: 26 Credit: 2,780,490 RAC: 0 | 
 I add that you could start with the Linux system, taking into account also the ARM platform. So much so that these crunching devices are also spreading: https://www.google.it/search?q=arm+hardkernel+odroid-n2&newwindow=1&sxsrf=ACYBGNRTcA7SJ1QcOdNKgjTC9WppFijsNg:1571610265481&source=lnms&tbm=isch&sa=X&ved=0ahUKEwj8s-W88KvlAhXko4sKHd9zBCwQ_AUIFCgD&biw=1485&bih=929 | 
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579   | 
 | 
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579   | 
 | 
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579   | 
 C++20 is ready and will be published in a few months!! C++20, the most impactful revision of C++ in a decade, is done! | 
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579   | 
 Seems that GCC is better than ICC | 
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579   | 
 Now seems that version 4.12 is a 64 bit native version for Windows. Great! It's time for SSEx/Avx support?? | 
| Jan Vaclavik Send message Joined: 26 Sep 05 Posts: 5 Credit: 465,351 RAC: 18 | 
 Now seems that version 4.12 is a 64 bit native version for Windows.Are there even any x86-64 CPUs without at least SSE2 support? | 
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579   | 
 It's time for SSEx/Avx support?? SSE2 from Wiki: Introduced by Intel with the initial version of the Pentium 4 in 2000... AMD added support for SSE2 in 2003 | 
| Klimax Send message Joined: 27 Apr 07 Posts: 44 Credit: 2,805,042 RAC: 0 | 
 It's time for SSEx/Avx support?? It should be noted, that x86-64 mandates SSE2 support and as such any 64-bit CPU supports it. | 
| Jan Vaclavik Send message Joined: 26 Sep 05 Posts: 5 Credit: 465,351 RAC: 18 | 
 SSE2 from Wiki: I know, but back in the day there were more manufacturers like VIA and Intel sometimes released CPUs like the Atom line, which did not support all the instructions sets. But it seems like you are right - all x86-64 CPUs support SSE2 and except the first AMD K8 they support SSE3. | 
| ![View the profile of [VENETO] boboviz Profile](https://boinc.bakerlab.org/rosetta/img/head_20.png) [VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2124 Credit: 12,426,657 RAC: 2,579   | 
 But it seems like you are right - all x86-64 CPUs support SSE2 and except the first AMD K8 they support SSE3. Like i said in the past, even if SSE2 version will give only 0,5% of more computational power, the ten firsts systems will exceed all old systems (remaining) that don't support this extension. Rjs5 said that introducing SSE2 is not so difficult (recompilation with some tricks), but i don't know if it is true | 
|  Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1895 Credit: 18,534,891 RAC: 0 | 
 Rjs5 said that introducing SSE2 is not so difficult (recompilation with some tricks), but i don't know if it is trueI'm thinking different compilation flags, making sure mathsafe or similar is used, then check the resulting output of the application is as expected. Grant Darwin NT | 
| Laurent Send message Joined: 15 Mar 20 Posts: 14 Credit: 88,800 RAC: 0 | 
 Rjs5 said that introducing SSE2 is not so difficult (recompilation with some tricks), but i don't know if it is true It is. The keyword is auto vectorization. It was already available in most better compilers sometimes around 2000-2005. I remember it kicking in for the Pentium MMX-extensions.... Just as a reminder, that's Pentium I in today's numbering. Now it is often faster to just write clean code without any extras and tell the compiler to do the magic, than to attempt to do the magic of AVX/SEE/whatever yourself. Even the free VisualStudio tiers can do that. Bonus: compilers usually emit code that runs on ALL CPUs, unless you screw it up in the parameters. The code contains fall-back stuff to run if an extension is not there. The only real advantage of dedicated exes for AVX, SSE,... are slightly smaller exes (Come on, we all download WU way bigger than the exes...) It's a different thing for GPU. Compilers are not yet smart enough yet to do that level of vectorization. | 
            Message boards : 
            Number crunching : 
        Rosetta@home using AVX / AVX2 ?
    
 
         ©2025 University of Washington 
https://www.bakerlab.org