No description
- C 66.5%
- Assembly 31.2%
- Shell 2.3%
| asm.s | ||
| asmtest.c | ||
| avxfull.c | ||
| avxleast.c | ||
| avxmid.c | ||
| benchmark.sh | ||
| BENCHMARK.txt | ||
| chacha.h | ||
| normal.c | ||
| README.txt | ||
| test.c | ||
This is my best attempt at making a decent chacha20 algorithm. You need a machine with both `avx512vl` and `avx512f` flags. Other avx flags don't count. To test this, run `lscpu | grep avx` The full avx file is about 167.2% faster than no avx which is very signifigant. The asm file, though, is only about 1.4% faster than the full avx. - note the init function is about 50% faster but that isnt a majority of the work So while I am happy to say that I can hand roll ASM better than gcc, it probably isn't worth it in most situations. The actual benchmark I took on a 2 core Xeon Platinum 8175M AWS instance is in BENCHMARK.txt for anyone who is curious.