SUMMARY: cl.h causes a choke on John the Ripper and a hangs on Hashcat when executing clCreateKernel(program[gpu_id], "wpapsk_final_sha256", &ret_code) for wpapsk. How may I fix this problem -- especially/at the very least for Hashcat?
NOTE: This post is a clarification and elaboration of my earlier post (which incorrectly and insufficiently identified the problem with OpenCL for Mali):
Rockchip-mali-midgard: OpenCL broken; please add earlier compilation (version) to repository and fix the current one - Rockchip-mali-midgard: OpenCL broken; please add earlier compilation (version) to repository and fix the current one
To enable OpenCL (initially just for use with Hashcat), I have installed OpenCL ICD Loader, plus clinfo. Three additional packages I installed for John the Ripper, to use it for purposes of comparison.
apt-get install ocl-icd-libopencl1 clinfo libopenmpi2 openmpi-bin argon2
The following additional packages will be installed:
libgfortran3 libhwloc-plugins libhwloc5 libibverbs1 openmpi-common
apt-get install mesa-opencl-icd
When I ran JtR, I had an auspicious (4266c/s for whole GPU) start with Ubuntu Bionic's CL, from opencl-c-headers.
NOTE: any performance is acceptable, as this exercise is one of education and understanding on a tight budget.
john-1.9.0-jumbo-1/src$ …/run/john --test --format=opencl --force-scalar
Device 1@wifi: Mali-T860
Benchmarking: sha1crypt-opencl, (NetBSD) [PBKDF1-SHA1 OpenCL]… DONE
Speed for cost 1 (iteration count) of 64000 and 40000
Raw: 38.6 c/s real, 4266 c/s virtual
Later, I was able to achieve better results.
Bionic’s CL, from opencl-c-headers, works in ~3-5min.
john-1.9.0-jumbo-1-compiled_with_1bionic_headers/src$ …/run/john --test --format=opencl --force-scalar
Benchmarking: sha1crypt-opencl, (NetBSD) [PBKDF1-SHA1 OpenCL]… DONE
Speed for cost 1 (iteration count) of 64000 and 40000
Raw: 53.7 c/s real, 5120 c/s virtual
NOTE: speed doubled. (I accidentally ran the test again.)
john-1.9.0-jumbo-1-compiled_with_1bionic_headers/src$ …/run/john --test --format=opencl --force-scalar
Device 1@wifi: Mali-T860
Benchmarking: sha1crypt-opencl, (NetBSD) [PBKDF1-SHA1 OpenCL]… DONE
Speed for cost 1 (iteration count) of 64000 and 40000
Raw: 113 c/s real, 6400 c/s virtual
But I noticed that JtR choked on /at least/ one format: the one I need to use.
…/run/john --test --format=wpapsk-opencl
Device 1@wifi: Mali-T860
Benchmarking: wpapsk-opencl, WPA/WPA2/PMF/PMKID PSK [PBKDF2-SHA1 OpenCL]… 0: OpenCL CL_INVALID_PROGRAM_EXECUTABLE (-45) error in opencl_wpapsk_fmt_plug.c:270 - Error creating kernel
The problematic line (270) is the sixth (the one that begins "wpapsk_final_sha256 = clCreateKernel...) shown below.
~/JtR/john-1.9.0-jumbo-1/src$ grep clCreateKernel opencl_wpapsk_fmt_plug.c
crypt_kernel = wpapsk_init = clCreateKernel(program[gpu_id], “wpapsk_init”, &ret_code);
wpapsk_loop = clCreateKernel(program[gpu_id], “wpapsk_loop”, &ret_code);
wpapsk_pass2 = clCreateKernel(program[gpu_id], “wpapsk_pass2”, &ret_code);
wpapsk_final_md5 = clCreateKernel(program[gpu_id], “wpapsk_final_md5”, &ret_code);
wpapsk_final_sha1 = clCreateKernel(program[gpu_id], “wpapsk_final_sha1”, &ret_code);
wpapsk_final_sha256 = clCreateKernel(program[gpu_id], “wpapsk_final_sha256”, &ret_code); #JtR chokes here.
wpapsk_final_pmkid = clCreateKernel(program[gpu_id], “wpapsk_final_pmkid”, &ret_code);
The procedure clCreateKernel resides in cl.h as shown below
grep clCreateKernel /usr/include/CL/cl.h
clCreateKernel(cl_program /* program /,
clCreateKernelsInProgram(cl_program / program */,
I replaced the directory CL with the same from ComputeLibrary-19.05/include/CL
mv /usr/include/CL /usr/include/CL.orig
cp -a ComputeLibrary-19.05/include/CL /usr/include/
but this change gave me the same results.
The first worked, albeit significantly more slowly than Bionic’s CL.
…/run/john --test --format=opencl --force-scalar
Device 1@wifi: Mali-T860
Benchmarking: sha1crypt-opencl, (NetBSD) [PBKDF1-SHA1 OpenCL]… DONE
Speed for cost 1 (iteration count) of 64000 and 40000
Raw: 33.7 c/s real, 2133 c/s virtual
…
wpapsk-opencl gave the error as before.
I posted my results to the JtR forum, but was unable to get a resolution.
Hashcat, like JtR (results omitted) recognized the GPU.
hashcat -I
hashcat (v5.1.0) starting…
OpenCL Info:
Platform ID #1
Vendor : ARM
Name : ARM Platform
Version : OpenCL 1.2 v1.r14p0-01rel0-git(966ed26).f44c85cb3d2ceb87e8be88e7592755c3
Device ID #1
Type : GPU
Vendor ID : 2147483648
Vendor : ARM
Name : Mali-T860
Version : OpenCL 1.2 v1.r14p0-01rel0-git(966ed26).f44c85cb3d2ceb87e8be88e7592755c3
Processor(s) : 4
Clock : 200
Memory : 229/919 MB allocatable
OpenCL Version : OpenCL C 1.2 v1.r14p0-01rel0-git(966ed26).f44c85cb3d2ceb87e8be88e7592755c3
Driver Version : 1.2
Upon issuing the benchmarking command the screen (but not the system) hung,
hashcat -b -m 2500
…
OpenCL Platform #1: ARM
-
Device #1: Mali-T860, 229/919 MB allocatable, 4MCU
…
Hashmode: 2500 - WPA-EAPOL-PBKDF2 (Iterations: 4096)
for 1/2 - 1hr sans results the first time (when I interrupted the process); for 4hrs ending in a spontaneous reboot, the second.Please, let me know how I can fix cl.h (or anything else) such that clCreateKernel(program[gpu_id], “wpapsk_final_sha256”, &ret_code) does not choke when processing wpapsk – especially/at the very least for Hashcat.