2024-09-02

August: Finalising the Parser

In this hectic month, I continued with the parser to parse the path mapping files. The parsing is now finished, although I do still need to modify how the parsed information is stored. Right now, the data structures are not optimised to be able to do line mapping as well.

Because the parser parsers user input, it is important that the parser is rubust. It should be able to handle correctly formatted files, but also files with errors.

It is not always possible to come up with all the failure situations by thinking, and therefore a common technique is to use a fuzzer. For PHP there is Infection PHP for mutation testing for example. For C, and C++, a commonly used tool is AFL++. This provides a compiler wrapper and a run-time to fuzz the input to your application. You first provide a template, which it then modifies to try to break your code.

The template that I used in this case was a minimal map file:

remote_prefix: /usr/local/www
local_prefix: /home/derick/project
/projects/example.php:5-17 = /example.php:8
/example.php:5-17 = /example.php:8-20
/example.php:17 = /example.php:20
/projects/php-web/ = /php-web/

In addition to the template you also need to provide a shim — the program that in my case takes the argument given to it and then parses that as a file. The AFL++ tool's compiler wrapper adds some magic to it to be able to catch errors.

When you then run the fuzzer, such as with:

AFL_SKIP_CPUFREQ=1 afl-fuzz -b 7 -i fuzz-seeds -o fuzz-output -- ./afl-test @@

It then runs your program with your template files (in fuzz-seeds in my example). And then also with loads of variants.

The output of it looks something like:

american fuzzy lop ++4.21c {default} (./afl-test) [explore]          
┌─ process timing ────────────────────────────────────┬─ overall results ────┐
│        run time : 0 days, 0 hrs, 0 min, 19 sec      │  cycles done : 0     
│   last new find : 0 days, 0 hrs, 0 min, 6 sec       │ corpus count : 44    
│last saved crash : none seen yet                     │saved crashes : 0     
│ last saved hang : none seen yet                     │  saved hangs : 0     
├─ cycle progress ─────────────────────┬─ map coverage┴──────────────────────┤
│  now processing : 0.1 (0.0%)         │    map density : 0.01% / 0.01%      
│  runs timed out : 0 (0.00%)          │ count coverage : 2.37 bits/tuple    
├─ stage progress ─────────────────────┼─ findings in depth ─────────────────┤
│  now trying : interest 32/8          │ favored items : 1 (2.27%)           
│ stage execs : 285/9626 (2.96%)       │  new edges on : 26 (59.09%)         
│ total execs : 4403                   │ total crashes : 0 (0 saved)         
│  exec speed : 248.6/sec              │  total tmouts : 0 (0 saved)         
├─ fuzzing strategy yields ────────────┴─────────────┬─ item geometry ───────┤
│   bit flips : 5/1408, 2/1407, 1/1405               │    levels : 2         
│  byte flips : 0/176, 0/175, 0/173                  │   pending : 44        
│ arithmetics : 6/11.9k, 0/20.6k, 0/20.3k            │  pend fav : 1         
│  known ints : 0/1481, 0/6354, 0/0                  │ own finds : 43        
│  dictionary : 0/0, 0/0, 0/0, 0/0                   │  imported : 0         
│havoc/splice : 0/0, 0/0                             │ stability : 100.00%   
│py/custom/rq : unused, unused, unused, unused       ├───────────────────────┘
│    trim/eff : 16.98%/83, 83.52%                    │          [cpu007: 12%]
└─ strategy: explore ────────── state: started :-) ──┘

The fuzzer found a few errors in my parser, which ended up crashing it. One such examples that I hadn't thought of is of a line starting with a =:

remote_prefix: /usr/local/www
local_prefix: /home/derick/project
=/example.php:42-5 = /example.php

You can find the other cases it found in a dedicated test file.

In September I hope to finish the parser (by storing things more efficiently internally) as well as creating APIs to do the mapping.

I have spent 8½ hours in August on Native Xdebug Path Mapping.