Birb - Faster Web Discovery

Web discovery?

Penetration testing on web applications often involves some scanning for API endpoints, directories and other such elements.

This kind of brute-force enumeration can become quite time-consuming for some cases. Similarly, the diversity of server implementations and framework/application behavior makes it particularly difficult to have a "universal" scanner - some manual tuning is almost always required to achieve the best possible results.

The Birb approach

Birb is a very fast web discovery tool.

Written in C, it uses parallel processing to go through a wordlist as fast as possible, consistently hitting the highest possible request rates allowed by the server.

In other words, Birb requires very little resources from the user's system, and is only limited by the network capabilities and processing power of the target.

How it works

To achieve this, Birb forks into as many processes as requested by the user, splitting the wordlist into chunks for each process. Scanner processes therefore have close to zero adherence between each other, thus maximizing parallelism.

Thanks to this, Birb can easily run with more than 100 simultaneous processes on modern medium-range laptops.

Birb also makes use of HTTP pipelining to send as many requests as defined by the user in bulk, before the responses even start coming in. Some servers accept only one request at a time, requiring the user to limit Birb through command-line arguments, but more modern/powerful servers may allow upwards of 300 requests at a time (the default value for Birb).

Finally, Birb doesn't rely on any pre-made HTTP library - instead featuring its own HTTP implementation, optimized for itself.

Get it

The latest version of Birb can be downloaded here: birb-1.19.2

It requires Linux or WSL (Windows Subsystem for Linux), and pretty much nothing else.

Usage

Birb is run with a set of arguments to control various aspects such as number of processes, number of bulk requests to pipeline, scan filters, etc...

Keys can then be used at runtime (during the scan) to achieve various effects.

Any time a directory is encountered, Birb will descend into it and re-scan the wordlist within it. At any point, the "p" key can be used to ignore the current directory and step back up one level, while the "x" key will step back all the way to the root directory. Pressing the "z" key will pause Birb (resume with "Z"). The "c" key pauses Birb and opens up an interactive command line. Alternatively, the "e" key will cancel the scan session and exit.

Basic

In its most common form, Birb is usually run with only a wordlist and a target:

$ birb path/to/wordlist.txt https://target.example/

Tuning

Performance can often be improved by tweaking the "-b/--bulk" and "-p/--procs" command-line parameters.

$ birb -b 10 -p 10 path/to/wordlist.txt https://target.example/

Filtering

Whenever we have some false positives (uninteresting results), we can use Birb's filtering engine which offers two types of filters:

Content-Length filtering (-l SIZE) can be used to ignore any responses which feature a body of exactly SIZE bytes.

Regex filtering (-r REGEX) can be used to filter out any response which matches the REGEX regular expression (PCRE format). The expression is applied to the entire response, including HTTP status code, headers and body.

Multiple filters can be combined together for best results.

$ birb -r 'HTTP/1.. 502 ' -r 'does not exist' -l 218 path/to/wordlist.txt https://target.example/

Advanced

Below is the complete help, which can be obtained from the "birb --help" command.

Usage: birb [OPTION...] WORDLIST URL
Birb - Web Discovery Tool

Birb uses a wordlist to interact with HTTP servers. Two modes of operation are
available:

 * URL discovery (Normal mode) - Words are used to construct URLs, allowing
Birb to discover directories and endpoints within websites.

 * Body-fuzzing - Words are used to alter the request body, while the URL never
changes. To use body-fuzzing, specify a payload containing the fuzz marker
([birb:fuzz]) using the '-d' argument.

The following keys can be used at runtime:
 [p] Step back one directory level
 [x] Step back to root directory
 [c] Open interactive command line
 [z] Pause activity
 [Z] Resume activity
 [e] Abort and exit

The interactive command line offers the following commands:
 help - Show available commands
 run - Exit command line and resume activity
 exit - Terminate activity
 show - Display current parameters
 verbose [LEVEL] - Set verbose level (no args = display current value)
 bulk [REQS] - Set bulk request count (no args = display current value)
 delay [SECS] - Set request delay (no args = display current value)
 ares [0/1] - Consider all results / don't ignore 404s (no args =
display current value)
 head [HEADER] - Set header in request (no args = display headers)
 rmhead <HSTART> - Remove any header that starts with HSTART
 ext [EXT] - Add extension (no args = display extensions)
 rmext <EXT> - Remove extension
 regex [REGEX] - Filter out responses where REGEX matches (no args =
display regex filters)
 rmrex - Remove regex filter (defined through -r/regex)
 clen [LENGTH] - Filter out responses where HTTP content-length is LENGTH
(no args = display clen filters)
 rmcl - Remove content-length filter (defined through -l/clen)
 frex [REGEX] - Select responses where REGEX matches as valid (no args =
display positive regex selectors)
 rmfr - Remove positive regex selector (defined through -F/frex)
 drex [REGEX] - Select responses where REGEX matches as directories (no
args = display dir regex selectors)
 rmdr - Remove dir regex selector (defined through -D/drex)

 -a, --allres Consider all responses (don't ignore 404s)
 -b, --bulk=REQS Send out REQS requests before even getting a
 response (default: 300)
 -c, --cert=CERT_FILE Use CERT_FILE as client certificate (PKCS12)
 -d, --body=DATA Add an HTTP body to the request (And enable
 body-fuzzing if '[birb:fuzz]' is present)
 -D, --drex=REGEX Select responses where REGEX matches as
 directories
 -e, --ext=EXTENSION Add EXTENSION to list of extensions to be tested
 for every word ('.' will be added if absent)
 -E, --extfile=EXT_FILE Load every word from EXT_FILE and add to list of
 extensions
 -f, --path=REGEX Skip any path that matches REGEX
 -F, --frex=REGEX Select responses where REGEX matches as valid
 -h, --header=HEADER Set HTTP header HEADER in request
 -i, --skip=WORD Skip a word from the wordlist
 -l, --clen=LENGTH Filter out responses where HTTP content-length is
 LENGTH
 -m, --method=METH HTTP verb to use (default: GET)
 -n, --noenc Don't URL-encode words
 -o, --log=LOG_FILE Save log output to LOG_FILE
 -p, --procs=PROCS Parallelize scanning using PROCS processes
 (default: 1)
 -r, --regex=REGEX Filter out responses where REGEX matches
 -s, --suffix=SUFFIX Append SUFFIX after every word
 -t, --retry=COUNT Try COUNT times before giving up on a test
 (Default: 3)
 -v, --verbose Produce verbose output
 -w, --delay=SECS Wait SECS seconds before each request
 -x, --dump=DIR Dump interesting responses to files in DIR
 -z, --cpwd=CERT_PASSWORD Use CERT_PASSWORD as the password for the client
 certificate (-c)
 -?, --help Give this help list
 --usage Give a short usage message
 -V, --version Print program version

Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.

Report bugs to <eresse@dooba.io>.