xfuzz GitHub repository: https://github.com/kernelmethod/xfuzz
For your first programming assignment, you will design a web fuzzer similar to ffuf, which featured heavily in Labs 2 and 3. This program will be written in Python, and will primarily focus on using the aiohttp library to perform fuzzing as quickly as possible.
Getting started
For this assignment, you will be developing a tool called xfuzz. The
behavior of xfuzz is quite similar to ffuf and
wfuzz, in that it takes a wordlist, a
URL, and one or more parameters and replaces all occurrences of the word FUZZ
with terms from the wordlist.
I have already created an xfuzz GitHub repository with some skeleton code you can use to start creating your tool. You should start by following the installation instructions to download the xfuzz code and install all of the Python packages you will need to run it and to run tests for it.
Once you’ve done that, you can take a look at the assignment
instructions for
details on what you’ll need to do for this assignment. In brief: you will need
to build out xfuzz
(starting from the fuzz()
function in xfuzz/fuzz.py
) to
include all of the features that you see when you run
python3 -m xfuzz --help
In addition, to get 100% on the assignment, you will need to make xfuzz as fast as possible.
Testing
To ensure that your implementation is correct, you should test xfuzz using the
interactive server as well as pytest
, as described in the
README. In addition, I have
stood up a separate version of the test server on
http://cs3710.kerneltrick.org. Once you’ve
started working on xfuzz, you should try running it against the server, e.g.
python3 -m xfuzz -u http://cs3710.kerneltrick.org/enum/FUZZ \
-w test/wordlists/common.txt
Grading
This assignment is scored out of 9 points, broken down as follows:
Correct implementation (7 points)
You will receive 7 points for having a correct implementation of xfuzz
. We
will determine this using a test suite based on PyTest. Note that this test
suite primarily checks that xfuzz
fuzzes the correct URLs. If your fuzzer does
not generate correct output (see the expected program
output)
we may take off additional points.
If you wish to try running the test suite yourself, see the instructions for using PyTest in the repository README.
To get all seven points, xfuzz
does not need to be especially fast. However,
to ensure that we can grade your assignments in a timely manner, the test suite
will automatically stop running after 15 minutes. If any of the tests are still
running after this time, they will automatically fail.
Fast implementation (2 points)
You will get two additional points for having a reasonably fast implementation of xfuzz. Our primary criteria will be to run
python3 -m xfuzz -w test/wordlists/common.txt \
-H 'Content-Type: application/json' \
-X POST -mc 200 -d '{"username": "admin", "password": "FUZZ"}' \
-u http://cs3710.kerneltrick.org/auth/login
against the live server running on http://cs3710.kerneltrick.org. You will get one point if you can do a full scan of the server with this command in < 2 minutes, and two points if you can do it in < 1 minute.
Note: for grading purposes we will run these tests locally with 200ms simulated latency to ensure consistency. The server on http://cs3710.kerneltrick.org has been configured so that no matter what your internet connection is like, xfuzz will run strictly faster during our grading than it does during your tests against this machine.
Hints
Windows users
Some of the commands provided in the assignment description don’t quite work the same way in the Windows command prompt. If you are having difficulties, my first suggestion would be to run the commands in Powershell (which should be pre-installed on your machine), or to install WSL (Windows Subsystem for Linux) and run it there.
Otherwise, here are some fixes you can make to the provided commands to get them to work correctly on your machine:
- Replace the
/
character in paths to files with backslash\
. - Replace calls to
python
andpython3
withpython.exe
, e.g.:python.exe -m xfuzz --help
- Replace single quotation marks
'
with double quotation marks"
. In addition, you should escape double quotation marks that are inside of commands.
With these changes, the command
python3 -m xfuzz -w test/wordlists/common.txt \
-H 'Content-Type: application/json' \
-X POST -mc 200 -d '{"username": "admin", "password": "FUZZ"}' \
-u http://cs3710.kerneltrick.org/auth/login
would become the following:
python.exe -m xfuzz -w test\wordlists\common.txt \
-u http://127.0.0.1:25373/auth/login \
-H "Content-Type: application/json" \
-X POST -mc 200 -d \
"{\"username\": \"admin\", \"password\": \"FUZZ\"}"
Writing a fast fuzzer
Before you focus on optimization, you should ensure that your fuzzer implementation is correct and that it implements all of the features that you need.
Once you’re ready to start focusing on making your fuzzer fast, there are multiple routes you can take. Your fuzzer’s biggest bottleneck is in waiting for HTTP requests to complete; your CPU might be waiting hundreds of milliseconds for an HTTP request to finish, which is a lifetime from the CPU’s perspective (which usually operates on the order of nanoseconds). Therefore, you’ll want to find a way to run multiple HTTP requests concurrently.
To this end, I suggest using Python’s asynchronous I/O features to their full
extent. Recall that aiohttp
uses
Python’s async
/ await
keywords so that you can run multiple HTTP requests
concurrently, for instance:
import aiohttp
import asyncio
from asyncio import create_task
async def main():
tasks = []
urls = [
"http://www.example.org/a",
"http://www.example.org/b",
"http://www.example.org/c",
]
async with aiohttp.ClientSession() as sess:
for u in urls:
task = asyncio.create_task(sess.request("GET", u))
tasks.append(task)
responses = await asyncio.gather(*tasks)
if __name__ == "__main__":
asyncio.run(main())
In this example, we made HTTP requests to three different URLs concurrently and
then waited for them all to complete with asyncio.gather
. Now instead of
waiting for all three requests to complete one after another, we can just wait
for them to all complete at the same time.
I recommend structuring your program so that it consists of a “job scheduler”, which identifies work that needs to be done (in this assignment, HTTP requests) and “workers”, which perform that work. The scheduler puts jobs onto a queue, while workers take jobs off the queue and run them concurrently.
To implement this in Python, you can use
asyncio.Queue
to create the work queue and
asyncio.create_task
to construct the scheduler and workers. Here is some rough pseudocode for what
that might look like (the documentation for asyncio.Queue
includes another
example):
import asyncio
async def fuzz(args):
# Perform some pre-processing here with input arguments...
work_queue = asyncio.Queue()
tasks = []
# Create a scheduler task to queue up jobs
s = asyncio.create_task(scheduler(queue, jobs))
tasks.append(s)
# Create workers to consume jobs
for _ in range(n_workers):
w = asyncio.create_task(start_worker(queue))
tasks.append(w)
# Wait for the scheduler and the workers to finish
await asyncio.gather(*tasks)
async def scheduler(queue, jobs):
# Put jobs onto the queue so that workers can execute them
for job in jobs:
await queue.put(job)
# Put None onto the queue once for each worker so that they know
# there isn't any more work to do
for _ in range(n_workers):
await queue.put(None)
async def start_worker(queue):
while True:
# Get some new work off the queue
job = await queue.get()
try:
# If the job is `None`, there's no more work to do, so the
# worker can exit
if job is None:
break
do_work_for_job(job)
finally:
# Mark the job as being completed
queue.task_done()
One important note: if you choose to use this method, you do not want to create a new task for every single HTTP request you make. Your machine will spawn thousands of HTTP requests ~instantaneously, and your program will (probably) crash from resource depletion.