Spider Module for the webhaezer project.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
emile c85d9a268d added the TLS-finder 4 years ago
.gitignore Initial commit 4 years ago
README.md added instructions for adding the html modules 4 years ago
main.go added the TLS-finder 4 years ago

README.md

webhazer-spider

Spider Module for the webhazer project.

The module works using the following concept:

  1. Get the current page from the queue.
  2. Search all html links (<a> ... </a>) in the current page.
  3. Find the href key and extract the attribute (the url).
  4. Submit the found url to the queue.

Using this method, all the links on one page can be appended to the queue and the directory structure that lies behind the page can be partially disclosed.

In order to prevent loops, the queue should automatically find out if the entry that is about to get added is already present. If so, the url should not be appended.


###reqs:

$ go get golang.org/x/net/html