DiscordDB: Creating Parasitic File Servers via Upload Abuse

IPM is a security firm focused on sociotechnical security. While "technical security" seeks to identify and remove unintended flaws in the architecture of platforms, "sociotechnical security" is all about identifying and removing the incentives for how worst-faith users may abuse the explicit intent of platform affordances.

Making this "sociotechnical" distinction brings into the frame a lot of issues not typically considered to be security issues, but are proving to become existential threats for a bunch of different businesses. Social platforms have misinformation problems due (in part) to fake accounts spreading it, online marketplaces face algorithmic manipulation challenges from sellers jockeying for position, and platforms with weak security around analytics face all sorts of ad and impression count fraud.

Discord2GPT2W.png

We recently investigated a systemic category of sociotechnical security issue - lax validation of upload affordances - and explored how the lack of strong form validation for uploads creates significant, mitigatable risk. Platforms tend to prefer low-friction interfaces, and tend to afford users increasing flexibility in affordances provided.

Discord’s in-chat upload feature is overly optimistic in assuming good faith in user behavior, and in doing so exposes a huge flaw. Because there is no validation for the input of files uploaded into the channel, and because their upload functionality has no verification or restriction scheme beyond requiring an active user session, their file server can be hijacked for nearly any arbitrary use case.

Far from being the only company facing this issue, we have identified or been notified of the existence of this problem within many popular online platforms. As a proof of concept, we used the unpublished APIs for Discord's file uploading capabilities to store a copy of GPT2 on their servers. For verification purposes, we provide the necessary scripts for loading and verifying the execution of those models.

The Discord vulnerability, to be fair, seems at least half-intended. Whereas on other platforms we’ve investigated, we’ve been able to hijack upload functionality purportedly only intended for the uploading of multimedia. In those cases, form validation should have easily excluded non-image-appearing files such as JSON data.

On Discord, the upload affordance at least appears to have been intended for the uploading of any file within a ≈10mb limit, ostensibly for easy sharing between users within a channel. This upload affordance is not meaningfully rate limited, however, meaning that we can rapidly send gigabytes of data through a channel in a short time frame simply by splitting our large file into smaller sub-files. Because the uploads are not individually signed and verified by Discord’s servers, the upload process is easily automatable - all that is required is copying a request that was successfully sent within the browser, and changing the upload package that is to be sent.

rdwMF5D.png

Below we provide our proof-of-concept script which allows for downloading GPT-2 via Discord. We also provide this script on a Github gist. In terms of the operation of this script, the initial URL links to a “manifest” JSON file which, when downloaded, contains a note from IPM in which we claim ownership of the manifest - second, it contains a listing of all the files required for re-download, in the order in which they must be re-stitched together. Finally, after all files are downloaded, they are decoded, re-constituted, and loaded in aitextgen.

The way this script is written, it is clear that any upload of any scale or complexity could use a similar scheme wherein the file is encoded, split by byte count, uploaded in partial phases, and then a final file is uploaded which describes the upload. With a bit more re-working, a general-purpose upload/download module could be productionized to fully realize the parasitic file store attack we imagine.


import os
import json
import urllib
import pathlib
import requests
from aitextgen import aitextgen
manifest_url = "https://cdn.discordapp.com/attachments/810918431142576201/810936686250557511/manifest.json"
download_dir = "download"

pathlib.Path(download_dir).mkdir(parents=True, exist_ok=True)
def request(url, user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:85.0) Gecko/20100101 Firefox/85.0"):
    return requests.get(url, headers={'user-agent': user_agent}).text

manifest = json.loads(json.loads(request(manifest_url)))
for file in manifest["urls_to_download"]:
  content = request(file['response']['attachments'][0]['url'])
  with open(download_dir+"/"+file['filename'], "w") as f:
    f.write(content)

commands = [f"cat {download_dir}/base64_gpt2_split* > base64_gpt2_unsplit.bin", "base64 -d base64_gpt2_unsplit.bin > pytorch_rebuilt.bin", f"base64 -d {download_dir}/config_base64.json > config.json"]
for command in commands:
  os.popen(command).read()

from aitextgen import aitextgen
ai = aitextgen(model="pytorch_rebuilt.bin", config="config.json")
ai.generate()




Previous
Previous

Automating COVID vaccine sign-ups with browser emulation.

Next
Next

Exploiting Lax Form Validation For Infinite Panera Brownies