python

Feb 17, 2025 ∞

We wrote a pretty powerful AWS infra tool in Python, and haven’t done much with it in a couple years except to update its dependencies.

I’m kinda thinking about porting it to Rust to get some extra type constraints (it’s already mypy-clean, but…) and make it a single executable deployment.

Oct 7, 2024 ∞

Python 3.13 launched today. I’ve done it. I’ve lived long enough to see a less-GIL’ed Python released to the public. Until now there’s been an unvirtuous cycle:

Python isn’t good at running CPU-intensive threaded code. → No one writes code like that. → There was no pressure to remove the GIL because no one writes code that would benefit from it. → Repeat.

I hope this is the first giant step toward good Python multithreading.

May 14, 2024 ∞

Python 3.13 is removing more Amiga “dead batteries” modules, like chunk:

The chunk module provides support for reading and writing Electronic Arts’ Interchange File Format. IFF is an old audio file format originally introduced for Commodore and Amiga. The format is no longer relevant.

I’m sure that’s the right thing to do. It still saddens me.

Apr 16, 2024 ∞

Palo Alto's exploited Python code

watchTowr Labs has a nice blog post dissecting CVE-2024-3400. It’s very readable. Go check it out.

The awfulness of Palo Alto’s Python code in this snippet stood out to me:

def some_function():
    ...
    if source_ip_str is not None and source_ip_str != "": 
        curl_cmd = "/usr/bin/curl -v -H \"Content-Type: application/octet-stream\" -X PUT \"%s\" --data-binary @%s --capath %s --interface %s" \
                     %(signedUrl, fname, capath, source_ip_str)
    else:
        curl_cmd = "/usr/bin/curl -v -H \"Content-Type: application/octet-stream\" -X PUT \"%s\" --data-binary @%s --capath %s" \
                     %(signedUrl, fname, capath)
    if dbg:
        logger.info("S2: XFILE: send_file: curl cmd: '%s'" %curl_cmd)
    stat, rsp, err, pid = pansys(curl_cmd, shell=True, timeout=250)
    ...

def dosys(self, command, close_fds=True, shell=False, timeout=30, first_wait=None):
    """call shell-command and either return its output or kill it
       if it doesn't normally exit within timeout seconds"""

    # Define dosys specific constants here
    PANSYS_POST_SIGKILL_RETRY_COUNT = 5

    # how long to pause between poll-readline-readline cycles
    PANSYS_DOSYS_PAUSE = 0.1

    # Use first_wait if time to complete is lengthy and can be estimated 
    if first_wait == None:
        first_wait = PANSYS_DOSYS_PAUSE

    # restrict the maximum possible dosys timeout
    PANSYS_DOSYS_MAX_TIMEOUT = 23 * 60 * 60
    # Can support upto 2GB per stream
    out = StringIO()
    err = StringIO()

    try:
        if shell:
            cmd = command
        else:
            cmd = command.split()
    except AttributeError: cmd = command

    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, bufsize=1, shell=shell,
             stderr=subprocess.PIPE, close_fds=close_fds, universal_newlines=True)
    timer = pansys_timer(timeout, PANSYS_DOSYS_MAX_TIMEOUT)

It uses string building to create a curl command line. Then it passes that command line down into a function that calls subprocess.Popen(cmd_line, shell=True). What? No! Don’t ever do that!

I fed that code into the open source bandit static analyzer. It flagged this code with a high severity, high confidence finding:

ᐅ bandit pan.py
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    running on Python 3.12.1
Run started:2024-04-16 17:14:52.240258

Test results:
>> Issue: [B604:any_other_function_with_shell_equals_true] Function call with shell=True parameter identified, possible security issue.
   Severity: Medium   Confidence: Low
   CWE: CWE-78 (https://cwe.mitre.org/data/definitions/78.html)
   More Info: https://bandit.readthedocs.io/en/1.7.8/plugins/b604_any_other_function_with_shell_equals_true.html
   Location: ./pan.py:14:26
13              logger.info("S2: XFILE: send_file: curl cmd: '%s'" % curl_cmd)
14          stat, rsp, err, pid = pansys(curl_cmd, shell=True, timeout=250)
15

--------------------------------------------------
>> Issue: [B602:subprocess_popen_with_shell_equals_true] subprocess call with shell=True identified, security issue.
   Severity: High   Confidence: High
   CWE: CWE-78 (https://cwe.mitre.org/data/definitions/78.html)
   More Info: https://bandit.readthedocs.io/en/1.7.8/plugins/b602_subprocess_popen_with_shell_equals_true.html
   Location: ./pan.py:49:8
48              bufsize=1,
49              shell=shell,
50              stderr=subprocess.PIPE,
51              close_fds=close_fds,
52              universal_newlines=True,
53          )
54          timer = pansys_timer(timeout, PANSYS_DOSYS_MAX_TIMEOUT)

--------------------------------------------------

Code scanned:
        Total lines of code: 41
        Total lines skipped (#nosec): 0

Run metrics:
        Total issues (by severity):
                Undefined: 0
                Low: 0
                Medium: 1
                High: 1
        Total issues (by confidence):
                Undefined: 0
                Low: 1
                Medium: 0
                High: 1
Files skipped (0):

From that we can infer that Palo Alto does not use effective static analysis on their Python code. If they did, this code would not have made it to production.