Where the Wild Things Are: A Complete Analysis of Jia Tan’s GitHub History and the XZ Utils Software Supply Chain Breach

Hayden Smith • Apr 02, 2024

The following is a story about the recent XZ Utils security breach and how things came about. For more context on the exploit, take a stroll over to here.


What can I say? My mother only read me picture books growing up.


Once upon a time there was a software developer, belonging to a nation-state that was an extremely patient and persistent attacker. They created a GitHub account on January 26th 2021. 

After joining, it appears Jia Tan did a good amount of development in private repositories, working away on projects we can’t see and we will likely never know.



This pattern continues for some time, and then, Jia Tan appears to need some time off and disappears in August of 2021. Who knows what happened... they could have lost funding for their operation, went on PTO, maybe Jia Tan was a multi-person operation consisting of developers onboarding and off-boarding like any other project.


They start to break out of their shell and need to establish bona-fides to look like the real deal. So, they have to get PRs approved starting with some really basic stuff.


Targeting Libarchive 2021

What is interesting is that it appears, in our opinion, Libarchive was a legitimate target. It fits the target profile as it is:

  • OSS
  • Widely used by almost all Linux/UNIX systems
  • Written in C
  • Deals with file compression


The only thing that is outside the targeting profile is that Libarchive has a ton of maintainers. Jia attempts to make multiple commits into master, some of which are unsuccessful. In fact, this is all Jia did starting September 2021 and October 2021.

The first of which comes when trying to add a dependency downloader script for APT and yum for Libarchive. This is extremely suspicious and dangerous as Jia proved in the XZ attack modifying helper scripts, shell scripts, and gitignore files allowed this backdoor to operate. Nonetheless, the Libarchive maintainers close this issue and pushback significantly. 

If you rack up enough of this work, build rapport with your target, then they begin to trust you enough.


This pattern pretty much continues for a year, small PRs for pretty meaningless additions to projects and then numerous commits in private repositories. 


He makes some changes back in October 4, 2021.



They also seem very novice and don't seem to comprehend the apt-yum package management, which I found surprising, even if it was 3 years ago. 3 years isn’t enough time to jump from “idk how to yum update” to “inserting a backdoor into a widely used project to cause catastrophic meltdown mode”. Check this out: 


After being told off by the Libarchive folks (kudos to them), he focuses only on commits within private repositories. Then, on October 18th, Jia is added to the Tukaani org in GitHub that is ultimately responsible for XZ utils git repository that was originally maintained by Lasse Collin. It’s clear through other sources, Lasse was pressured into giving up the primary ownership to XZ. This is not Lasse’s fault. XZ just meets the targeting profile:

  • OSS
  • Widely used by almost all Linux/UNIX systems
  • Written in C
  • Deals with file compression
  • Far less maintainers than Libarchive


 This is when Jia has complete access to the repo. He is in.

Jia works on some small-ish issues relevant to the XZ utils repo, with Jia removing the old readme and creating a new readme on December 12, 2022.

Lasse and Jia also continue to work on issues as they arise with certain topics. This does a few things from a social engineering perspective:

1. Establishes some level of bona fides soon after being added to the project

2. Helps build rapport


On February 6, 2022, Jia Tan makes legitimate commits to the XZ project that effectively adds arguments to the LZMA and LZMA2 Encoders.

These aren’t crazy commits at this point. He is building trust and slow rolling it, at that. Trying not to raise any eyebrows. Trust can be an extremely effective offensive security tool. 


You can see him already trying to review and close issues with the help of Lasse. His commentary does a lot to build rapport with other users.
Take a closer look below:

This continues for some time in 2022 with commits peppered throughout. 


Summer 2023 – Game On

Jia appears to take a vacation of sorts because he is dormant for 3 months, but he makes some basic commits before taking off, making himself the primary.


He is able to add himself as a new contributor to oss-fuzz. This is where the wheels start to fall off.


He is laying the ground work for disabling ifunc via modifying the build.sh script which nullifies any value oss-fuzz could provide to detect this attack. This is exactly what happens on July 7 2023, after that is merged, Jia has the green light to now modify XZ as he sees fit to install his backdoor. 

On August 24th, he forks squashfs-tools. This is suspicious as it could be used for offensive purposes, specifically if you are targeting embedded systems. He has no commits into squashfs-tools, and it’s unclear whether he could have saw it as a potential target or as useful in the next step of his kill chain. He forks the code and makes a ton of adjustments, too much to summarize in a picture book. He makes no commits back to

squash-fs tools, but works on it openly in a forked public repo.

As he is busy disabling tools and setting up his software supply chain attack, he gets into this weird argument and shoots down an idea to merge a mirror of XZ under the tookani-project to centralize and essentially take down the mirror. Lasse and Jia reject the request from this GitHub user below, and actually “thanks” the other maintainer of the mirror. It’s interesting because the user requests that the mirror bump to the latest 5.6.0 compromised version of XZ.

Through February, he remains fairly dormant until he uploads the build-to-host.m4 uploader. This is the script that will validate if it is osx86-64 linux in addition to checking if it is Debian or RPM package manager. If you go back to the top of the blog, he started in 2021 wanting to add a bash script to Libarchive specifying these two package managers as well. 

One week later, he finalizes the backdoor by adding the obfuscated code in the XZ repository. 

This was the final step. As of February 24th, he had a fully operational backdoor. All he needed at this point was to proliferate the backdoor into as many organizations as possible to establish a beach-head for a later date. Lastly, on March 25th,  he takes the time to update the security.md file as his very last commit, where he explains if any vulnerability or security discovery is made against XZ to send him an email privately.


March 29, 2024, the backdoor is found by researcher Andres Freund. A few hours later, GitHub had suspended Jia Tan’s account.


Soon after that, the XZ repo was completely locked.



5 TAKEAWAYS

1. The System Worked - The OSS community discovered the attack and the open source community performed their own version of incident response to contain the attack itself by identifying every commit and PR by Jia Tan and alerting the other community maintainers.

2. Someone else started the fight, but the OSS community finished it - Kudos to everyone who participated in this effort to secure OSS to help other maintainers that were impacted in the past by Jia Tan. The OSS community didn't back down from a fight even when they were starting in a very disadvantageous position.

3. OSS Maintainers are targets and really good ones everyone relies on OSS. This is a great way for an attacker to seize a huge portion of any targets software attack surface at once.

4. OSS Maintainers need better support from people that have both time and money - not all of these maintainers work for billion dollar companies. They need help. We don't think the answer is just asking even more from the community.

5. This actor puts the A,P, and the T in APT – this had all of the markings of a group effort. This was not a solo person. This involved planning, targeting, reconnaissance, funding, and near flawless execution. Even looking at the commit history the level of experience from a software developer perspective seems to increase dramatically in such a short period of time.


Stay vigilant and continue to hunt the threat at all times.


Happy Hunting!



By Hayden Smith 26 Mar, 2024
Recently, there was an attack targeting 170k+ GitHub users in a very complex attack that leveraged a lot of different tricks in the book including stealing session cookies, account takeover, dependency confusion and dependency hijacking just to name a few. I think all of the NVD drama drowned this out, but it's a pretty damning indicator of persistence to commit a software supply chain attack by adversaries which have planted this since *squints at watch* early February! Attackers are patient and can fool anyone, even maintainers who are the trusted guardians of a repository. Today, we will discuss lessons learned from the attack and some easy things your teams can do to protect their organization. 1. Anyone can be a target. Yes, that means you: Again, we are really cautious about putting out any FUD, but when we find a package as widely used as Colorama, anyone can fall victim to an attack as widespread as this which impacted just your every day developers doing their own projects after logging off of the 9 to 5. It’s time to step it up. It’s time to step it up and gain visibility into your software supply chain ( Cyber Kill Zone Tenet #1). SSC Defense: Incorporate security tooling into your CLI. When you are pulling packages, validate your packages being pulled are coming from legitimate upstream sources. S/O to my good friends over at Phylum which provides a fine tool to help protect your source code via blocking malicious packages from being downloaded onto your machine: https://docs.phylum.io/ 2. The Details Matter: The only difference between the legitimate website versus the poisoned domain was Python hosted versus PyPi hosted. Here is a screenshot from the CheckMarx blog, which you can find here .
20 Mar, 2024
Over the course of my career, I've seen a lot of cool technology, but I think most of us know in the cybersecurity community that the weakest link is typically the human.
By Hayden Smith 12 Mar, 2024
SBOMs, CTI, and I
By Hayden Smith 16 Jan, 2024
Prepare yourself/team/organization for a pre zero day, zero day, software supply chain attack. Knowing the software in your organization is a necessity for tracing threats in your software supply chain. See a break down of some of the OSS tooling available to help you make sense of the ever growing software supply chain attack surface.
By Hayden Smith 03 Jan, 2024
In this blog we discuss defining the Cyber Kill Zone, how it differs to be more proactive than the Cyber Kill Chain, and how to identify if you are in a Cyber Kill Zone today.
Share by: