Untracking that directory you forgot to .gitignore 4 years ago

Aug 17, 2025, 5:40 PM
Δ
Aug 18, 2025, 6:00 AM
Jujutsu SiteV2

When I started my website repo, I forgot to git ignore…anything. Which meant I had been tracking .DS_Store files, temporary 11ty build outputs (located in _site), and an entire node_modules directory for ages. At some point I finally fixed that going forwards (and ignored/untracked those files) but that still meant I had megabytes upon megabytes of old file and diff data floating around in the commits.

Since I want to merge my notes and the website code itself into a single repo, I figured I should clean that up before make an even bigger mess.

With Git, that kind of thing is usually within the purview of git filter-repo (a la GitHub's sensitive data removal guide) because messing with commit history is scary forbidden magic.

Jujutsu is pretty chill about jumping around commits willy-nilly and doing nonsense to them, so I figured it would be up to the task of time-traveling file deletion. Unfortunately, I had a hard time finding any obvious or "idiomatic" ways to do it, but I did land on a couple viable approaches.

Safety first! Before going any further, it's probably not a bad idea to record the jj operation ID so you can restore the repo state in case you screw anything up. For example, mine was: 27d99ed5dad8, which I found by doing jj op log and copying the ID of the operation immediately before I started doing anything wild.

Slipping past the gate

Jujutsu tries to keep you from breaking things too bad by marking shared historical commits (e.g. ones from a remote's history) as "immutable" and preventing you from editing them by default. To edit them anyways, you just need to pass a --ignore-immutable flag when you use either jj new or jj edit to go modify stuff.

Step 0: Kill baby Commitler

First things first, we need to go back through history to stop a grave error, i.e. add a proper .gitignore file right before I started accidentally committing thousands of lines of JS dependencies. To figure out where it all went wrong, I ran:

jj log -r 'files(node_modules)'

which told me that Jujutsu change mn was the first change where the node_modules directory (my biggest offender) appeared.

So all I needed to do was slip in a .gitignore tweak before that point and resolve the resulting conflicts.

Inserting a new change before mn was fairly simple:

# Create a new change *immediately before* the offending commit
jj new --before mn --ignore-immutable  
# Tweak .gitignore to ignore `node_modules`, `_site`, etc.
vim .gitignore

That does the trick! Except now there are conflicts in every one of the 80+ changes after this point that need to be resolved.

Approach 1: Manual untrack

The first way I tried resolving the resulting conflicts was going to each conflict and manually untracking the still-extant files via jj file untrack.

For example, the next change after mn is kyk, which also changed the relevant files and now has a conflict. If we navigate to this change and untrack it, the conflict will be resolved and we will have successfully scrubbed the extra data from this part of history:

# Still editing "immutable" stuff, the flag may or may not be needed.
jj edit kyk --ignore-immutable

# Now we're in `kyk`, which has a conflict from out previous fixes.
# We need to delete the files that shouldn't have been there:
jj file untrack .DS_Store _site 'glob:**/.DS_Store' node_modules

Rather than manually moving to the next conflict by name, we can automatically find the next one (and do the untracking in one line) via jj next with some extra flags to move from conflict to conflict and edit the changes themselves instead of creating empty ones that need to be squashed:

jj next --conflict --edit && jj file untrack .DS_Store _site 'glob:**/.DS_Store' node_modules

Approach 2: restore

I noticed that jj restore has this little flag available:

--restore-descendants — Preserve the content (not the diff) when rebasing descendants

That sounded kind of similar to what I was trying to do! I wanted to keep the files deleted, regardless of changes.

It turns out jj restore is also a viable approach, but is still a bit manual (you can't restore multiple revs at once, you have to go one-by one).

This command restores the state of the specified files from one change (--from) to another (--into), :

jj restore --from <CHANGE 1> --into <CHANGE 2> .DS_Store _site 'glob:**/.DS_Store' node_modules --restore-descendants

In my case, the state is "deleted", because I don't want them to exist. The drawback in my particular case with --restore-descendants is that for all of the files I'm messing with, basically every commit introduces new changes on the files anyways, so it doesn't actually help much. If it had been a set of static files that lay dormant for hundreds of commits (e.g. you accidentally dropped some sensitive data somewhere and just now realized it), this probably would've been more successful at fixing things in one go.

It's worth noting that jj diffedit allegedly does the same thing (and has a --restore-descendants flag, too), but instead of operating on a per-file level it lets you tweak individual sub-file diffs.

Approach 3: Big squash/absorb?

What I really wanted to be able to do was say, "Hey, you know anywhere in the history where file X existed? Modify all of those changes so that they actually don't add/modify file X." untrack and restore don't let you modify more than one commit at a time, which led to a pretty manual process. (And, to be fair, that wasn't the worst thing in the world for me; it let me make sure I wasn't messing anything else up along the way.)

However, squash and absorb do allow you to squash/absorb into multiple revisions. So maybe that's a viable approach? Write up some "remove files X/Y/Z" diff and squash it? I wasn't quite able to get it to work, unfortunately. Maybe I'll try again some other time.

In the end, I just used a mix of the first two approaches and cleaned each commit manually.

Cleanup: fixing date metadata

Unfortunately, despite preserving the historical flow of commits throughout this process, all my edits changed the timestamps—now the history makes it looks like I wrote the whole repo in the last 2 days.

After a bunch of extra digging, I figured out some tricks to change this metadata, too!

Change/commit metadata

There are actually two kinds of timestamps associated with a Jujutsu change. You can see this in action via the show command, which has dates for both "author" and "committer" that may be different:

% jj show

Commit ID: 1a39541d6444596f43b9f76c082ed92a886ba5c3
Change ID: mnzkntkrpxtzmzuslssxsuumrvztroly
Author   : Jackson Mostoller <me@example.org> (2022-01-26 17:41:21)
Committer: Jackson Mostoller <me@example.org> (2025-08-16 23:28:10)

Handily, in my case it shows us that the "author" timestamp has been preserved even after all the messy history changes, so we just need to update the "committer" timestamp.

Modifying metadata

Author details can be updated via jj describe --reset-author if you set appropriate env vars like JJ_USER or JJ_EMAIL (as mentioned here and hinted in the docs). Not mentioned is that there are a few other env vars that are checked for, including…JJ_TIMESTAMP!

We could manually copy-paste timestamps, but jj's templating syntax can do us one better, because it can directly get us the author timestamp string (cutting out all extra formatting with --no-graph, and outputting as RFC3339 via the .format("%+") method):

% jj log --no-graph --revisions @ --template 'self.author().timestamp().format("%+")'

2022-01-26 17:41:21.000 -07:00

That means we can use the above command to get the author timestamp, temporarily put it into the JJ_TIMESTAMP var, and reset the committer timestamp via jj describe, all in one horrific command:

JJ_TIMESTAMP=$(jj log --no-graph -r@ -T 'self.author().timestamp().format("%+")') jj describe --reset-author --no-edit

Checking the results with jj show, it looks like it correctly changed the commit timestamp:

% jj show

Commit ID: 71e228b133b154b328bf1dfe046fe179f4a7ebab
Change ID: mnzkntkrpxtzmzuslssxsuumrvztroly
Author   : Jackson Mostoller <me@example.org> (2022-01-26 17:41:21)
Committer: Jackson Mostoller <me@example.org> (2022-01-26 17:41:20)

(Not sure about that 1-second difference, but it's close enough for me!)

Now to just do that to, uh, all the commits.

Doing it to all the commits

The workflow for this actually wasn't too bad because jj next has a handy feature where, if the "next" commit is ambigious, if pops up with an interactive picker to choose your path:

% jj next --edit

ambiguous next commit, choose one to target:
1: tzorrlpo 7a9ab3bc blah blah blah description
2: vqqkqqul 0facf803 lalala description of other change
q: quit the prompt
enter the index of the commit you want to target: <TYPE RESPONSE HERE>

So to blast through updating all the commits, I just kept running this command over and over, picking branching commits as needed:

JJ_TIMESTAMP=$(jj log --no-graph -r@ -T 'self.author().timestamp().format("%+")') jj describe --reset-author --no-edit && jj next --edit

YOLO force-push to main

Force-pushing to main is, of course, the only way to actually manifest all the changes, and the point of doing any of this in the first place.

A quick jj git push after modifying literally all of your "immutable" commits should do the trick of calcifying your historical revisionism. If you have a serious repo with, y'know, protections and stuff, there might be more hoops to just through here. Good luck!