Deleting history of a file in AWS CodeCommit
29 Nov 2021
Table of Contents
- Installed git-filter-repo
- Back up GitConfig files
- Clean up my repository
- Unsuccessfully ran git filter-repo
- Deleted and re-cloned my repository from AWS
- Unsuccessfully reran git filter-repo
- Gitignore
- Reran git-filter-repo
- Took notes about CodeCommit web console before pushing
- Force-pushed from my desktop to CodeCommit
- Inspected CodeCommit web console after pushing
- Re-cloned and examined SourceTree
Has anyone here ever tried to completely erase a file that contained a password from AWS CodeCommit? I got stuck using git-filter-repo – things seemed to work on my computer, but not in the cloud.
(In my case, a completely new file was created that did the same job without a password, so I simply needed to make it look like the file had never existed.)
I followed these directions, but while I can no longer see any trace of the file’s existence when using SourceTree to peruse my local repo’s commit history, it’s still in the commit history displayed at console.aws.amazon.com after force-pushing my local repo to CodeCommit.
(The “most recent” version of the file says not found in the web console, but it’s not missing from commit history in the web console like it appears to be through SourceTree against local.)
Below are the steps I took.
Installed git-filter-repo
- From
c:\miscsoftware\git-filter-repo\
, ran:git clone https://github.com/newren/git-filter-repo#simple-example-with-comparisons .
- From a random directory, ran:
git filter-repo --version
- Got a reply of:
git: 'filter-repo' is not a git command. See 'git --help'.
- Got a reply of:
- Added
c:\miscsoftware\git-filter-repo\
to my username’sPATH
environment variable. - From a random directory, ran:
git filter-repo --version
- Got a number along the lines of
1234abcd5678
as a reply.
- Got a number along the lines of
Looks like I successfully installed Elijah Newren’s git-filter-repo
.
Back up GitConfig files
- Backed up the contents of my repository’s “local” Git Config file,
c:\my_repo\.git\config
, into a plaintext file on my desktop. - Just to be safe, also backed up the contents of my system’s “global” Git Config file,
c:\users\my_username\.gitconfig
, into a plaintext file on my desktop. - Just to be safe, also backed up the contents of my system’s “portable” Git Config file,
c:\ProgramData\Git\config
. - There was no file named anything like
gitconfig
atC:\Program Files\Git\mingw64\etc
, so I guess my system doesn’t have a “system” Git Config file.
Clean up my repository
- Ran a fresh
git pull
to download a copy of my Git repository from AWS CodeCommit into my local computer’s copy of my repository. - Got rid of all files I was working on in
c:\my_repo\
, moving them over to my desktop, so that SourceTree didn’t show anything in the “Unstaged files” panel.
Unsuccessfully ran git filter-repo
- In my Git Bash software, from
MINGW64:/c/my_repo
, ran:git filter-repo --invert-paths --path /c/my_repo/sub_folder/bad_script.java
Or maybe it was:
git filter-repo --invert-paths --path sub_folder/bad_script.java
I forgot.
Sadly, this errored out and I didn’t copy the error message down. To summarize, git-filter-repo complained that my repository wasn’t “freshly cloned” enough.
Deleted and re-cloned my repository from AWS
- Deleted the contents of
c:\my_repo\
- From
MINGW64:/c/my_repo
, ran:git clone https://git-codecommit.my-region.amazonaws.com/v1/repos/my_repo .
- Replaced the contents of
c:\my_repo\.git\config
with the backup I saved earlier.
Unsuccessfully reran git filter-repo
- In my Git Bash software, from
MINGW64:/c/my_repo
, ran:git filter-repo --invert-paths --path /c/my_repo/sub_folder/bad_script.java
- Results something along the lines of:
Parsed 5 commitsHEAD is now at 1234ab testing git Enumerating objects: 780, done. Counting objects: 100% (780/780), done. Delta compression using up to 8 threads Compressing objects: 100% (195/195), done. Writing objects: 100% (780/780), done. Total 780 (delta 560), reused 780 (delta 560), pack-reused 0 New history written in 2.8 seconds; now repacking/cleaning... Repacking your repo and cleaning out old unneeded objects Completely finished after 8.4 seconds.
- Results something along the lines of:
Gitignore
Not yet realizing things had failed…
- Created a
c:\my_repo\.gitignore
file (there wasn’t one) and putsub_folder/bad_script.java
into it. - Opened my repo in SourceTree. The
.gitignore
file is in “Unstaged files” andbad_script.java
isn’t despite still existing atc:\my_repo\sub_folder\bad_script.java
, which seems fine.- Except that if I change the contents of the gitignore to add an
x
to the end of the ignored filename,bad_script.java
still doesn’t appear in “unstaged files.” Weird.
- Except that if I change the contents of the gitignore to add an
- Went to SourceTree history view, and in the commit where
bad_script.java
was first added to the repository, sadly,sub_folder/bad_script.java
was still in the “added files” list, and I could see the password in the preview of the file’s contents.
Reran git-filter-repo
- In my Git Bash software, from
MINGW64:/c/my_repo
, ran:git filter-repo --invert-paths --path sub_folder/bad_script.java
- Results something along the lines of:
Parsed 80 commitsHEAD is now at f62b9d5 testing git Enumerating objects: 784, done. Counting objects: 100% (784/784), done. Delta compression using up to 8 threads Compressing objects: 100% (196/196), done. Writing objects: 100% (784/784), done. Total 784 (delta 572), reused 782 (delta 571), pack-reused 0 New history written in 0.2 seconds; now repacking/cleaning... Repacking your repo and cleaning out old unneeded objects Completely finished after 1.5 seconds.
- Results something along the lines of:
- Back in SourceTree, I clicked to another commit in this history, then clicked to the one I was hoping to see my file disappear from. I scrolled down the “added files” list to the appropriate place in the alphabet and noticed
sub_folder/bad_script.java
was no longer in the list. Great! - Furthermore, there was no longer a
c:\my_repo\sub_folder\bad_script.java
file on my hard drive.
Took notes about CodeCommit web console before pushing
- I visited
https://console.aws.amazon.com/codesuite/codecommit/repositories/my_repo/browse/refs/heads/main/--/sub_folder/bad_script.java?#
and validated that, in the cloud, not having pushed anything from my desktop to the cloud yet,bad_script.java
still existed, password and all in its body. - I also found
bad_script.java
athttps://console.aws.amazon.com/codesuite/codecommit/repositories/my_repo/commit/1234567890987654321?region=my-region
in the “go to file” suggestions, and I could see it in the file list, and see its contents, password and all, if I clicked on that suggestion. - Clicking that commit’s “browse” took me to
https://console.aws.amazon.com/codesuite/codecommit/repositories/my_repo/browse/1234567890987654321/--/sub_folder?region=my-region
, where I could see the contents of the file, password and all. - Clicking “browse” through the file tree for a later commit, I was able to see the contents of the file at
https://console.aws.amazon.com/codesuite/codecommit/repositories/my_repo/browse/0987654321234567890/--/sub_folder?region=my-region
, passwords and all.
Force-pushed from my desktop to CodeCommit
- With my “gitignore” still languishing in “unstaged files,” I skipped over the part of the directions that said to commit it to the repo. Maybe that was a mistake, but I didn’t figure I needed it, since the file had been deleted from my hard drive. locally, I hit “push” in SourceTree.
- From
MINGW64:/c/my_repo
, ran:git push origin --force --all
- Results:
fatal: 'origin' does not appear to be a git repository fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists.
- Oops. That’s right, the directions for using git-filter-repo warned me that my “remote” was going to disappear from
c:\my_repo\.git\config
.
- Results:
- Replaced the contents of
c:\my_repo\.git\config
with the backup I saved earlier. - From
MINGW64:/c/my_repo
, ran:git push origin --force --all
- Results something along the lines of:
Enumerating objects: 784, done. Counting objects: 100% (784/784), done. Delta compression using up to 8 threads Compressing objects: 100% (196/196), done. Writing objects: 100% (784/784), 3.9 MiB | 1.3 MiB/s, done. Total 784 (delta 572), reused 784 (delta 572), pack-reused 0 remote: processing ... To https://git-codecommit.my-region.amazonaws.com/v1/repos/my_repo \+ 9876fe...9876fe main -> main (forced update)
- Results something along the lines of:
Inspected CodeCommit web console after pushing
- Inspected
https://console.aws.amazon.com/codesuite/codecommit/repositories/my_repo/browse/refs/heads/main/--/sub_folder/bad_script.java?#
.- Results promising: I got a
PathDoesNotExistException
header with aCould not find path sub_folder/bad_script.java;
body as a big red error message.
- Results promising: I got a
But then it’s bad news:
- Not only is
bad_script.java
still in the picklist suggester athttps://console.aws.amazon.com/codesuite/codecommit/repositories/my_repo/commit/1234567890987654321?region=my-region
, but I could indeed jump to it and see its highlight. - “Browse File Contents” still let me see it at
https://console.aws.amazon.com/codesuite/codecommit/repositories/my_repo/browse/1234567890987654321/--/sub_folder/bad_script.java?region=my-region
I wonder what happened.
- Did I mess up not committing the
.gitignore
before pushing? - Did I mess up running
git filter-repo
with a.gitignore
in place? - Are there things cached in AWS that I need to request that they fix?
- If so, would setting up a new greenfield AWS remote and force-pushing to it do the trick?
- If that works, would nuking & re-setting-up the remote work? Or would it still fail if the remote had the same name due to caching?
Re-cloned and examined SourceTree
- Deleted the contents of
c:\my_repo\
again. - From
MINGW64:/c/my_repo
, again ran:git clone https://git-codecommit.my-region.amazonaws.com/v1/repos/my_repo .
- Replaced the contents of
c:\my_repo\.git\config
with the backup I saved earlier again. - Clicked away from history and back into history in SourceTree. Clicked to the commit where the file had been added. I scrolled down the “added files” list to the appropriate place in the alphabet, and
sub_folder/bad_script.java
is still gone from that commit on my hard drive. There still isn’t ac:\my_repo\sub_folder\bad_script.java
file on my hard drive.
WEIRD.
I wish AWS had provided proper documentation about thoroughly cleaning secrets out of repositories in the cloud the way GitHub did.
I’m stumped.