Help with LFS migrating to v2 from legacy

Hello,

I need some help for moving from renkulab v1 to v2; in particular because of git LFS files on v1.
I cannot push my LFS files to github or other external provider, it’s too large.
How I envision is that

  1. I need to pull git + git LFS to my laptop
  2. convert all LFS tracked files to regular files (I don’t stress about history of the LFS files, only the latest version matter).
  3. Remove those LFS files from all git commit history.
  4. Push this new history to github.
  5. Figure out the repo I broke with LFS files with Zenodo or etc. in some distant future.

Upon asking some chat bot, it looks more risky and less straightforward, and I am now wondering if my only trivial option without dedicating major time on this is to get the latest snapshot, remove LFS files, delete git history and do a init push on the remote provider.

Can someone create a step by step guide on how I should best proceed if keeping commit history without LFS is possible?

If it is indeed not something one cannot automate, please also let me know so I go with option B (no git history).

1 Like

Hi Firat, Does using git push --no-verify work for your scenario? According to Stack Overflow, this bypasses trying to upload the git LFS files.

Also, https://gitlab.datascience.ch/ should support git LFS. Is that not an option for you?

1 Like

@firat I have not tested this but after googling around it seems you can do the following:

  1. Clone the repo locally, use the mirror option: git clone --mirror <url>
  2. cd into the repo
  3. Install git-filter-repo (GitHub - newren/git-filter-repo: Quickly rewrite git repository history (filter-branch replacement)). See here for instructions: git-filter-repo/INSTALL.md at main · newren/git-filter-repo · GitHub
  4. git filter-repo --path unwanted_lfs_file --invert-paths --all
  5. git push origin --force --mirror

This will remove the lfs files from all history - and the last command will push the changes to the remote called origin. So the file will be completely gone (locally and in the remote repo) and wont be stored in git or lfs.

You can also do the following if you want to remove the LFS tracking of those files you removed:

  1. git lfs untrack <path>
  2. git lfs prune
  3. commit and push the changes

Also if you want to start a new repo without the history the easiest way is to just start a brand new empty repo on github for example. Copy over the files you want to have in the new repo and commit and push them.

Hi @laura.kinkead ,
Thanks for the suggestions. I will try sdsc gitlab with LFS.
I tried the git push –no-verify, but it does not work:

git push --no-verify github

Enumerating objects: 734, done.
Counting objects: 100% (734/734), done.
Delta compression using up to 8 threads
Compressing objects: 100% (540/540), done.
Writing objects: 100% (734/734), 1.30 GiB | 9.66 MiB/s, done.
Total 734 (delta 183), reused 732 (delta 182), pack-reused 0 (from 0)
remote: Resolving deltas: 100% (183/183), done.
remote: error: GH008: Your push referenced at least 138 unknown Git LFS objects:
remote: 5c58f39707752d22224527bef976a2e614eb0b5dfc5206b723b518f15b74e743
remote: 005bd9b0b94bfec99e55a1cf1f7c8f2cd60136f0543705b7a189d9331b4fcf98
remote: a78a7af5f388fdc7941d31672eb517568934424fe3cc89ded896e45498a7178c
remote: …
remote: Try to push them with ‘git lfs push --all’.
To github.com:firatozdemir/oadat-evaluate.git
! [remote rejected] master → master (pre-receive hook declined)
error: failed to push some refs to ‘github.com:firatozdemir/oadat-evaluate.git’

There are a lot of unwanted lfs files and since there are also a lot of repos to move in short time, I did not want to manually go over this list for the git filter-repo --path unwanted_lfs_file --invert-paths --all command.
I suppose if SDSC gitlab does not work, I will use
git lfs ls-files --all --name-only to delete lfs files (or references), copy them to a new repo and have a hello world commit for all projects

EDIT: Do not use this script from below.

Thanks for all the responses. I was able to push to sdsc gitlab using the following steps:

export RENKU_REPO="git@gitlab.renkulab.io:firat.ozdemir/oadat-evaluate.git"
export REPO_NAME="oadat-evaluate"
export NEW_REPO="git@gitlab.datascience.ch:dlbirhoui/oadat-evaluate.git"

#git clone $RENKU_REPO $REPO_NAME
#cd $REPO_NAME
#git fetch --all
#git lfs install && git lfs fetch --all 
#git lfs checkout
#git remote add gitlab $NEW_REPO
#git push gitlab; git push gitlab

There might be redundant commands in between, but this worked for this repo. Will try with other repos as well in the coming days

2 Likes

I need to make a correction. The above would only push the default branch..

The following seems to be a more complete code that would push all branches and LFS items:

export RENKU_REPO="git@gitlab.renkulab.io:.../....git"
export REPO_NAME="repo-dir-name"
export NEW_REPO="git@gitlab.datascience.ch:.../....git"

git clone --mirror $RENKU_REPO $REPO_NAME.git
cd $REPO_NAME.git
git lfs fetch --all 

git remote set-url origin $NEW_REPO

git lfs push origin --all
git push --mirror

cd ..
1 Like

One final comment.
It seems I need to do all this manually, but what would have saved me a good hour if not multiple hours would be to have
Download my repos on renku gitlab button that does something like what I am doing now:

export RENKU_REPO="git@gitlab.renkulab.io:cvl/cardiac_calcification_regression.git"
export REPO_NAME=“cardiac_calcification_regression”

rm -rf $REPO_NAME
git clone --mirror $RENKU_REPO $REPO_NAME.git
cd $REPO_NAME.git
git lfs install && git lfs fetch --all
git lfs checkout
cd ..
tar czf “$REPO_NAME.git”-$(date +%F).tar.gz “$REPO_NAME.git”

that then downloads the tar.gz file and that iterates through all repos of the user (which I need to do manually at the moment)

In GitLab, you can create an access token for the read_api scope and then make a call to list all your projects and iterate over that list, invoking the script you define above.

For example,

curl -H "PRIVATE-TOKEN: $GITLAB_TOKEN" https://gitlab.renkulab.io/api/v4/projects

Though, this xkcd may also be relevant in this case.

1 Like