Sparse Checkout Not Supported in actions/checkout@v2
At the time of writing this post, GitHub's built-in actions/checkout@v2
does not support sparse (partial) checkout, so I decided to just manually perform the checkout with sparse enabled. As a reference, there is an open issue about this on GitHub, with some comments offering some suggestions.
Note that the script I'm using is PowerShell, since the project I'm working on needs Windows to build.
A working demo can be found at my GitHub repo. The rest of the post goes through some of my findings.
Authentication
One tricky part was authentication. A simple way would be just adding the token to the repo URL:
git remote add origin https://${Env:GITHUB_ACTOR}:${{secrets.GITHUB_TOKEN}}@github.com/$Env:GITHUB_REPOSITORY
But that's not what actions/checkout
does. Also, it's not a good idea to add authentication into the URL since it maybe logged by the server (though in this case it might not be that critical since it's the GitHub server itself and the auth token expires after each run). So to follow what actions/checkout
does, I tried to use the header. According the log of when I used actions/checkout
, it looked like it's just the token itself, since it's masked out:
"C:\Program Files\Git\cmd\git.exe" config --local http.https://github.com/.extraheader "AUTHORIZATION: basic ***"
But when I tried it, it didn't work. After looking at the actions/checkout
code, found out it actually wraps the token with additional data. Here's the snippet of the job for handling this:
- name: Git - Setup auth extraheaders config run: | $authToken="x-access-token:${{ secrets.GITHUB_TOKEN }}" $bytes=[System.Text.Encoding]::UTF8.GetBytes($authToken) $encodedAuthToken=[Convert]::ToBase64String($bytes) git config --local http.https://github.com/.extraheader "AUTHORIZATION: basic ${encodedAuthToken}"
Fake Log Output..?
Well, I suppose technically it's not fake, but one very interesting thing that I noticed in the actions/checkout
code is that the log masking is explicitly set, and not delegated to the general token masking feature of the log output. It executes git config
command with the mask as the actual parameter. Afterwards, it changes the config file with the correct value, to avoid the OS from capturing the command line with the base64 token on it. So to people just looking at the log output, it looks as if it executed the git config
command with the token value, but in reality, it didn't.
In addition to the regular token value, GitHub actions will also automatically mask the encodedAuthToken
above.
LFS
To handle sparse checkout for LFS, I looked at how Jenkins' git plug-in does it, by looking at the log file first from the old Jenkins job I've had setup prior to moving to GitHub, and skimmed through the code to confirm, and which led me to using lfs.fetchinclude
:
git config lfs.fetchinclude folder/subfolder
LFS Pull?
Since git lfs
is included in the git installation of the host runner, the checkout will actually go through the filters and download the LFS files. But I've seen some cases where it missed a few, so I decided to add git lfs pull origin
. I can probably just enable skip-smudge
on the checkout and keep lfs pull
for possibly better performance.
Further Improvements
As mentioned above, just like the actions/checkout
, the job script can be modified to write the authentication in the config file instead of executing the git config
command so that the OS will not capture it.
The job only handles main/master branch builds. Probably should parse the ref to find out the branch or use explicit sha.
If you're running this on a self-hosted runner, you should also do some clean up before checkout. For GitHub runners, it's probably not necessary since every new run starts with a clean slate.
Self-Hosted Runner
One thing I want to mention about GitHub actions – compared to Azure DevOps, I like that GitHub actions' self-hosted runner is free. So I can add multiple self-hosted runners for a repo without worrying about cost. Azure DevOps charges $15 per month for each runner if you want more than one for your organization.