Class Meeting 4 The version control workflow
Today’s topic is version control..
4.1 Learning Objectives
From this lesson, students are expected to be able to demonstrate each of the git/GitHub functionality listed here.
4.2 Working with git and GitHub
Before we dive into concepts, it’s important to distinguish between a local repo and remote repo.
- Local refers to things on your own computer. A local repo is a repo found on your hard drive.
- Remote refers to things on the internet. A remote repo lives on GitHub (and possibly other places).
Note that you can have more than one remote repo! Because of this, git names the remote repos. We will only ever be using one remote in STAT 545, and by default, this remote is named origin.
4.2.1 Preliminary: configuring git (3 min)
You’ll need to config your git using the command line.
Your RStudio will probably be able to “find” git. But if it can’t, you’ll encounter errors. See happygitwithr: see-git for help.
Optional (but recommended): After class, you might want to cache your credentials so that you don’t have to keep inserting your password.
4.2.2 The typical workflow (8 min)
The majority of your interaction with version control will be a pull/stage/commit/push workflow, explained here. For another resource on this, check out happygitwithr: rstudio-git-github.
- Clone your repository if you don’t have a local copy.
Once you have a local copy of the repository, then working on a project involves frequent use of these three steps:
- Pull the remote repo.
- Make changes to your files, stage and commit them.
- Push the changes (after perhaps pulling again).
Git treats the remote repository as the “official” version of the repository. This means that your local copy is a second class citizen – the repository you have locally must be up-to-date with the remote repository before you are allowed to push your work. If there are commits on the remote repository that are not present locally, git will throw an error if you try to push your changes.
Integrate version control as you do work!
- Workflow without version control: save your files spontaneously.
- Workflow with version control: save your files spontaneously, commit your changes after every “step” in your work, and push your changes in case of fire.
Committing often ensures that you can trace back all the work you did. This results in transparency with the way your project has developed, which is a very effective workflow. But, from my experience at least, making half baked work viewable might face you with feelings of vulnerability. I encourage you to push past this.
4.2.3 The typical workflow: Activity (5 min)
Let’s make a change to our repository from local.
- Cloning your participation repo.
- In RStudio, File -> New Project -> Version Control -> Git.
- You should see a
Git
tab in RStudio, upper-right corner window. If not, see happygitwithr: see-git for help. - Take a look at the files you just downloaded!
- Make your README a little nicer. Maybe fix up the title.
- Stage and commit the changes:
- In the Git tab in RStudio, click the checkboxes for the files that you want to commit. This is called “staging”.
- Click the “Commit” button.
- Enter a commit message.
- Click “commit”.
- Push to your remote repository (which is named “origin”)
- Click the up arrow in the Git panel in RStudio.
4.2.4 Git Clients (3 min)
We just saw that RStudio can “talk to” git. But there are other ways we can use git locally. To “directly” interact with git, we type commands in the terminal (or bash) like git clone
, git commit
, etc. If you want to access the full functionality of git, you’ll need to use the terminal.
Alternatively, there are git clients that provide a visual dashboard for interacting with git. RStudio is one example. Others:
- GitHub Desktop
- Source Tree
- Gitkraken
- …
In STAT 545, we’ll be using the RStudio git client, but you can use whichever method you prefer.
4.2.5 Merge conflicts (5 min)
If you change a file locally, and that same file (and the same lines) get changed on the remote repository in a different way, you’ll end up with a merge conflict that you’ll need to resolve. Remember that your local copy is a second class citizen compared to the remote version, so you’ll have to resolve things locally before pushing to the remote.
4.2.6 Merge conflicts: Activity (5 min)
Let’s make a merge conflict, and fix it.
- Edit a line of your README both locally and remotely to something different in both cases. Commit both changes.
- Try pulling your remote changes. You’ll get a merge conflict.
- Update the file that has the conflict, commit your changes, and push.
4.2.7 Branching (8 min)
Branching is the idea of making commits somewhere other than your main repository on GitHub. Even cloning a repo is a type of branch.
Git and GitHub allow us to make branches within a repository, and we can do this both locally and on GitHub (although it seems the RStudio git client doesn’t provide functionality for local merging). Let’s check out the STAT545-UBC.github.io repo to see some branches (which is actually being phased out).
Eventually, you may want to merge a branch back to its predecessor. This is called merging. A merge specifically on GitHub initiates a pull request – the idea here being that you’d like the predecessor branch to pull the commits from the child branch, and you’re requesting it from the collaborators on your repository (which is sometimes just yourself). Example from STAT545-UBC.github.io again. For more info on pull requests, see this GitHub tutorial.
There are many reasons you may want to branch. Here are some:
- A collaborator wants to make a change to the repo, but the end product of the change requires review from collaborators.
- You want to make changes, but don’t want to “deploy” the changes until later (such as if pushing to github triggers a website build).
- If you want to try something “risky”, it’s just safer to work on a branch.
4.2.8 Branching: Activity (5 min)
Let’s organize our participation repo in a branch.
- Create a new branch locally, called “organizing” (we could have also made this on GitHub):
- Click the “Git” tab in the upper-right panel of RStudio
- Within that window’s option bar, click .
- Name your branch and create!
- Stage and commit the new files.
- Restructure your repository in a more sensible way, using folders (locally).
- Stage and commit the changes; push to GitHub.
- Explore:
- switch between branches to see that the repo structure is different.
- Merge the branch to “master” via GitHub by making a pull request.
4.2.9 Undoing Changes (5 min)
There are many ways that work can be “undone” in git. We will only investigate two of the simpler methods. For more advanced methods, like reverting to a previous commit, check out these resources by bitbucket and GitHub – you’ll need to use the command line.
The two most useful “undo”s are:
- Undoing your (uncommited) work to the previous commit.
- Browsing the repo at previous states, and taking files from there.
We’ll demonstrate (1) in an activity.
4.2.10 Undoing Changes: Activity (2 min)
Here’s how to go back to the most recent commit.
- First, make and save a change to (say) a README file in your participation repo.
- In the Git panel of RStudio, stage the file that you want to return to the previous commit. Click “More” -> “Revert…” -> “Yes”
That’s it!
4.2.11 Getting errors? (3 min)
It’s not unusual to experience some errors in git, especially if you’re first learning how to use it. Try to get yourself unstuck with the concepts we’ve discussed here first.
But, you might find yourself stuck. The git documentation is full of jargon, making it difficult to read and therefore difficult to debug things. There’s even a parody on it. If you are in this position, it’s best to just burn it all down. There’s even an xkcd comic on this.
4.2.12 Tagging a Release (5 min)
Tagging a release on GitHub is like putting a “star” next to a particular commit. It highlights a particular point in time of your repository that is noteworthy, typically after achieving some milestone. It’s just easier than having to manually keep track of noteworthy points in your commit history.
Examples:
- After every year of finishing STAT 545A/547M, we tag a release so that we can easily navigate to earlier versions of the course.
- After sufficient development of an R package like ggplot2, a new release is tagged corresponding to the version of the package.
4.2.13 Tagging a Release: Activity (3 min)
Congratulations! We finished the first two weeks of STAT 545A, which focussed on tools. To mark this milestone, let’s tag a release on our participation repositories.
- On your GitHub repo, click “releases”
- Click “Create a new release”
- Fill in the fields:
- It probably makes sense to use a versioning system like
cm004
here.
- It probably makes sense to use a versioning system like
- “Publish Release”.