When starting this tutorial, we assume:
We start with you working on a repository, without interference of colleagues
A new project starts with a new repository, which can be easily setup using GitHub.com.
For this first exercise, we have chosen to create a dummy repository:
(you do not have to copy the example, choose any name :-))
The resulting repository layout will look like this:
(Remember that there is a Clone or Download
button on this page!)
As we will mainly use Git(Hub) to manage R/Rmarkdown scripts, the existing integration of Rstudio and Git provides a convenient way to add version control.
Just to make sure, in case you did not already installed Git and Rstudio:
When installed properly:
In order to start using version control in Rstudio, we have to do some configuration the first time. You only need to do this configuration once.
First of all, tell RStudio where to find the Git installation.
Tools > Global Options
Git/SVN
.*If you do not know where Git is installed, open your command line (cmd
in Start for Windows user). When the cmd is open, type where git
and hit enter. The path should be something like: C:/Program Files (x86)/Git/bin/git.exe
. Still in trouble? Check this out.
Next, we have to tell Github who we actually are, in order to make the connection to the online account. To do so, Git requires the configuration of your Github (!) username and GitHub email:
Tools > Shell
to open the Git Shellgit config --global user.name "mygithubusername"
git config --global user.email "my.name@inbo.be"
Use your GitHub username! You can check if you’re set up correctly by typing git config --global --list
in the same shell.
When successful, the configuration is done. Congratulations!
We have initiated a repository online and a working Git within Rstudio. Hence, we can start working on the code locally by downloading the repository to our computer. Rstudio provides a convenient way to start a new project as a Git repository.
File > New Project...
, select Version Control
, choose Git
To get the https link, you need to click the green button and make sure to copy paste the link with as title Clone with HTTPS
:
An example of the project setup using an existing Git repository:
In your File explorer, search for your project folder and check the content. Does this corresponds to what is shown online on your repository website?
.gitignore
When starting a new project in RStudio, it will always add a file .gitignore
if it does not already exists (you can actually also create one in the online setup). A gitignore
file defines all those files that should not be taken into account by Git. An example is the myprojectname.Rproj
file, as this is a user/computer specific file.
Hence, we can ignore the myprojectname.Rproj
file by adding the file to the .gitignore
text-file. We can do this inside Rstudio:
git
pane (the tab that says Git),....Proj
and select Ignore...
.gitignore
is correctly updated and click Save
.As our dummy project is called favourite-fruit-color
, the .Rproj
file is called favourite-fruit-color.Rproj
. We right click on the file and select Ignore...
:
When to use .gitignore
:
Note that you can use wild cards e.g. *.Rproj
to exclude a group of files from the version control:
Check the content of the .gitignore
file. If satisfied, click Save
:
As the .gitignore
is a newly added file (or at least adapted), we can commit
this change and provide a commit message:
git
pane.gitignore
Commit
commit message
and click commitClose
to remove the commit summaryThe status of the .gitignore
file before the file was committed, with the yellow question mark saying that the file is yet unknown to Git:
By checking the box and clicking commit, we add the file and commit this addition with a commit message:
We get a small technical overview of the alterations we provided with this commit:
It is good practice to commit often, so you will do this a lot. Each commit should only contain changes related to a single problem/element/… Each commit is a snapshot of your project and the messages describe the story of your project.
As documentation is crucial, providing some more information in the README.md
file will help others (and yourself in a couple of months/years) to understand the aim of the project. Just as we adapted the README.md
file online earlier, we can do the same locally:
README.md
inside RstudioRemark the reporting of the changed lines linked to this commit with green and red colors:
Note that the git
pane displays your branch is ahead of ‘origin/main’ by 2 commits. This is actually providing you a warning message that there is no backup of these two commits! In order to store these changes on GitHub as well, we have to push
our changes to GitHub:
push
in the git
paneRstudio gives you a warning about the status of your local commits versus those stored on GitHub:
Is the README.md
adapted? Where do you find the Commits overview online?
It is a tedious task to retype the password each time you want to push
anything to GitHub. Luckily, you can store your credentials when using https:
more
button in the git
pane and select Shell
git config --global credential.helper store
exit
to quit the shellThe next time git
needs your credentials, it will ask them one more time and store them!
The more button has a wheel
symbol:
As mentioned earlier, you should commit often and make sure each commit links to a specific change/problem. Sometimes, this means that you have to split the additions in a single file into two individual commits. Rstudio provides the interface to include specific lines of code into a commit message:
To summarize, the following actions can be executed:
stage xyz
: add the xyz from the commitunstage xyz
: remove the xyz from the commitdiscard xyz
: revert the changes in the xyz (be careful, can’t be undone!)and xyz
can be
line
selection
of lineschunk
README.md
locallycommit
for each change:
stage selection
commit message
and click Commit
Sometimes, conflicts will appear. Maybe because a collaborator was working on exactly the same lines of code or because of a mistake in your workflow,… No worries, we’ll learn you how to fix conflicts by initiating one.
Update your README.md
online on GitHub, on the exact same line you just edited locally in the previous exercise.
If you do not remember how to change files online, check again this online tutorial. Check the result of your online commit, similar to:
Back in Rstudio, try to push (click push
) your local changes.
Git provides a warning about the remote changes on the same repository:
Click Pull
to download the changes from Github
Git notices the CONFLICT and demands you what to do next:
Keep calm and resolve conflict!
Each conflict is always marked by the combination of the following elements:
<<<<<<< HEAD
your local code
=======
the code as it exists on Github
>>>>>>> origin/main
To solve a conflict, you have to decide which version of the code you want to keep.
Open README.md
in RStudio:
<<<<<<< HEAD
A simple analysis to visualize my favourite fruit colour.
=======
A simple analysis to discover my favourite fruit color.
>>>>>>> origin/main
Choose what you want to keep and only keep that code in the file:
A simple analysis to discover my favourite fruit colour.
Commit the adaptation with a commit message
Click Push
to update GitHub
If you encounter a conflict in the future, repeat this procedure.
During a project, new files will be added to the project folder, which need to be version controlled as well. New directories and files can be added and committed, just like any other adaptation.
By clicking the box next to a file, the file is staged (i.e. ready to be committed). Staging a new directory will stage all files in the directory. However, you can not stage empty directories!
fruits.csv
file in a /data
subdirectoryREADME.md
(use relative paths), mentioning the purpose of the fileREADME.md
adaptation) in a single commit messageRstudio provides information about the status of the file:
README.md
file is adapted (blue Modified box)data/
is currently unknown to Git (yellow Question mark box)Clicking both will make them ready to be committed. Remark, the data/fruits.csv
file gets a green Added box:
Sometimes, you just did a commit of some adaptations, but you see that you missed some lines of code that should be included in the same commit. Git provides the ability to amend
a previous commit with some additional adaptations.
fruits.csv
Commit
as usual but check amend previous commit
in the commit message boxWarning: don’t do this on commits that have been pushed. That would result in conflicts.
Next to the Commit
box, the amend
option is available:
One of the advantages of using version control with regular commits is that you get a history of your project. You can check the history (the series of commits) both online on GitHub as well as in Rstudio:
history
in the git
paneNotice the presence of History
twice in the following image:
git
pane (the history of your commit messages)You can click on each of the commits to verify the adaptations that were part of the specific commit:
It is good practice to ALWAYS work on (short-living) branches. It allows you to freely experiment until you are satisfied of the result and it ensures your main
provides the last stable version of your analysis/project.
Create branch
symbol* (next to the current branch name)origin
Sync branch with remote
Create
*the create branch symbol looks as follows:
Make sure your branch name makes sense for the work you are planning to do. Use lower case characters and -
in between words (NO spaces!). For this example, we decided to use the name analysis-script
for the branch name.
Older versions of Rstudio do not provide the functionality to create a branch directly! You will have to use the Shell
to create the new branch:
checkout
to new branch
: git checkout -b analysis-script
git branch
git push -u origin analysis-script
to activate Push
/Pull
buttonsShell based approach to create a new branch:
This will also activate the pull
and push
buttons. From now on, you can commit as many times as you want/need to this branch and push the commits to GitHub, where it will be stored under the same branch name:
We can work on the just created branch and start making adaptations:
/src
directoryCommit
the adaptation to the branchPush
to send the adaptations to GithubThe first time you switch branches and you see files disappearing, you might be wondering what is happening. No worries, Git is just making sure you only see those files relevant for the active branch. In the next exercise, we will explore this feature in more detail:
git
panemain
branch by selecting (LOCAL BRANCHES) -> main
analysis-script
)As in our example, the branch name is analysis-script
, the dropdown defines this as the currently active branch. Clicking on the dropdown provides an overview of all branches:
Rstudio provides you information about the switch to the main
branch:
Make sure you are currently working on your newly created branch (e.g. analysis-script
) before you proceed!
When satisfied about the work done in a branch, it is time to bring these adaptations to the main
branch as part of the stable analysis. NEVER merge locally on your machine, we will ALWAYS merge a branch online, by making use of a Pull request
!
push
(if not already done)pull request
onlineGithub actually suggests you to Compare & Pull request
on the webpage in a new yellow box. If you do not see the message, you can still initiate a Pull request
. The Pull request
aims to insert the adaptations in your branch into the main
branch.
In a new dialog, you can provide a short summary line about the Pull request
with the option to add additional information. By clicking the Create pull request
button, you actually propose the merge into the main
branch.
While working on your own and you are sure about the changes, you can actually merge the Pull request
yourself. Although, this is the ideal moment for revision of the code (you can ask people to review your code, but automated checks can be added as well).
Pull request
.For the moment, your adaptations are integrated into the main
branch online, but not yet on your local computer. So, you have to update the local main
branch:
main
branch (right top in Git
pane)Pull
git branch -d analysis-script
You now have experienced all the major steps to effectively use version control with Git. We can now apply these same steps when working together.
On your own repository, you have all admin rights to add Collaborators
in the Settings
section of GitHub:
When invited to collaborate, you have to Accept invitation
in order to start working on the repository:
Issues can be used for multiple purposes: alerting colleagues about a bug in the code, proposing new features, discussing specific steps in the analysis,…
issue
onlinelabel
When there is no coding involved, e.g. you just need to update a few text lines in a markdown file, the usage of the online GitHub features could be sufficient enough:
pull request
and assign your collaboratormain
branchIn this section, all the individual steps come together to collaborate on the code of your project. First, we will start with a rehearsel of the individual steps as an exercise. Afterwards, we will try out a workflow that can be used as a step by step procedure in the future.
Working on a local branch - merging online
You can interpret the following exercise as a wrap-up of the steps learnt in the previous section:
clone
the other repository to your local computerbranch
locally with a different nameCommit
your adaptationPush
your branch
to the remote repositoryPull request
merge
online when appropriateTip: All functionalities are described in the previous sections
The steps of the previous exercise provide the main building blocks. Nevertheless, when working on a project, a step by step procedure can help in the beginning to remember the workflow. A dedicated version for Rstudio is available at this repo as a separate workflow overview.
Once you are satisfied with the status of your analysis, it makes sense to create a release:
Follow this tutorial to create a release.
main
branch.main
branch.pull request
without their consent.There’s no such thing, as a free lunch…
…but if you’re hungry:
Information combined at INBO Git course.
You’re welcome to provide issues, pull requests,…