2 Git & Github

This section is paired with:

The two main tools you’ll learn about to start are:

  • Git is a version control system that lets you track changes to files over time. These files can be any kind of file (eg doc, pdf, xls), but free text differences are most easily visible (eg txt, csv, md). You can rollback changes made by you, or others. This facilitates a playground for collaboration, without fear of experimentation (you can always rollback changes).

  • Github is a website for storing your git versioned files remotely. It has many nice features to be able visualize differences between images, rendering & diffing map data files, render text data files, and track changes in text.

Steps:

  1. Create Github login
  2. Create project website with Github Pages
  3. Edit README.md in Markdown
  4. Create HTML website content with R Markdown

2.1 Setup Github & Git

  1. Create Github account at http://github.com, if you don’t already have one. For username, I recommend all lower-case letters, short as you can. If you use an email ending in .edu, you can request free private repositories via GitHub Education discount.

  2. Configure git with global commands. Open up the Bash version of Git and type the following:

# display your version of git
git --version

# replace USER with your Github user account
git config –-global user.name USER

# replace USER@SOMEWHERE.EDU with the email you used to register with Github
git config –-global user.email USER@SOMEWHERE.EDU

# list your config to confirm user.* variables set
git config --list

2.2 Github Workflows

The two most common workflow models for working Github repositories is based on your permissions:

  1. writable: Push & Pull (simplest)

  2. read only: Fork & Pull Request (extra steps)

We will only go over the first writable mode. For more on the second mode, see Forking Projects · GitHub Guides.

2.2.1 Push & Pull

repo location initialize edit update
github.com/OWNER/REPO create
~/github/REPO clone commit , push pull

Note that OWNER could be either an individual USER or group ORGANIZATION, which has member USERs.

2.3 Create Repository p2p-demo

Now you will create a Github repository for a project.

  1. Create a repository called my-project.

    Please be sure to tick the box to Initialize this repository with a README. Otherwise defaults are fine.

  2. Create a branch called gh-pages.

    Per pages.github.com, since this will be a project site only web files in the gh-pages branch will show up at http://USER.github.io/REPO. For a user (or organization) site, the REPO must be named USER.github.io (or ORG.github.io) and then the default master branch will contain the web files for the website http://USER.github.io (or http://ORG.github.io). See also User, Organization, and Project Pages - Github Help.

  3. Set the default branch to gh-pages, NOT the default master.

  4. Delete the branch master, which will not be used.

2.3.1 Edit README.md in Markdown

Commit your first change by editing the README.md which is in markdown, simple syntax for conversion to HTML. Now update the contents of the README.md with the following, having a link and a numbered list:

# p2p-demo

Wrangling data with R.

## Introduction

This repository demonstrates **software** and _formats_:

1. **Git**
1. **Github**
1. _Markdown_
1. _Rmarkdown_

## Conclusion

![](https://octodex.github.com/images/labtocat.png)

Now click on the Preview changes to see the markdown rendered as HTML:

Notice the syntax for:

  • numbered list gets automatically sequenced: 1., 1.
  • headers get rendered at multiple levels: #, ##
  • link: [](http://...)
  • image: ![](http://...)
  • italics: _word_
  • bold: **word**

See Mastering Markdown · GitHub Guides and add some more personalized content to the README of your own, like a bulleted list or blockquote.

2.4 Create index.html

By default index.html is served up. Go ahead and create a new file named index.html with the following basic HTML:

<!DOCTYPE html>
<html>
<body>

<h1>My First Heading</h1>

<p>My first paragraph.</p>

</body>
</html>

You’ll be prompted to clone this repository into a folder on your local machine.

See GitHub Desktop User Guides for more. You could also do this from the Bash Shell for Git with the command git clone https://github.com/USER/REPO.git, replacing USER with your Github username and REPO with my_project. Or you can use the Github Desktop App menu File -> Clone Repository…

2.5 Create RStudio Project with Git Repository

Next, you will clone the repository onto your local machine using RStudio. I recommend creating it in a folder github under your user or Documents folder.

Open RStudio and under the menu File -> New Project… -> Version Control -> git and enter the URL with the .git extension (also available from the repository’s Clone button):

If it all works correctly then you should see the files downloaded and showing up in the Files pane of RStudio. If RStudio is configured correctly to work with Git, then you should also see a Git pane.

2.6 Create index.Rmd in Rmarkdown

Back in RStudio, let’s create a new Rmarkdown file, which allows us to weave markdown text with chunks of R code to be evaluated and output content like tables and plots.

File -> New File -> Rmarkdown… -> Document of output format HTML, OK.

You can give it a Title of “My Project”. After you click OK, most importantly File -> Save as index (which will get named with the filename extension index.Rmd).

Some initial text is already provided for you. Let’s go ahead and “Knit HTML”.

Notice how the markdown is rendered similar to as before + R code chunks are surrounded by 3 backticks and {r LABEL}. These are evaluated and return the output text in the case of summary(cars) and the output plot in the case of plot(pressure).

Notice how the code plot(pressure) is not shown in the HTML output because of the R code chunk option echo=FALSE.

Before we continue exploring Rmarkdown, visit the Git pane, check all modified (M) or untracked (?) files, click Commit, enter a message like “added index” and click the “Commit” button. Then Push (up green arrow) to push the locally committed changes on your lapto up to the Github repository online. This will update https://github.com/USER/p2p-demo, and now you can also see your project website with a default index.html viewable at http://USER.github.io/p2p-demo

For more on Rmarkdown:

A more advanced topic worth mentioning is dealing merge conflicts

2.7 Exercise: Gil’s Israel Sites Dataset

Gil Rilov shared the following dataset for us to play with:

Please download and open this dataset. Your task is to investigate this dataset and prepare it for submission to OBIS.

2.7.1 Task: Provide Excel cell ranges for how you would divide data into tables?

For reading and wrangling data in R, please see cheat sheets and resources mentioned in: