Home
Notes on my R / Git workflow
- Details
- Category: R-Stats
- Published on 10 March 2013
- Hits: 665
These are some notes on my current R git work flow, which is quite fluid, and git has enough quirks that I usually forget part of it !
Creating Projects
I've used both RStudio and Eclipse. RStudio seems easier to create a 'project' and add a local git repo to it, but Eclipse has more functionality (like roxygen comment generation) so I prefer eclipse.
In Eclipse 3.7, I have both Statet and eGit installed. To start create a new project normally (File > New > R Project), and add any starting stuff like R and Data folder, a readme etc...
Right click on the project name and select Team > Share Project. Select Git and then create a local Git Repo. For some reason eclipse has a check box to create the repo within the Eclipse workspace, and then gives you a warning that its not recommended.
Then there are a few ways to commit, Right click on project and Team > Commit, use the Git Staging view tab, Whatever route, select which files to commit and enter a comment. Your name and email is stored in Preferences > Team > eGit.
Backing Up 'locally'
To 'backup' (and potentially make available anywhere) I have a Linux server called Pegasus tucked away somewhere that does many, many jobs. it's actually an old work desktop and a tad underpowered, but it does the job.
One job is to act as a backup server, and that goes for git too. using two pieces fo software, Gitosis and gitview. (although it seems Gitosis hasn't been updated in a few years, and isn't being actively maintained, which means no new bugs !)
To add a new repo to my server
on local machine;
cd~/gitosis-admin
kate gitosis.conf
add lines for the new repo, save and close
git commit -a -m "add repos for xxx"
Then cd to the repo your adding
git remote add pegasus gitosis@pegasus:PaulHurleyMisc.git
git push pegasus master
and the repo is magically on the server. I can even visit http://pegasus/viewgit/index.php and see the new repo sitting there.
Backing up to the cloud AKA Github
For things I'm happy to share, I have used github as a great cloud based way to share code (https://github.com/paulhurleyuk). The thing that always gets me is the need to create the repo on github before pushing to it.
So create a repo on Github
then, on your local machine
git remote add github https://www.github.com/paulhurleyuk/testrepo.git
git push github master
and I then get an error that something conflicts (because I have a file with the same name in both, usually readme.md), so need to do
git pull
and then merge/drop any changes before doing git push again....
Some assorted Links
https://help.ubuntu.com/community/Git
http://ao2.it/wiki/How_to_setup_a_GIT_server_with_gitosis_and_gitweb
http://lostechies.com/jasonmeridth/2010/05/25/gitosis-and-gitweb-part-1-setup/
Add a comment
Photo Challenge Egg-stravaganza
- Details
- Category: Photography
- Published on 05 November 2012
- Hits: 1578
This weeks photo challenge was 'An Egg, A torch, and a piece of paper'. Well I cheated a little (kids have lost my torch, so I used my SB-24 speedlight, and this is technically an animation, not a single image, but there you go.

Geotagging photos with Digikam and OpenGPS Tracker on my Android
- Details
- Category: Photography
- Published on 21 October 2012
- Hits: 1583
I love my Android ! I used to manually geotag photos from memory using Digikam, but after I recently got an HTC One V Android phone I thought there must be another way. And there is. Following a guide here, I downloaded OpenGPS Tracker on my phone and started it tracking and put it in my pocket on a recent walk around. I then took photos as normal.
When I got home I downloaded my photos from my camera, and on my phone 'Shared' my tracks as files, and transferred the gpx files to my laptop. I could then use the geolocation function in digikam to match my photos to the tracks. I had to extend the time limit to 240 seconds but other than that it worked really well.
You can see the results in my Amsterdam Flickr set
Add a commentCreating SVG Plots from R
- Details
- Category: R-Stats
- Published on 12 October 2012
- Hits: 1673
I recently wanted to create a ggplot that I could then 'tweak' furthur. This is my solution, to create an .svg file which can be loaded into a suitable application (I prefer Inkscape) and furthur edited / tweaked.
# Build an example Plot
library(ggplot2)
dataframe <- data.frame(fac = factor(c(1:4)), data1 = rnorm(400, 100, sd = 15))
dataframe$data2 <- dataframe$data1 * c(0.25, 0.5, 0.75, 1)
testplot <- qplot(x = fac, y = data2, data = dataframe, colour = fac, geom = c("boxplot",
"jitter"))
testplot

# Produce a PNG plot
library(Cairo)
Cairo(800, 800, file = "testplot12200.png", type = "png", bg = "transparent",
pointsize = 12, units = "px", dpi = 200)
testplot
dev.off()

#Produce an svg file
library(Cairo)
Cairo(800,800,file="cairo_2.svg",type="svg",bg="transparent",pointsize=12, units="in",dpi=400, width=20, height=20)
testplot
dev.off()
Add a comment
What lens should I buy next ?; Analysing and graphing a Digikam database using R
- Details
- Category: R-Stats
- Published on 10 October 2012
- Hits: 911
I use the Open Source photo management Software Digikam (along with other tools such as Gimp and DarkTable). I obviously need very little encouragement to combine my geeky hobbies, so I quickly tried to interrogate Digikam with R, which is easy, because Digikam keeps all it's image info in a SQLite database, which R has support for.
So this post shows how I did it, along with some of the output, such as the focal length of my images over time, looks like I need a telephoto lens ! (this script and my digikam db are in github here)
library(RSQLite)
## Loading required package: DBI
library(ggplot2)
library(plyr)
m <- dbDriver("SQLite")
basedir <- "/home/paul/RStudio/DigikamR/"
con <- dbConnect(m, dbname = paste(basedir, "data/digikam4.db", sep = ""))
Now we've opened the database, we can examine some of the tables within it.
# List the tables in the database
dbListTables(con)
## [1] "AlbumRoots" "Albums" "DownloadHistory"
## [4] "ImageComments" "ImageCopyright" "ImageHaarMatrix"
## [7] "ImageHistory" "ImageInformation" "ImageMetadata"
## [10] "ImagePositions" "ImageProperties" "ImageRelations"
## [13] "ImageTagProperties" "ImageTags" "Images"
## [16] "Searches" "Settings" "TagProperties"
## [19] "Tags" "TagsTree"
# List the columns of some of the interesting tables
names(dbReadTable(con, "ImageInformation"))
## [1] "imageid" "rating" "creationDate"
## [4] "digitizationDate" "orientation" "width"
## [7] "height" "format" "colorDepth"
## [10] "colorModel"
names(dbReadTable(con, "ImageComments"))
## [1] "id" "imageid" "type" "language" "author" "date"
## [7] "comment"
names(dbReadTable(con, "ImageMetadata"))
## [1] "imageid" "make"
## [3] "model" "lens"
## [5] "aperture" "focalLength"
## [7] "focalLength35" "exposureTime"
## [9] "exposureProgram" "exposureMode"
## [11] "sensitivity" "flash"
## [13] "whiteBalance" "whiteBalanceColorTemperature"
## [15] "meteringMode" "subjectDistance"
## [17] "subjectDistanceCategory"
names(dbReadTable(con, "ImageProperties"))
## [1] "imageid" "property" "value"
names(dbReadTable(con, "ImagePositions"))
## [1] "imageid" "latitude" "latitudeNumber"
## [4] "longitude" "longitudeNumber" "altitude"
## [7] "orientation" "tilt" "roll"
## [10] "accuracy" "description"
names(dbReadTable(con, "Images"))
## [1] "id" "album" "name"
## [4] "status" "category" "modificationDate"
## [7] "fileSize" "uniqueHash"
names(dbReadTable(con, "TagProperties"))
## [1] "tagid" "property" "value"
And now we can pull some of the inetresting tables into a dataframe
# Pull some of the information together
Imgs <- dbReadTable(con, "Images")
ImgComments <- dbReadTable(con, "ImageComments")
ImgMeta <- dbReadTable(con, "ImageMetadata")
ImgInfo <- dbReadTable(con, "ImageInformation")
# and merge it together
ImgMerge <- merge(Imgs, ImgMeta, by.x = "id", by.y = "imageid")
ImgMerge <- merge(ImgMerge, ImgInfo, by.x = "id", by.y = "imageid")
# clean it up
ImgMerge$make <- as.factor(ImgMerge$make)
ImgMerge$model <- as.factor(ImgMerge$model)
ImgMerge$faperture <- as.factor(ImgMerge$aperture)
ImgMerge$fexposureTime <- as.factor(ImgMerge$exposureTime)
ImgMerge$fmodel <- as.factor(ImgMerge$model)
ImgMerge$Year <- format(as.POSIXct(ImgMerge$creationDate), format = "%Y")
ImgMerge$Month <- format(as.POSIXct(ImgMerge$creationDate), format = "%b")
Here are some plots
# and draw some graphs
ggplot(data = subset(ImgMerge, focalLength < 60), aes(x = as.POSIXct(creationDate),
y = focalLength, colour = model)) + geom_point()

ggplot(data = ImgMerge, aes(x = focalLength)) + geom_histogram(binwidth = 5,
aes(colour = as.factor(model))) + facet_grid(model ~ .)

qplot(data = ImgMerge, x = as.numeric(as.character(aperture)), y = log(as.numeric(as.character(exposureTime))),
colour = as.factor(model), geom = "point")
## Warning: Removed 2638 rows containing missing values (geom_point).

ggplot(data = subset(ImgMerge, model == "NIKON D5000"), aes(x = focalLength)) +
geom_histogram(binwidth = 5) + facet_grid(Year ~ .)

ggplot(data = subset(ImgMerge, model == "NIKON D5000"), aes(x = as.POSIXct(creationDate),
y = focalLength)) + geom_point()
## Warning: Removed 14 rows containing missing values (geom_point).

ggplot(data = subset(ImgMerge, model == "NIKON D5000" & focalLength < 60), aes(x = as.POSIXct(creationDate),
y = focalLength)) + geom_point(alpha = 0.2)

Add a comment
Git Error when pushing with a large file
- Details
- Category: R-Stats
- Published on 09 October 2012
- Hits: 680
Quick Note: I had an error recently where RStudio nor EGit nor the command line would push my repo to github. I can't remember the exact error, although after some googling I found this SO answer that solved it
git config http.postBuffer 524288000
This fixed my problem.
Add a comment
