Statistics and R « The Echinacea Project

Dealing with XML in R

XML is packaged for R via CRAN and is based on RSXML. Perhaps this will make it easier to parse the XML that the Topcon software puts out, or easier than trying to parse it all yourself.

file for Ian’s phenology analysis

Ian wants to quantify the overlap of flowering time between all pairs of plants in his experiment. This following R script reads his file with the flowering schedule of all 31 plants in his experiment and writes a file called ianPhenPairs.csv that has the flowering schedule of every possible pair combination (one per line). Note that there is a separate record for each plant as a sire and dam.

# script ian.phenology.r

pp <- read.csv("https://echinaceaproject.org/wp-content/uploads/ian.phenology.final.csv")

str(pp)

p <- merge(pp,pp, by = NULL)

str(p)

31^2 == dim(p)[1]

write.csv(p, “ianPhenPairs.csv”, row.names = FALSE)

Topcon GPS data

I dumped the topcon’s gps data into a csv. this data hasn’t been cleaned and contains a couple errors that have yet to be fixed, but it should be enough to flesh out some R magic to help parse the output. The point number, lat, long, and {all the entered data for the points} are comma seperated, the least being one big glob of stuff.

The first couple values are proper points but the wrong dictionary, so those will have to be done seperately.

stipa.csv

03 July – row/position R data and csv files

I only need 680 positions/site, because the seeds will be in between the plug points. So attached is a .doc and an R file w/ the script to create 3 sites with ~680 positions in each. I have also attached the resulting .csv file, 3 columns “site”, “row”, and “pos”.

KG_row&pos_03 July.doc

KG_row&pos_03 July.R

KG_positions_03July.csv

Here’s the breakdown:
site breakdown.xls

Next steps:

Assign each new.env ids to a row and position. See file: sane3blocks.csv

Create labels.

Put labels on envelopes.

Assign each plug to a row and position (keeping in mind that they’re already randomized in the trays.)

Develop planting protocol.

Organize materials for planting.

Mow sites.

Plant.

plants for Katie

Here’s a list of plants that are available for Katie to use in inb1:
plantsForKatie.csv
I made this list with this script :plantsForKatieKoch.r

Allegra’s pollination data .csv file

Here is my dataset that I am working on analyzing in R as a .csv file.

halverson.data.091.csv

Stuart, here is my R script so far:

halverson.data.analysis1.R

I made new columns in the .csv spreadsheet for the factors and levels we discussed. I will work on a list of hypotheses to test. I think I changed the definition of “y” when I did my 24 hour analysis. Can I give “y” a different name for each analysis? Or does the code need to read a defined “y” each time?

Thanks for the help and check out the graph of 24 hours and the summary m2.

Allegra

shrivel data

Here’s a snippet of R code showing how to extract info from the shrivel character data (a file is below…

df <- data.frame(shrivel.txt =c("x", "xoxx", "xxxx", "oooo", "xoooo"))
df      # start off with this data frame
str(df)

df$shrivel.count <- nchar(as.character(df$shrivel.txt)) #add column

vx <- gsub("o", "", df$shrivel.txt)  # replace o with ""
vx
df$shrivel.xs <- nchar(vx)           # make a new column in df

vo <- gsub("x", "", df$shrivel.txt)  # replace x with ""
vo
df$shrivel.os <- nchar(vo)           # make a new column in df

str(df)
df      # final data frame

codeForAllegra.r

lists of random numbers for visors

Here’s a snippet of code I used to generate files to upload to visors.

makeRandFileForVisor <- function(size = 50, fname = "xyz"){
write.table(sample(1:size),
                 file = paste("E:\shared\rand",
                                    size, 
                                    fname,
                                    ".txt", 
                                    sep=""), 
                 quote= FALSE,
                 row.names= FALSE, 
                 col.names= paste("rand",size,fname, sep="")) }
visors <- c("ag","dr","kg","ad","cr","gk",
            "mmj","mj","ah","gd","sw","rs")
for (i in visors) {
makeRandFileForVisor(20,i)
makeRandFileForVisor(50,i)
makeRandFileForVisor(100,i)
makeRandFileForVisor(200,i)
}

visiting flags

This file lists flags in random orders suitable for pollinator observation tomorrow.

Here’s the R code used:

flagOrder <- function() {
cat(cat(sample(LETTERS[1:8]),"n"),
    cat(sample(LETTERS[1:8]),"n"),
    cat(sample(LETTERS[1:8]),"n"),
    cat(sample(LETTERS[1:8]),"n"),
    cat(sample(LETTERS[1:8]),"n"),
    "n")
}
for (i in 1:20) flagOrder()

spp coordinates

I generated a list of 40 random UTM coordinates for SPP and posted them here: sppRandCoords.csv.

Here’s the R code I used to generate random coordinates…

df <-  data.frame(order= 1:40,
                  E= round(runif(40,  286100,  286900),2),
                  N= round(runif(40, 5077080, 5077500),2))
write.csv(df, file= "sppRandCoords.csv", row.names= FALSE)

I gleaned the rough SPP corner coordinates from Google Earth--UTM 15T:

NE 286900 E 5077500 N

SE 286900 E 5077080 N

NW 286100 E 5077500 N

SW 286100 E 5077080 N

Here's a snippet of R code to make a plot of the points and to make a file with latitudes & longitudes..

df <- read.csv(
"https://echinaceaproject.org/wp-content/uploads/sppRandCoords.csv")
plot(df$E, df$N, asp = 1, type = "n")
text(df$E, df$N, labels= df$order)
require(PBSmapping)
names(df) <- c("EID", "X", "Y") 
df <- as.EventData(df)
attr(df, "projection") <- "UTM" 
attr(df, "zone") <- 15
fred <- convUL(df, km=FALSE)
write.csv(fred, file= "sppRandLL.csv", row.names= FALSE)

Here's a link to those 40 random points in a lat long projection sppRandLL.csv.

Categories

Experiments