Skip to contents

Load Excel files containing phenotypic data from GrainGenes, from multiple locations and years. Combine them into one data frame and separate genotype information with phenotypic data.

Usage

format_phenot(p2d, years, locs, traits, cols2rem = NULL, distMatchTrait = 8)

Arguments

p2d

path to directory where tables are saved

years

numeric vector of years to look for

locs

character vector of tab names to look for (including Entry) or to location names to identify the trial. If several names are corresponding to one trial, repeat the different versions in the vector and add the final name as vector name.

traits

character vector of trait names to look for. If several names are corresponding to one trait, name the different versions and use as vector name the sought version. for example: traits=c("VSK","Heading","FDK"); names(traits)=c("VSK","HD","VSK")

cols2rem

character vector of column names to remove, to avoid bad matching

distMatchTrait

numeric value, distance for string matching. Default is 8. Increased distance would lead to more matching and is more prone to errors.

Value

list of 4 components:

  • var.match.info: data frame of variable matching

  • sheet.match.info: data frame of sheet matching (finding the relevant tabs)

  • phenot: data frame of combined phenotypic data for all years and locations

  • entry.info: data frame of combined genotype information

See also

Author

Charlotte Brault