课程目录:Introductory R for Biologists培训
I. Introduction and preliminaries
1. Overview
Making R more friendly, R and available GUIs
Related software and documentation
R and statistics
Using R interactively
An introductory session
Getting help with functions and features
R commands, case sensitivity, etc.
Recall and correction of previous commands
Executing commands from or diverting output to a file
Data permanency and removing objects
Good programming practice: Self-contained scripts, good readability e.g. structured scripts, documentation, markdown
installing packages; CRAN and Bioconductor
2. Reading data
Txt files (read.delim)
CSV files
3. Simple manipulations; numbers and vectors + arrays
Vectors and assignment
Vector arithmetic
Generating regular sequences
Logical vectors
Missing values
Character vectors
Index vectors; selecting and modifying subsets of a data set
Array indexing. Subsections of an array
Index matrices
The array() function + simple operations on arrays e.g. multiplication, transposition
Other types of objects
4. Lists and data frames
Constructing and modifying lists
Concatenating lists
Data frames
Making data frames
Working with data frames
Attaching arbitrary lists
Managing the search path
5. Data manipulation
Selecting, subsetting observations and variables
Filtering, grouping
Recoding, transformations
Aggregation, combining data sets
Forming partitioned matrices, cbind() and rbind()
The concatenation function, (), with arrays
Character manipulation, stringr package
short intro into grep and regexpr
6. More on Reading data
XLS, XLSX files
readr and readxl packages
SPSS, SAS, Stata,… and other formats data
Exporting data to txt, csv and other formats
6. Grouping, loops and conditional execution
Grouped expressions
Control statements
Conditional execution: if statements
Repetitive execution: for loops, repeat and while
intro into apply, lapply, sapply, tapply
7. Functions
Creating functions
Optional arguments and default values
Variable number of arguments
Scope and its consequences
8. Simple graphics in R
Creating a Graph
Density Plots
Dot Plots
Bar Plots
Line Charts
Pie Charts
Scatter Plots
Combining Plots
II. Statistical analysis in R
1. Probability distributions
R as a set of statistical tables
Examining the distribution of a set of data
2. Testing of Hypotheses
Tests about a Population Mean
Likelihood Ratio Test
One- and two-sample tests
Chi-Square Goodness-of-Fit Test
Kolmogorov-Smirnov One-Sample Statistic
Wilcoxon Signed-Rank Test
Two-Sample Test
Wilcoxon Rank Sum Test
Mann-Whitney Test
Kolmogorov-Smirnov Test
3. Multiple Testing of Hypotheses
Type I Error and FDR
ROC curves and AUC
Multiple Testing Procedures (BH, Bonferroni etc.)
4. Linear regression models
Generic functions for extracting model information
Updating fitted models
Generalized linear models
The glm() function
Logistic Regression
Linear Discriminant Analysis
Unsupervised learning
Principal Components Analysis
Clustering Methods(k-means, hierarchical clustering, k-medoids)
5. Survival analysis (survival package)
Survival objects in r
Kaplan-Meier estimate, log-rank test, parametric regression
Confidence bands
Censored (interval censored) data analysis
Cox PH models, constant covariates
Cox PH models, time-dependent covariates
Simulation: Model comparison (Comparing regression models)
6. Analysis of Variance
Two-Way Classification of ANOVA
III. Worked problems in bioinformatics
Short introduction to limma package
Microarray data analysis workflow
Data download from GEO: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
Data processing (QC, normalisation, differential expression)
Volcano plot
Custering examples + heatmaps