This lesson is in the early stages of development (Alpha version)

Exploring tabular data with R using a transcriptomic example dataset.: Instructor Notes

Learner profiles

These learner profiles describe some examples of our target audience. These were developed with reference to people who have taken Carpentry courses at the University Edinburgh, and who filled out a survey we conducted on research computing needs in Edinburgh’s School of Biological Sciences.

Jenny is a professor in plant sciences working on the response of crops to viral infections.

She has been analyzing data using MS Excel for a decade, but has never programmed from the command-line before. Her PhD student who did bioinformatics for the lab just graduated and left.

Jenny would like to be able to plan and execute next-generation sequencing experiments in her lab, but more immediately needs to be able to use publicly available data to advance her reserach. Another group in the field recently published a paper containing transcriptomic analysis of maize infected with Maize Iranian mosaic virus. Jenny would like to know if some genes she is studying are changing their expression in this dataset, but the paper does not mention them specifically.

Because of Jenny’s heavy teaching load, 2 children, and other commitments, she doesn’t have a lot of time to devote to learning new skills.

Data carpentry will teach Jenny how to get started in downloading, wrangling, and visualizing publicly available data. It will kick-start her learning tools that make it easier to analyze and visualize her data more effectively.

Bob has just started a PhD in biotechnology after spending a few years working in the brewing industry.

Bob plays around with robots for fun and has programmed a little in Python, and did an undergraduate project using javascript, but has never used R before.

His PhD project is likely to involve some transcriptomic and proteomics analysis and he is curious to know how that analysis might work, and what R can do. He wants some insight into what tools are actually going to be useful to him during the PhD.

Data carpentry will teach Bob about R/tidyverse’s capabilities for summarizing and displaying data, and give a flavour of the kinds of analyses that are used and useful in transcriptomics.

Sneha is a second year PhD student in cell biology, conducting experiments to study RNA splicing in mammalian cells.

She moved straight from an honours undergraduate program in biology to the PhD. Sneha has not used R or Python before, but has used Excel and Graphpad Prism to analyze data and considers herself an intermediate (non-beginner) user.

Lack of programming skills is Sneha’s biggest frustration in her work. It would really help if she could get hold of skills to analyse RNA-seq data and microarray data.

Programming and data visualization using R/Python is Sneha’s biggest perceived need.

Data carpentry will build Sneha’s confidence that she can learn to do this. It will push her to install the software on her computing, introduce her to some tools, and put her in touch with others who will learn alongside her over the coming months.