Understand How to Use Visuals to Efficiently Explore and Effectively Communicate Data
Over the course of the first day, participants develop an appreciation for the visual language of quantitative data. Using an understanding of visual perception, colour and data we explore different ways of representing the same data to not only explore it for ourselves, but communicate a meaningful result to our audience.
In this example from a previous participant, sugar concentrations were measured over time in peppermint plants grown under different conditions. Among the many ways in which this seemingly straight-forward data-set could be visualised, the grammar which most readily communicated a clear result was uncovered. In this case we can observe that sugar concentrations, in particular sucrose, increase over time in high CO2 drought conditions.
Learn Practical and Flexible Commands for Generating Meaningful Publication-quality Graphics using R
The second day begins with a hand-on tutorial in ggplot2 the R package which implements the grammar of graphics plotting concept introduced in the first day. All major grammatical elements are discussed and demonstrated using a built in data-set.
By the end of the day students should have developed a visual solution for a data-set they have brought to class.
Develop tools to Building Interactive Interfaces for Dynamic Data Exploration
Day 3 focuses on making interactive plots that can be shared with your colleagues or published on the web. We will explore interactivity in two ways.
First, we will learn about making interfaces that provides access to your data. This includes all the familiar interface features such as pulldown menus, check-boxes, radio buttons, etc. but also upload and download features to use the same script on different data sets or save the visualisations for publication.
In this example from a previous workshop, the student’s data-set consists of 4200 relative ΔCT observations (a proxy for gene expression). These observations were contained in 279 unique combinations of 5 other variables: The experimental CO2 concentration used, the number of days after hatching, the individual gene, and gene groups and the tank from which the sample came.
All these elements come together in an interactive interface, which also provides the ability to choose the plot geometry. Different plot geometries reveal different trends in the data-set. The average relative ΔCT per tank is plotted upon activation of a check-box when the dot plot geometry is chosen.
Second, we will explore the use of tooltips and brushing to interact with the plot itself. An example of tooltips is shown in the following interactive binomial distributuion app. When the user hovers over a bar, the point probability is given. tooltips and brushing can be used with scatter plots and linked to tables, providing information on a single observation or many data points.
[testimonialswidget_list limit=10 category=data-visualisation disable_quotes=true random=true enable_schema=”false”]
The Data Visualisation workshop enhances participants’ understanding of the importance of visually communicating their research results. Added to this, students leave the workshop with confidence that they can create impactful, well-designed figures using the R statistics package.
The high overall score of the Data Visualisation workshop rests on the unique features of our training approach. Students single-out the personal knowledge and one-on-one attention of the instructor for praise. This approach helps them put the theory of Data Visualisation into practice when making their own figures. The tailor-made workshop manual is prized as a rich source of data visualisation ideas that inspires students to choose and implement optimal visualisations for their data.
Essentially anyone with some knowledge of R and some form of quantitative data that they need to visualise. This can be anyone from an in-house data scientist who is responsible for generating reports for colleagues and supervisors, scientists preparing their next publication or presentation or journalists who want to add another tool to their data journalism kit – anyone who appreciates that visualisation is an essential component of the data analysis and communication process.
Typically, participants in previous workshops have been graduate students in the life sciences, but we invite people from outside science participate.
In the past, participants have commented that we place too much emphasis on prior knowledge of R, to the point that some participants decided not to attend. We err on the side of caution.
Data handling is not covered in this workshop. In short, if you have no experience using R, you may find this workshop very challenging. This is particularly true if you have a poor understanding of your data and how to handle it. For this situation we refer you to the Data Analysis Workshop, which provides an introduction into programming in R.
In our experience, if you have already worked with other programming/scripting languages (e.g. Python or MatLab), you will have a much easier time, even if you only know the basics of R.
Another consideration is the format of your data – how are your variables and observations arranged. Rearranging your data manually is not only time-consuming but also error-prone. To help with this, we will briefly cover reshaping data. If you have a difficult time with R, this will again be a challenging part of the workshop for you. The less comfortable you are with R, the cleaner your data should be.
In short: no. Although there are plenty of built-in data-sets in R, which we make use of in the workshop, we emphasise that working on your own data is the most beneficial use of your time.
The workshop will be lead by Dr. Rick Scavetta, a biologist and co-founder of Science Craft. Rick has over three years of experience developing data visualisation solutions and teaching on-site Data Visualisation workshops for scientists of various disciplines. He is frequently hosted by graduate schools associated with Max Planck Institutes and Clusters of Excellence across Germany.
Participants consistently remark on Rick’s professional yet approachable presence in workshops. The atmosphere is relaxed, fun and participatory – everyone is encouraged to contribute their opinions and experiences – which fosters a positive learning environment.
Rick has authored the reference book used in the workshop and also offers his services as a “visual editor” for scientific publications. Last year, Rick spoke at the re:publica conference in Berlin and was invited to present concepts in data visualisation for Quarks & Co., a popular science program on WDR in Germany.
Coffee, snacks and lunch for each day are included in the workshop price. No accommodation arrangements are provided.
No, participants are expected to bring their own computers and data.
The workshop is limited to 12 participants. Each 8-hour day consists of 6.5 hours of instruction and exercise time plus coffee breaks (2 x 15min) and a lunch break (1 hour). In addition, each student receives a reference book, written by the instructor, containing additional material.
Early bird registration (until 12.07.2015) is 500EUR. Regular registration (13.07 – 23.082015) is 550EUR, and late registration (24.08.2015 – 01.09.2015) is 600EUR.
You can apply for participation via Betahaus at this link.
We will work with you to develop an appropriate visualisation solution. So far, most participants have been able to use the introduced packages to handle their data and produce appropriate and meaningful plots. If there are specific plot types you are keen on making, e.g. triangle plots, chord diagrams, Sankey diagrams, networks, etc., it would be helpful if you inform us in advance of the workshop. Special packages or data classes may be required.
HTML widgets is an exciting and growing, but still relatively new, development in R. This will not be a focus of the workshop, but depending on the interest and R proficiency of the participants, we may explore some of these plot types.
For the second day, we’ll rely mostly on ggplot2 and associated packages such as RColorBrewer, gridExtra, GGally and ggthemes. For the third day, we’ll move onto Rmarkdown and shiny.
D3 is wonderful – but this workshop focuses on using R. R has gained immensely in popularity in recent years – both within and outside academia – because it is remarkably flexible and sits at the cross-roads of powerful statistics, data analysis and data visualisation. For scientists already accustomed to using R, there is no need to move to another language to produce both publication-quality figures or interactive graphics.
We acknowledge the limitations of data visualisation in R – there is no one tool for all jobs. If there is a D3 plot type that you are keen on producing, this may not be the workshop for you.
Unfortunately, we are not able to offer subsidised or reduced registration fees.
If you require immediate results, we offer a Visual Editing service. You can submit enquires directly to firstname.lastname@example.org.
Our goal is to offer this as a recurring in-house workshop in Berlin. You may want to attend a future workshop.
If you are associated with a research institute or graduate school, you may consider organising a Data Analysis workshop, which will give you the necessary background. You can follow this up with a Data Visualisation workshop at a later date.