Zoomable scatter: cancer prevalence and survival rates
April 23, 2016
Last Friday, De Tijd published a multimedia article on cancer in Belgium and immunotherapy. To set the stage, the author wanted to include some numbers on the prevalence of different types of cancer in Belgium. This was the original chart (fantastic how they replicate the gradient in the legend):
My colleague Raphael Cockx, who produces our multimedia articles, saw ‘some room for improvement’ and obtained the data behind the graphic plus additional data on survival rates for the different cancers. He passed the data to me and asked me what we could do with it.
I made a quick sketch and came up with some kind of connected scatterplot, with for every type of cancer prevalence rate on 1 axis and survival rate on the other. Lines connect the dots for both sexes.
I made a first draft with Plot.ly, to see how the data looked like. Although lung and breast cancer are outliers and make all other cancers to be tightly packed at the bottom, there are a lot of interesting things to see in the graph.
I wanted [icon name="venus" class="" unprefixed_class=""] and [icon name="mars" class="" unprefixed_class=""] symbols for the dots and this is something Plot.ly can’t do. So I looked into Highcharts, becaus I saw an example that used Font Awesome for markers.
But it became clear rather quickly that making what I wanted to make would be difficult with Highcharts too. So it was time to take out the Swiss army knife of dataviz, which is of course D3.js.
I wanted a responsive chart, so I googled ‘D3 responsive scatterplot’, which led me to this block. It already had tooltips with D3 tooltip. This proved to be a very good starting point.
Below is the final result (see it live here, in Dutch). Use the buttons to zoom in and out of the rarer types of cancer, touch the symbols for the exact numbers.
Some code
D3 and svg have some native symbols, but [icon name="venus" class="" unprefixed_class=""] and [icon name="mars" class="" unprefixed_class=""] are not among them. So I had to recreate these symbols with circles and lines in D3. This code groups the circle and the lines for every symbol and translates them to the right place in the scatterplot.
For the connecting lines between the male and female symbols, I used D3.nest to group the data for males and females for the same type of cancer. Then I filtered out cancer types that only occur in one sex to obtain only cancer types which have numbers for both sexes.
On resize and when the user clicks one of the buttons on top, axes are resized and every element on the chart follows along. This is done by resetting the range of the scales and putting everything on the chart in its new place using the new scales. For the buttons, these transitions are animated.
For every male-female couple of symbols, the graph shows the type of cancer as a label. Labels for rare types of cancer are only shown when zoomed in.