This is an ongoing article where I explore a core functionality provided by d3: selections and data binding. Why? I realized I didn't fully understand how it works and that was limiting the benefit I get from the library. This is more to clarify my thoughts about the topic.
Let's start with the basics:
const sel = select('body')
.selectAll('h2');
The first part of the statement (select('body')
) uses the select
method from d3-selection package. This will return a new selection which is a
javascript function (remember they are also objects) with all the selection
functionality attached to it. By functionality I mean a bunch of functions attached
to this object. Exploring the selection object returned is informative.
It is important to stop here for a second and review what constructs Mike is using to provide this functionality. He has a wonderful article that covers this. I am going to go over it with my own words.
He uses a "configurable" function that returns another function. The inner function has access to all the config parameters we pass to the configurable function (via a closure). This function he returns, has functionality attached (setter/getter methods). Each of those "methods" return the function itself. Thanks to that we can chain calls together.
Let's build a simple example that encapsulates the logic to add two numbers to solidify these concepts.
function calculator(opts={}) {
let {x = 0, y = 0} = opts;
function engine() {
return x+y;
}
engine.x = function (value) {
if (!arguments.length) return x;
x = value;
return engine;
}
engine.y = function (value) {
if (!arguments.length) return y;
y = value;
return engine;
}
return engine;
}
Let's use that new piece of code we have just created. First, we call our new functions
and provide the two numbers we want to add: c = calculator({x:1, y:2}).
When we
execute the function, we assign the x and y values from the opts parameter (an object) to
the local variables x and y. After that, we create a function (engine
) that
encapsulates the logic we need (adding two numbers in this case). Then, we attach two setter
and getter functions (methods) to our engine function. This logic checks for the input
parameter (a number) and either sets a new value o returns the addend (x or y). In both cases,
we return the engine function. Because of that, we can do things like:
c.x(100).y(200)() // 300
Excellent. We are ready to move on. Let's bring back the original d3 statement:
const sel = select('body')
.selectAll('h2');
We are chaining another select call (.selectAll('h2')
) but at this
stage, we have already narrowed down the "selection space" to children of the body
element. That's because we are running the selection off of the first selection
result (a selection object).
Now that we have selected the elements we are interested on, we can go ahead and
preform data binding. That's linking our data to the elements we have selected.
We do it by using the property data
from our selection object.
const sel = select('body')
.selectAll('h2')
.data(data);
data
is an array of values. Those values will be assigned to
the selected elements. But, what happens if we have missing elements? Or what
happens if we have more elements than elements in our data array? The data
method in our selection returns an object that knows how to deal with that. It
does so by providing methods (operations) on these different cases.
data()
returns the elements that exist and are linked to our
dataset. That's called the update selection. We can access the other selections
via enter()
and exit()
.
enter()
gives us access to elements that do not exist yet but for
which we have data and exit()
yields elements without data associated
to it. Let's write some code to exercise these concepts.
This is a helper function that executes the data binding and exercises the enter, update, and remove() states:
function basicDataJoin(selector, data) {
const sel = select(selector)
.selectAll('span')
.data(data);
sel.text(d => d)
.attr('class', 'update');
sel.enter()
.append('span')
.text(d => d)
.attr('class', 'enter');
sel.exit().remove();
}
Now, let's run basicDataJoin('#numbers_example_1', ['a', 'b',
'c']);
. New elements have the salmon
color and new elements use dark red.
Excellent, all our new elements are there and they are properly bind to our data. Let's now call the same function twice so we exercise the enter and remove states:
basicDataJoin('#numbers_example_2', ['a', 'b', 'c']);
basicDataJoin('#numbers_example_2', ['a', 'x', 'y']);
And we get:
That may look strange to you. It seems those are existing letters/data? What is happening here is that the first element is assigned to the first datum, the second element to the second datum and so on. We are joining by index. But d3 provides alternatives to perform this binding. We can provide a function that evaluates on each element we select. The value returned by that function is what we will use to join the elements and datums. Let's write another helper function:
function advanceDataJoin(selector, data) {
const f = (d) => d.letter;
const sel = select(selector).selectAll('span').data(data, f);
sel.text(f)
.attr('class', 'update');
sel.enter()
.append('span')
.text(f)
.attr('class', 'enter');
sel.exit().remove();
}
Let's run that now like this:
advanceDataJoin('#numbers_example_3', [ {letter: 'a'}, {letter:'b'} ]);
advanceDataJoin('#numbers_example_3', [
{letter:'b'}, {letter: 'c'}
]);
And that's what we wanted. Now, after the second call, a is gone b is part of the update selection and c is part of the enter selection.