Pluralities in Semantics
(part 1)

posted by Darryl on 10 Dec 2014

In this post I'm going to discuss some of the design issues encountered while implementing Language Engine, specifically the issue of plural noun phrases.

First, I need to briefly introduce some basic ideas about the meanings of sentences. Suppose that we want to interpret the sentence "Vir stabbed the Emperor" into some logical form as the meaning. A very traditional approach to this would be to assign it the meaning


You could interpret this as a boolean expression, which is quite common in logic, or you could interpret this as just some piece of abstract syntax, or any other number of things. How you get this from the sentence doesn't really matter for my purposes here.

To represent the relationship of sentence to meaning, we'll write

[[Vir stabbed the Emperor]] = stabbed(vir,emperor)

with the double brackets indicating the function that interprets English into the logic.

This approach to meanings works pretty nicely for all sorts of sentences, and works especially well for mathematics and programming, where plurality is absent, or at least very different from in natural language. However, when we begin to consider the meanings of sentences with plural noun phrases such as "John and Susan" or "the pilots", things get a bit more interesting.

Consider, for a start, the sentence "John spoke", which we might take to mean

[[John spoke]] = spoke(john)

If we use a plural subject noun phrase instead, such as "John and Susan spoke", what could this mean? On the one hand, there's a reading in which each of them spoke separately, so that we might say something like

[[John and Susan spoke]] = spoke(john) & spoke(susan)

but there is another reading in which they spoke to one another. What would the meaning of that be? Would we need a different predicate, perhaps as in

[[John and Susan spoke]] = spoke2(john,susan)

If we took this route, things get very messy, as we add more and more people to the subject noun phrase.

But perhaps we don't need to do any of this at all. One solution that's been proposed is that the reading where they speak to each other is actually an assumption that's made. That is to say, perhaps it only has the conjunctive meaning

spoke(john) & spoke(susan)

and we simply assume, barring indications to the contrary, that they were speaking to one another. After all, they can't speak to one another without speaking-full-stop, and it's pretty common that when someone speaks, they're speaking to someone, so it seems reasonable to assume that they were speaking to one another, right? Maybe we don't need anything fancy after all. We'd still need to figure out what was going on with this noun phrase tho, but that goes back to the question of how we derive meanings from sentences, which I said we'll ignore.

Unfortunately this won't always work. While it's certainly possible to do this with "spoke", other verbs are more complicated. Consider "lift", as in "John and Susan lifted the crate". Now, this can perfectly well mean that each of them lifted the crate on their own, as in

lifted(john,crate) & lifted(susan,crate)

but it's entirely possible to say this to mean that they together lifted the crate. We probably don't want to use a conjunctive meaning for this reading, however, because unlike speaking, it might be that neither of them did, or even could lift the crate on their own. That is to say, while it's true that knowing "John and Susan spoke" lets us conclude "John spoke" — which the conjunctive meaning

spoke(john) & spoke(susan)

makes possibly with some trivial logic, it's not the case that "John and Susan lifted the crate" lets us conclude "John lifted the crate", and in fact they might not be strong enough for either of them to lift the crate independently. So what could the meaning of this sentence be, since we can't push this problem into an assumption like we did before?

Here's another sentence with similar properties: "the Starfuries surrounded the vessel". Here it's definitely not the case that any individual Starfury surrounded the vessel — that would be a pretty big Starfury! Huge even! What happened was, the group of them surrounded the vessel, encircling it or whatever.

So something has to be done to figure this out. What could be a solution? The easiest thing to do, arguably, is to just have sets of entities as the arguments to predicates. For singular noun phrases, we get singleton sets, for plurals, we get non-singleton sets. What could be simpler? Now we can just say:

[[Vir stabbed the Emperor]] = stabbed({vir}, {emperor})

[[John spoke]] = spoke({john})

[[John and Susan spoke]] = spoke({john}) & spoke({susan})
  (for one reading)

[[John and Susan spoke]] = spoke({john,susan})
  (for the other reading)

[[John and Susan lifted the crate]] = lifted({john,susan}, {crate})

[[the Starfuries surrounded the vessel]]
  = surrounded({sf0,sf1,sf2},{vessel})

Hooray, we've got a solution to the problem! We're done, right? Well, maybe not. This solution is less than ideal for at least two reasons.

Firstly, sentences like "John lifted the crate with Susan" should mean more or less the same thing as "John and Susan lifted the crate" on the collective-lifting reading, but there is no single noun phrase that has both John and Susan, so we would need some really rather fancy way to get that single set from two distinct noun phrases.

And secondly, we want to store these predicates — stabbed, spoke, lifted, surrounded — in a database, we would need to have compound data columns, which can be rather nasty since these sets can be arbitrarily large. Logically it's no problem, but computationally it's a bit of a mess to have that.

Is there an alternative way of representing meanings that will let us avoid these two problems? In the next post, I'll introduce some tools that will let us give a "yes" answer.

If you have comments or questions, get it touch. I'm @psygnisfive on Twitter, augur on freenode (in #languagengine and #haskell). Here's the HN thread if you prefer that mode, and also the Reddit threads.