Monday, January 23, 2012

New embedded clause options

In the previous post, I discussed the current state of embedded clause thought in Koa, and was about to go on to other theoretical possibilities that might do a better job of maintaining the spirit of the rest of the language's design.

I have a contender, actually. It occurred to me spontaneously while walking the dog, and I'm not sure just how crazy it actually is: I know there are languages that do it this way, but none of the ones I speak. It goes like this.

What if i doesn't just mark a verb with a third-person nominal subject like I've always said? What if it marks the clause itself as being independent, finite, verbal? And what if it can be replaced with other particles to switch the clause into one of the other two main Koa roles, nominals and adjectivals? We could end up with something like:

i - verbal
u - adjectival
ko - nominal

ka toto i paólo mo pili
DEF child FIN smell SIM lizard
"the kid smells like a lizard"

ka toto u paólo mo pili
DEF child REL smell SIM lizard
"the kid that smells like a lizard"

ka toko ko paólo mo pili
DEF child NOM smell SIM lizard
"the kid smelling like a lizard"

I was astonished: I had never ever before considered the possibility that i might alternate with anything else, but suddenly I seemed to be looking at a system that completely paralleled the rest of Koa syntax. And there's no question about scope, because the clause type of every verb is clearly identified.

Well, sort of. There's no i in clauses with a pronominal subject, of course, since we've always assumed that the pronoun was taking its place; should there be an u or ko, then? My best answer to this question so far amounts to a complete reenvisioning of Koa clause structure, from this


to this


In other words, i is no longer a kind of pronoun: it marks the status of the clause, and the pronoun falls between it and the verb. All we need to say is that the i is generally deleted before pronouns, and all of our existing material still conforms. U and ko, then, would seemingly need to be present even with pronouns, though I'm open to further thought on this.

(i) ni loha le Susi
(FIN) 1SG love NAME Susie
"I love Susie"

ka mina [ u ni loha ] i le Susi
DEF woman [ REL 1SG love ] FIN NAME Susie
"Susie is the woman I love"

poka i ilo [ ko ni loha le Susi ]
everyone FIN know [ NOM 1SG love NAME Susie ]
"everyone knows I love Susie"

My thought is that generally a clause would either have a pronominal or a nominal subject, but not both -- in other words, not le Keoni i ta paólo mo pili "John [he] smells like a lizard" -- but this will also require more thought. It has occurred to me that using "they" in this way might be an option for overt number marking on the subject, something we currently have no way of doing:

a susi i tu si iune ka nene ni
INDEF wolf FIN 3PL PERF steal DEF baby 1SG
"some wolves stole my baby"

But we'll leave that aside for the moment. Just to establish exactly what we're talking about here, I'm going to recast all of the preceding example sentences (and one or two new ones) using this strategy.

ka talo [ u nu ma asu (ne ta) ] i piku lia
DEF house [ REL 1PL IMPF dwell (LOC 3SG) ] FIN small too
"the house we live in is too small"

ka kane [ u ma ipo ka sahi ni ] i ia paha
DEF man [ REL IMPF drink DEF wine 1SG ] FIN AFF evil
"the man who is drinking my wine is certainly evil"

ka sahi [ le Keoni u si ipo ] i si miláho
DEF wine [ NAME John REL PERF drink ] FIN PERF INCEP-rot
"the wine John drank had gone bad"

ka [ le Keoni u si ipo ] i koa nai
DEF [ NAME John REL PERF drink ] FIN good some
"the one John drank was pretty good"

le Keoni i halu [ ko ipo a sahi koa ]
NAME John FIN want [ NOM drink INDEF wine good ]
"John wants to drink some good wine"

le Keoni i halu [ le Malía ko ipo a sahi koa ]
NAME John FIN want [ NAME Mary NOM drink INDEF wine good ]
"John wants Mary to drink some good wine"

le Keoni i na ma mai koa lo [ ko si ipo a sahi pua ]
NAME John FIN NEG IMPF feel good REASON [ NOM PERF drink INDEF wine bad ]
"John doesn't feel well because he drank some bad wine"

ni si kulu [ le Keoni ko na ma mai koa ]
1SG PERF hear [ NAME John NOM NEG IMPF feel good ]
"I heard that John isn't feeling well"

pai [ le Keoni na ma mai koa ]
day [ NAME John NEG IMPF feel good ]
"A John-not-feeling-well day"

ti pai i [ le Keoni na ma mai koa ]
this day FIN [ NAME John NOM NEG IMPF feel good ]
"this day is John-not-feeling-well-y"

vo ka sene [ u si tapa ka hili [ u si suo ka lepa [ u ne ka talo  [ le Iako u si tei ] ] ] ]
here's DEF cat [ REL PERF kill DEF mouse [ REL PERF eat DEF bread [ REL LOC DEF house [ NAME Jack REL PERF make ] ] ] ]
"this is the cat that killed the mouse that ate the bread that was in the house that Jack built"

In theory, this all works beautifully. Unfortunately, though, much though it pains me given the gorgeous symmetry of this system, I'm concerned that in many cases this might just be too weird, too typologically marked, for an IAL. For what it's worth, though, I do note that the use of ko above mimics Latin complement clause structure surprisingly closely, imagining ko + verb as equivalent to an infinitive.

Leaving this idea for further deliberation, there's something in the last series of examples that I'd like to point out. Take a look at these two sentences:

pai [ le Keoni na ma mai koa ]
day [ NAME John NEG IMPF feel good ]
"A John-not-feeling-well day"

ti pai i [ le Keoni na ma mai koa ]
this day FIN [ NAME John NOM NEG IMPF feel good ]
"this day is John-not-feeling-well-y"

These both lack either a ko or an u, since any other adjectival phrase wouldn't have one either: pai mehísi "foggy day," ti pai i mehísi "this day is foggy," etc. It occurrs to me that, now that there's no i in there, if we were going to put the ko or u back in, it might just make as much sense to slap it on the beginning of the clause rather than in front of the verb. In other words:

ni si kulu ko [ le Keoni na ma mai koa ]
1SG PERF NOM [ NAME John NEG IMPF feel good ]
"I heard that John isn't feeling well"

or really, we ought to think of it like this:

ni si kulu ko [ le Keoni Ø na ma mai koa ]
1SG PERF hear NOM [ NAME John NONFIN NEG IMPF feel good ]
"I heard that John isn't feeling well"

In other words, the function of i would, among other things, be to mark the clause as finite; removing it would allow the clause to behave like any other predicate. Let's see how this would affect the rest of our example set.

ka talo (u) [ nu ma asu (ne ta) ] i piku lia
DEF house (REL) [ 1PL IMPF dwell (LOC 3SG) ] FIN small too
"the house we live in is too small"

ka kane (u) [ ma ipo ka sahi ni ] i ia paha
DEF man (REL) [ IMPF drink DEF wine 1SG ] FIN AFF evil
"the man who is drinking my wine is certainly evil"

ka sahi (u) [ le Keoni Ø si ipo ] i si miláho
"the wine John drank had gone bad"

ka [ le Keoni Ø si ipo ] i koa nai
DEF [ NAME John NONFIN PERF drink ] FIN good some
"the one John drank was pretty good"

le Keoni i halu ko [ ipo a sahi koa ]
NAME John FIN want NOM [ drink INDEF wine good ]
"John wants to drink some good wine"

le Keoni i halu ko [ le Malía Ø ipo a sahi koa ]
NAME John FIN want NOM [ NAME Mary NONFIN drink INDEF wine good ]
"John wants Mary to drink some good wine"

le Keoni i na ma mai koa lo ko [ si ipo a sahi pua ]
NAME John FIN NEG IMPF feel good REASON NOM [ PERF drink INDEF wine bad ]
"John doesn't feel well because he drank some bad wine"

vo ka sene (u) [ si tapa ka hili (u) [ si suo ka lepa (u) [ ne ka talo (u) [ le Iako Ø si tei ] ] ] ]
here's DEF cat (REL) [ PERF kill DEF mouse (REL) [ PERF eat DEF bread (REL) [ LOC DEF house  REL [ NAME Jack NONFIN PERF make ] ] ] ]
"this is the cat that killed the mouse that ate the bread that was in the house that Jack built"

My first reaction is that this instinctively feels like the best so far. I see two objections we'll need to investigate:

1) Why do all these embedded clauses have to be non-finite? I'm asking both in terms of Koa structure and in terms of what is typologically reasonable.

I'm thinking, with undisguised relief, that the finiteness question might actually be a bit tautological. If I say kunu kona "black dog," why does the adjective "have to be non-finite" here? Well, it has to be non-finite because it's in a position in which all Koa predicates are non-finite. The same is true of phrases like ka kona "the black one." Koa predicates are finite only when preceded by i or a pronoun. Looking at it this way, I don't see that there's any other way to do it.

2) What happens when I interpret a phrase like ko le Malía ipo a sahi koa, translated above effectively as "Mary('s) drinking some good wine," according to the usual rules of Koa predicate relationships? Does it make any sense?

Well, let's see. As we know, when two Koa predicates stand in the order XY, Y modifies or describes X. If Y is specified, the relationship is seen as genitive. Ipo a sahi koa, then, will mean "drinker of good wine."

Since this phrase has no specifier, it modifies rather than possesses the head; how to translate this into English? "Drinker-of-good-wine Mary," perhaps; or "Mary, drinker of good wine." Okay.

Lastly, ko turns the whole predicate into an abstract idea: "drinker-of-good-wine-Mary-ness." My parser just broke. Let's try again.

If puna means "red one," then ko puna means "the quality/concept of being a red one," thus "redness." Can we apply this back onto our longer predicate? "The idea of Mary, (being a) drinker of good wine." "The idea of drinker-of-good-wine Mary." Great Scott, this may just work -- I really wasn't daring to hope for the literal translation to make any sense at all.

Okay. Wow. The suddenness of this discovery has kind of left me reeling.

For relative clauses, then — that is, clauses that serve to further describe or delimit their head — there are two options. One is the internally-headed relative clause, which is fully finite and marked with ke preceding the head; and the other is a standard gapping strategy for which the clause is in its non-finite form (i.e. no i occurs before the verb phrase) and may, like any other "adjectival" predicate, be preceded by u.

Complement and adverbial clauses, at this point, now seem to have one solid strategy: the clause appears in its non-finite form (sans i). If the clause is in a nominal environment, the usual specifier would be ko. One note: predicates frequently appear without a specifier when preceded by a "preposition": la koto "home(wards)," etc. If clauses are to have the same rules as other predicates, we might be able to say not only

lo ko [ si ipo a sahi pua ]
REASON NOM [ PERF drink INDEF wine bad ]
"because of having drunk some bad wine"

but also

lo [ si ipo a sahi pua ]
REASON [ PERF drink INDEF wine bad ]
"because of having drunk some bad wine"

Or with a clause with an overt subject:

lo (ko) [ le Keoni si lahe ]
REASON (NOM) [ NAME John PERF leave ]
"because John left"

Although in the past, now that I think about it, the deleted specifier has always been ka or a in the past, so maybe deleting it here would mess with the intelligibility of the phrase. So yes, all seems fine...I think. I'll need to let this sit for a while and try it out in all kinds of different contexts to make sure there are effects.

So here's a thought. Since modifier clauses have two strategies (non-finite and finite), what if the same were true of nominal clauses? We have our non-finite strategy worked out above; I think my favorite of the finite options was with ve used as a complementizer.

Now, deciding to do this would mean using up one of only four remaining particles (we've currently got hi, ve, ie and iu). We're going to have to evaluate whether that really makes sense. I do think, though, that requiring all complement and adverbial clauses to be non-finite would be typologically unusual in its restrictiveness. What we're looking at, then, are the following kinds of alternatives:

le Mia i si sano [ ve ka moa i ma lalu poli ]
NAME Mia FIN PERF say [ COMP DEF chicken FIN IMPF sing much ]
"Mia said that the chickens were singing a lot"

le Mia i si sano ko [ ka moa ma lalu poli ]
NAME Mia FIN PERF say NOM [ DEF chicken IMPF sing much ]


he [ ve ta ma tuvo hiki ]
TIME [ COMP 3SG IMPF cut grass ]
"while he was mowing the lawn"

he ko [ ta ma tuvo hiki ]
TIME NOM [ 3SG IMPF cut grass ]

I imagine that one or another of the strategies would make more sense, and be likely to be used more, in specific contexts. I suppose these trends will emerge with lots more use, and like all similar situations in Koa, it'll never actually be incorrect either way.

So hey! That ended up being quite a bit easier than I expected, actually. The next task is to go back through all of my existing multiple-clause structures and see how they fare under these new models. In particular, I'm concerned about frames like te tai ko... "it's possible that...," "maybe..." I'll report soon.

Sunday, January 22, 2012

Embedded clauses: the show so far

This is a big one.

I've been deliberately staying agnostic for the last 12 to 13 years, biding my time until I felt I had the wisdom or clarity to make a decision. Meanwhile, though, my interim strategies have been seeing so much use that they've actually been influencing other important choices that will not be easy to disentangle. It's clearly time to make up my mind.

The matter under consideration is that of embedded clauses. These fall into the two broad categories of modifiers (relative clauses) and nominals (complement and adverbial clauses), though as ever these categories are fluid in Koa.

Before diving into this discussion, I should first mention that one of Koa's relativization strategies, the internally-headed relative clause, is actually fairly uncontroversial. In this structure the particle ke marks the head while it remains in situ:

ke kane i si ipo ka sahi ni
QU man 3P PERF drink DEF wine 1SG
"the man who drank my wine"

le Keoni i si ipo ke sahi
NAME John 3P PERF drink QU wine
"the wine John drank"

nu ma asu ne ke talo
1PL IMPF dwell LOC QU house
"the house we live in"

If the head is focalized, note that this changes the sense from one of modification to identification:

ke kane sa si ipo ka sahi ni
QU man FOC PERF drink DEF wine 1SG
"which man drank my wine," as in "I don't know..."

ke sahi sa le Keoni si ipo
QU wine FOC NAME John PERF drink
"which wine John drank"

ke talo sa nu ma asu (ne ta)
QU house FOC 1PL IMPF dwell (LOC 3SG)
"which house we live in"

So far so good. This is easy to form, pretty unproblematic to parse, and works well much of the time. It feels a little odd to speakers of IE and neighboring languages, though, which is enough of a reason to have another option even without the fact that long chains of relative clauses can end up pretty unparseable using this strategy:

vo ke sene i si tapa ke hili i si suo ke lepa i tai ne le Iako i si tei ke talo
~ "this is which cat killed which mouse ate which bread was in Jack built which house"

What we need is a way for a clause to modify a nominal head that stays in its usual position within the matrix clause: that is, something more along the lines of a traditional Indo-European relative clause. This is easy when the head is the subject of the relative clause, because in that syntactic context the verb phrase can just as easily be considered adjectival anyway:

ka kane [ ma ipo ka sahi ni ] i ia paha
DEF man [ IMPF drink DEF wine 1SG ] 3P AFF evil
"the man drinking my wine is certainly evil"

We have the "relativizer" particle u at our disposal as well, which heretofore we've described as marking a phrase as both adjectival and pragmatically important, as in

ka sahi u puna
DEF wine REL red
"the red wine, the wine which is red, etc."

Our sample relative clause, then, could optionally also incorporate u:

ka kane [ u ma ipo ka sahi ni ] i ia paha
DEF man [ REL IMPF drink DEF wine 1SG ] 3P AFF evil
"the man who is drinking my wine is certainly evil"

I suspect that the difference between these would be about the same as the difference between "the man drinking my wine" and "the man who's drinking my wine" in English: in other words, pretty ethereal. One can envision contexts in which one or the other sounds better, but in general they're equivalent. The above is unproblematic, and indeed follows automatically from the basic principles of Koa structure.

Once the head occupies a position other than subject within the relative clause, though, we immediately run into apparently insoluble problems -- or at least, problems whose solutions have not seemed obvious to me for 12 to 13 years. Suppose, for example, that we want to say "The wine John drank had gone bad." Calquing the English structure would lead us to do something like this:

ka sahi [ u le Keoni i si ipo ] i si miláho
DEF wine [ REL NAME John 3P PERF drink ] 3P PERF INCEP-rot
"the wine John drank had gone bad"

It seems so normal that I might not even object, but then I remember the optionality of u in my previous examples. It was optional because the relative clause was "adjectival" on its own, with the same meaning. The same cannot be said here: there is no precedent anywhere in the language for a phrase like le Keoni i si ipo being able to directly modify anything.

Likewise, this clause doesn't seem to be modular, able to be slipped into any syntactic position, like all other parts of Koa are. For example, it should be possible to say this:

?ka [ le Keoni i si ipo ] i koa nai
DEF [ NAME John 3P PERF drink ] 3P good some
"the one John drank was pretty good"

...but I have no confidence in this at all. I don't see why I should expect that the bolded phrase should have the given English translation considering the remainder of Koa grammar, other than my English language intuition.

What we want is some way of forming a clause in Koa that would sound something like "the John-having-drunk(-it) wine" when translated literally into English. How would this be done?

Let's leave this for a moment and turn to the other embedded clause type. Complement and adverbial clauses occur in environments in which they are formally nominal -- that is to say, ordinarily one would see "nouns" in those contexts -- so a reasonable starting assumption might be that clauses of this type would have some kind of specifier. In fact, ko works very well for this purpose when, as with relative clauses, there isn't a subject expressed within the embedded phrase:

le Keoni i halu ko [ ipo a sahi koa ]
NAME John 3P want ABSTR [ drink INDEF wine good ]
"John wants to drink some good wine"

le Keoni i na ma mai koa lo ko [ si ipo a sahi pua ]
NAME John 3P NEG IMPF feel good REASON ABSTR [ PERF drink INDEF wine bad ]
"John doesn't feel well because he drank some bad wine"

The second example, which in English is an adverbial clause, might just as well be translated "John doesn't feel well because of having drunk some bad wine," and as such demonstrates why ko is the appropriate particle: ko si ipo a sahi pua really does mean "[the idea/state of] having drunk some bad wine." It's a little more of a stretch as a complement clause: I'm not sure it's as obvious that the literal "John wants [ drinking some good wine ]" ought to have the given meaning. Nonetheless, it's the only remotely reasonable way of doing this that I've ever come up with.

Having used this kind of structure for some time now, I found it easy enough to start regarding ko as a kind of complementizer, having in its scope an entire following clause. It seemed like this was a reasonable extension of its usual role of marking abstract concepts. Using it this way, we might see phrases like this:

ni si kulu ko [ le Keoni i na ma mai koa ]
1SG PERF hear ABSTR [ NAME John 3P NEG IMPF feel good ]
"I heard that John isn't feeling well"

You may be noticing a similarity developing with what happens with relative clauses. The problem is that ko marks the abstraction of a root. That's its one function. Every particle in Koa has one function. In using it this way I've done a very natural, linguistically neutral thing, but a fundamentally very un-Koa thing. Given the meaning of ko everywhere else, does it make any sense to express "John not feeling well" as ko le Keoni i na ma mai koa, in the same way that ko puna means "redness?" I'm not convinced that it does.

Scope is definitely part of this feeling. Even though the languages I know best don't have any problem parsing the appropriate scope of a complementizer, I feel uncomfortable assuming that everyone should just understand where this ko-phrase ends.

Another spot of discomfort is in the fact that, by preposing ko, I'm making this clause into a nominal. The clause, especially with that i in there, feels awfully finite for a nominalization.

This also fails the modularity test. If ko mevúa means "raininess," and pai mevúa means "rainy day," we should be able to say:

pai [ le Keoni i na ma mai koa ]
day [ NAME John 3P NEG IMPF feel good ]
"A John-not-feeling-well day"

Similarly, since I can say ti pai i mevúa "this day is rainy," why not:

ti pai i [ le Keoni i na ma mai koa ]
this day 3P [ NAME John 3P NEG IMPF feel good ]
"this day is John-not-feeling-well-y"

I don't think either of these is very well motivated. Although I wish I could clearly articulate why, my instinct is strong enough that I don't think I can use this kind of structure moving forward...unless I decide that it's okay for ko to lead a double life as a complementizer, in which its structures are not modular in the same way as other sort-of nominals.

If I'm opening up that line of inquiry, there's also the option of using one of my few remaining particles as a bona fide complementizer: perhaps ve, in homage to Bislama:

ni si kulu ve [ le Keoni i na ma mai koa ]
1SG PERF hear COMP [ NAME John 3P NEG IMPF feel good ]
"I heard that John isn't feeling well"

and even

pai ve [ le Keoni i na ma mai koa ]
day COMP [ NAME John 3P NEG IMPF feel good ]
"A John-not-feeling-well day"

It doesn't seem to work as well without an overt subject on the embedded clause, I suppose because what follows ve here is supposed to be a fully formed finite expression. We'd need to use structures like

...lo ve [ ta si ipo a sahi pua ]
REASON COMP [ PERF drink INDEF wine bad ]
"...because he drank some bad wine"

which seems to work reasonably well. What about this, though?

?le Keoni i halu ve [ ta ipo a sahi koa ]
NAME John 3P want COMP [ 3SG drink INDEF wine good ]
"John wants to drink some good wine"

Not so much. Whether we read this as "John wants himself to drink..." or "John wants him to drink," it's not inspiring any applause. I guess its structure is parallel to the same kind of clause in Greek/Romanian/Bulgarian/etc., though:

ο Γιάννης θέλει να πιει καλό κρασί
DEF John want-3SG COMP drink-3SG good wine
"John wants to drink good wine"

Anyway, before looking much more closely at that kind of strategy, or giving up on my principles, I would like to see if it might be possible to come up with a way of doing all this that really does work the way I had been envisioning. This is already a ridiculously long post, so we'll go on to that in the next one.

Thursday, January 19, 2012

Backing up: the illusion of null derivation in Koa

The following is a slightly modified transcript of a correspondence with Allison regarding the previous post.

I think I should clarify something that I've clearly been too lax about explaining in the past: this "singer" issue that I know people have had problems with intuitively.

The thing about Koa is that it doesn't actually have parts of speech that correspond to European languages at all. I keep cavalierly using terms like "nominalization" as if it were entirely clear to everyone else exactly what I mean, but in fact you don't really have nominalization in Koa. In the same way, there's no verbalization, or "adjectivalization": just words used as predicates or modifiers.

My suggesting that ka lalu is the nominalized form of the verb lalu, then, is actually almost completely unhelpful. I'm realizing that it goes beyond that to the point of being positively deceptive.

The idea behind "words" in Koa is that they take their apparent lexical class -- their sense -- from the place they're used in a clause, but that none of this is inherent. If we take a simple clause like ka kane mata i ma luke "the short man is/was reading," we could equally visualize the "nominal" here as

a noun: the man
an adjective: the male one
a verb: the one-who-is-male, the one being male

likewise, the "adjective" mata could be seen as

an adjective: short
a verb: who-is-short, being short, "shorting"
a noun: a short one (appositive)

and the "predicate" could be framed as

a verb: is reading
a noun: is being a reader
an adjective: is a reading one, is one who is reading

The above is true for Koa not just in theory but in practice. There is unavoidably a question of arbitrariness in terms of what arrangement of semantic roles to foist onto each lexeme in order to make these interpretations what they are, and this is something I want to talk about more in a minute. There is not, however, anything arbitrary about the way the words resolve into each apparent lexical class once you know their basic meaning.

This is all building to an attempt to get at the feeling that it doesn't make intuitive sense for the "nominalized" form of "sing" to mean "singer." I think it's important to note that it actually doesn't mean "singer" with any of the semantic or aspectual/modal sense of the English word. It might be helpful to look at what the verb lalu actually means in a sentence like le Keoni i lalu. I've translated it as "John sings," but that statement is aspectually ambiguous in English.

The best example I can come up with is in the context of, say, a party, at which a certain portion of the contingent has decided it's time to start singing some songs. The folks who want to sing start asking less-well-known attendees whether they'd like to participate by saying "Do you sing?" to which they might respond, "Yeah, I sing" or "No, I don't sing." In Koa, these would get translated as Ai se lalu? Ia, ni lalu and Na, ni na lalu. In a sense, then, le Keoni i lalu might be more helpfully translated as "John is willing to sing," or "John can sing." The assertion is that John has the general potential to sing, whatever the specific realization of that fact in context might be.

I tend to translate ka into English as "the." It's more accurate to say that ka placed before a Koa predicate ("predicate" just meaning "content word" in the Koa grammatical tradition) gives it the sense "a definite instantiation of the meaning of the root," in other words "the one which..." From i puna "is red," then, we get ka puna "the one which is red," "the red one." In the same way, i lalu "sings" gives us ka lalu "the one which can/will sing, the one which sings, the 'singer'." Looking at it another way, i lalu could be equally correctly translated as "is one who can/will sing, is one who sings, is a 'singer'."

Given the semantics of a particular predicate, then, there is only one possible meaning that it could have in its nominal, adjectival or verbal role. There's no question of what semantic role to "promote" during the apparent process of nominalization: it's the same semantic role that it has everywhere else as well. What I've been calling null derivation should really be called "apparent null derivation," because there isn't actually any derivation going on here. Traditional concepts of lexical class just don't apply.

At least, that's the way it's always been; what I brought up in my previous post was the possibility of adding another layer of arbitrariness that would firmly reintroduce traditional lexical classes into the language for the theoretical benefit of greater intuitiveness, and/or greater word-worthiness of basic roots in each lexical class guise. Or reducing the average number of morphemes per word to give the language more of the feel of a creole. The disadvantage, beyond the increase in arbitrariness itself, would be that it would break the carefully constructed elegant system above.

When it comes to which thematic roles to map onto arguments for a given word, of course it's unavoidably true that it's philosophically an arbitrary choice. My guiding strategy, though, is based in the assumption that pragmatically most of that theoretical arbitrariness disappears. Once in a while one sees a language make some bizarre encoding choices -- Maltese where "thief" means literally "one who is considered a thief," for example, being my favorite -- but by and large this isn't what I've seen languages do. Dogs are nominal. Killing is verbal, and the agent will be the subject. Inasmuch as a language has a class anything like adjectives, "big" will be an adjective. I would suggest that, statistically, there are good and bad choices where these assignments are concerned.

Where there is genuine disagreement across large chunks of the language spectrum, I've tried to make my decisions at least consistent and predictable. Experiencers are encoded as subjects in Koa, for example. Bodily substances are their own nominal subjects (i.e. i taku means "is blood" not "bleeds"). I do actually intend to present the dictionary in a somewhat Loglanny way, demonstrating a clause frame for each word to illustrate the semantic structure; ideally my choices would feel intuitive or at least reasonable to the greatest percentage of humans, and where there is guesswork for a learner, there is at least a system on which to lean. This kind of arbitrariness is, I feel, pretty distinct from Esperanto's where kombo is the verbal noun "combing" but broŝo is the instrument "brush."

I've been relying on my intuition based on a very large number of studied languages in making these choices, but I really ought to be more scientific about it. In general I'm being pretty agnostic about the thematic role/argument deployment of potentially problematic words, with the understanding that I'll firm that up (or not -- some ambiguity is probably okay) later on through philological investigation.

I wouldn't want anyone to think I'm getting caught up in my own idea or anything. I'm firmly grounded in the fact that IAL design is always fundamentally going to be a variety of intellectual navel-gazing. The process is fascinating to me, though, and I really do have the conceit of thinking, or at least hoping, that this particular language, when it's "finished," would be easier to learn for a much greater number of people than Esperanto -- the founding goal back in 1999.

The question I was posing in the previous post, then, was that of whether this goal would be best served by coherence to an established and predictable, though more complex, system, or by the addition of ambiguity in order that more forms might be morphologically simpler and more typologically neutral. I'm leaning strongly towards the former at this point, if only for the reason that Koa was designed from the bottom up with the existing system informing every choice, and this change would destroy all internal consistency. It would be better, in terms of optimality, to start over from scratch if this were going to be a primary design goal.

Thursday, January 12, 2012

Back to null derivation

A core principle of Koa design from the very, very beginning has been avoiding the kinds of problems caused by inherent lexical class in Esperanto. By this I refer to the fact that, for example, the root komb- is inherently verbal, which gives us kombi "to comb" and, counterintuitively for me, kombo "combing." In order to designate a nominal comb, one needs an instrumental affix: kombilo. Martel- "hammer," on the other hand, is nominal, so we have martelo "hammer," marteli "to hammer," and construct the verbal noun with an affix: martelado "hammering."

I've always felt that this was a terribly sloppy state of affairs for an IAL, and would be confusing enough for learners without the fact that it's barely touched upon by the textbooks. Koa, I thought, would conquer this territory, by making the business of part-of-speech conversion completely logical and, therefore, predictable. Some examples of the way this kind of thing works follow:

ka kane "the man"
le Keoni i kane "John is a man"
ka moa kane "the male chicken"

ka puna "the red one"
ka ame i puna "the bird is red"
ka moa puna "the red bird"

ka lalu "the one who is willing/able to sing, the singing one, the singer"
le Keoni i lalu "John sings"
ka kane lalu "the singing man, the man who sings"

ka pa lalu > ka palálu "the thing sung, i.e. the song"
le Amazing Grace i palálu "Amazing Grace is a song"
ka iune palálu "the one who steals songs, the song thief"

ka ne ka talo "the one in the house"
le Keoni i ne ka talo "John is in the house"
ka moa ne ka talo "the bird in the house, the bird that is in the house"

All of these structures are 100% parallel, in a manner entirely different from the way e.g. Esperanto does it. The idea is that every part of the language should follow this framework. The problem is that I'm worrying that by doing so, I'm diverging seriously from cross-linguistic neutrality and, in some cases, basic common sense.

I've written about this before, but I think the time has come to do some more rigorous investigation of the consequences of these assumptions, and evaluation of their reasonableness.

One of the most obvious effects of this system is that a lot of basic roots in English get encoded in Koa via the passive marker pa. "Song" above is an example of this: whereas in Esperanto kanti means "to sing" and the noun form kanto means "song," Koa lalu when used nominally means something like "one who sings in an aorist kind of way." This is not an obviously useful concept to be able to express, but it's necessary to maintain the parallelism of structures.

I should note that ka lalu doesn't really quite mean "the singer," in the sense of someone who sings a lot and constructs their identity partially around this fact. For this kind of meaning, that is, something characterized by the meaning of the root, we have an affix -ma, so láluma "singer." One could also employ the usitative particle va to form ka va lalu or ka valálu "the one who frequently sings," "the singer." It's easier to visualize its meaning in the negative: a na lalu "a person who doesn't/won't sing."

If ka lalu doesn't have a particularly useful existence with its current semantics, I have to ask this question: what if ka lalu meant "the song" instead? Not because it's logical, but because it's highly intuitive and useful. Some other examples of this general quandary:

suo "eat" > pasúo "food" (see below also)
lule "think, believe" > palúle "opinion"
haku "braid, weave" > paháku "braid (in hair)"
siva "tie" > pasíva "knot"
komo "wear" > pakómo "clothing"
kaka "go poop" > pakáka "feces" (the most egregious: surely "poop" should be monomorphemic in any language?)

In addition to this kind of concrete passive nominalization, we have a lot of abstract active forms:

nuku "sleep" > konúku "sleep (n.)"
mua "die" > komúa "death"
moe "dream" > komóe "dream (n.)" (or should this be pamóe?)
ela "live" > koéla "life"
suo "eat" > kosúo "meal"

In all these cases, we theoretically have the option of equipping the nominalized base root morpheme with the same meaning that currently needs derivational morphology to attain. What would be the advantages and disadvantages of such a system?

Before getting into syntactical ambiguities, we can state right off the bat that this would give Koa that same frustrating arbitrariness of intrinsic root meanings as Esperanto. Suo above is the perfect example: should this root used as a noun mean "food" or "meal?" Esperanto chooses the latter with manĝo, but it could really go either way. I don't like the thought of having to look up and memorize the meanings of various derived forms of every word: the whole point with Koa is that it's all there, free to interpret, in the morphology and syntax.

Leaving aside that qualm, I'd like to see if there are any really conspicuous structural problems with this idea. One issue that comes up repeatedly is that of what happens when the root is used as a predicate: as things stand, there is no formal difference between a predicate nominal, a stative verb, or any other kind of verb. Thus, le Keoni i lalu means "John sings" and is identical in structure with le Keoni i moa "John is a chicken."

If lalu, for example, means "sing" as a verb and "song" as a noun, clauses like le Keoni i lalu suddenly become ambiguous. It could either continue to mean "John sings," or more fancifully, "John is a song."

There are three possible responses to this that I see. First, we could decide that this potential ambiguity is unacceptable, and can the idea right now. Second, we could point out that the second interpretation is entirely semantically anomalous and therefore the ambiguity is artificial: in actual context, the meaning would be clear. Third, we could eliminate the ambiguity by requiring verb phrases to bear a tense/aspect/mood marker: thus the verbal meaning would have to be expressed as le Keoni i va lalu, currently meaning "John sings regularly/habitually." This is not quite the same as the aorist sense of le Keoni i lalu, but we could theoretically add this to the arsenal of va.

Well, I have no truck with monosignificance -- all languages are full of ambiguity -- so I can throw out the first response. I'm not a fan of the third either, because (A) I don't want to have to mark every verb phrase this way, and (B) it doesn't actually eliminate the ambiguity anyway, because le Keoni i va lalu could equally be interpreted as "John is often a song." This means that deciding to make this change to Koa null derivation semantics would entail accepting a healthy dose of intrinsic ambiguity into the language, for better or for worse.

One good thing about this that I'd like to throw in before I forget is that it would also obviate the need for a helper verb. Thus instead of tei kaka "go poop" (if kaka is nominal, that is), kaka could have both verbal and nominal force.

Returning to ambiguity, I would like to point one thing out. In more poetic contexts, where potential meanings range more freely, I find I can easily come up with examples where this ambiguity would no longer be trivial. Take the theoretical Koa sentence ka ela ni i lalu, for instance. I don't think there's a problem with ka ela ni "my life" (this would be ka koéla ni in standard Koa) despite the fact that it could also be saying "my living one," whatever that means. The predicate, though, could mean either "sings" (i lalu) or "is a song" (i palálu), and given the poetic nature of the utterance, there's really no reason to prefer one reading over the other. Some ambiguity is, of course, acceptable in poetry, but this seems to seriously encumber the expressiveness of the language. I think this may be the strongest argument yet in favor of not making this change.

In terms of typological appropriateness, I really need more data. I can say that, from the perspective of the inflectional IE languages I speak, I have nothing to worry about either way. "Food" in English is unrelated to "eat," but transparently connected to "feed." In Polish we have jedzenie, literally the verbal noun "eating." Spanish has comida, literally "(female) eaten thing": an exact parallel of Koa pasúo. What do more isolating languages do, though? I have absolutely no idea. Inexcusably, I don't have a grammar of Mandarin, Vietnamese, Burmese, Thai or any other related language, but Bislama, Malay and Yoruba ought to give me something to work with. I'll come back with part II soon; in the mean time, I think I'm seeing some reasons to leave things as they are.