Wednesday, April 30, 2008

Koa verbs and thematic roles

Worth mentioning before I forget to mention it again: in designing Koa I do not want to make Indo-European-style decisions about verb valence and thematic roles. The subject position, therefore, will always be used for agent and experiencer, and the object position for patient and recipient.

This removes all the guesswork that usually happens when learning a language around things like "I think," "I dreamt," "I like," "it seemed," etc. -- all of these have an experiencer subject, including "seem" which would then need to be glossed as something like "perceive."

Naturally, we don't know yet how we're going to do predicate nominatives in sentences like "the house looks big," which is apparently yet another thing to think about. Ka talo i pa nae FOO iso, or something.

And no more of this "is 'boil' transitive or intransitive?" business for us like in Esperanto -- the agent is subject and patient is object, always, so if we're trying to translate "the water boiled," and there's only a patient, we know we'll need a passive structure. Same with "the door opened," etc.

I'm sure that trying to be this firm is going to get me into trouble sooner or later, but it is at least a good principle to go by, I think.

Aha, here's problem #1: if mua means "die," wouldn't the subject be the patient, the participant undergoing change of state? But the only possible agent of that verb would make it mean "kill," and "be killed" has, obviously, extremely different semantics from "die." Okay, Einstein, figure that one out.

Plural pronouns

After years and years of deliberation, we've finally got the singular pronoun system in the bag:

I = ni
you = se
he/she/it = li

Well, okay, there is the fact that "I" and "he" would be homophonous for many Cantonese speakers, and just that teeny glimmer of uncertainty about se, and whether it might be better if it were te so that sentences like the following wouldn't be such tongue-twisters:

Ei se sa si sano li tia?
Q 2SG TOPIC PERF=say 3SG that
"Was it you who told him that?"

I think this would be slightly less ridiculous as Ei te sa si sano li tia? But the thing is that I really like "you" as se, damn it all! I've been struggling with this sodding issue for years and years, and it always comes back to the same thing.

Say, what if the topicalizer were something like ta instead of sa? So Ei se ta si sano li tia? That's actually even better. I might have less allegiance to sa as TOPIC than to se as "you," though my heart weeps at the thought nonetheless. Something to think about.

ANYWAY, this isn't what I was intending to talk about. The thing is, aside from nu as "we" that I've always felt ambivalent about, we have no plural pronouns. I see five possible approaches here:

1) No plural pronouns. Nouns don't have plurals in Koa, so why should pronouns?
2) Completely different forms for singular and plural pronouns (like ni and nu -- I could also have li and lu in parallel).
3) Plurals formed by reduplicating the singular: nini, sese, lili. Note that we could not then use reduplication pragmatically as we had previously been throwing around.
4) Plurals formed by adding a morpheme, like Mandarin , nǐ, tā "I, you, he" and wǒmen, nǐmen, tāmen "we, y'all, they." This morpheme could either never show up elsewhere in the grammar, or be used for certain other specified purposes as in Mandarin.
5) Some kind of hodgepodge of systems: combinations of the above, or maybe a separate morpheme for "we," since that's semantically different from the plural of "I," but then "y'all" could be either seni (typologically consistent with creoles) or just se as in English, and I suppose we don't absolutely need a 3PL pronoun. Or hey, how about pose = "all of you," poli = "all of them?" (po)nu, seni/pose, poli? Hm.

As always, I just don't know where to go with this. Aside from #1 which I don't think I'll really be considering, all of these seem possible and none of them strike me any more overwhelmingly positively than any others. Mimblewimble.

What about those semivowels, anyway?

Should we finally make the decision to add /w/ and /y/ to the phonology? There would need to be certain distribution restrictions, in short:

• /y/ cannot occur before /i/ or after /i/ or /e/ due to the serious danger of confusion between e.g. ia/iya, ea/eya, and the fact that [ji] is difficult to pronounce and tends to be unstable cross-linguistically.

• In parallel fashion, /w/ cannot occur before /u/ or after /u/ or /o/ due to the danger of confusion between e.g. ua/uwa, oa/owa, and the fact that [wu] is difficult to pronounce and tends to be unstable.

For these reasons, /w/ and /y/ probably can't be used in the derivational system: we'd ideally like to be able to apply every suffix to every root, but -wa, for example, could not be added to any root ending in -o or -u -- the distinction between pokoa/pokowa or pukua/pukuwa really can't bear functional load.

Come to think of it, what about the muya/moya or tiwa/tewa kind of distinction? This seems harder, at least for English speakers, than anything else in the phonology. And, and, and, can we have both ona and wona, ena and yena? See, this is exactly why this thought process stalled out last time.

Okay, putting the above panic attack out of our thoughts for a moment, what would these putative roots look like, anyway? Here are 50 from Randword:

weu
niwo
piwe
niwa
yolu
kewi
yuka
lawa
hiwe
hewo
hewi
kaye
hewa
siwi
puyo
siwa
wali
lewo
yoya
toya
yona
hayu
woo
poyu
noyo
newi
tewi
wawo
yeso
kiwe
hawa
naya
oyu
sewa
yuyo
toyu
siwo
mewa
lewe
kawo
womu
payu
koyu
siwe
yumi
iwe
wole
mewe
taye
kiwa

Hm. I like some of these very much, others not at all. This is a really hard decision to make. I have a conservative streak with Koa that's probably at least partially responsible for my continuing to like this language after all these years, and it's sort of urging me not to do any rash phoneme-adding at this point.

The bottom line is that the above words don't feel like Koa to me, they feel more like Yorùbá or something. There's nothing wrong with this in itself, of course, but I don't think it's the direction I want to go in right now with this language. So there it is.

Huge changes this morning

It's time to take the plunge. Ever since September 13th, 1999 my phonology has been completely unchanged, and the whole time I've been a bit uncomfortable with /c/: whether it's pronounced [S], [tS], [ts], [dZ], or whatever, its presence really isn't appropriate in the phonology of an IAL. I've held onto it because I didn't want to affect my math, and I couldn't think of anything better to replace it with: [f], for instance, might be a possibility, but it's so easy to confuse with [s] especially on the phone, and lots of languages (e.g. Finnish, Japanese) don't really have it. I toyed around with [j] and [w] for a while, but I wasn't ever sure I liked those even with massive distribution restrictions -- and they didn't buy me back my particles, which is one of the big issues.

Because with my current 10 consonants and 5 vowels, I have 50 CVs (particles) and 2500 CVCVs (roots). If I drop this down to 9 consonants, that leaves me only 45 CVs and 2025 CVCVs.

I was realizing this morning, though, that only one of my particles currently uses /c/ (co "all") and I could be pretty okay with changing this to po. I'm sad to lose cumo "squash" and colo "run," but it wouldn't be the worst thing in the world if they were sumo and polo instead. In short, since this is clearly the right thing to do if I want to hold onto my self-righteousness about the viability of Koa as an IAL, /c/ needs to go. So god speed your way, /c/, jedź z bogiem. It's been an honor working with you these nine years.

The second big change is not quite so traumatic, but it's sort of a big deal because things have been the way they are since about 2001 and I'm pretty used to it. So: I recognized last year that hi wasn't really working for me as the 3rd person pronoun, for reasons of euphony in sentences like the following:

Ei se si sano hi ko se loha hi?
Q 2SG=PERF=say 3SG COMP=2SG=love 3SG
"Did you tell her that you love her?"

Sano hi is marginally acceptable to me, but loha hi is just too hard to say -- the problem is that hi is phonetically extremely weak. I had talked about replacing hi with ti, and ti with to, but the latter in particular I really wasn't prepared to do. So here's the solution: the 3rd-person singular pronoun is now li. This gives us Ei se sano li ko se loha li? for the above sentence, a huge improvement.

I swear Esperanto had nothing to do with this one.

...and now to change my dictionaries and welcome this (hopefully) bright new day in the world of Koa. Next question: should I reconsider [j] and [w] to swell my root numbers a bit? It would be typologically consistent. Back after these important messages.

Tuesday, April 29, 2008

Numbers: Some Options

I see five options for our numeric system, each with advantages and disadvantages:

1) A separate basic root for each number at each decimal place, i.e. toru = "four," pima = "forty," etc.; pima toru = "forty four."

Pros: It's very clear, no possibility of misunderstanding.
Cons: It uses up far, far too many roots and requires an unreasonable feat of memorization on the part of the learner. Unacceptably ridiculous, in short.

2) Each numeral has a monosyllabic root, and these are simply stacked to build complex numbers. So if sa = "1," he = "2," pu = "5," then hesapu = "215."

Pros: Numbers are very short and simple.
Cons: This works unlike any other derivational process in Koa; it's kind of weird how "100" and "10" aren't overtly mentioned. Despite the fact that this was my initial idea back in 2001 or whenever it was, I don't think it feels right at this point.

3) Each numeral and decimal place has its own bisyllabic root, e.g. sulu = "four," kumi = "ten," súlukumi sulu = "forty-four."

Pros: Very clear; both numeral and decimal place are phonetically salient.
Cons: Resultant forms are the longest of any of our options, especially as we keep going -- súlupima súlukumi sulu = "444," etc. Perhaps more seriously, all numbers above ten have four syllables, the only four-syllable words (I believe) in Koa.

4) Each number is composed of two CV roots, one for number and the other for decimal place; these are arranged in sequence for building. Thus if su = "four," to = "x10^0" lu = "x10^1," then sulu suto or maybe sulu su = "forty four."

Pros: Number roots are nice and short, easy to remember and work with.
Cons: We have to choose between eliminating the vast number of resultant bisyllabic forms from our root array on the one hand (unacceptable), or having a large number of roots that mean one thing in numerical context and something else elsewhere. Sulu, for instance, could mean "forty" when followed by the counting word pi as in sulu pi cumo "forty squashes," but "helmsman" elsewhere: Ka sulu i si suo sulu pi cumo = "The helmsman ate forty squashes." Languages typically do just fine even with huge numbers of homophones (cf. Mandarin), so this might be okay.

5) Each base number has a bisyllabic root, which is then modified by a monosyllabic suffix to indicate decimal place. Thus, if sulu = "4," -ma = "x10^0" and -ho = "x10^1," then súluho súluma or súluho sulu = "44."

Pros: We're using a minimum number of roots, retain monosignificance, and have highly distinct, easily understandable and memorable forms. We retain the basic Koa derivational system, although the suffixes are being used in a different way than they would with nominal roots.
Cons: Numbers are kind of long -- súlune súluho sulu = "444" as opposed to e.g. sune sulu su -- though not ridiculous. Are they distinct enough from each other?

I suppose length isn't something to which we should attach prime importance given, for example, seitsemänsataa kahdeksankymmentäyhdeksän which seems to work just fine as "789" in Finnish.

I think we can throw options 1 and 2 away outright, which leaves us 3, 4, and 5 to choose between -- we'll think it over. In the mean time, note that either way words of quantity need to be separated from the noun by pi, with a slew of unanswered questions, chief among them being whether pi counts as an article. Is "four squashes"...

sulu pi cumo
sulu pi a cumo
a sulu pi cumo
a sulu pi a cumo


In parting, I'm pleased to see that, no matter how we end up doing this, we should have no trouble constructing phrases along the lines of E-o du instruistoj da studentoj, e.g. sulu mehe pi sahi = "four men's worth of beer."

Saturday, February 24, 2007

Genitive Relationships

I think it's time to finally figure this thing out. Supposing we want to express something like "my father's house," through the years we've usually been thinking of something like

ka talo o ni ato
the=house GEN=1SG=father
"my father's house"

The big question in here is the o. Is it really necessary? What all is it used for? While trying to imagine what I'd do without o, it occurred to me that a structure like this might work:

ka talo ni ato
the=house 1SG=father

This follows our rule of head-modifier word order, and all; but would it work acceptably with the rest of the grammar? Well, just now I was thinking about direct objects. Take this sentence:

ni tata i loha ko sihi
1SG=dad 3P=love NONINST=vegetable
"my dad loves vegetables"

If this is how we're treating direct objects, and I think it is, then we should also have the following in parallel with our usual verb/noun/adjective series:

ka loha ko sihi
DEF=love NONINST=vegetable
"the vegetable lover"

ka mehe loha ko sihi
DEF=man love NONINST=vegetable
"the vegetable-loving man"

Note that we're going to have to look pretty critically at this last example, as this is basically a relative clause and I'm not totally sure that the clauses are going to be sufficiently distinct from each other articulated like this. But for the moment, let's assume this is okay to continue the argument.

If ka loha ko sihi means "the lover of vegetables," then we've got our genitive phrase! Of course, this is what a Latin speaker would call an "objective genitive," not a possessive genitive. But I really can't come up with a situation where there could be ambiguity between the two.

Well...except for one, whose importance I'm not yet sure of. How does one express the possessor of an agent? In most cases it's not problematic: "my father's builders" would be pretty unlikely to mean "the builders who built my father." But. I apologize for the following choice of verb, but it's the only one I can come up with that exhibits true ambiguity. What about e.g. "the emperor's killers?" I think a phrase like this probably occurs in Dune somewhere. There would be no way to know whether the assassins being spoken of in fact disposed of the emperor, or are employed disposing of others for the emperor. But maybe there's some way to resolve this by means of additional morphology. For instance, if we put a suffix on "killers" that indicates that the verb is the individual's profession (like we did with "wine salesman"), the ambiguity would no longer be pragmatically troublesome.

I'd like to take this moment to give thanks for the fact that this is not Loglan and therefore I don't need to worry about being able to clearly and unambiguously express every logical possibility.

Anyway, I feel like I'm getting away from my original topic. The point of all this is that I believe genitive relationships can be expressed simply by SPEC-NOUN SPEC-NOUN.

At this decision, though, a disquieting realization arrives. If we take our sentences about vegetables from above and use a pronoun object instead of a generic noun, look what happens:

ka tata i loha ni
DEF=dad 3P=love 1SG
"dad loves me"

ka loha ni
DEF=love 1SG
"the one who loves me," "the lover of me," "my lover"

We've just used ka X ni = "my X," instead of ni X which is what we've been fervently believing in for years. Uh-oh.

Okay, so I did think of this possibility originally -- in fact, in Ea opi le Koa I wrote ka tata o ni for "my dad." But I liked the way that saying ni tata was (1) shorter, and (2) treated "my" as a kind of determiner, which it is in a way.

Can I have it both ways? I don't think I can avoid the fact that there's no reason ka tata ni shouldn't mean "my dad," so what would be the difference between that and ni tata? Purely pragmatic? Hm. Well, variety is the spice of life, sure, but this is supposed to be an IAL.

There is one other disadvantage to doing pronoun possession this way, though I don't think it's strong enough to warrant throwing out the idea. If a postposed monosyllabic pronoun indicates possession, that means I can't use anything pronominal as a formative for my derivational system. That is to say, if ka talo ni = "my house," then there cannot be a word ka taloni. It's not that I'm into all that monosignificance crap, but this is just far too ambiguous for a well-designed system.

So, then, it looks like, instead of the 45-47 CV derivational formatives I was hoping for, I'm going to have to deal with around 40. Not such a huge loss, I suppose, but in a language with such a small phonology every such loss is pretty important. Here's what I have to exclude so far:

ka, a, ko, ti, i, ni, se, hi, nu

...plus whatever I end up choosing for 2nd and 3rd plural, if I decide I need them. Damn and blast.

Switching gears completely, I do want to mention that I've been throwing around the idea of providing optional reduplicative pronouns in addition to the usual monosyllabic forms. Thus, "I" would be either ni or nini, "you" = se or sese, &c. I momentarily thought, "wait! then possession could use the long forms, thus avoiding the derivation problem!" But ka talo nini is just far, far too long for a simple possessive phrase, in my opinion.

To sum up: "the X of the Y" is to be expressed as ka X ka Y. If Y is pronominal, the structure, at this point, can be either Y X or ka X Y.

Uh...one last thought before I give it a rest for tonight. If ka talo ni and ni talo are really equivalent, then either of the following would have to be acceptable for "my father loves me":

ka ato i loha ni
DEF=dad 3P=love 1SG

ka ato i ni loha
DEF=dad 3P=1SG=love

Um. Huh. This will require thought.

Friday, February 23, 2007

NP Modifiers

Though we have not yet RIGOROUSLY defined the semantics of this operation, we've determined that when two nouns stand in the relationship X Y, X is interpreted as the head and Y as the adjectival modifier. What if Y is not a noun, though?

Turkish, for instance, has a special marker that ties locative phrases to NPs. So

resim masa-da-Ø
picture table-LOC-3sg
"the picture is on the table"

But if we want to say "The picture on the table is pretty," we can't just say

*resim masa-da güzel-Ø
picture table-LOC pretty-3sg

...despite the fact that it makes perfect sense in direct translation. Instead, we see the following:

masa-da-ki resim güzel-Ø
table-LOC-ADJ picture pretty-3sg
"the picture on the table is pretty"

For this reason I was nervous about making any unresearched decisions about similar PPs in Koa. The sentence "the man in the house is bad" (and yes, I know I desperately need to expand my lexicon), for instance, in literal translation would run ka mehe ne ka talo i pua. But is

mehe ne ka talo
man LOC=DEF=house
"man in the house"

okay? Usually I asked this question and then pondered whether it ought to be mehe o ne ka talo, with this o showing that the preceding lexeme is about to be modified by something (I believe o is going to figure in my possessives -- maybe I'll explore this next time). I think, though, that I may be able finally to answer this one satisfactorily. Let's compare the following parallel structures:

As a predicate:

ka talo i iso
DEF=house 3P=big
"the house is large"

ka mehe i ne ka talo
DEF=man 3P=LOC=DEF=house
"the man is in the house"

As an instantiated adjectival:

ka iso
DEF=big
"the big one"

ka ne ka talo
DEF=LOC=DEF=house
"the one in the house"

Therefore, directly modifying a noun, and voilà:

ka talo iso
DEF=house big
"the big house"

ka mehe ne ka talo
DEF=man LOC=DEF=house
"the man in the house"

Considering that these phrases are parallel in the top two cases, I don't see why they would need to be differentiated in the final one. One less thing to worry about.