Showing posts with label natlang. Show all posts
Showing posts with label natlang. Show all posts

Thursday, March 9, 2023

A Happy Song (alt: A Song of Hope and Joy)

 Just to prove that I'm not just sadposting and still have some linguistics content, here's a project I've been working on. And by working on, I mean put together in a couple hours this evening, instead of leaving it for next week like I originally planned to. 

So once I finish "Fishing for Birds" my next project is an EP called "Scandals from the Karaoke Booth" which is a bunch of covers. I did the first one/the opener a long time ago, a cover of Sparklehorse's "Chaos of the Galaxy/Happy Man." Which tells you a lot about what this project is like. Most people cover just Happy Man because it's a rocking song. But while it can be played alone, thematically it works best in conjunction with Chaos of the Galaxy. I took this a step further by focusing almost entirely on Chaos of the Galaxy with just hints of Happy Man, making it a cover but very much my own interpretation of it. 

That's not the point though. The next song I've been working on is "Lagu Bahagia" by Sisir Tanah. Now, I can sing this just fine in Indonesian but that doesn't fit the spirit of the album. So I decided to do my own translation of it. This blogpost is about the theory behind that (and will probably make it to reddit at some point). 

 

Thursday, September 15, 2022

On tubers and how to translate them

 I've been playing a lot of xenoblade recently and have many thoughts about it, as can be seen on my reddit profile. As much as I like it, there's something extremely immersion breaking for me: taro is referred to as potatoes.

If I remember correctly, spongy spuds are first introduced around when the party's food gets stolen. At the very least I saw a bag full of them at some point in Maktha Wildwood and I recognized them right away as taro. Which was cool because taro is not a commonly seen plant, especially for westerners. It's only later on we get to the problematic part. See, Zeon's ascension quest is about growing crops, specifically "spongy spuds" for his colony. The name spongy spud itself is fine, it's a fictional world after all. However, they are also referred to as potatoes, taters and other less ambiguous names. This is despite both the tubers and the plants are clearly modeled after taro. Even the advice to harvest after the leaves start wilting is a taro trait (though potatoes do have similar advice. I wonder if cassava does as well, it might be a general root crop thing). So that was pretty frusterating.

Now, localizing taro as potato isn't necessarily a bad thing. After all, most English speakers aren't going to be that familiar with taro. But it does become a lot worse when you're localizing something that has images along with the text, since even if you don't know what taro is it is pretty obvious that those aren't potatoes. It reminds me of the "jelly donut-onigiri" controversy from the Pokemon anime way back when. The idea of turning onigiri into jelly donuts to make it more relevant to the audience isn't a bad thing. Doing that when it is clearly referencing an image which is not a jelly donut is an issue. (Funnily enough, onigiri does play a minor role in Xenoblade 3 and its name is not translated). 

Anyway, I actually went to the Japanese version of the game to see what the original text called them. Spud, potato, tater, etc all seem to be used as translations for the same word imo "tuber". So while the original text doesn't seem to explicitly label it as taro (as far as I could tell), it doesn't explicitly call it potato either. Localizing this to a bunch of words for more variety is reasonable enough but again, the translators should've looked at what it was referring to before making some of these translations. 

You know what the worst part is? I seem to be the only person to notice this and care enough to complain! At least, I haven't encountered anyone else yet who was like "yep, that's clearly taro."

-------------------------------------

In other farming related video game news Harvestella looks pretty awesome but I hope it doesn't neglect farming too much. And someone needs to make a farming sim that caters towards caters towards my desire for complex agronomy and agricultural markets while still maintaining the sort of whimsy often found in these games (that is, I want more realism but don't want to play John Deere Combine Simulator 2022: Deluxe Edition). Maybe I'll rant about that some day.

Friday, May 29, 2020

Random (Real) Languages (take 1)

Gonna play a game with myself, for myself. I'll choose one language from each of Glottolog's families (the first dozen or so at least) that I haven't studied before but think would be cool to learn and explain why. Oh and while I won't go too weird, I'll try to keep my things kinda off the beaten path, though I am going to try to minimize outside sources. If I can't think of a qualifying language then maybe that means I've gone too deep. Clicking on the glottolog links is allowed though, if I need a memory job. I'm using Glottolog because it is a little more conservative with its groupings, which makes this more fun.

Atlantic-Congo

There's literally hundreds here, of course. I think I'd go with Fula/Pula(a)r/Fulfulde, though I don't know which dialect. It has a number of cool features. First of all, by being "Atlantic" (and specific Senegambian) it is hip. But it has a robust class system so I get to play with that (unlike say Yoruba). I'd choose one of the dialects with consonant mutation, since that's fun. And unlike most Atlantic-Congo languages, I wouldn't have to worry about tones.

Going beyond linguistic reasons, I've recently become quite interested in the Sahel and its peoples. As one of the biggest and most widespread groups in the region, knowing some Fula might be useful in learning about them and working with them (especially since working in dev and agriculture might very well bring me to their region). And it sets me apart from all those numerous Hausa speakers, though there's dubious value in that.

Austronesian

Ahh yes, the one I could go on for ages about. And not just cause it has so many languages, but because I have a long history with it and lots of uses for it. And they're super cool structurally. Hell, I could do a whole post just on MP branches. Anyway, my choice for today though is pretty basic: Malagasy. First of all, it could be useful (for many of the same reasons as Fula, actually). But also it is just a cool language in and of itself. It has a (simplified) Philippine voice system to work with. It's phonology is whack, with all those voiceless vowels and stuff. And it has a visibility distinction for deictic system. Plus Madagascar would just be cool to visit.

Indo-European

It's pretty lame that some people would have a lot of difficulty with this one. For me it's easy: Ossetian. I find most IE languages boring, but Indo-Iranian is one of the better branches. Ossetian is a Scythian language, which is pretty dope. Not to mention, it has ejectives now, making objectively one of the cooler IE languages. Add in the awesome ethnic flag and at least there is something here. Sure it is the most useless of the languages I've mentioned so far but not everything can be useful. Not sure about any particular grammar things of Ossetian but I'd guess it's pretty typically Iranian.

Trans-Himalayan 

I don't like using Sino-Tibetan even though that's what Glottolog uses. Anyway there's quite a few possibilities here. For this round I'll draft Burmese. It has some interesting sounds in its inventory and would be a decent introduction to tones, for one. It is also one of the more widely spoken non-Sinitic languages in the family, so there is some use. Plus for my line of work (once again) it is a fairly useful language to know and area to be acquainted with. As far as grammar is concerned, it is highly isolating which is fun for someone used to more inflection, yannow. Maybe anyway.

Afro-Asiatic

Another hard one. In this case, I will go with Beja. Yeah sure if I wanted Cushtic something like Somali or Oromo would make more sense. But I gotta say that I am just fascinated by Beja and how there are relatively few sources (even for cushtic languages which are hard to find to say the least). I can't even say much about what's special with Beja, just that I would like to learn it. Tbf, it's not even the most hip Cushtic language (that would probably be Agaw or Iraqw or if you're really a conlanging trend follower, Dahalo). But they are desert pastoralists, which seems to be my thing.

(Nuclear) Trans-New Guinea

As always, I try to think of a language before I look at a list. In this case, I am still able to do that. The language I choose is Nduga. This might seem like a really random one. Well, it is. But in this case, I have some personal reasons for it. I happen to know a guy from there, which is a good enough reason to learn a language. It is also undergoing a refugee crisis, which goes back to my whole background, in a way. Linguistically, I can't say much about it, there aren't many resources. I hear though that it is a very typical Dani language, and therefore typical for TNG. That means it has features like switch reference, lots of serial verbs, clause chaining and all those other things that make Papuan languages fun. But in the end, there just simply isn't much more I am able to say about it right now.

Pama-Nyungan

This is going to be difficult. I simply can't remember many PN languages, let alone their names. Arrernte though, that has some weird features. Of course, that makes it really trendy to choose, but I can't be picky right now. The most interesting thing about Arrente is that it might have an underlying VC structure, which is more or less unique. It has a vertical (ish, at least) vowel system as well. As a PN language, it is non-configurational (iirc) which is cool as well. Not to mention some fun cases and noun extensions. At the very least it would be different (though as far as I am aware, classically PN). And it might be one of the more useful Aboriginal languages, because it is used in Alice Springs, not that I'd have any intention of living there. All that makes a decent case I guess.

Otomanguean

This, like PN, is one where I have to choose based on memory and then see what I can do. So I can choose Zapotec. Unfortunately, I can't say much about it. Like other OM languages it is fairly isolated and has really funky tones. Both of those are cool. It's fairly isolating, which might notice is something I'm interested in (weird, I know).  And yeah, that's actually all I can really say here.

Austroasiatic

Finally one I have opinions about again. There's a lot I could choose from here. This round, I think I will go with Khasi, but it wasn't easy. Khasi was one of the first MK (using so that AA can go to Afro-Asiatic) languages I looked into, plus Northeast India is cool. If I recall correctly, it's a bit more agglutination than most non-Munda MK while still having that fairly analytic charm. It has noun incorporation that doesn't look much like incorporation on the surface (like even less so than Fijian, I think). Not nearly as insane as aslian languages, but still cool. And a bit more useful.

Tai-Kadai

While the responsible answer would be Thai, Lao, Shan or Isan, I'm gonna say (Paha) Buyang. Why? Because it is an essential language in putting together the Austro-Tai hypothesis, which I've slowly come to think is pretty cool. It also has an insane consonant inventory (like well over 50 consonants I think). This might be the smallest and most endangered language I've chosen so far. Even more so than the very ill-defined Zapotec and possibly even Arrernte.

Also I'm now in the families with less than 100 languages so I will probably stop soon.

Dravidian

Brahui, because it is the only northern one I can name off the top of my head. Also Pakistan. Dravidian languages in general are pretty cool, with lots of participial and stuff, so I don't think I would mind learning any of them.

I'm gonna end here because I have 0 opinions on Arawakan and just do some honorable mentions.

  • Mande: Manding because let's be real, Mandinka/Bambara/Dyula/Malinke are all very similar and more or less mutually intelligible. So that opens up a big portion of West Africa and the Sahel without having to learn French.
  • Nilotic: Dinka, specifically Dinka Bor. This one was hard because there are a lot of Nilotic languages that are interesting but Dinka (Bor) has weird vowels, weird alignment and weird grammar in general. Like seriously, it supposedly has a voicing system like and three contrastive vowel lengths. This is totally a language worth knowing. 
  • Mayam: Kaqchikel because that's what my grandfather speaks. You know, I had a chance to take a semester of K'iche' but I didn't end up signing up. Regrets man, regrets.
  • Timor-Alor-Pantar: I feel like I've read just a bit too much on Abui grammar, so let's go south and say Fataluku. Which I think is actually really different than Abui. Decently useful in timor though.
And that's all folks (for now).


Saturday, September 2, 2017

Isolating polylang?

I've already complained about polysynthesis before. I still think it's a stupid term. Anyway, here's one (very strict) definition I've seen for it:

1) polypersonal agreement
2) noun incorporation
3) extensive derivational synthesis
4) pervasive head-marking
5) verb-marking more than noun-marking

There's nothing about the phonological coherence in this one. Which, if I understand correctly, allows for the mythical "isolating polysynthetic language". Now this is a concept I've heard about before, in the back corners of internet forums and the like. I never understood how it was possible.

Then I met Abui. The author describes it as polysynthetic. Yet it sure doesn't look it. I think the most morphemes I've in a (phonological?) word is 5 and most of 2-3. Yet it's serial verbs allow for very complex verb phrases. While it's not isolating by any means (it's squarely in the agglutinating camp), it does show the diversity of "polysynthetic" languages and how the term really doesn't do justice. I'm sure if the average amateur (con)-linguist looked at it (even with glosses) they'd probably not label it as polysynthetic. Yet the author of this grammar was confident in doing so and I haven't seen anything disputing this. (Another fun one that I keep seeing brought up as polysynthetic (including by experts in the field like Michael Fortescue) with no discussion as to why it is classified as such. From my skimmings of the grammar, it sure doesn't look it).

In other news, I'm looking forward to the release of the Oxford Handbook of Polysynthesis which comes out in a few months.

Sunday, July 16, 2017

Aslian Language Dump

Here's a dump of links on Aslian languages, since people keep asking me about them. (This is originally taken from a reddit post)

Sadly there aren't tons (and the grammars in the side bar are kind of lacking :/ ). But here's some
The big pile o grammars:
  • Temiar
  • Semai
What is good though is that the Temiar grammar has lots of links to new(er) materials, so that's gonna be the bulk of what follows (you might notice a common author):
General Information

Specific Languages

Friday, July 14, 2017

The Problem of Polysynthesis

I hold many controversial opinions. One of them is polysynthetic is a bad term, not just because of it's vagueness but because of what it signals, especially in conlanging. Anyway, this argument with some people got out of hand (#selfexaminationhurts) (I said some dumb things too) so I never really got explain why it is bad beyond the vagueness.

Here's the first thing I never sent and then I'll follow up with some other ideas I've had since then:

"Anyway, my point is that even now, the languages we choose to label as polysynthetic (especially taking the large amount of morphemes approach) tend to fall on minority and especially disenfranchised groups. This wouldn't necessarily be a problem if there was actually an agreed on definition for polysynthesis. But there isn't, because whenever someone tries to come up with something, other people get angry because their language gets excluded (the biggest example of this being Baker and his exclusion of Inuit languages) or because a language they don't consider polysynthetic is included. So we are left with a category that means "lots of morphemes and if feels that way". Which then brings us back to the point that "feels that way", for whatever reason, closely aligns with "languages spoken by minority groups". So we have a category that (like all morphological typologies, mind you) doesn't tell us really much of anything about the languages classified in it, except that 1) they have long words with multiple morphemes; 2) are not placed with the other languages for some reason.

And that's crux of it. The category doesn't tell us anything that synthetic (here being agglutinative and fusional) doesn't already tell us. Yet people defend it so viciously and want their language to be in the category. Why?"

Well, a big part of it is what I call "fetishization of the exotic" (and I am guilty of it too). Polysynthesis is seen as something cool, so you want your language in it (especially for conlanging). It is seen as cool because it is different from IE (and especially English) therefore something you want to be. And that's where the underlying "racism" (for lack of a better term) comes in. It doesn't mean that the linguists/enthusiasts are being racist, but they are, because of the way the terms has been used, perpetuating stereotypes and signaling certain ideas (namely primativity/noble savage/north americanness) through the use of the term "polysynthesis". It is the "exotic" that really binds the different types of polysynthesis together, more so than head-marking, polypersonal agreement or noun-incorporation.

Why is this important? Well, the category "polysynthesis" hurts conlanging and reduces its diversity. How? First of all, since there is little if any actual tendencies that fit for polysynthesis, it isn't signaling features for the most part. Instead it signals that you want your lang to be North American-like, especially in a Salishanesque way. This is fine and all, but it further reduces the amount of languages people learn about and makes them think that polylangs actually have many binding features. It also means that they are less likely to learn about features not found in those languages. For example, I did an informal survey on switch reference (with a bunch of polylang enthusiats) and none of us could think of any conlang with a switch reference system (other than my own, in progress one). Why? I think a lot of it has to do with the fact that the primary references for polysynthesis don't have it, even though it is very common in polylangs in other parts of the world (like New Guinea) and even in the United States! In all this reduces the diversity of conlangs (I've seen one papualang (excluding my own) and none based on Australian langs, for instance) because people have an incomplete view of what "polysynthesis" really is and don't realize it.

Fetishization of the exotic aside, polysynthesis would be an okay term if it could be well defined, people agreed on a definition, didn't try so hard to fit every language into it and recognized its limits and unreasonability. It would be fine if the community used a wider variety (not just of Native American langs) of languages to act as references, showing the diversity in the term and maybe counteracting some (though not all) of the underlying marks/stereotypes within the term. But it doesn't and we don't have the self-awareness nor desire in the community to fix this. So I'm stuck ranting about it on a blog. Well the next time the inevitable "how do I polylang" or "I never see polylangs (cue 15 polylangs)" post comes up, I can link this as I try to raise awareness :p

Wednesday, February 22, 2017

Praat

I'm never going to get the hang of praat. Good thing I'm not going into phonetics. Acoustics (and transcriptions in general) is just too hard.  Here's the chart I came out with for english though:

Sunday, February 19, 2017

Intesification in Indonesian?

Random question of the day. Has anyone done a study/published a paper on the indonesian infix -w- (alternatively -u-). Like in panas vs pwanas and bagus vs bwagus. I think it comes from javanese, but I can't find any literature about its use in Javanese either.

I heard it all the time, especially in East Java, and occasionally I see it written to, but I can't seem to find anything actually about it. Like it is a feature that everyone seems to know and understand when someone else uses it, but no one officially recognizes it as a real thing.

I'm a just crazy thinking that this is an actual productive infixation? Is my analysis of it being an intensifier wrong? Do people actually understand it whenever it is used, or only in lexicalized words? And most importantly, why has no one studied this? It seems like a surely some Malayacist would have done a paper on it by now. If not, I guess I have a possible research paper to write, if I had time and was actually in a lingustics program.

Thursday, February 9, 2017

Adventures in Indonesian translation pt 1

What I perceived as a very poor translation on Facebook has lead me to start submitting translations for Google again (it's interface is more userfriendly than facebook's. Also, it lets me do Indonesian to English, which they might need more than english to indonesian. But I like to do both) and I've been submitting translations for English from Indonesian.

For those that don't know, the way this works is I am given a phrase or sentence (with no context) and then asked to give a translation. Not having context can make it pretty hard, and man do I get some funny things. What follows is some funny ones I've got, or ones that provide good translation notes.

  • "Hidup bebas di dalam air laut dan tawar."
    • I can't tell if this is some sort of saying or word of wisdom or if it is information about a fish. Is it an advertisement offering a way to free yourself from the perils of water? I eventually decided it was probably about a fish.
  • "Berita tertangkapnya tuyul tersebut membuat banyak orang penasaran."
    • Tuyul, a small spirit of the familiar sort. Where does Google get these to translate? Tuyul can't even be translated into english without a translator's footnote, imo. Penasaran is an interesting word too. Here (and most other places) it is being used like an adjective, even though the wordform itself is a noun. I translated as curious (since that's how I almost always see it being used, and makes sense in this case), but my dictionary says that it literally means "angered" or "anxious to find out something". I guess the second one could mean curious, but why not just say that? Tertangkapnya is also an interesting word being a nominalization. I have an article about -nya nominalizations somewhere, I should probably read it again sometime.
  • "Einstein dengan teori relativitas khusus dan umumnya."
    • This, as far as I can tell, is a fragment in English and Indonesian. It's easy to translate (you probably don't need to know indonesian to translate it), but it NEEDS MORE CONTEXT
  • "Begitu mereka terjerumus, adalah masalah besar di kemudian hari."
    • Terjerumus appears to be a new word for me. I think in this case it is definitely being used like "to fall into sin". Also, this appears to be a case where "adalah" means"ada-lah" not "is".
  • "Saya membeli kertas, pena, dan tinta."
    • This is the one where I realized I was translating most things into the past tense. Probably accurate, but context is really important for translations and even more so for Indonesian, where tense is so context based.
  • "Bonus dihitung dan diberikan secara harian."
    •  I realized I've been doing a similar sort of thing with things that could be singular or plural. I think in this case it is plural. Honestly, I've gotten pretty bad at marking plurals in english sometimes, it just doesn't seem important anymore.
  • "Saya raba seluruh bagian tubuh yang sensitif"
    • This is the second translation that I think it pulled from a porn site. I'm translating these as unerotically as possible. "I groped all the sensitive body parts".
  • "Apa sih penyebab tubuh kita bisa merasakan gatal?"
    • I don't get many translations that use particles like "sih". Kind of hard to translate, but not too bad overall, though I did a pretty liberal translation on this one.
  • "Cara menghilangkan jerawat yang aman adalah secara tradisional."
    • I'm looking at this one and seeing reasonable translations. First I thought it would be "a safe and traditional way to remove acne", but then I noticed the adalah. The best translation would probably be "A safe way to remove acne is traditionally" and play on the fact that English does allow (I think) adverbs in the predicate like that.
  • "Tak pernah secuilpun kudengar kabar tentang dirinya."
    • Who even uses language like this? I guess I should try to preserve the formality of it. Trying to decide if I change the word order or preserve it for poetic sake.
  • "Kamu wonderwoman, yang membuatku ngerasa jadi superman"
    • I don't want the context on this one. Hopefully it's a song or a poem.


Other random translation note. -nya and dia are really difficult because I never know if to translate them to he or she, or if a straight singular they is best. I NEED CONTEXT TO TRANSLATE. Google's advice? "If you feel you need more context (like gender or formality), go ahead and translate as best you can". This is why Google Translate (and all machine translations) suck. Machines can't understand the context and pragmatics of a statement.

Well that's enough for tonight. Translation is a really fun exercise.