philena: (dampskunk)
philena ([personal profile] philena) wrote2012-01-29 09:36 pm

More electronic toys

It seems that I failed to write an entry bragging about learning Perl (it would have been a thematic companion to my "I run Linux now!" bragging). So, briefly: I learned Perl this summer! I'm hardly a whiz and there's a whole bunch that I don't know how to do, but I'm good enough now that I can use it for real thing. Specifically, I learned it fo a project that required the ability to sift through large amounts of text, (in this case, a corpus of spoken Dutch) pull out specific words (in this case, everything tagged as a diminutive) and their contexts (was the preceding word a definite detemriner or not?), and then match them to another file that had information about those words (what is the gender of the base word?*). I'm sure it was not particularly elegant, but it got the job done and I'm waiting to hear back from a conference to which I submitted the abstract that resulted from the analysis of the output of all this scripting.

That, however, is not the point of this entry. The point of this entry is some more bragging! I'm now learning Python in addition to Perl! I'm not entirely sure yet what Python can do that Perl cannot, but the computational methods in linguistics class that is being offered here this semester uses Python as its language of choice, and there's certainly no reason not to be able to add more languages to my growing toolkit of computery-things. Class has only lasted about two weeks so far, and I certainly would be happier if it moved a smidge faster, but even at the reasonably slow pace (which assumes no prior programming experience) I've learned enough to vastly prefer the ability of Python user-defined functions that allow you to specify the number and type of arguments over the inability of Perl user-defined subroutines to do anything other than squash all the arguments into on long list, which requires the meat of the subroutine to do some pretty fancy manipulations in order to do the right thing to the right argument. I have also learned enough about programming in general to adore PyScripter! Our professor mentioned it as an alternative to the built in IDLE, and I must say it is vastly superior. Not only does it not freeze when I run certain scripts as IDLE does, it also has all sorts of nice bells and whistles! For example, a regular expression test pane! When I worked in Perl, my approach to regexes was to cross my fingers and hope, but there's no need for that here!

I have lots more to say on the difference between Perl and Python here, but I will refrain out of courtesy to the majority of my readers (Hi, Mommy!) who, I suspect, cared enough to expand the cut, but now have glazed eyes and are wishing I'd talk about something else.

*In fact, getting that information from the second file also required a bit of perl-scripting to adjust the file from its extremely unhelpful format into a format that was searchable, with all the forms of the same word indexed by the same number.

After my prospectus hearing last December, my committee gave me a Task to accomplish this semester. More specifically, my dissertation project depends on certain constructions being more or less probable, but the previous research that suggests that these constructions are more or less probable has been conducted on extremely unbalanced corpora of Russian prose (e.g., the works of four Russian literary authors from 1850-1950, plus three years of newspaper articles), and yielded only a few hundred observations*. So my Task is to design a norming study that systematically manipulates the factors that these corpus studies suggested were important in order to (a) confirm the existence of the proposed effects, and (b) get a better sense of the effects' magnitude. I will then run the norming study on the same population that I hope to run my actual experiment on, so that when I run the actual experiment this summer or next fall I will have a better set of baseline measurements than the ones currently available from the literature.

This weekend, I finally came up with enough sentences to design a set of stimuli lists that will have enough systematic manipulation of factor values to provide useful data. Then I met with a friend who speaks Russian and he helped me change the sentences I came up with to better sentences that do not sound at best like Soviet realist poetry, as he kindly put it, and at worst like "what is this even supposed to mean?". I was pleased that fewer than half the sentences needed heavy revision.

I might also take this moment to circle back to the electronic theme and mention that I am very pleased that my advisor gave me an extremely nice big-screen second monitor to use with my computer in my cubicle in the psycholinguistics lab. Without it, designing the sentences would have been much more tedious. I was aiming for a Latin Square design, which requires a lot of different items to be systematically cycled through a lot of different conditions and then put into a set of different lists so that one subject sees the sentence about leaves growing on the ficus in Condition 1a, while a different subject sees that sentence in Condition 2a, and a third sees it in Condition 2b, and so on. When I tell you that my conditions yield a 2 x 2 x 3 x 6 design, you'll understand that this made for a very elaborate spreadsheet, and I have never appreciated the use of a big-screen double-monitor set-up as much as I did this week.

*For comparison, the corpus study on Dutch that I described above had about 600,000 observations. Granted, it is extremely large, but it is still the case that 373 is unusually small.

Mr. Philena and I went to see The Gondoliers today! It's the first time I've seen the play since I was in the chorus in college, and it was really lovely. It's certainly not one of the best Gilbert and Sullivans, but the singing and staging and music and acting was excellent, and even mediocre Gilbert and Sullivan is reliably good. The venue was in a sort of ritzy community that was easily accessible by public transportation, and we had a nice stroll around the neighborhood, which, in addition to the expected clothing boutiques and haircutteries, boasted no fewer than three piano stores, two of them on the same block, and a store dedicated to cupcakes, which we naturally patronized. It's the kind of neighborhood that is very good for taking one's parents to: lots of nice restaurants a (and piano stores), and a very large arts center that puts on productions of Gilbert and Sullivan and George Bernard Shaw plays. We're hoping to go there to see Arms and the Man sometime next month, and I hope we'll like it as well s we did today's play.

It is possible that one reason I'm so eager to see more good theater is that Mr. Philena and I have been extremely unlucky with movies recently. I happen to really like the old Jack Lemmon and Cary Grant and Hitchcock type movies, as well as British costume dramas (Downton Abbey), but Mr. Philena has gotten bored with them (and I can see why). Mr. Philena really likes nature documentaries, but we've seen the entirety of Planet Earth, Life in the Undergrowth, Life of Birds, Life of Mammals, and Life in the Freezer, and as much as I find David Attenborough charming, I've gotten a bit bored with them (and I hope one can see why.) There are some movies that I don't even try to suggest we watch together (e.g., science fiction thrillers and action flicks), and Mr. Philena likewise doesn't expect me to join him watching biopics about musicians. Sometimes we find really, really good non-classic movies that we both adore (The Lives of Others and Clueless, in particular), but it seems that more often when we watch new movies or famous movies we're disappointed (Adaptation) or so dissatisfied that we simply turn them off partway through (Crash, Magnolia*, 2001: A Space Odyssey**). The most recent disappointment was Fast Times at Ridgemont High. It had been described to me as a sort of Clueless-type thing, only earlier and not quite as good, but worth seeing as a cultural icon. And it might certainly be a cultural icon, but I was certainly not expecting all the sex, and there was nothing clever or subtle about any of the rest of it. Mr. Philena and I could easily predict a lot of the dialogue, and about the time the girl who has been having lots of sex reveals that *gasp* she's pregnant! we simply couldn't take it any more and turned it off again. This is getting very tiresome. I know that it is possible to find good television and good movies that we both like. We've done it before. I just wish we could do it again.

*It is possible that there was some confusion between Magnolia and Steel Magnolias which led to the lack of success of this movie

**I must confess that we fall among the crowd who cannot see past the incredibly boring-ness to understand the brilliance. We did give it a fair try, though--we watched the whole monkey sequence and a fair bit of the space-faring bit before we simply couldn't take another lengthy shuttle-docking scene and turned it off.
mithrilian: (Default)

[personal profile] mithrilian 2012-01-30 09:25 am (UTC)(link)
If you like G&S, maybe, you'll like works of Hungarian composers Franz Lehar and Imre Kalman.

Kalman, especially, is well known and loved in Russia. I'd recommend and old b/w Soviet film "Mister X".