rockym93 dot net

archive · tags · feed

A New Herp

20 April 201304:58AMcode

The word derp and its derivatives are really useful, mostly content-free filler. I could go on a whole tangent about the sociolinguistic implications of that, but that's not what I'm here to do today. Today I come before you to revolutionise the way you derp, with the power of software. More specifically, I'm going to write a script which will generate every possible single-syllable variant of the words ending in 'derp'.

Why? Because I was bored. Let's begin.

We're looking for single syllables, which means no vowels, so the logical place to start is with a list of consonants.

c = ['b','g','d','p','k','t','v','z','f','s','sh','r','l','n','m']

The obvious problem here though is that you can't just mash consonants together and get something pronouncible. Not only is there an upper limit of three- ish in a row, but there's also some kind of rules about what can follow what. We have to split it into categories of some sort.

vs = ['b','g','d']
us = ['p','k','t']
vf = ['v','z']
uf = ['f','s','sh']
l = ['r','l']
n = ['n','m']

Of course, right about the time I was writing that, I realised that's not good enough. These are basically just dumb lists, which means that I'm going to have to write a function (or more likely, a whole set of functions) that somehow know what to do with a list of (say) voiced fricatives, and what to put after them, and how many of them you can have in a row. That's probably a valid approach, and it'd lead to some mildly interesting insights into English phonology, but I happen to think that there's a better way.

Basically, I need to implement the sonority sequencing principle. This is a generalisation about syllable structure which says, basically, that the loudest part goes in the middle. And loudness (well, sonority) is something that you can reduce to a numeric value. And once you have numeric value, you can work out a way to put it in sequence.

class soundtype():
    """A represents a class of sounds and what they can appear after."""
    def __init__(self, sonority, wordedgey, membersounds):
        self.sonority = sonority
        self.sounds = membersounds
        self.wordedgey = wordedgey

This is the data structure I've ended up using. It holds a set of sounds, a sonority value, and a true/false falue about whether or not these particular sounds are allowed to appear out of sequence at word edges (which is something that, for example, 's' does in 'sprints'.)

def derpify(allsounds,already="",depth=1):
    if depth <= 3:
        for set in allsounds:
            next = []
            for test in allsounds:
                if test.sonority > set.sonority:
                    next.append(test)
            if next == [] and not set.wordedgey:
                for test in allsounds:
                    if test.wordedgey:
                        next.append(test)
            for sound in set.sounds:
                print sound + already + "erp"
                if next:
                    derpify(next, sound + already, depth + 1)

This is the function. It takes a bunch of soundtypes, makes a list of what can come before them for each of them, and then passes that list and the string so far to itself. The list gets smaller every time. When it runs out, it checks the edgeyness, and adds an edgey sound if it hasn't already. It will do this at most three times, because that's the upper limit on consonant clusters for English.

l = [
soundtype(2,False,['y','w','h']),   #glides
soundtype(3,False,['l','r']), #liquids
soundtype(4,False,['m','n']), #nasals
soundtype(5,False,['z','v']), #voiced fricatives
soundtype(5,False,['f',]), #voiceless fricatives
soundtype(5,True,['sh','s']), #voiceless fricatives which can appear at a word edge
soundtype(6,False,['b','g','d','j']), #voiced stops
soundtype(6,False,['p','k','t','ch']) #voiceless stops
]

derpify(l)

That's the data I fed in. And that's what I got out. It's not half bad. I'd say it produces everything you can pronounce, but not quite everything it produces is pronouncible. There are a couple of rules about things like putting voiced and unvoiced things next to each other which need to be taken into account, but by this point I'm losing interest and it's really late at night.

There's also easily room to modify this to produce every pronouncible syllable in English. Just add vowels, and a mechanism to reverse the sonority sequence on the other side of the syllable. Again though, I'm not really interested. If you are, feel free to play around with the code. Or just peruse the list, to find an interesting derplike for your next conversation.

...

I can't believe this is what I do on Friday goddamn night.

< 1992-2013 Bee Grade Movie >