9.10.15

Zipf Fun

Yo,

So, I've been rather unproductive, I feel, but at the same time, I think I have done a lot. Maybe at the same time though, I need rest and now I'm not doing much because of that need? I don't know. Whenever I end up feeling like this I get guilty for not really doing anything. Idk.

I've probably also been watching too much youtube. It's a mixture of my regular subscriptions, cake/dessert DIYs, and super smart Vsauce, Veratasium, MinuteEarth, etc. So, I feel like I'm learning a bunch, but I'm probably not really learning that much because it's kinda in one ear out the other?

The other day I was watching this super neat Vsauce video about the Zipf Mystery, which in essence says that in language, all language, we tend to use some words more than other words. What's crazy about that is that if you then rank them in order of use there is a logarithmic relationship. The less used they are, they become exponentially less used and in relation to the 1st word are used about 1/x where x is their ranking. And it's been shown that this is the case in books, in whole sums of authors' works. It's crazy!

I'm really interested in both linguistics and in science, so this greatly appeals to me. Additionally, I just so happen to have a compilation of my works, aka, a blog. So I can actually test this. haha. So, allow me to do so. I realize I have many many many blog posts, so this may take a bit of time. But I also don't want to waste my time, so I'm gonna try to do this as fast as possible. xP

Ttyl.

D.Fa

PS. For this post it is following a somewhat logarithmic distribution! 289 words, 161 unique words. On the left is the simple frequency vs word, on the right is the frequency vs word rank on a log scales with the red being a ideal Zipf distribution if the 1st word had 50 repetitions. this is a tiny sample though.

No comments:

Post a Comment