R

Tidy collostructions

tl ; dr In this post I look at the family of collexeme analysis methods originated by Gries and Stefanowitsch. Since they use a lot of Base R, and love using vectors, there is a hurdle that needs to be conquered if you are used to the rectangles in tidy data. I first give an overview of what the method tries to do, and then at the end show the

Guanguan goes the Chinese Word Segmentation (II)

tl; dr This double blog is first about the opening line of the Book of Odes, and later about how to deal with Chinese word segmentation, and my current implementation of it. So if you’re only interested in the computational part, look at the next one. If, on the other hand, you want to know more about my views on the translation of guān guā

Guanguan goes the Chinese Word Segmentation (I)

tl; dr This double blog is first about the opening line of the Book of Odes, and later about how to deal with Chinese word segmentation, and my current implementation of it. So if you’re only interested in the computational part, look at the next one. If, on the other hand, you want to know more about my views on the translation of guān guā

Mapping the terminology for ideophones

Goal The goal for this short update is to use the R package lingtypology (click here for the tutorial), in order to create a map that shows which for which languages we use which terminology relating to ideophones. Now, I know that the data isn’t complete yet. It is an ongoing cataloguing project. You can find the more recent versions of this map on my Github