coding

Setting up a clean project: the best practice I use

I’ve been settling in at my new job at the Language Development Lab at HKU, which I started physically in March 2021. I will soon do an update on my time in Hong Kong since then, but this post is written with future me in mind.

Tidy collostructions

tl ; dr In this post I look at the family of collexeme analysis methods originated by Gries and Stefanowitsch. Since they use a lot of Base R, and love

Guanguan goes the Chinese Word Segmentation (II)

tl; dr This double blog is first about the opening line of the Book of Odes, and later about how to deal with Chinese word segmentation, and my current implementation

Guanguan goes the Chinese Word Segmentation (I)

tl; dr This double blog is first about the opening line of the Book of Odes, and later about how to deal with Chinese word segmentation, and my current implementation