Data packages for current and future me

tl; dr I show why it is worthwile to put my Chinese-related datasets in packages and how I went about it. Introduction I don’t know if I’m very late to

Guanguan goes the Chinese Word Segmentation (II)

tl; dr This double blog is first about the opening line of the Book of Odes, and later about how to deal with Chinese word segmentation, and my current implementation

Bridging phonology, meaning, and written form across time: introducing CHIDEOD, a database of Chinese literary ideophones

With an open source database of ideophones we can address a multitude of issues regarding Chinese ideophones.