Dr Samantha Pendleton
Clinical Informatician
Engineer of data, ontologies, and clusters. Thrower of pots, controllers, and eggs.
Jabberwocky
Jabberwocky is a Natural Language Processing (NLP) toolkit for those nonsensical ontologies1. Available open-source on GitHub.
To avoid duplicating information, all information is provided in the links:
I started development in 2019 and Jabberwocky v1.0 was published in 20201.
After a few years of a PhD and starting a job, in 2024 I came back to Jabberwocky and Version 3.0 was a complete revamp of the repository. Version 2.0 (2021) improved the annotation script with a Phrase Matcher2 so both key terms and phrases work. Futhermore, high-level functions for stop word removal & text cleaning and a new plotting feature was introduced: a pretty word cloud!
Shortly after version 3.1 was released! This included a new plotting feature. And the TF-IDF3 (statistical method) was improved with the to ability to use n-grams so rankings can expand to uni-grams, bi-grams, tri-grams, and more.
In 2025, v3.1.1 updates were inspired by an old project - cyannotator - users can now request an HTML output of the corpus with key terms highlighted.
As of 2026, v4.0 has been released, with a complete change of log scripts, error handling, convering excel to owl, and more.
References
Pendleton, Samantha C., and Georgios V. Gkoutos. “Jabberwocky: an ontology-aware toolkit for manipulating text.” Journal of Open Source Software 5.51 (2020): 2168. ↩︎ ↩︎
Honnibal, Matthew, et al. “spaCy: Industrial-strength natural language processing in python.” (2020). ↩︎