Open Source Adapted Bicycle Pedal Comes to the Rescue
Accessibility has always been important to designers of open source software. Now that open source has come to design, that's more true than ever, as demonstrated with this open source bicycle
Linux Action Show to End Eleven-Year Run at LFNW
Six more episodes before the popular Linux podcast, Linux Action Show, ends its nearly 11-year run in a live broadcast from LinuxFest Northwest.


Jupiter Broadcasting's long-running
Dealing With Real-Life, Everyday Security Threats
No one has ever been shot by a hacker who was breaking into their computer through the Internet. Not so for thieves coming in through the back door.

Roblimo's Hideaway

I wrote a piece
Four Things a New Linux User Should Know
When you move from "that other operating system" to Linux, you're going to find that in most ways you'll be in familiar territory. However, that's not always the case. We sometimes do things a little differently
The Future of Desktop Ubuntu
With all the changes happening at Canonical, you might wonder what this means for the future of desktop Ubuntu, besides the return to the GNOME desktop.

There hasn't been this much news about a single Linux distro
Libreboot Reorganizes: Seeks to Make Amends
It appears the people developing Libreboot have done some of the hard work necessary to fix potentially toxic personal dynamics after last year's controversy, when the project removed itself from the
It's Windows Time in Linux Land Again
Using Windows. What a horrible thing to ask a Linux user to do.
April 2nd, 2014

Open Source Project Brings 11th Century Kannada Verses Online

Vachana sahitya is a form of rhythmic writing in Kannada poetry that evolved in the 11th century C.E. and flourished in the 12th century as a part of the Lingayatha movement. More than 259 Vachanakaras (Vachana writers) have compiled over 11,000 vachanas. 21,000 of these verses which were published in a 15 volume set, “Samagra Vachana Samputa,” by the Government of Karnataka, a state in South West India, have been digitized. Two Wikimedians along with Kannada linguist and author O. L. Nagabhushana Swamy are involved in the Unicode conversions, corrections and writing the preface for these verses. The entire work is now available as a standalone project called Vachana Sanchaya and ready to enrich Kannada WikiSource.

Palm Leaf Vachanas

Palm leaf of 11th and 12th Century with Vachana poems in Kannada language

This project was started a year ago when Kannada Wikimedian Omshivaprakash was trying to help Professor O. L. Naghabhushana Swamy and Kannada author and publisher Vasudhendra to easily access the vachana (verses) of Vachana Sanchaya. Swamy had challenges in using publicly available content on Vachanas since the data was in ASCII and searching text was a huge problem. Pavithra Hanchagaiah started helping to collect information about about vachanas and document them into Unicode by writing scripts to customize open source software to convert the Kannada fonts from ASCII into Unicode.

Kannada Language project

Pavithra Hanchagaiah and Omshivaprakash H L

After further discussions, it was decided to get thousands of vachanas into a database, making them easily searchable with an index. This required us to build a platform on which this could be done. The fruits of our labors will help linguistic researchers and students as well as the public at large, anybody who’s interested in reading and studying Vachana literature.

With this idea, Omshivaprakash started designing the model and his colleague Devaraju started building it. In the meantime, Pavithra was running various scripts to fix errors in the conversion of the ASCII text to Unicode, confirming that the data was ready to be consumed by the modules developed for the concordance. We spent weekends and holidays executing this project from home and would sync up once in a while online.

With constant feedback and guidance from Mr. Swamy and Vasudendra, we learned how a concordance of text is used by researchers and what would make it easier for them to do their research. Omshivaprakash worked on the architecture of the platform, decided the infrastructure requirements and managed the entire project. Free and open source software technologies were used for keeping the platform active. Pavithra was involved in providing critical hacks for digitization and offered valuable input through suggestions, feedback and Q&A.

Working system

At present, the system has around 200,000 unique words in the repository. It was an extensive learning process, as we used our free time to solve real time issues. Moreover, it was a work of the Kannada language that needed quick attention. Vachana Sanchaya is meant to be more than just a repository of the text online; it’s meant to be a tool for researchers.

For example, as a user searches the words on our system, he or she can see who has used the word in which Vachanas. To improve readability, the searched text string is highlighted in each Vachana that is displayed. To repeat the search for a specific Vachanakaara, the user needs only to click on his or her name on the graph provided on the result page. We have used the MediaWiki jquery-ime input tool architecture that helps us provide the user with the ability to directly enter Kannada text in Unicode for a search.

Public Response

We are glad to see people accessing vachanas from our Facebook, Twitter and Google+ channels. Thousands read them every day and it has become a part of many people’s daily routine. There have been more than 50,000 page views on social networks and 500,000 page views on our site in the first few months after our platform’s public launch. Some of the most commonly searched Kannada words are “ಕರ್ಮ”(Karma en: Work/Deed), “ಸತ್ಯ” (Sathya en: Truthfulness) and “ನದಿ” (River).

ಆಂಗೀರಸ, ಪುಲಸ್ತ್ಯ, ಪುಲಹ, ಶಾಂತ,
ದಕ್ಷ, ವಸಿಷ್ಠ, ವಾಮದೇವ, ನವಬ್ರಹ್ಮ, ಕೌಶಿಕ, ಶೌನಕ, ಸ್ವಯಂಭು, ಸ್ವಾರೋಚಿಷ, ಉತ್ತಮ, ತಾಮಸ, ರೈವತ, ಚಾಕ್ಷಷ, ವೈವಸ್ವತ, ಸೂರ್ಯಸಾವರ್ಣಿ, ಚಂದ್ರಸಾವರ್ಣಿ, ಬ್ರಹ್ಮಸಾವರ್ಣಿ, ಇಂದ್ರ ಸಾವರ್ಣಿ ಇವರು ಇಪ್ಪತ್ತು ಮಂದಿ ಪ್ರಪಂಚ ನಿರ್ಮಾಣ ಸಹಾಯ[ದ]ವರು. ಹತ್ತೊಂಬತ್ತು ಎಂದರೆ ಪುಣ್ಯನದಿಗಳು. ಅದು ಎಂತೆಂದಡೆ: ಗ್ರಂಥ

— An example of a vachana from the Vachana Sanchaya project.

Plans for the future

Our system is extensible with respect to adding new features. We have a review desk for researchers to help with the review of content. Later we will be adding required references to Vachanas from various research works on this literature. The content is available for the public through OpenData API and will be distributed in the public domain through WikiSource once the review work is complete. This will open up the system for students, developers, researchers and anyone interested in working to build linguistic tools for Kannada and other Indic languages.

This system will evolve so it can be used for other literature projects. Vachana Sahitya will further help us to initiate Natural Language Processing (NLP) projects if more researches get together to tag the words, glossary, etc. We can also add various language tools such as a spell checker and grammar checker through crowd-sourcing development. The forthcoming project under the “Kannada Sanchaya” are Sarvagnana Vachanagalu and Dāsa Sanchaya which are already in the pipeline. Our idea is to extend this platform to include works from antiquity (Vyasa, for example) to the early 20th century (e.g., Muddanna) and possibly even include contemporary literature that’s available in the public domain.

4 comments to Open Source Project Brings 11th Century Kannada Verses Online