This post is an account of the implementation of an Input Method Engine (IME) for translation through Intelligent Input Bus (iBus) - a software available on modern Linux desktop environments.
Most of my work in the last years have been in the vicinity of building a library for on-device machine translation. In one exchange over GitHub, I was tasked with implementing the underlying library requirements for Outbound Translation . In essence, it refers to the concept of user enter in a source language which the user knows and a software layer intercepts it to provide the target language content.
This problem immediately resonated with me. At some point during my internship at NAVER LABS Europe when I stayed in France, I had to contact La Poste customer care. Calling them up was out of the question - because my processing power for French audio over phone was worse. Somehow I managed to find this accessibility chat interface where I could “chat” with customer support, using text. What bothered me as a user was that the medium is French and I’m sitting with multiple tabs of Google Translate manually copying and translating content I receive, then translating the content I want to send out from English to French.
I have been typing my mother tongue - Malayalam, which uses a non-Latin alphabet and has a script of it’s own using an English (US) keyboard since high school. The technology which enabled me to do this at the time was Swanalekha. Learning the InScript Keyboard looked like a hard effort, and just typing out weird combinations of the alphabet to create the intended alphabet in Malayalam felt like an easier thing to learn. The technology is enabled by iBus - which one can find today in modern Linux based operating systems with tight integration to the GNOME Desktop - and enjoys widespread use in inputting non-Latin script by means of a Latin keyboard layout (e.g: English (US)) while providing software layers for non-Latin keyboards.
In a 1:1 with Kenneth Heafield, I bring up the idea of the alternative of the entire outbound translation concept from browser-only to system-wide using the Intelligent Input Bus (iBus). While the know-hows of connecting iBus to the library was not straightforward, the possibility of the solution using both was acknowledged. However, Bergamot Project was more focused on attempting cross-platform and in the browser (on operating systems more than linux), so a pitch for a keyboard layer in the operating system (Linux specific) turned out to be a downside.
My counterparties at Mozilla collosally delayed the extension’s implementation of Outbound translation, leaving me plenty idle time - some of which could be redirected towards this idea. Since the tools were decided, the solution was not all that complicated. Lemonade iwould implemented in C++ as an engine that implements the iBus interface, that connects to the bergamot-translator C++ library. Together with the ecosystem (models, fast-nmt engine) built by the Bergamot project, lemonade manages to run the translations completely locally, providing the privacy benefits intended to be achieved by the Bergamot Project. For purposes of building an IME, I found ibus-libzhuyin which I could modify and connect to the Bergamot C++ library to reach a minimum viable product. Non traditional applications like ibus-typing-booster - further improved my confidence in using iBus.
The user interacts with iBus through two elements - (1) a panel available system-wide and (2) an input UI in the vicinity of the text area the user intends to input the translated text.
Panel The panel, often available in the top-right corner for GNOME allows to choose from available input methods.
It also allows the user to switch source language, target language and allows to configure a verify option. For now,
xx->xx is configured to be a passthrough.
Input UI The input UI inverts the traditional usage. The existing implementation shows translation as pre-edit text, which is underlined text that is almost entered into the target text area. A commit action by the user inserts the pre-edit text into the text area. The candidate list is used to show the text that is entered in by the user before committing.
While we only have
en->yy models internally, pivoting feature allows for entering
xx->yy by using the models involving English in sequence.
Verifiying machine-generated translations
Okay, now that the user is potentially trusting a machine-learning system to intercept and translate the content being put in. Lost in translation is a thing, sometimes with dangerous consequences - including getting detained. How does one boost the users’ confidence in the translated text?
Zouhar et al. (2021) studied this as part of the Bergamot Project and came up with the UI recommendation of providing the backtranslated content additionally to the user. In this setting, we take the translated text and try to translate it back to the source language, which the user understands. If the backtranslated text match, we can be more confident that the text sent in is correct.
In the below screencast, I use lemonade (system-level) for outbound translation in conjunction with a local-translation browser extension - jelmervdl/firefox-translations.
Lemonade also works on other applications, pretty much any text-area by intercepting the keyboard and input method. See different controls being configured on a word-processor.
I am grateful to Kenneth and the UI group in Bergamot to have received attention and feedback for what is a hobby horse side-project. Feedback I have received so-far notes that this is particularly useful in a chat setting, but limited when richer editing requirements are involved.
Since I managed a few interactions with the UI (Research) team, a lot of the problems highlighted I perceived to be quite hard.
UI Limitations The inability to control the UI elements (primarily because of my lack of understanding of iBus) impede complex UI mechanisms. For example, once the pre-edit text is committed, there is no way to backtrack it to the original source text that the user input. This was deemed to be useful when we want complex editing workflows that are quite common on the web. This is however a constraint due to sticking to the iBus specified interface. I am keeping keeping open a web-based input method to gain more development capabilities.
Flickering There is an increased instability in the text during translations. This is a necessary evil, as translations are not monotonic in nature and larger contexts lead to drastically different word orderings in the translated text. The modified translation is perhaps more suitable than an incremental one. This is a problem shared by interactive translation research and some speech-translation which requires stability as transcribed speech translations progress.
Edit workflows Queries often arose about being able to edit/verify at word levels rather than the sentence levels after an initial draft translation was committed in. Word level editing appears to be quite hard, especially when the signals we have during inference are faint.
In the current setting, the implementation uses hardcoded file paths and is reliant on the bergamot python package to fetch and inventory the models. A better position is to eventually connect to translateLocally. translateLocally’s vision is perhaps as a cross-platform GUI application - and I’m trying to convince the authors to separate out the application library out so lemonade can pick it up. The pursuit of a Native Messaging extension in translateLocally brings the features closer to the requirement of lemonade.
This could also be done using the Python Interface iBus allows for, but at the time Python package for bergamot was not very mature.
Lemonade source currently sits in jerinphilip/lemonade under a permissive license. It has a lot of rough edges, which are expected to be smoothened over free time. If you’d like to help out with development, please feel free to drop by the GitHub issues/discussions or even contact me via email.
Vilém Zouhar, Michal Novák, Matúš Žilinec, Ondřej Bojar, Mateo Obregón, Robin L. Hill, Frédéric Blain, Marina Fomicheva, Lucia Specia, and Lisa Yankovskaya. 2021. Backtranslation feedback improves user confidence in MT, not quality. In Proceedings of the 2021 conference of the north american chapter of the association for computational linguistics: Human language technologies, pages 151–161, Online, June. Association for Computational Linguistics.