Choosing Java as your language for a Machine Learning project — Are we crazy???
Most people are stunned when they realize that the Xatkit bot engine is written in Java. True, the vast majority of AI / Machine Learning projects are written in Python. But this doesn’t mean that you should go with Python when starting your own project. And don’t worry, this is not a post about language wars. I don’t pretend to say that Java is better than Python (nor the other way round, for that matter). I’m just explaining our language choice. And suggesting that you should take into account many aspects when choosing the base language for your next project.
Let’s see why Java is a good choice for Machine Learning projects, or at least as good as a choice as many others:
- Machine Learning is only a small part of your project. Most of your code will NOT be about ML tasks but about data input/output, user interface, interaction with external services,… so the language needs to be good at all these things as well. This is especially true in the case of chatbots that, to begin with, need to interact with different user input platforms.
- There are ML libraries available for every language. So there is always a way to execute/train your neural networks outside the python world. For instance, in Xatkit, we reuse Stanfords’ Core NLP models in some of our language processors. And, if needed, there is always the option to wrap the ML models code in a Python server (I like the simplicity of Flask for this) and consume them via API calls to this server.
- Java is heavily used in the enterprise world. So while core ML fans may frown at our language choice, enterprise users may see Java as a benefit as they already know how to manage and deploy Java-based applications but they could not have the same experience with Python or other languages.
- We are Java “experts”. We are much more productive coding in Java than with any other language. Of course, we could become proficient in Python if we put the time but time is precious and it made sense to stick to the language we were already using in other projects
- Xatkit is a model-based tool. By model, I refer here to software design models, not ML ones. An in the modeling ecosystem, Java is still the boss. In particular, Xatkit reuses some EMF libraries, mostly to do some reflection on the bot definition at runtime. For sure, there are other ways to accomplish the same goal, but you can see this as a legacy decision before Xatkit embraced Fluent APIs for the bot definition.
As you can see, maybe Java should not be your first option when getting started in AI technologies if there is really no constraint at all on your language choice. Otherwise, the choice of a language is more of a social/team/organization decision that should take into account many other aspects (team knowledge, organization architecture, integration needs,…). We see developers arguing non-stop about why language A is better than language B but for most projects, even those including some kind of intelligent component, any major language will work and that choice will NOT be the core element in the project success at all.
So, forgive me if we continue developing bots in Java 🙂