How to Build Production Ready NLP Solutions?

A vast majority of NLP solutions developed at the workplace just don’t scale! And by scale, we mean handling real-world uses cases, ability to handle large amounts of data, and ease of deployment in a production environment. Some of these approaches either work on extremely narrow use cases or have a tough time generating results in a timely manner.

Below are 3 tips that you can use to ensure that your NLP systems such as text classifiers, topic extraction algorithms, synonyms generator are all are able to handle production loads.

1. KISS please!

KISS (Keep it simple, stupid). When it comes to the choice of techniques for solving NLP problems, choose techniques and pipelines that are easy to understand and maintain instead of complex ones that only you understand, sometimes only partially. In a lot of NLP applications, you would typically notice one of two things: (1) Deep pre-processing layers or  (2) Complex neural network architectures that are just hard to grasp, let alone train, maintain and improve on iteratively.

The first question to ask yourself is if you need all the layers of pre-processing? Do you really need part-of-speech tagging, chunking, entity resolution, lemmatization and etc.  What if you strip out a few layers? How does this affect the performance of your models? With access to massive amounts of data, in a lot of applications, you can actually let the evidence in data guide your model.

When it comes to Deep Learning, use it wisely. Not all problems benefit from Deep Learning and for the problems that do, use the architectures that are easy to understand and improve on. For example, for a programming language classification task, I just used a two-layer Artificial Neural Network and realized big wins in terms of training speed and accuracy. In addition, adding a new programming language is pretty seamless as long as you have data to feed into the model. I could have complicated the model to gain some social currency by using a really complex RNN architecture straight from a research paper. But I ended up starting simple just to see how far this would get me, and now I’m at the point where I can say, what’s the need to add more complexity?


2. When in doubt, use a time-tested approach

With every NLP/text mining problems, your options are a plenty. There will always be more than one way to accomplish the same task. For example, in finding similar documents, you could use a simple bag-of-words approach and compute document similarities using the resulting tf-idf vector. Alternatively, you could do something fancier by generating embeddings of each document and compute similarities using the document embeddings. Which should you use? It actually depends on several things:

a. Which of these methods has seen a higher chance of success in practice? (Hint: We see tf-idf being used all the time for information retrieval and its super fast. How about the latter?)

b. Which of these do I understand better? Remember the more you understand something, the better your chance of tuning it and getting it to work the way you expect it to.

c. Do I have the necessary tools/data to implement either of these?

Some of these questions can be easily answered with some literature search. But you could also reach out to experts who have worked on similar problems to give you a recommendation. As you get more and more projects under your belt, the intuition factor kicks in and you would just have a very strong sense about what’s going to work and what’s not.


3. Think about cost and speed

Have you ever thought about what it would take to deploy your model in a production environment? What are your data dependencies, how long does your model take to run, how about time to predict or generate results?

Also, what are the memory and computation requirements of your approach when you scale up to the real number of data points that it would be handling? All of this has a direct impact on whether you can budget wise afford to use your proposed approach and secondly if you will be able to handle a production load. If your model is GPU bound, make sure that you are able to afford the cost of serving such a model.

The earlier you think about cost and scalability, the higher your chance of success in getting your models deployed. In my projects, I always instrument time to train, classify, and process different loads to approximate how well the solutions that I am developing would hold up in a production environment.

In summary, the prototypes that you develop don’t have to be throw-away prototypes. It can be the start of some really powerful production level solution if you plan ahead. Think about your endpoint and how the output from your model will be consumed and used and don’t over-complicate your solution. You will not go wrong if you KISS and pick a technique that fits the problem instead of forcing your problem to fit your chosen technique!


Scroll to Top