How We Improved AI Text Classification Accuracy By 60%

Domain: AI in Legal
Services: NLP Consulting

LegalForce Trademarkia is one of the biggest Trademark search engines. They were looking to improve the text classification of trademark applications into 45 different primary categories and 40,000 sub-categories leveraging AI

These categories are typically manually entered by attorneys, a very slow process, making it an ideal place for automation with Artificial Intelligence and Natural Language Processing (NLP) techniques.

ai text classification
Trademarkia Search Engine

Problematic Classification

Unfortunately, Trademarkia’s existing AI text classification system which used a linear classifier performed poorly, where trademarks were grossly misclassified at the primary category level and could not make a sensible classification at the sub-category level.

Our Solution

1. Understanding the problem

Our first step was to understand the exact solution used by our client, the size of data involved, and problems in their dataset such as sparsity issues. Upon diagnosis, we realized that a traditional classification approach was not the best way to tackle the problem due to issues in the data and the massive number of sub-categories.

2. Solution Development

Once we determined the problem, we were able to design and develop an alternative solution leveraging a highly efficient information retrieval approach using Python, Gensim, and ElasticSearch.

During development, we preprocessed the data adequately and developed a full pipeline (client IP) where any valid user input would result in a logical categorization both at the primary category and sub-category levels.

3. Evaluation & Delivery

In addition to the quantitative evaluation of our approach, to ensure that the results made sense and met the needs of our client, we manually evaluated ~100 test cases. Finally, the full solution was delivered to our client for integration.


As a result of our partnership, LegalForce Trademarkia was able to see  ~60% improvement of text classification accuracy at the primary category level and was able to make automatic classification at the sub-category level which they previously were not able to. Also, because our algorithm was clear and efficient, LegalForce Trademarkia was able to easily integrate the pipeline into their workflow.

From Trademarkia’s CTO…

We are a silicon-valley startup in the legal technology field. We were looking for a Natural Language Processing expert for an important feature in our project, and we found Kavita by her blog. 

We are lucky to have her help our project. She is very knowledgeable in data mining, proposing different options, trying different models, and helping us find out the appropriate approach.

Kavita has great personality, easy to work with, and always replying you on time. I enjoy working with her, and highly recommend her.


Need help on a similar problem? Get in touch to speak to our experts.

Scroll to Top