Case study 1

Project Description

Background

We are in charge of developing and maintaining a system for our client from the United States specializing in patent prosecution. In general, the system processes decisions relating to patents – it is meant for users who want to become well-informed patent prosecutors or to stay up-to-date with patent appeal trends. When the client contacted us, the system already existed. However, it needed to be updated – it was necessary to improve its components performance, develop new functionalities and refresh the look of web application. It has been a long-term cooperation, and the system is still developed – we add new functionalities and improvements on a regular basis.

 The system processes PTAB decisions daily. In this project, we use Big Data, Natural Language Processing and Machine Learning.

Challenges and solutions

The first challenge in this project was to find a way to separate data processed by the components of the system and data of the users and metadata. We decided to use two separate databases – MongoDB is used to store the documents processed by the system, and MySQL is used to store information from user management part and metadata.

Another challenge was to manage data ingestion. There are external datasources used for getting the documents. The problem here is that we download specific fields and values from each datasource. We created a special component dealing with downloading and processing the decisions. This module downloads the decisions, extracts necessary fields and provides the Machine Learning module with the documents so that the issues (issues are the basis for making the decision) and outcomes (they are the results of the decision) of the decisions are retrieved.

Regarding Machine Learning module, as it is mentioned above, it is used to extract issues and outcomes from the extensive texts of the decisions. The nature of these texts is that they have similar structures, but the issues and outcomes occur in different places of the documents, and the sentences in which they occur are not based on any regular pattern. Additionally, the number of issues and outcomes is vast. Therefore we used Machine Learning and regular expressions to make sure that these values are correctly obtained. Developing and improving this module required a lot of work and tests.

The next challenge was to refresh the web application. The client wanted to have numerous custom filters and functionalities allowing the users to do complex research on the data. We achieved the goals by combining technologies appropriate for given functionalities.

 The client also wanted to develop a functionality sending emails to the users of the system notifying them about new decisions. These emails include also details of these decisions. The emails can be sent on a daily, weekly, biweekly and monthly basis, and there are also custom emails, where the users select the time range that they are interested in. These functionalities are possible thanks to cron job.

 Finally, the client needed an API that could be used by the users. We used swagger to build API for this system.