Document Search: Financial Institutions Algorithm Search

Since the advent of Google, search has been the bedrock for the spur of the information age. With the touch of a button it is possible to access any public indexed document stored. The drawback with conventional search engines such as Google and Bing is that they cannot be used on proprietary data sets.

In the context of compliance, financial institutions usually have a compliance team monitor and overlook other teams to ensure that they are not in breach of any regulations. In the case of the client, they wanted to save time by ensuring that their compliance team was able to find the right documentation. The client had an internal database of PDF files which contained information about whether or not a statement or comment should be flagged.

There are certain keywords and phrases that traders would use that corresponded to a specific set of stocks. This would amount to insider trading and the job of the compliance team was to closely monitor these Bloomberg chat logs to ensure that this was not the case. There were a very high number of false flags which meant that the team had to spend time trying to identify the relevant document to check if it was a real flag.

DataSpartan was asked to solve the search part of the problem by creating a tool which would save the compliance officers time by allowing them to search their internal databases more quickly for the relevant information. A custom interface was built using Django which had PDF previews for key words and phrases to allow the officers to preview the documents manually and by eye for relevancy.

Our Result
This is the first component of a larger system which involves a document recommendation engine highlighting the usage of the keywords in the document explicitly. The component being developed is currently being integrated into the client servers and is projected to save each officer 3 hours of work each month. Full integration of the larger system is projected to save 10 hours of work each month.


Posted on

August 23, 2021