For business to business enterprises, client classification focuses on putting a customer into a specific category or bucket. In this specific scenario, DataSpartan was working with a crowdsourcing startup who wanted to identify whether or not a business on their platform was likely to “hit” their crowdfunded target amount raised. The client also wanted us to inform them of which variables were most important in terms of increasing the likelihood of a raise.
A large proportion of the onboarding process was done manually and it was extremely time consuming for the startup to fully onboard each client only to have them not complete a successful raise. Standardized data was collected about the size of the company, the number of followers they had in social media as well as financial performance, their team, industry and competition as well as exit potential. Currently, 56% of the companies that were going through the funnel had a successful raise.
The first part of the problem was identifying which variables were the most likely to influence the outcome. Historical data was extracted, cleaned and standardised to allow this analysis and three variables were singled out which determined the probability of a successful raise. A supervised learning model was then created which could classify these leads based on their likelihood of conversion based on their data. Python was used for this portion of the data analysis and libraries such as NumPy and SciPy were leveraged to ensure rapid computation and fast model prototyping.
The current research and results have been incorporated back into their lead pipeline in the form of a qualifying questionnaire which filters out unlikely leads and pre categorises the clients. This has resulted in a 4% decrease in the number of man hours used as well as a 6% increase in the number of companies that successfully raise. Further work is being done to automate some of the manual processes such as KYC (Know-Your-Client) document verification to allow the client to efficiently scale up its business. Work is also being done to improve the data pipeline and to ensure that the data collected is being stored in the appropriate format for analysis.