Voice of the Industry

Fraud techniques: why we chose machine learning to tackle fraud

Friday 21 April 2017 08:37 CET | Editor: Melisande Mual | Voice of the industry

Gerry Carr, Ravelin: Machine learning is a rapidly expanding discipline with a lot of innovation and experimentation around various techniques

We are not a machine learning company

Ravelin is a fraud detection company. When we set up the company in late 2014 to tackle the fraud issue, we actually had a clear slate: none of us were machine learning aficionados. We were not a bunch of academics with a machine learning hammer looking for a problem to crack. Instead, we were considering the following scenario: how do we help businesses to get accurate fraud decisions at the moment they are useful and scale that solution to thousands of transactions per day, week or month?

After scouring the landscape, the only way to tackle the speed, accuracy and scale issues all at once was through machine learning. It still is. With a ruleset we could probably solve the problem adequately for one client, but would not be feasible to replicate that for multiple clients - consequently the economics would not make sense for us or our clients because it would be too expensive to build and maintain.

So we determined on machine learning as the core technology but supplemented with some other approaches.

Which machine learning techniques Ravelin uses to detect fraud

Machine learning is a rapidly expanding discipline with a lot of innovation and experimentation around various techniques; some old and some new. The cost and availability of computing power has made this experimentation affordable and the benefits are coming to the market now.

At Ravelin, we prioritise techniques that make it possible for us to explain how a fraud decision was arrived at (inspectability). Our customers tell us that they want to know why and how fraud is happening so simply delivering a decision from a ‘black-box’ is not a valid approach for us. Therefore, we choose from the following approaches and blend them together to provide a probabilistic fraud score.

Customer dashboard image

Logistic regression: This is a statistical technique where a merchant’s good transactions are compared with its chargebacks, to create an algorithm that predicts whether a new transaction is likely to be a chargeback or not. For very large merchants these models are specific to their customer base, but more often general models will apply.

Decision trees and random forests: ‘Decision Tree’ is a mature machine learning algorithm family used to automate the creation of rules for classification tasks. They are essentially a set of rules which we have trained using examples of fraud that our clients are facing. The creation of a tree ignores irrelevant features and does require extensive normalisation of our data. By following the list of rules triggered by a certain customer, we can inspect trees and understand why certain decisions are made.

Random Forest is a technique of using an ensemble of multiple decision trees to improve the performance of the classification. It allows us to smooth any errors which might exist in a single tree and increase our overall performance and accuracy while maintaining our ability to interpret the results and provide explainable scores to our users.

What techniques do we use in addition to machine learning?

Rules: Yes, we use rules. We do not use rules to produce fraud scores but we do use them to implement policy. So let’s say you never allow a transaction over a certain value to go through for first time buyers. It’s a rule and can be implemented to override the fraud score even if we believe the transaction is good. Rules, used sparingly, are a very useful way to ensure company policy is adhered to.

Reporting: A common concern with using machine learning approaches is that the reasons for a decision are hard to discern. We select techniques that allow us to explain how a decision/score was arrived at. This gives users the desired level of control.

Graph networks: Machine learning models score on actions, behaviour and activity. They are not designed to spot connections. So a seemingly obvious connection (say a shared card between two accounts) might not be detected by a model. To counter this, we enhance our models with graph networks which map out all sorts of connections in user data and have proven to be at least as effective at stopping fraud for our clients as the machine learning models.

Graph Network image

For similar stories, please check out our Web Fraud Prevention and Online Authentication Market Guide 2016/2017 here to get access to an insightful outline of the global digital identity and web fraud ecosystem.

About Gerry Carr

Gerry is CMO of Ravelin, which provides fraud protection for online businesses. He joined Ravelin from its inception to help define and articulate a product vision for the changing face of fraud in ecommerce. Prior to Ravelin, Gerry has led the product marketing functions for products as diverse as Ubuntu and Sage CRM.

 

About Ravelin

Ravelin prevents fraud and protects margins for online businesses. Companies all over the world are accepting more transactions with fewer chargebacks thanks to our unique machine learning-based approach to fraud prevention. By automating standard fraud tasks, fraud teams can spend time focusing on the root causes of fraud instead of day-to-day review of transactions.


Free Headlines in your E-mail

Every day we send out a free e-mail with the most important headlines of the last 24 hours.

Subscribe now

Keywords: machine learning, fraud detection, Ravelin, Random Forest, security, behaviour analysis, fraud prevention
Categories:
Companies:
Countries: World





Industry Events