Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence (AI) is the theory and development of computer systems that may perform tasks, that traditionally have required human intelligence. AI is a vast field, in which ‘machine learning’ is a subdomain. Machine learning can be described as a method of designing a sequence of actions to solve a problem, known as algorithms, which automatically optimized through experience and with limited or no human arbitration. These methods can be used to find patterns in large sets of data (big data analytics) from increasingly diverse and innovative sources. The figure below provides an overview.From the very beginning of interest in the 1950s, smaller subsets of artificial intelligence – the first machine learning, then deep learning, a subset of machine learning – have created ever larger disruptions. The simplest analogy of their connection is to visualize them as concentric circles with AI – the idea that came first, followed by machine learning, and at last deep learning – driving current day AI expansion – fitting inside both of them.Machine Learning, plainly, is use of algorithms to parse data, study it, and then make a deduction or forecast about something in the world. So rather than manually coding software command modules with a defined sequence of commands to achieve a particular goal, the system is “trained” using large sets of data and specific algorithms that give the system, the capability to understand and learn how to execute the assignment. Machine learning is the brainchild of the early AI group, and the algorithmic undertakings over the decades included decision tree learning, inductive logic programming, clustering, reinforcement learning, and Bayesian networks, including other techniques. Machine Learning has many categories, which are classified on the amount of human guidance necessary for labeling the input training data. These categories are :Supervised Learning: The algorithm is provided a set of data for training, which has labels on some segments. As an example, some data points in a data set of financial transactions may have labels, which helps to identify the ones that are fraudulent as compared to those that are genuine. Over the course of training, the system’s algorithm will ‘learn’ a high-level method of classification, which it will use to forecast the labels for the outstanding entries in the data set.Unsupervised Learning: In this category, the input data fed to the algorithm doesn’t have labels. The algorithm is requested to detect patterns in the data by identifying groups of observations that are on comparable underlying features. For instance, an unsupervised machine learning algorithm could be defined to look for securities that have features comparable to an illiquid security that is difficult to provide an evaluation for. If it finds a similar looking cluster for the illiquid security, evaluations of other securities in the cluster can be used to help estimate the value of the illiquid security.Reinforcement Learning: This technique in somewhat part way between supervised and unsupervised learning. In this category, the algorithm is given an unlabeled set of data, and it decides an action for each data point, and receives comment or review (possibly by a human) that helps the algorithm learn. For example, reinforcement learning is particularly useful in self-driving cars, game theory, and robotics.Deep Learning: This is a type of machine learning that employs algorithms that function in ‘levels’ influenced by the design and purpose of the human brain. Deep learning algorithms and it’s structures, called artificial neural networks, can be used for supervised, unsupervised, or reinforcement learning.Neural network algorithms were created about a couple of decades ago. Deep Learning algorithms expand on neural networks by working with multiple levels. This helps solve complex problems with a higher degree of abstraction. We also now have a large amount of data and enhanced processing capabilities (through GPUs). All of these propel use of Deep Learning algorithms. While neural networks have been used as non-linear classifiers, the actual potential of Deep Learning algorithms lies in automatic feature engineering. Automatic feature engineering requires extensive labeled data. Therefore, there is a requirement to employ both AI/Deep Learning side-by-side other machine learning technologies for fraud detection.Fraud prevention is a type of anomaly detection. Therefore, the aim of fraud prevention is to detect transactions which do not function in the normal manner, i.e. anomalies. A variety of traditional machine learning techniques like Logistic regression, Decision tree, Random Forest, Neural networks, Clustering etc can be employed to identify anomalies. Use Cases in FinanceThere are a wide variety of applications within the financial system for AI and machine learning like: Trading signals: Machine learning to increase efficiency and reduce costs of investment firms by swiftly parsing news from multiple sources and make trading decisions from more sources of data than a person is capable of. However, these machine learning methods are susceptible to false news that could cause incorrect trading decisions, that can manipulate the outcome of the trading decisions.Sentiment Indicators: Social media data analytics companies use Artificial Intelligence and Machine Learning methods to discern ‘sentiment indicators’ among customers. Similarly, banks, hedge funds, high-frequency traders and social trading platforms can find investor sentiment indicators extremely useful.Fraud Detection: Companies can also use machine learning methods for monitoring credit reports and mitigating risk. Financial institutions also increase productivity while reducing costs and risks, to comply with regulations by using AI for AML/CFT and fraud detection. Financial Fraud Financial Fraud is a deception with aim of appropriating other people’s financial assets, which could be their bank accounts, credit cards, mortgages, loans and credit information or even identities. Nefarious parties will use this information to acquire new loans, credit cards or steal identities. The most popular methods of stealing credit and financial information online services are the following:Carding or card skimming: By attaching magnetic stripe reading devices to Automatic Telling Machines, swindlers can obtain cards’ data. This data can be sold on the Dark Web for making fraudulent purchases on the internet.Malware or Spyware or Viruses: Personal and financial information can be stolen, causing substantial damage to users by infecting their computing devices with special software including malware, spyware or viruses.Phishing: This sophisticated strategy involves creating a replica of a trusted website (such as that of a bank or credit card company) and directing users to such sites by either sending genuine looking emails or listing the links in web search results. When customers are directed to these websites, they unwittingly enter their credentials and confidential information which the website records and steals.Mobile viruses: Similar to malware on desktops and laptops, specialized mobile viruses help thieves steal the secret data that is stored on mobile phones.Consequences of FraudCredit information once stolen from online sources can be used to commit fraud online. The most common type is making payments online. Online payments’ fraud, at the simplest level, involves a fraudster making unauthorized purchases online using someone else’s credit card number obtained by fraud. For instance, the fraudster could possible buy a costly product including a watch from an e-commerce website for $1,000 and then resell it on a different online marketplace like eBay for $200. The original cardholder will eventually identify the unauthorized transaction and file a dispute (aka a “chargeback”) with the bank.In the event that the cardholder’s bank chooses that the transaction was fake, at that point the cardholder is compensated, however the business is left to cover the cost of the fraudulent charge. This cost isn’t just the price of the product sold, but also any additional charges incurred due to legal disputes. At the point when a business is being focused by fraudsters, these expenses can accumulate and significantly affect the business’ finances. False negatives—or fraud that isn’t detected and avoided before the dispute happens—are not by any means the only manner by which misrepresentation can have a genuine, budgetary effect on a business. False positives—or genuine transactions that are avoided by a fraud recognition system are expensive. For example, when a client genuinely tries to buy a product, but is blocked from doing so due to a false positive detection of a fraud, the business takes both a gross profit and a hit to its reputation. The below diagram explains this.Detecting FraudSystems that detect fraud vary depending on the company or organization developing them, the algorithms being used, and the use-case of the fraud being detected. Despite this, the main underlying principles employed are similar. The primary directive that is performed by such a system is advance identification of any variations or irregularities in user actions or existing process flow. Every organization that aims to detect fraud should build a customized system and would follow steps like : Studying and examining expected standard user behaviorsIdentifying which patterns of behavior are atypical or deviant from the normEstablish the scenarios where notifications should be flaggedUsing these guidelines, every organization sets in place a process to review and detect these fraud transactions. This process could depend on manual review, or rules-based software that parses transactions which need to be manually reviewed or by software that can detect fraud almost entirely by itself and with very little human intervention. A 2016 survey from Cybersource, one of the leading rules-based fraud detection solutions, revealed that over 25% of all transactions, across a variety of companies surveyed, were manually reviewed. While manual review can be effective, it is not an efficient method to detect and prevent fraud, and might end up being costlier than the fraud itself. A lot of e-commerce businesses depend on rules-based systems to detect fraudulent transactions. Rules-based systems depend heavily on a lot of manual human review, and because e-commerce businesses usually had a window of time to review transactions for fraud before the orders were shipped, this process was acceptable. In the current day and age, with same day delivery, the window to review the transactions has shrunk significantly, and it is no longer feasible to review each transaction manually. Machine learning-based systems help make this process significantly faster by using the business’s live data as well as historical data to identify fraudulent transactions based on patterns in behaviour exhibited by both fraudulent and genuine customers. With an increasingly large data set being currently amassed by companies about their customers, the rules for the rules-based systems will become increasingly complex to be effective and this is almost impossible to maintain. Machine-learning is perfectly suited for parsing very large datasets and weighing many data signals simultaneously. The historical data contains chargeback information and this serves as excellent training data for the system to detect fraud in the live data.It is imperative to choose the appropriate models which can be applied to the historical dataset in order to optimise the levels of recall and precision that will be generated. Recall refers to the proportion of fraudulent transactions that were correctly flagged by the models and precision is the ratio of transactions flagged that were actually fraudulent.Classification of Data Mining and Machine Learning techniquesData mining techniques for identifying financial accounting fraud detection can be classified into the below listed major categories:Regression Models: Regression based models are usually based on logistic regression, stepwise-logistic regression, multi criteria decision making method and exponential generalized beta two (EGB2). It is mainly used for insurance and corporate fraud.Neural Networks: These are nonlinear statistical data modeling tools that are based on the functionality of the human brain and uses a group of interconnected nodes. Neural networks are extensively used in classification and clustering, and has the advantages of being adaptive, is able to generate robust models and its classification process can be modified if new training weights are set. It is primarily used for credit card, automobile insurance and corporate fraud.Bayesian Belief Network (BBN): This represents a group of random variables and their conditional independencies using a directed acyclic graph (DAG), in which nodes represent random variables and missing edges encode conditional independencies between the variables. BBN is actively used in developing models for automobile insurance, credit card, and corporate fraud detection.Decision Trees: A decision tree (DT) is a tree structured decision support tool, in which, each node represents a test on an attribute while each branch represents possible consequences. DTs are commonly utilized for detecting credit card, automobile insurance, and corporate fraud.Naïve Bayes: – Naïve Bayes is used as simple probabilistic classifier based on Bayes conditional probability rule.Nearest Neighbour Method: This is a similarity based classification approach and is used in automobile insurance claims fraud detection and for identifying defaults of credit card clients.Fuzzy Logic and Genetic Algorithm: Fuzzy Logic is a mathematical technique used to classify subjective reasoning and assign data to a specific grouping, based on the degree of possibility the data has of belonging in that group. Genetic algorithms are used in classifier systems to represent and modeling the auditor decision behavior in a fraud setting.Effectiveness of Machine LearningAccording to IBM, Machine Learning techniques help speed up and make fraud detection efficient:Modern businesses like Paypal are increasingly adopting AI and Machine Learning solutions to detect fraud. Paypal has a 0.32% of revenue fraud rate, which is significantly lower than the industry average of 1.32%.