Continue in 2 seconds

A Gentle Introduction to Neural Networks

Published
  • January 01 1999, 1:00am EST

Most of the data mining literature about neural networks is geared toward programmers and technically sophisticated readers. However, I believe that many readers would appreciate a non-mathematical approach to get the flavor of neural networks. This article provides a gentle introduction to neural networks from a more general perspective.

Background

With the spread of computers and "electronic brains" in the late 1950s, researchers looked for ways to mimic the brain more closely. One effort was called Perceptrons. Perceptrons were the initial attempt to build a thinking machine by connecting many computational units in a way similar to how neurons in the brain were believed to be interconnected. Because of the expense of computers and the lack of good algorithms for the concept, Perceptrons were limited and support for them dwindled.

In the early 1980s, there was renewed interest in mimicking the brain. Research that had been called "Perceptron Theory" was then called "Neural Networks" because of the proposed similarity with neurons in the brain. One reason for the current interest in neural networks is the fact that they are dynamic, adaptive systems. Conventional computer systems are static: they do not change appreciably from their initial state. Neural networks are dynamic: they have many interacting parts called neurons, computational units, or nodes, which can change their connections to learn from incoming data. Flexible connections allow a neural network to evolve and adapt to changing input and a changing environment.

Two advances helped bring about renewed neural network interest in the 1980s. First, powerful computers became cheap and plentiful. Second, new algorithms were developed to implement the neural networks and to represent the knowledge used to teach these networks.

Teaching Neural Networks

Neural networks have an advantage over other applications because they can learn from data and discover patterns, while other systems need to be programmed explicitly. Neural networks are given examples of data or meta data as input, and they produce an output. They learn by being corrected, based on comparisons to a predefined, standard output. No rules or other knowledge are used. The neural network models its own rules, knowledge and relationships for a given set of data by using a statistical approach to iteratively match its outputs to a consistent set of patterns.

How They Work

Imagine a grassy field with many bushes between two buildings. People will cross this field along different paths, but the shorter path ­ or the path of least resistance ­ will get the most traffic and will eventually be the best defined. This is the general idea behind the learning process used by neural networks. If a neural network were given a model of this example, it would use statistical type algorithms to assign weights to each of the various possible paths. Just as more people will travel the shorter path and give it more wear, the neural network will assign a greater weight to the shorter path, if that is the goal. Although more complicated, neural network-based data mining is very similar.

A neural network has an input layer, processing layers and an output layer. The information in a neural network is distributed throughout the processing layers. Each processing layer is made of many nodes. Each node simulates a neuron by its interconnection to other nodes. The collective outputs form the final output from the neural network. The neural network is taught by correcting the false or undesired outputs from a given input. Training results in a different set of nodes being used and in an adjustment in the weights of the interconnections to other nodes. The weights and adjustments are similar to a least mean squares fit of a line to a number of points. In this analogy, each point represents a node, and the fitted line represents the collective, final output. Just as statistical analysis reveals underlying patterns in a collection of data, a neural network locates consistent patterns in a collection of data, based on a set of predefined criteria.

The input layer is connected to the raw data or to the meta data. The middle (or hidden) layer is used to consolidate the results of the input layer, and the output layer provides the results of the consensus propagated from the previous layers. A rough analogy is provided by comparison with a legal trial. The two lawyers gather information from the external world much the same as nodes in the input layer. The twelve-person jury consolidates the information from the lawyers, similarly to the middle layer. And the judge makes the final sentencing, somewhat like the output layer. In practice, there are usually many more nodes in the input layer than in the middle layer, and the number of nodes in the output layer usually corresponds to the number of possible outcomes.

Neural networks have much potential because they are flexible ­ much the same as the method of least mean squares is flexible. The method of least mean squares can be used in many ways to fit a line to a set of points. Similarly, the nodes in a neural network can be connected to recognize patterns in business data. This is the essence of data mining.

Prognosis for the Future

The data mining capabilities of neural networks can be better than statistical approaches for some business functions. The pattern matching ability of neural networks has been applied to credit evaluators, finance, strategic analysis and decision support systems.

A large credit card company used a neural network to appraise credit risks in evaluating loan applications. They used a number of previous applications, along with the approval or disapproval decision and the success or failure of the loan to train the neural network. The company was quite satisfied with the neural network's success rate of greater than 95 percent, which was equal to or better than most loan officers.

Wall Street's so called "rocket science" uses neural networks for data mining. Financial rocket scientists create stock trading strategies by using neural networks and statistical methods to perform data mining on the massive amount of corporate information available to the general public.

People who deal with large amounts of data for strategic analysis frequently rely on their experiences and intuition to act as a type of pattern recognition. For example, a chess master does not really try to look at all possibilities. He looks for patterns in the general flow of movement and strategies. When he identifies a pattern, he tries to extend that sequence to create a logical advantage. Successful generals managing a battlefield situation may function in much the same way as a chess master. Just like generals, CEOs also make decisions based on past experience, intuition and pattern recognition. Because neural networks can use data mining to identify strategic patterns, strategic analysis is an area where they can be leveraged to build a competitive advantage.

An advanced application of neural networks is to connect them to knowledge-based decision support systems. The neural networks perform the pattern analysis and data mining. The patterns are passed to a knowledge-based decision support system which uses business rules to suggest strategic actions. This idea can be used to build automated "what-if" analyses, similar to spreadsheet analyses, but using strategies rather than numbers. Using carefully created data models and knowledge representations, a "what-if" analysis system can discover more strategic scenarios over a weekend than a systems analyst might develop in three months. The system won't replace the experts, but it can automate a portion of their expertise, resulting in a system that leverages their time and analyses for a competitive advantage. This marriage of the two technologies holds much potential for realizing the competitive advantages of data mining and decision support systems.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access