This article suggests a new approach to analyzing enterprise systems using the "small-world" property of complex systems. Based on recent research, this property is found in a diversity of situations and leads to several surprising conclusions and tactics for managing complex systems. Part 2 of this article will outline a methodology, called Associative Link Analysis, which exploits this approach within the enterprise data warehousing environment.
Business intelligence (BI) professionals, particularly those of us involved with data warehousing (DW), are missing a huge opportunity for exploiting enterprise data. We have selected and projected our data, aggregated and pivoted, and correlated and mined. However, we have not "linked" and "hubbed" our data.
The analysis of enterprise data usually adopts a "reductionism" approach by focusing on only one piece of the overall enterprise system at a time. Missing is the ability to see and understand the entire system. Also missing is the ability to visualize and analyze the dynamic interactions between elements of that system.
In many physical and social systems, the important characteristic is that it is composed of a loosely coupled network of interacting autonomous elements. It is not a homogeneous mass. The whole system behaves quite differently than that of the individual elements. For instance, analyzing the characteristics of individual customers may not provide insight about their purchasing behavior if they interact in their purchasing decisions (such as sharing opinions within a discussion group).
The key to understanding enterprise systems is to understand why elements seem so closely interwoven when there are so many of those elements. It is a big world out there, but some worlds seem quite small at times.
The Small-World Property
The small-world (SW) property is something that we all experience in social settings. At your next party, ask a stranger where he/she grew up, went to college or took his/her last vacation. In the next ten minutes, I predict that the conversation will be filled with people, places and events that the two of you have in common. It seems like a "small world." However, why do we often experience this SW feeling amid the enormous complexity of our society?
Stanley Milgram, a social psychologist at Harvard, conducted a simple experiment in 1967. He sent packets to hundreds of randomly selected people in Nebraska and Kansas with instructions to get the packet to a target individual in Boston, identified by name, occupation and city. There was one important constraint: each recipient had to get the packet to the target by forwarding the packet through a chain of people individuals they knew on a first-name basis. Milgram tracked the activity by postcards (included in the packets) returned by the participants. Of the successful contacts, there were an average of six people in the path of the packet exchange. Hence, he concluded that there were six degrees of separation between any two persons within the world. Albeit based on tenuous results, this idea of separation has stuck in our collective minds.1
Recent work has proven that there is something special and universal about the SW property that goes far beyond social networks. In the last four years, there has been a surge of articles and books concerning analyzing complex systems by the linkages (i.e., associations, relationships or connections) that form their internal structure (see Suggested Readings). From this work, diverse applications are emerging in Internet security, drug development, the SARS epidemic, terrorism prevention, stock market meltdowns and the spread of HIV. See a sampling in Figure 1.
As an example of a social network, Figure 2 shows the informal personal interactions occurring in a company. This is a map showing the combined company after a merger. The red nodes are executives from the acquiring firm, while the green nodes are executives from the acquired company. The dark gray nodes are new hires after the merger. A link is drawn if the executives exchanged information related to the merger. Notice that, except for the people in the central core, executives mainly interact with other executives from their original company. This insight prompted the company to intermix executives to better integrate the organization.
Figure 2: Corporate Structure After a Merger
Reprinted with permission from Valdis Krebs of orgnet.com (http://www.orgnet.com/merger.html)
The SW property involves an amazing efficiency in clustering elements of a complex system while maintaining the natural clustering among similar elements. In uniformly linked structures, there is connectivity efficiency within clusters of similar elements, but there is a penalty to connect elements across clusters.2 In randomly linked structures, there is efficiency in connecting diverse elements, but there is a penalty in connecting similar elements.3
SW structures in real systems live between the extremes of uniform and random. As Buchanan states, "Too much order and familiarity is just as bad as too much disorder and novelty. We instead need to strike some delicate balance between the two."4 So, what is the trick inherent in SW systems?
Aristocratic Versus Egalitarian
Current work indicates that there are two approaches for creating SW structures:
Aristocratic approach dominated by a few elite hubs.
Egalitarian approach with equality among nodes but strengthened with weak links between disparate clusters. 5
First, the aristocratic approach favors a special subset of nodes called hubs that have an unusually large number of links to other nodes. It is the hubs that determine the behavior of the entire system.
The insight into the special nature of hubs came from analyzing the World Wide Web. Dr. Albert-Lászl- Barabási, a physics professor at the University of Notre Dame, collected Web pages visited by the university community and followed the URL links from those pages. When he plotted the distribution of links per node, he was surprised by the results. Most Web pages had a couple of links; but a few Web pages possessed large numbers of links. One would normally assume that the number of links per node would be a normal distribution or bell-shaped curve around some average. In other words, on average, Web pages would have eight links, and it would be very unlikely for a Web page to have significantly more or less than eight links. However, there were a small number of Web pages that had thousands of links to other Web pages, and there are some Web pages to which thousands of other pages pointed.
Let's use a common example to illustrate this point. There are probably millions of people who consider the popular singer Norah Jones to be their friend (in a special sense).6 If she were killed in a car accident, there would be millions who would grieve. Conversely, there are millions of people who have only one friend ... or even less. This is quite a disparity.
We can illustrate this disparity in Figure 3. If we plot the number of nodes versus the number of links per node for some network, we might get the chart on the left. It is the typical normal distribution where a typical node has the average number of links. Outside the bell (i.e., three or four standard deviations from the average), the probability that a node would have that many links is very close to zero. An example of a normal distribution would be the height of the adult population. Most people are between five and six feet tall. However, more importantly, no one would be less than two feet or greater than eight feet.
Figure 3: Random Versus Scale-Free Distributions
In many real systems, we are surprised to find the chart in the middle (as Barabási did). There is no bell shape; there is no average in any meaningful sense. It looks like a ski slope! There are many nodes with a small number of links, and there are a few nodes with a large number of links. This is an exponential curve resulting from the Power Law. In Figure 3, the chart at the right uses logarithmic scales to show clearly this Power-Law distribution. Barabási called this attribute "scale-free" because the distribution seems to have no specific scale.
By their nature, scale-free systems foster a disparity within their internal structure. Norah Jones accumulates more friends in a day than I do in a lifetime. Scale-free systems favor the rich nodes, and the rich get richer over time. Hence, scale-free systems with their elite hubs are the "aristocratic" approach to SW structures.
Second, the egalitarian approach favors more equality among nodes. There is a rich clustering among nodes having similar features or functions, and there is an obvious absence of nodes acting as hubs that exploit most of the links.
When friendship involves a real handshake, the dynamics change. How many people would Norah Jones consider as a real friend worthy of a sincere handshake or loving hug? Undoubtedly, the number is considerably less than the millions who consider her a special friend. In social systems, there is a physical limit to the number of social links (friendships or other associations) that a person can support. Likewise, in many types of complex systems, the number of links supported by a node becomes limited because of the high cost incurred of supporting a large number of links.
What gives egalitarian structures the special SW property? The answer is "weak links" links between nodes in dissimilar clusters.7 For instance, most of your friends know each other. Your friends typically have a lot of history, events and associations in common. However, you also have a few friends who are not that familiar with your other friends. Perhaps you met these friends on vacation, had a great time together and now just exchange holiday greetings. This friendship is a weak link, but it supplies a special strength in your social network. Studies have shown that if you are searching for a new job, your weak-linked friends will be more valuable than your close- linked friends.
The recent spread of the SARS virus is another illustration of weak links. When a person carrying the virus boards an airplane, the person is leaving the cluster of people with whom he/she normally associates, and the person now inflects entirely different clusters. Thus, the disease spreads in unexpected ways.
Egalitarian structures derive their strength from weak links, not big hubs. The dynamics are driven by a different cost mechanism than aristocratic structures. As hubs become increasingly expensive to form, they will play a lesser role. For egalitarian structures, nodes will cluster based on some similarity, with an occasional weak link tying the entire structure together.
Over the coming years, it will be exciting to see how this aristocratic-egalitarian (A-E) distinction unfolds. Hopefully, we will see the emergence of a unifying theory that will integrate the A-E approaches, thus giving us a better understanding of the SW property.
When we apply the A-E distinction to studying complex systems, we often find surprises that foster humility. In particular, sudden nonlinear fluctuations in the behavior of SW systems are a common trait. Small and seemingly insignificant changes trigger "tipping points," exhibiting fluctuations out of proportion to the initial change.
For example, disease epidemics explode when their inflection connectivity among people achieves a SW level. A small change in a treatment program may impact the spread by a 50 percent decrease or a 50 percent increase. The same principle has been applied to marketing in which a vendor wants their ideas to spread among their customer base.8 The buzz that occurs from a successful marketing campaign is the SW property kicking into gear.
Studies have shown that tipping points are intimately related to the SW property. In other words, tipping points occur when a world suddenly flips between a large (anonymous) structure to/from a small (intimate) one. The mechanism that allows you to connect with an old schoolmate at a social gathering is the same mechanism that allows a virus to spread through a population. Unfortunately, the mechanism behaves unpredictably around the tipping point, which forces us to understand why and when tipping points occur within complex systems.
In Part 2 of this article, the concepts of the small-world property, the aristocratic-egalitarian distinction and tipping points are applied to the BI/DW area. In particular, a new methodology, called Associative Link Analysis, that exploits these concepts to better understand enterprise systems and leverage this understanding into practical applications is suggested.
For additional references, see http://www.bolder.com/ALA/.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access