Saturday, December 31, 2016
Welcome
The emphasis on this blog, however, is mainly critical of neoclassical and mainstream economics. I have been alternating numerical counter-examples with less mathematical posts. In any case, I have been documenting demonstrations of errors in mainstream economics. My chief inspiration here is the Cambridge-Italian economist Piero Sraffa.
In general, this blog is abstract, and I think I steer clear of commenting on practical politics of the day.
I've also started posting recipes for my own purposes. When I just follow a recipe in a cookbook, I'll only post a reminder that I like the recipe.
Comments Policy: I'm quite lax on enforcing any comments policy. I prefer those who post as anonymous (that is, without logging in) to sign their posts at least with a pseudonym. This will make conversations easier to conduct.
Wednesday, April 13, 2016
Math Is Power
A common type of post in this blog is the presentation of concrete numerical examples in economics. Sometimes I present examples to illustrate some principle. But usually I try to find examples that are counter-intuitive or perverse, at least from the perspective of economics as mainstream economists often misteach it.
Voting games provide an arena where one can find surprising results in political science. I am thinking specifically of power indices. In this post, I try to explain two of the most widely used power indices by means of an example.
2.0 Me and My Aunt: A Voting GameFor purposes of exposition, I consider a specific game, called Me and My Aunt. There are four players in this version of the game, represented by elements of the set:
P = The set of players = {0, 1, 2, 3}
Out of respect, the first player gets two votes, while all other players get a vote each (Table 1). A coalition, S, is a set of players. That is, a coalition is a subset of P. A coalition passes a resolution if it has a majority of votes. Since there are four players, one of who has two votes, the total number of votes is five. So a majority, in this game of weighted voting, is three votes.
Players | Votes |
0 (Aunt) | 2 |
1 (Me) | 1 |
2 | 1 |
3 | 1 |
One needs to specify the payoff to each coalition to complete the definition, in characteristic function form, of this game. The characteristic function, v(S) maps the set of all subsets of P to the set {0, 1}. If the players in S have three or more votes,v(S) is 1. Otherwise, it is 0. That is, a winning coalition gains a payoff of one to share among its members.
3.0 The Penrose-Banzhaf Power IndexPower for a player, in this mathematical approach, is the ability to be the decisive member of a coalition. If, for a large number of coalitions, you being in or out of a coalition determines whether or not that coalition can pass a resolution, you have a lot of power. Correspondingly, if the members of most coalitions do not care whether you join, because your presence has no influence on whether or not they can put their agenda into effect, you have little power.
The Penrose-Banzhaf power index is one (of many) attempts to quantify this idea. Table 2 lists all 16 coalitions for the voting game under consideration. (The number of coalitions is the sum of a row in Pascal's triangle.) The second column in Table 2 specifies the value for the characteristic function for that coalition. Equivalently, the third column notes which eight coalitions are winning coalitions, and which eight are losing. The last two columns are useful for tallying up counts needed for the Penrose-Banzhaf index.
Coalition | Characteristic Function | Winning or Losing | Player | |
Aunt (0) | Me (1) | |||
{} | v( {} ) = 0 | Losing | 0 | 0 |
{0} | v( {0} ) = 0 | Losing | 0 | 0 |
{1} | v( {1} ) = 0 | Losing | 0 | 0 |
{2} | v( {2} ) = 0 | Losing | 0 | 0 |
{3} | v( {3} ) = 0 | Losing | 0 | 0 |
{0, 1} | v( {0, 1} ) = 1 | Winning | 1 | 1 |
{0, 2} | v( {0, 2} ) = 1 | Winning | 1 | 0 |
{0, 3} | v( {0, 3} ) = 1 | Winning | 1 | 0 |
{1, 2} | v( {1, 2} ) = 0 | Losing | 0 | 0 |
{1, 3} | v( {1, 3} ) = 0 | Losing | 0 | 0 |
{2, 3} | v( {2, 3} ) = 0 | Losing | 0 | 0 |
{0, 1, 2} | v( {0, 1, 2} ) = 1 | Winning | 1 | 0 |
{0, 1, 3} | v( {0, 1, 3} ) = 1 | Winning | 1 | 0 |
{0, 2, 3} | v( {0, 2, 3} ) = 1 | Winning | 1 | 0 |
{1, 2, 3} | v( {1, 2, 3} ) = 1 | Winning | 0 | 1 |
{0, 1, 2, 3} | v( {0, 1, 2, 3} ) = 1 | Winning | 0 | 0 |
The Penrose-Banzhaf index, ψ(i) is calculated for each player i. It is defined, for a given player, to be the ratio of the number of winning coalitions in which that player is decisive to the total number of coalitions, winning or losing. A player is decisive for a coalition if:
- The coalition is a winning coalition.
- The removal of the player from the coalition converts it to a losing coalition.
From the table above, one can see that player 0 is decisive for six coalitions, while player 1 is decisive for only two coalitions. Hence, the Penrose-Banzhaf index for "my aunt" is:
ψ(0) = 6/16 = 3/8
By symmetry, the index values for players 2 and 3 are the same as the value for player 1:
ψ(1) = ψ(2) = ψ(3) = 2/16 = 1/8
More than one player can be decisive for a winning coalition. No need exists for the Penrose-Banzhaf index to sum up to one. How much one's vote is weighted does not bear a simple relationship to how much power one has. Also note that the definition of this power index is not confined to simple majority games. Power indices can be calculated for voting games in which a super-majority is required to pass a measure. For example, in the United States Senate, 60 senators are needed to end a filibuster.
4.0 The Shapley-Shubik Power IndexThe Shapley-Shubik power index is an application of the calculation of the Shapley value to voting games. The Shapley value applies to cooperative games, in general. For its use as a measure of power in voting games, it matters in which order players enter a coalition. Accordingly, Table 3 lists all 24 permutations of all four players in the voting game being analyzed.
Permutation | Player | |
Aunt (0) | Me (1) | |
(0, 1, 2, 3) | v( {0} ) - v( {} ) = 0 | v( {0, 1} ) - v( {0} ) = 1 |
(0, 1, 3, 2) | v( {0} ) - v( {} ) = 0 | v( {0, 1} ) - v( {0} ) = 1 |
(0, 2, 1, 3) | v( {0} ) - v( {} ) = 0 | v( {0, 1, 2} ) - v( {0, 2} ) = 0 |
(0, 2, 3, 1) | v( {0} ) - v( {} ) = 0 | v( {0, 1, 2, 3} ) - v( {0, 2, 3} ) = 0 |
(0, 3, 1, 2) | v( {0} ) - v( {} ) = 0 | v( {0, 1, 3} ) - v( {0, 3} ) = 0 |
(0, 3, 2, 1) | v( {0} ) - v( {} ) = 0 | v( {0, 1, 2, 3} ) - v( {0, 2, 3} ) = 0 |
(1, 0, 2, 3) | v( {0, 1} ) - v( {1} ) = 1 | v( {1} ) - v( {} ) = 0 |
(1, 0, 3, 2) | v( {0, 1} ) - v( {1} ) = 1 | v( {1} ) - v( {} ) = 0 |
(1, 2, 0, 3) | v( {0, 1, 2} ) - v( {1, 2} ) = 1 | v( {1} ) - v( {} ) = 0 |
(1, 2, 3, 0) | v( {0, 1, 2, 3} ) - v( {1, 2, 3} ) = 0 | v( {1} ) - v( {} ) = 0 |
(1, 3, 0, 2) | v( {0, 1, 3} ) - v( {1, 3} ) = 1 | v( {1} ) - v( {} ) = 0 |
(1, 3, 2, 0) | v( {0, 1, 2, 3} ) - v( {1, 2, 3} ) = 0 | v( {1} ) - v( {} ) = 0 |
(2, 0, 1, 3) | v( {0, 2} ) - v( {2} ) = 1 | v( {0, 1, 2} ) - v( {0, 2} ) = 0 |
(2, 0, 3, 1) | v( {0, 2} ) - v( {2} ) = 1 | v( {0, 1, 2, 3} ) - v( {0, 2, 3} ) = 0 |
(2, 1, 0, 3) | v( {0, 1, 2} ) - v( {1, 2} ) = 1 | v( {1, 2} ) - v( {2} ) = 0 |
(2, 1, 3, 0) | v( {0, 1, 2, 3} ) - v( {1, 2, 3} ) = 0 | v( {1, 2} ) - v( {2} ) = 0 |
(2, 3, 0, 1) | v( {0, 2, 3} ) - v( {2, 3} ) = 1 | v( {0, 1, 2, 3} ) - v( {0, 2, 3} ) = 0 |
(2, 3, 1, 0) | v( {0, 1, 2, 3} ) - v( {1, 2, 3} ) = 0 | v( {1, 2, 3} ) - v( {2, 3} ) = 1 |
(3, 0, 1, 2) | v( {0, 3} ) - v( {3} ) = 1 | v( {0, 1, 3} ) - v( {0, 3} ) = 0 |
(3, 0, 2, 1) | v( {0, 3} ) - v( {3} ) = 1 | v( {0, 1, 2, 3} ) - v( {0, 2, 3} ) = 0 |
(3, 1, 0, 2) | v( {0, 1, 3} ) - v( {1, 3} ) = 1 | v( {1, 3} ) - v( {3} ) = 0 |
(3, 1, 2, 0) | v( {0, 1, 2, 3} ) - v( {1, 2, 3} ) = 0 | v( {1, 3} ) - v( {3} ) = 0 |
(3, 2, 0, 1) | v( {0, 2, 3} ) - v( {2, 3} ) = 1 | v( {0, 1, 2, 3} ) - v( {0, 2, 3} ) = 0 |
(3, 2, 1, 0) | v( {0, 1, 2, 3} ) - v( {1, 2, 3} ) = 0 | v( {1, 2, 3} ) - v( {2, 3} ) = 1 |
Table 3 shows some initially confusing calculations in the last two columns, where each of these columns is defined for a given player. Suppose a player and a permutation are defined. For that permutation, let the set S_{π, i} contain those players in the permutation π to the left of the given player i. The difference, in the last two columns, is the following, for i equal to 0 and to 1, respectively:
v(S_{π, i} ∪ {i}) - v(S_{π, i})
The Shapley-Shubik power index, for a player, is the ratio of a sum to the number of permutations of players. And that sum is calculated for each player, as the sum over all permutations, of the above difference in the value of the value of the characteristic function.
If I understand correctly, given a permutation, the above difference can only take on values of 0 or 1. And it will only be 1 for one player, where that player determines whether the formation of the coalition in the order given will be a winning coalition. As a consequence, the Shapley-Shubik power index is guaranteed to sum over players to unity. In this case, power is a fixed amount, with each player being measured as having a defined proportion of that power.
5.0 Both Power IndicesThe above has stepped through the calculation of two power indices, for all players, in a given game. Table 4 lists their values, as well as a normalization of the Penrose-Banzhauf power index such that the sum of the power, over all players, is unity. (I gather that the Penrose-Banzhauf index and the normalized index do not have the same properties.) As one might expect from the definition of the game, "my aunt" has more power than "me" in this game.
Player | Penrose-Banzhaf Power Index | Shapley-Shubik Power Index | |
Index | Normalized | ||
0 | 6/16 = 3/8 | 6/12 = 1/2 | 12/24 = 1/2 |
1 | 2/16 = 1/8 | 2/12 = 1/6 | 4/24 = 1/6 |
2 | 2/16 = 1/8 | 2/12 = 1/6 | 4/24 = 1/6 |
3 | 2/16 = 1/8 | 2/12 = 1/6 | 4/24 = 1/6 |
In many voting games, the normalized Penrose-Banzhauf and Shapley-Shubik power indices are not identical for all players. In fact, suppose the rules for the above variation of Me and my Aunt voting game are varied. Suppose now that four votes - a supermajority - are needed to carry a motion. The normalized Penrose-Banzhaf index for player 0 becomes 1/3, while each of the other players have a normalized Penrose-Banzhaf index of 2/9. Interestingly enough, the Shapley-Shubik indices for the players do not change, if I have calculated correctly. But the values assigned to rows in Table 3 do sometimes vary. Anyways, that one tweak of the rules results in different power indices, depending on which method one adopts. A more interesting example would be one in which the rankings vary among power indices.
Other power indices, albeit less common, do exist. Which one is most widely applicable? I would think that mainstream economists, given game theory and marginalism, would tend to prefer the Shapley-Shubik power index. Felsenthal and Machover (2004) seem to be widely recognized experts on measures of voting power, and they have come to prefer the Penrose-Banzhaf index over the Shapley-Shubik index.
6.0 Where To Go From HereI have described above a couple of power indices in voting games. As I understand it, many have tried to write down reasonable axioms that characterize power indices. One challenge is to specify a set of axioms such that your preferred power index is the only one that satisfies them. But, as I understand it, some sets of reasonable axioms are open insofar as more than one power index would satisfy them. I seem to recall a theorem that one could create a power index for a reasonable set of axioms such that whichever player you want in a voting game is the most powerful. Apparently, a connection can be drawn between a power index and a voting procedure. And Donald Saari boasts that he could create an apparently fair voting procedure that would result in whatever candidate you like being elected.
I gather that many examples of voting games have been presented in which apparently paradoxical or perverse results arise. And these do not seem to be merely theoretical results. Can I find some such examples? Perhaps, I should look here at some of Daron Acemoglu's work.
I am aware of three types of examples to look for. One is that of a dummy. A dummy is a player that, under the weights and the rule for how many votes are needed for passage, can never be decisive in a coalition. Whether this player drops out or joins a coalition can never change whether or not a resolution is passed, even though the player has a positive weight. A second odd possibility arises as the consequence of adding a new member to the electorate:
"...power of a weighted voting body may increase, rather than decrease, when new members are added to the original body." -- Steven J. Brams and Paul J. Affuso (1976).
A third odd possibility apparently can arise on a council when one district annexes another. Suppose, the district annexing the other consequently increases the weight of its vote accordingly. One might think a greater weight leads to more power. But, in certain cases, the normalized Penrose-Banzhaf index can decrease.
The above calculations for the Penrose-Banzhaf and Shapley-Shubik power indices treat all coalitions or permutations, respectively, as equally likely to arise. Empirically, this does not seem to be true. And this has an impact on how one might measure power. For example, since voting is unweighted on the Supreme Court of the United States, all justices might be thought to be equally powerful. But, because of the formation of well-defined blocks, Anthony Kennedy was often described as being particularly powerful in deciding court decisions, at least when Antonin Scalia was still alive. So empirically, one might include some assessment of the affinities of the players for one another and, thus, some influence on the probabilities of each coalition forming. This will have consequences on the calculation of power indices. But why stop there? In the United States these days, politicians only seem to represent the most wealthy.
Update: This page, from the University of Warwick, has links to utilities for calculating various power indices.
References- Steven J. Brams and Paul J. Affuso (1976). Power and Size: A New Paradox, Theory and Decision. V. 7, Iss. 1 (Feb.): pp. 29-56.
- Dan S. Felsenthal and Moshé Machover (2004). Voting Power Measurement: A Story of Misreinvention, London Scool of Economics and Political Science
- Andrew Gelman, Jonathan N. Katz, and Joseph Bafumi (2004). Standard Voting Power Indexes Do Not Work: An Empirical Analysis, B. J. Pol. S.. V. 34: pp. 657-674.
- Guillermo Owen (1971) Political Games, Naval Research Logistics Quarterly. V. 18, Iss. 3 (Sep.): pp. 345-355.
- Donald G. Saari and Katri K. Sieberg (1999). Some Surprising Properties of Power Indices.
Monday, April 11, 2016
Inane Responses To The Cambridge Capital Controversy
I consider the following views, if unqualified and without caveats, just silly:
- The Cambridge Capital Controversy (CCC) was only attacking aggregate neoclassical theory.
- The CCC is just a General Equilibrium argument, and it has been subsumed by General Equilibrium Theory. (Citing Mas Colell (1989) here does not help.)
- The CCC does not have anything to say about partial, microeconomic models.
- Perverse results, such as reswitching and capital-reversing, only arise in the special case of Leontief production functions. If you adopt widely used forms for production functions, the perverse results go away.
- It is an empirical question whether non-perverse results follow from neoclassical assumptions. And nobody has ever found empirical examples of capital-reversing or reswitching.
- Mainstream economists have moved on since the 1960s, and their models these days are not susceptible to the Cambridge critique.
I would think that one could not get such ideas published in any respectable journal. On the other hand, Paul Romer did get his ignorance about Joan Robinson into the American Economic Review
References- Andreu Mas-Colell (1989). Capital theory paradoxes: Anything goes. In Joan Robinson and Modern Economic Theory (ed. by George R. Feiwel), Macmillan.
Saturday, March 19, 2016
Post Keynesianism From The Outside
Post Keynesian economics has been under development for about three quarters of a century now. Academics in countries around the world have made contributions to the theory and to its application. And they have participated in many common practices of academics, including economists^{1}.
Post Keynesians have written and published papers in peer-reviewed journals. Over this time-span, such journals include widely referenced mainstream journals, such as the American Economic Review, the Economic Journal, the Journal of Economic Literature, the Journal of Political Economy, and the Quarterly Journal of Economics^{2}. Lately, certain specialized journals have proven more sympathetic to publishing Post Keynesians. Such journals include, for example, the Cambridge Journal of Economics, the Journal of Post Keynesian Economics, Kyklos, and the Review of Political Economy. The list suggests two other activities: the founding and editing of journals. As I understand it, Joan Robinson, among other economists now thought of as Post Keynesian, participated in the founding of the Review of Economic Studies, while the Review of Keynesian Economics is a more recent academic journal with an analogous start. The Banca Nazionale del Lavoro Quarterly Review, the Canadian Journal of Economics, and Metroeconomica are some journals, while not being specifically heterodox, I guess, had Post Keynesians as editors for some time^{3, 4}.
Participation in professional societies, as officers, as organizers of conferences and conference sessions, and as presenters at conferences, provides another typical venue for academic activities. Naturally, Post Keynesians have performed such activities. For example, John Kenneth Galbraith was president of the American Economic Society, and the annual meeting of the Allied Social Sciences Association (ASSA), held in conjunction with the American Economics Association, regularly holds sessions dedicated to Post Keynesian topics. Recently, heterodox economics have become interested in pluralism and how their concerns overlap. These concerns have been reflected in much work in many professional societies relating to heterodox economics.
I began this article with journal publications because economics has become less focused on books and more focused on journal publications during the period in which Post Keynesianism grew. But during this period, Post Keynesians have also made original contributions in books published by prestige university and academic presses. I think, for example, of presses associated with Cambridge, Columbia, and Harvard, to pick some examples at random^{5}.
After decades of work, a need will arise to introduce others to it. And Post Keynesians have addressed this need with anthologies of classic papers, introductory works for other economists, and textbooks. One can also find Post Keynesians editing, or participating in the development, of standard reference works^{6}.
On a more local level, Post Keynesian economists have participated in the governance of economic departments around the world^{7}. And they have provided governments with advice many a time, from within and without^{8}.
I have deliberately not written about substantial Post Keynesian ideas in this post. If one is aware of the history mentioned in this post, even if one had never been exposed to Post Keynesian ideas, one must conclude Post Keynesian theory is much like any other set of academic ideas. One would have difficulty in seeing how academics could justify dismissing these ideas without engaging with them Likewise, one might wonder how, perhaps, those aspiring to be professional economists might not even be exposed to Post Keynesianism in gaining a post graduate degree. Yet, apparently, such a happenstance seems to be not at all rare among mainstream economists.
Footnotes- I recognize my post is biased towards the English language. It is also quite impressionistic and selective. I am taking Sraffians as Post Keynesians for the purposes of this post.
- Major contributions to the Cambridge Capital Controversy are to be found in these journals.
- For example, Paolo Sylos Labini for the Banca Nazionale del Lavoro Quarterly Review, Athanasios Asimakopulos for the Canadian Journal of Economics, and Neri Salvadori for Metroeconomica.
- For what it is worth, I am published in the Manchester School.
- How would one characterize Edward Elgar and Routledge, for example?
- The first edition of the New Palgrave is an obvious example.
- Economics at Cambridge University is an obvious case. Albert Eichner chairing the Rutgers economics department is another case.
- Examples include Nicholas Kaldor's work with the Radcliffe Committee, John Kenneth Galbraith giving advice to John F. Kennedy, the advocacy of Tax-based Income Policies (TIPs) in the 1970s to fight stagflation, and policy suggestions associated with Modern Monetary Theory (MMT).
Wednesday, March 02, 2016
Romer And Romer Stumble
A debate has recently arisen about Gerald Friedman's analysis of Bernie Sanders' proposed economic program. In a welcome turn of events, two defenders of the establishment, Christina and David Romer, finally offer some substance, instance of just relying on their authority as Very Serious People.
In this post, I ignore most of the substance of the argument. I want to focus on three errors I find in this passage:
"Potentially more worrisome are the extensive interventions in the labor market. The experiences of many European countries from the 1970s to today show that an overly regulated labor market can have severe consequences for normal unemployment. There are strong arguments for raising the minimum wage; and over the range observed historically in the United States, the short-run employment effects of moderate increases appear negligible. But doubling the minimum wage nationwide, adding new requirements for employer-funded paid vacations and sick leave, and increasing payroll taxes substantially would take us into uncharted waters. Obviously, these changes would not bring the United States all the way to levels of labor market regulation of many European countries in the 1970s. But they are large enough that one can reasonably fear that they could have a noticeable impact on capacity growth." -- Christina D. Romer and David H. Romer, Senator Sander's Proposed Policies and Economic Growth (5 February 2016) p. 10-11.
First, the reference to "interventions in the labor market" and an "overly regulated labor market" imposes a false dichotomy. An unregulated labor market cannot exist. Certainly this is so in an advanced capitalist economy. Possible choices are among sets of regulations and norms, not among intervention or not. Calling one set of regulations an example of government non-intervention is to disguise taking a side under obfuscatory verbiage.
Second, Romer and Romer presuppose a consensus about the empirical effects of different regulations on the labor market in Europe and the United States that I do not think exists. If I wanted to find empirically based arguments countering Romer and Romer's claim, I would look through back issues of the Cambridge Journal of Economics. Perhaps at least one of these articles might be helpful.
Third, Romer and Romer suggest that, given the set of regulations they like to think of as government non-intervention, markets for labor and goods would have a tendency to clear. Otherwise, economic growth would be jeopardized. No theoretical foundation exists for thinking so.
Even the best mainstream economists seem incapable of writing ten pages without spouting ideological claptrap and propagating silly errors exposed more than half a century ago. Something seems terribly wrong with economics profession.
Monday, February 29, 2016
Conservatism According to Corey Robin
I have been re-reading Corey Robin's The Reactionary Mind. According to this book, defending arbitrary hierarchies is the first priority among conservatives. They believe:
- Workers should obey their masters.
- Wives should obey their husbands.
- Downtrodden ethnic groups should obey socially privileged ethnic groups.
- The laity should obey priests.
- The non-affluent should show proper deference towards those with great wealth (who could never be malefactors).
These hierarchies have implications for daily lives, not just political rule. For the right, liberty is liberty for the rulers to do as they will, not for those who suffer what they must.
I deliberately do not write about slavery. According to Robin, conservatism is literally reactionary. Conservatives defend hierarchies that are currently threatened or recently overthrown. They focus on restoring what was recently lost. Maybe this has something to do with widespread fear and resentment on the right.
Conservatives often do not have admiration for the rulers of the ancient regime. If those rulers were willing to do what needed to be done to preserve their power, the threats would never have gotten so far, and losses would not have been suffered. The conservative, unlike his popular and complimentary image, is willing to make radical changes so as to reconstruct society as it once was. This seems to go along with the awareness of some contemporary neoliberals that market societies are not natural formations, but must be constructed and maintained by state power. But is this aping of the left consistent with the conservative's encouragement of anti-intellectualism and stupidity? Perhaps the idea is that only an elite need understand the goal, while widespread ignorance among the masses can only help the cause.
The hierarchies that conservatives seek to defend or restore are not meritocracies, in the sense that those on top are expected to have superior intellect, wisdom, or morals. Rulers should demonstrate their fitness to rule by seizing what they can, in war or business. Maybe this has something to do with why many conservatives endorse the supposed "free market", without worrying about externalities, information asymmetries, transaction costs, or market power. One can also see here an echo of Friedrich Nietzsche's overman.
Much of the above comes from the introduction and first couple of chapters of Robin's book. Much of the rest consists of case studies of particular thinkers and polemicists.
Reference- Corey Robin (2013). The Reactionary Mind: Conservatism from Edmund Burke to Sarah Palin, Oxford University Press.
Wednesday, February 17, 2016
Classification of Finite Simple Groups: A Proved Theorem?
Figure 1: Lattice Diagram for Group of Symmetries of the Square |
"I shall now mention something I obviously do not understand." - Ian Hacking (2014, p. 18)1.0 Introduction
This has nothing to do with economics. It is my attempt to get my mind around a place where I can get a glimmer of some exciting mathematics being done in my lifetime.
Mathematicians have stated a theorem for classifying finite simple groups. Whether they have proven this theorem is an intriguing question in the philosophy of mathematics.
A finite simple group is a group with a finite number of elements and no proper normal subgroup. This definition contains several technical terms. In this post, I try to explain these terms and the setting of the theorem for classifying simple groups. This preamble raises several questions:
- What is a group? A proper subgroup? A normal subgroup?
- How can a finite, non-simple group be factored into a composition of simple groups?
I try to clarify the answers to these questions by means of a lengthy example. You can probably find this better expressed elsewhere. In working this out, I relied heavily on Fraleigh's textbook, which is the only book in the references that I have read, albeit mostly in the second edition.
2.0 The Group of Symmetries of the SquareA group is a generalization, in some sense, of a multiplication table. Formally, it is a set with a binary operation, in which the binary operation satisfies three axioms. A finite group is a group in which the set contains a finite number of elements.
To illustrate, I consider the set of symmetries of the square (Figure 2). These eight elements of the set are like the numbers along the top and left side of a multiplication table. Each element is an operation that can be performed on a square, leaving the square superimposed on itself. Each operation is described in the right column of Figure 2. The third column provides a picture of the operation. The four vertices of the square are numbered so that one can see the result of the operation. The second column specifies each operation as a permutation of the numbered vertices. The first row in each permutation lists the vertices, while the second row shows which of the original vertices ends up in the place of each vertex. The first column introduces a notation for naming each operation. The remainder of this post is expressed in this notation.
Figure 2: Elements of a Group |
The group operation, *, is function composition. Let a and b be elements of the set {ρ_{0}, ρ_{1}, ρ_{2}, ρ_{0}, μ_{0}, μ_{1}, σ_{0}, σ_{1}}. The product a*b is defined to be the single operation that is equivalent to first performing the operation a on the square and then performing the operation b on the result. (Many textbooks define functional composition from right-to-left, instead.) Table 1 is the multiplication table for this group, under these definitions. For example, rotating a square 90 degrees clockwise twice is equivalent to rotating the square clockwise through 180 degrees. Thus:
ρ_{1} * ρ_{1} = ρ_{2}
* | ρ_{0} | ρ_{1} | ρ_{2} | ρ_{3} | μ_{0} | μ_{1} | σ_{0} | σ_{1} |
ρ_{0} | ρ_{0} | ρ_{1} | ρ_{2} | ρ_{3} | μ_{0} | μ_{1} | σ_{0} | σ_{1} |
ρ_{1} | ρ_{1} | ρ_{2} | ρ_{3} | ρ_{0} | σ_{0} | σ_{1} | μ_{1} | μ_{0} |
ρ_{2} | ρ_{2} | ρ_{3} | ρ_{0} | ρ_{1} | μ_{1} | μ_{0} | σ_{1} | σ_{0} |
ρ_{3} | ρ_{3} | ρ_{0} | ρ_{1} | ρ_{2} | σ_{1} | σ_{0} | μ_{0} | μ_{1} |
μ_{0} | μ_{0} | σ_{1} | μ_{1} | σ_{0} | ρ_{0} | ρ_{2} | ρ_{3} | ρ_{1} |
μ_{1} | μ_{1} | σ_{0} | μ_{0} | σ_{1} | ρ_{2} | ρ_{0} | ρ_{1} | ρ_{3} |
σ_{0} | σ_{0} | μ_{0} | σ_{1} | μ_{1} | ρ_{1} | ρ_{3} | ρ_{0} | ρ_{2} |
σ_{1} | σ_{1} | μ_{1} | σ_{0} | μ_{0} | ρ_{3} | ρ_{1} | ρ_{2} | ρ_{0} |
A group is defined by the following three axioms:
- The binary operation in the group is associative. That is, for all a, b, and c in the group:
(a * b) * c = a * (b * c)
- The group contains an identity element. There exists an element e in the group such that for all a in the group:
e * a = a * e = a
- Every element of the group has an inverse. For all a in the group, there exists an element a^{-1} in the group such that:
a * a^{-1} = a^{-1} * a = e
Associativity is tedious to check for D_{4}. Associativity implies that one can drop parenthesis below. ρ_{0} is the identity element. Every row and column in the multiplication table for D_{4} contains ρ_{0}; thus, every element has an inverse.
An Abelian group is one in which the binary operation is commutative. The group of symmetries of the square is not Abelian. For an Abelian group, the multiplication table is symmetric across the principal diagonal; it does not matter to the result in which order one performs the operation for two arguments. The following two equations illustrates that D_{4} is not Abelian:
μ_{0}*ρ_{1} = σ_{1}
ρ_{1}*μ_{0} = σ_{0}
In words, flipping a square around its horizontal axis of symmetry and then rotating it ninety degrees clockwise is not equivalent to rotating it ninety degrees clockwise and then then reflecting it across that axis. The result of the first composition of operations is equivalent to reflecting the square across the diagonal axis of symmetry running from the south west to the north east. The second composition of operations is equivalent to flipping the square across the other diagonal.
One can also set up equations in a group, for example:
ρ_{1}*ρ_{2}*x = μ_{0}
Then x must be σ_{0}. Solving a Rubik's cube is analogous to solving such an equation.
3.0 Proper and Improper SubgroupsSome rows and columns in Table 1 can stand alone as a group. The entries in these restricted row and columns all appear as headings in the rows and columns. These entries form a subgroup of the original group. One-fourth of the table in the upper left of Table 1 provides an example. {ρ_{0}, ρ_{1}, ρ_{2}, ρ_{3}} is a subgroup of D_{4} (Table 2).
* | ρ_{0} | ρ_{1} | ρ_{2} | ρ_{3} |
ρ_{0} | ρ_{0} | ρ_{1} | ρ_{2} | ρ_{3} |
ρ_{1} | ρ_{1} | ρ_{2} | ρ_{3} | ρ_{0} |
ρ_{2} | ρ_{2} | ρ_{3} | ρ_{0} | ρ_{1} |
ρ_{3} | ρ_{3} | ρ_{0} | ρ_{1} | ρ_{2} |
The group D_{4} has ten subgroups, as shown in the Lattice Diagram in Figure 1 above. Subgroups have been defined such that, for any group G, the group G is a subgroup of itself. Another trivial case, the one-element group consisting of the identity element, also provides a subgroup of G. These two subgroups are known as improper subgroups. All other subgroups are proper subgroups.
One can make a couple of observations about subgroups. The binary operation in the group is the same as the binary operation in the subgroup. The property of associativity carries over from the group to the subgroup. Since a subgroup is a group, it must contain an identity element. And that identity element must also be the identity element for the group containing the subgroup. Thus, every subgroup of D_{4} contains ρ_{0}. Likewise, for every element of a subgroup, the subgroup must also contain its inverse. Finally, the number of elements in a subgroup must evenly divide the number of elements in the group.
I have shown above how the eight elements of D_{4} can be defined in terms of permutations. As a matter of fact, the set of permutations of (1, 2, ..., n) form a group under the operation of function composition. This permutation group is designated as S_{n}, and it contains n! elements. Thus, S_{4} contains 24 (= 4x3x2x1) elements. Not only can one find all the subgroups of D_{4}, one can extend the group such that D_{4} is a subgroup of that extended group.
4.0 Isomorphic GroupsIn a group, the order of rows and columns in the multiplication table are of no matter. Likewise, the names of the elements are irrelevant to the structure of the group. Two groups are isomorphic if the multiplication table for one group can be mapped into the multiplication table for another group by reordering and renaming the elements of, say, the first group. As an example, consider the groups {ρ_{0}, ρ_{2}, μ_{0}, μ_{1}} and {ρ_{0}, ρ_{2}, σ_{0}, σ_{1}}. They each have the same number of elements, which is necessary for an isomorphism. Table 3 defines the group operation for the first group. Suppose that, in Table 3, μ_{0} is renamed σ_{0}, and μ_{1} is renamed σ_{1} throughout. The resulting table will match the operation for the second group. Thus, the two groups are isomorphic.
* | ρ_{0} | ρ_{2} | μ_{0} | μ_{1} |
ρ_{0} | ρ_{0} | ρ_{2} | μ_{0} | μ_{1} |
ρ_{2} | ρ_{2} | ρ_{0} | μ_{1} | μ_{0} |
μ_{0} | μ_{0} | μ_{1} | ρ_{0} | ρ_{2} |
μ_{1} | μ_{1} | μ_{0} | ρ_{2} | ρ_{0} |
The groups in Tables 2 and 3 are NOT isomorphic. They each contain four elements. Each element, however, in the group in Table 3 is its own inverse. This is an algebraic property, preserved no matter how the elements of the group are renamed. And the group in Table 2 does not have this property. As a matter of fact, only two groups containing four elements exist, up to an isomorphism. In other words, any group with four elements is isomorphic to either the group in Table 2 or to the group in Table 3.
Furthermore, only one group, up to isomorphism, contains two elements. Its operation is defined by Table 4. All the subgroups of D_{4} containing two elements are isomorphic to this group and, ipso facto, to each other. The text colors of the subgroups in the lattice diagram (Figure 1) express these isomorphisms.
* | 0 | 1 |
0 | 0 | 1 |
1 | 1 | 0 |
Certain additional patterns are apparent in Table 1. I have already pointed out that the first four rows and columns constitute the subgroup with the operation shown in Table 2. Notice that none of the entries in the last four columns for the first four rows are in this subgroup. Likewise, none of the entries in the first four columns for the last four rows are in this subgroup. On the other hand, the entries in the remaining rows and columns in the lower right are all in this subgroup. Can you see that these observations reveal the pattern expressed in Table 4? Mathematicians express this by saying that the factor group D_{4}/{ρ_{0}, ρ_{1}, ρ_{2}, ρ_{3}} is isomorphic to the group with two elements.
A subgroup is normal if it can be used to divide up the rows and columns in the multiplication table for the group like this. For another example, consider the subgroup {ρ_{0}, ρ_{2}}. Table 5 shows a reordering of the rows and columns in Table 1 to facilitate the calculation of the factor group for this subgroup. Consider dividing this grid up into 16 blocks of two rows and two columns each. Each block will contain two elements of the group D_{4}, and which element is paired with each element does not vary among these blocks.
* | ρ_{0} | ρ_{2} | ρ_{1} | ρ_{3} | μ_{0} | μ_{1} | σ_{0} | σ_{1} |
ρ_{0} | ρ_{0} | ρ_{2} | ρ_{1} | ρ_{3} | μ_{0} | μ_{1} | σ_{0} | σ_{1} |
ρ_{2} | ρ_{2} | ρ_{0} | ρ_{3} | ρ_{1} | μ_{1} | μ_{0} | σ_{1} | σ_{0} |
ρ_{1} | ρ_{1} | ρ_{3} | ρ_{2} | ρ_{0} | σ_{0} | σ_{1} | μ_{1} | μ_{0} |
ρ_{3} | ρ_{3} | ρ_{1} | ρ_{0} | ρ_{2} | σ_{1} | σ_{0} | μ_{0} | μ_{1} |
μ_{0} | μ_{0} | μ_{1} | σ_{1} | σ_{0} | ρ_{0} | ρ_{2} | ρ_{3} | ρ_{1} |
μ_{1} | μ_{1} | μ_{0} | σ_{0} | σ_{1} | ρ_{2} | ρ_{0} | ρ_{1} | ρ_{3} |
σ_{0} | σ_{0} | σ_{1} | μ_{0} | μ_{1} | ρ_{1} | ρ_{3} | ρ_{0} | ρ_{2} |
σ_{1} | σ_{1} | σ_{0} | μ_{1} | μ_{0} | ρ_{3} | ρ_{1} | ρ_{2} | ρ_{0} |
These observations can be formalized by the function defined in Table 6. For an element a of D_{4}, let f(a) denote the map defined in Table 6. To find the value of this function, locate a in the first column. Whether this value is 0, 1, 2, or 3 is determined by the corresponding entry in the second column. For all a and b in D_{4}:
f(a * b) = f(a) o f(b)
A map from one group to another with this property is a homomorphism. An isomorphism is a homomorphism, but a homomorphism is a more general concept. Homomorphisms do not need to leave the number of elements in the group invariant.
Elements of D_{4} | Image |
ρ_{0}, ρ_{2} | 0 |
ρ_{1}, ρ_{3} | 1 |
μ_{0}, μ_{1} | 2 |
σ_{0}, σ_{1} | 3 |
The factor group D_{4}/{ρ_{0}, ρ_{2}} is easily calculated. Replace each element of D_{4} in Table 5 by its image under the homomorphism in Table 6. Collapse each pair of rows and columns. One ends up with Table 7, where I have renamed the group operation, as above. The factor group D_{4}/{ρ_{0}, ρ_{2}} is isomorphic to the group with four elements with the operation shown in Table 3 above. The number of elements in a factor group is the quotient of the number of elements in the original group and the number of elements in the subgroup used to form the factor group.
o | 0 | 1 | 2 | 3 |
0 | 0 | 1 | 2 | 3 |
1 | 1 | 0 | 3 | 2 |
2 | 2 | 3 | 0 | 1 |
3 | 3 | 2 | 1 | 0 |
The two improper subgroups for any group are normal and yield trivial factor groups. The factor group D_{4}/D_{4} is isomorphic to the one-element group whose only member is the identity element. The factor group D_{4}/{ρ_{0}} is isomorphic to D_{4}. The factor groups for improper subgroups provide no information about the structure of a group.
6.0 A Subgroup that is Not NormalNot all subgroups are normal. The subgroup {ρ_{0}, μ_{0}}, for example, is not a normal subgroup of D_{4}. Table 8 proposes a map from the elements of the group to the first four natural numbers. And Table 9 illustrates another reordering of the rows and columns in Table 1, with the entries replaced by the natural numbers to which they map. If one confines oneself to the first two columns, each pair of rows could be collapsed into one, with the label from the row taken from the map. But this process breaks down for the next two and the last two columns.
Elements of D_{4} | Image |
ρ_{0}, μ_{0} | 0 |
ρ_{1}, σ_{0} | 1 |
ρ_{2}, μ_{1} | 2 |
ρ_{3}, σ_{1} | 3 |
* | ρ_{0} | μ_{0} | ρ_{1} | σ_{0} | ρ_{2} | μ_{1} | ρ_{3} | σ_{1} |
ρ_{0} | 0 | 0 | 1 | 1 | 2 | 2 | 3 | 3 |
μ_{0} | 0 | 0 | 3 | 3 | 2 | 2 | 1 | 1 |
ρ_{1} | 1 | 1 | 2 | 2 | 3 | 3 | 0 | 0 |
σ_{0} | 1 | 1 | 0 | 0 | 3 | 3 | 2 | 2 |
ρ_{2} | 2 | 2 | 3 | 3 | 0 | 0 | 1 | 1 |
μ_{1} | 2 | 2 | 1 | 1 | 0 | 0 | 3 | 3 |
ρ_{3} | 3 | 3 | 0 | 0 | 1 | 1 | 2 | 2 |
σ_{1} | 3 | 3 | 2 | 2 | 1 | 1 | 0 | 0 |
Suppose a subgroup contains n elements. To determine if the subgroup is normal, it is sufficient to examine the first n rows and the first n columns in the reordered table. This capability follows from a theorem about what are known as left and right cosets for a subgroup.
The permuation group S_{4} provides another example of a subgroup that is not normal. By my calculations, D_{4} is NOT a normal subgroup of S_{4}.
7.0 The Composition Series of a GroupAt this point, I have completed my explanation of the lattice diagram at the top of this post, including circles, text colors, and boxes. I draw from these results to illustrate how a non-simple group, namely D_{4}, can be expressed as a composition of factor groups.
Table 10 lists twelve series of subgroups of the group of symmetries of the square. Each series has the following properties:
- The leftmost group in the series is the one-element group containing the identity element.
- The rightmost group is D_{4}.
- Each group in the series (except D_{4}) is a proper normal subgroup of the group immediately to the right of it in the series.
A series with these properties is known as a subnormal series of the group D_{4}. If every group in the series is also a normal subgroup of D_{4}, the series is a normal series of the group D_{4}. By the last property in the bulleted list, one can calculate a factor group for each pair of immediately successive groups in the series.
Number Factor Groups | Series | Normal Series |
1 | {ρ_{0}} < D_{4} | Yes |
2 | {ρ_{0}} < {ρ_{0}, ρ_{1}, ρ_{2}, ρ_{3}} < D_{4} | Yes |
2 | {ρ_{0}} < {ρ_{0}, ρ_{2}, μ_{0}, μ_{1}} < D_{4} | Yes |
{ρ_{0}} < {ρ_{0}, ρ_{2}, σ_{0}, σ_{1}} < D_{4} | Yes | |
{ρ_{0}} < {ρ_{0}, ρ_{2}} < D_{4} | Yes | |
3 | {ρ_{0}} < {ρ_{0}, ρ_{2}} < {ρ_{0}, ρ_{1}, ρ_{2}, ρ_{3}} < D_{4} | Yes |
{ρ_{0}} < {ρ_{0}, ρ_{2}} < {ρ_{0}, ρ_{2}, μ_{0}, μ_{1}} < D_{4} | Yes | |
{ρ_{0}} < {ρ_{0}, ρ_{2}} < {ρ_{0}, ρ_{2}, σ_{0}, σ_{1}} < D_{4} | Yes | |
{ρ_{0}} < {ρ_{0}, μ_{0}} < {ρ_{0}, ρ_{2}, μ_{0}, μ_{1}} < D_{4} | No | |
{ρ_{0}} < {ρ_{0}, μ_{1}} < {ρ_{0}, ρ_{2}, μ_{0}, μ_{1}} < D_{4} | No | |
{ρ_{0}} < {ρ_{0}, σ_{0}} < {ρ_{0}, ρ_{2}, σ_{0}, σ_{1}} < D_{4} | No | |
{ρ_{0}} < {ρ_{0}, σ_{1}} < {ρ_{0}, ρ_{2}, σ_{0}, σ_{1}} < D_{4} | No |
The definition of an isomorphism for a subnormal series builds on the definition of isomorphism for groups. Consider the factor groups arising in each series from successive pairs of subgroups in each series. Two series are isomorphic if they contain the same of number of factor groups, in this sense, and these factor groups are isomorphic. The order in which the factor groups arise can vary among isomorphic subnormal series.
I have collected isomorphic series together, in Table 10, by means of horizontal lines in the first column. The series with one factor group is not isomorphic to any other series. The first series shown with two factor groups is not isomorphic to the other three series with two factor groups. And those three series are isomorphic to one another. All of the series with three factor groups are isomorphic to one another.
The series with three factor groups have another property. All factor groups in these series with three factor groups are simple groups. That is, they contain no proper normal subgroups. A subnormal series of a group in which all factor groups formed by the series are simple is known as a composition series. By the Jordan-Hölder Theorem, all compositions series for a group are isomorphic series. This theorem justifies one in speaking of THE composition series for a group. Finding the factor groups in a the composition series for a group is somewhat analogous to factoring a natural number. Note that D_{4} contains eight elements and each of the three factor groups in the composition series contain two elements. Furthermore,
8 = 2^{3}
For a natural number, the prime factors can be combined to yield the original number. Here the analogy apparently breaks down. The factor groups in a composition series for a group constrain the structure of the group, but two non-isomorphic groups can have the same composition series. But still, mathematicians have solved various problems in group theory for finite non-simple groups by use of the classification of finite simple groups.
Composition series apparently have an application in solving polynomial equations. The composition series for the permutation group S_{5} contains a factor group that is non-Abelian. This is connected with the insolvability of the quintic. There are formulas for zeros for cubic and fourth order polynomial, analogous to the quadratic formula. But there is no such formulas for poynomials of the fifth degree and higher.
8.0 Classification of Finite Simple GroupsAt this point, I have explained how finite simple groups arise as factor groups for the composition series of any finite group. I hope that this gives some hint of why the following theorem is of interest.
Theorem: Each finite simple group is one of the following, up to an isomorphism:
- A group of prime order.
- An alternating group.
- A Lie group.
- One of 26 sporadic groups not otherwise classified.
I am aware that this this theorem uses technical terms I still have not explained, including one that I simply do not understand myself.
The sporadic groups are finite simple groups that do not fall into the other categories, although, I gather, some sporadic groups are related to one another.The sporadic group with the largest number of elements is called the Monster group. It has 808,017,424,794,512,875,886,459,904,961,710,757,005,754,368,000,000,000 elements.
9.0 History of the TheoremIn 1972, Daniel Gorenstein proposed that mathematicians could complete a classification of all simple groups. By the early 1980s, mathematicians had stated the theorem and those specialists who had pursued Gorenstein's program believed they had proven it. The proof, however, was scattered among (tens of?) thousands of pages in hundreds(?) of papers in many mathematics journals. No one person had probably ever understood the proof or read it in its entirety.
The proof, however, was discovered even then to be incomplete. Steve Smith and Michael Aschbacher worked on closing this gap, relating to quasithin groups. They succeeded by 2004.
Meanwhile, a number of mathematicians have been trying to simplify the proof and to restate it in one location. The ambition of these mathematicians is to produce a "second generation" proof of only a couple thousand pages.
Has a theorem been proven if only one or two mathematicians have read the proof in its entirety? How about if nobody has, which would have been the case in the 1980s if the proof had indeed been valid? Certainly, the proof of the classification theorem is not surveyable, in Wittgenstein's sense. Do mathematical results need to be established by a social process? If so, how can such social processes be characterized?
Appendix: Terms Defined or Illustrated AboveAbelian group, Associativity, Composition Series, Factor Group, Finite Group, Group, Homomorphism, Identity Element, Improper Subgroup, Inverse, Isomorphic Groups, Isomorphic Subnormal Series, Lattice Diagram, Normal Series, Normal Subgroup, Permutation Group, Proper Subgroup, Subgroup, Subnormal Series.
References- Michael Aschbacher (2004). The Status of the Classification of the Finite Simple Groups, Notices of the AMS, V. 51, No. 7 (Aug.): pp. 736-740.
- Michael Aschbacher, Richard Lyons, Stephen D. Smith, and Ronald Solomon (2011). The Classification of Finite Simple Groups: Groups of Characteristic 2 Type, American Mathematical Society.
- Nicolas Bourbaki (1943). Elements of Mathematics: Algebra I: Chapters 1-3.
- J. H. Conway and S. P. Norton (1979). Monstrous Moonshine, Bulletin of the London Mathematical Society, V. 11, no. 3: pp. 308-339.
- John B. Fraleigh (2002). A First Course in Abstract Algebra, 7th Edition, Pearson.
- Daniel Gorenstein, Richard Lyons, and Ronald Solomon (1994). The Classification of the Finite Simple Groups, American Mathematical Society.
- Ian Hacking (2014). Why is there Philosophy of Mathematics at all?, Cambridge University Press.
- Daniel Kunkle and Gene Cooperman (2007). Twenty-Six Moves Suffice for Rubik's Cube, ISSAC'07, 29 Jul. - 1 Aug., Waterloo, Canada.
- Tomas Rokicki (2008). Twenty Five Moves Suffice for Rubik's Cube.