Nothin' Matters and What if it Did?: May 2011

Tuesday 31 May 2011

San Francisco Circumcision Ban

San Francisco is going to put a circumcision ban on the ballot in November. Unlike other recent cases of gov't meddling in San Francisco, e.g., the "happy meal" restrictions, I think this is a sensible law to consider. It's the kind of situation in which the government really can play a useful protective role in a potentially physically abusive situation. It's hard to think of circumcision as abuse mostly because it's so common, but given the intense pain that it causes and the lack of any clear medical justification, a very strong case can be made that it is, in fact, abusive. Of course, some would argue that there are medical benefits associated with circumcision, but, as noted in the linked article, these data are unclear and not adequate to have resulted in medical associations recommending it.

Sometimes there are good health reasons for circumcision but the actual bill provides for medical exceptions: "A surgical operation is not a violation of this section if the operation is necessary to the physical health of the person on whom it is performed because of a clear, compelling, and immediate medical need with no less-destructive alternative treatment available, and is performed by a person licensed in the place of its performance as a medical practitioner"

Some will object that this violates freedom of religion. But it's hard to take such arguments seriously. We don't allow other kinds of abusive actions in the name of freedom of religion. Nobody gets to cane their children and point to Bible passages about sparing the rod and spoiling the child. Religious freedom doesn't trump freedom of children to be kept free of intense unnecessary pain. We don't let religious practice trump child welfare considerations in the case of female circumcision.

I'm not sure that the ban is a good idea, but I think that whether or not it's a good idea turns on the question of whether or not clear medical benefits exist, not the fact that it's common or that it's an important religious practice. Without the existence of demonstrable medical benefits, those factors should only motivate the need to protect children from well-meaning parents who might have their sons circumcised for the wrong reason.

Wednesday 25 May 2011

Conservatives and Relativism

I remember that conservatives used to portray liberals as peddlers of relativism. But it seems to me that conservatives have become much less staunch defenders of objective truth, resting content with their own version of relativism. A few examples that I've recently encountered: (1) It's interesting to observe O'Reilly in his 2007 interview with Dawkins. O'Reilly doesn't attempt to offer any evidence for his religious beliefs simply "because they help him as a person" and "they're true for me". Dawkins takes him to task for his nambly pambly milquetoast relativism (starts around 2:20). (2) I was struck by a similar impression in reading the article "Transgender Clownfish Controversy". A conservative organization objects to the teaching of the possibility of more than two genders as this "does not represent the values of the majority of families in Oakland" as if the truth and facts and what we're to teach are defined by parental values rather than objective facts. I suppose it has all gone downhill ever since the Bush administration plunged into Marxist "reality construction" with their infamous "We're an empire now, and when we act, we create our own reality" (see this old post)

Thursday 19 May 2011

Ben Stein's Elitism Defence of Strauss Kahn

Ben Stein has written what I think is a really atrocious call for "perspective" on the Strauss-Kahn case, "Presumed Innocent Anyone". Despite the title, it's not an article about a failure to presume legal innocence, it's just a set of odd attempts to cast aspersions on the allegations in the case based, it appears, on the fact that Strauss-Kahn is rich and powerful and his accuser is a hotel maid. Most of his points are laughable but I'll pick out a few of my favourites:

1) "Can anyone tell me any economists who have been convicted of violent sex crimes? Can anyone tell me of any heads of nonprofit international economic entities who have ever been charged and convicted of violent sexual crimes?" He's likely innocent because he's an economist or because he's head of the IMF. By the same argument, I suppose, we needn't have bothered attempting to prosecute OJ Simpson (can anyone tell me of any sports broadcasters who'd been charged and convicted of violent murders?) or Martha Stewart (can anyone tell me of any home decorating gurus convicted of insider trading) or the Menendez brothers (can anyone tell me of any wealthy suburban kids who'd been charged and convicted of violent patricide and matricide?)

2) "The prosecutors say that Mr. Strauss-Kahn "forced" the complainant to have oral and other sex with him. How? Did he have a gun? Did he have a knife? He's a short fat old man. " Given the account of the struggling that went on and the fact that this is alleged to have occurred in a large suite, this point doesn't make much sense. But note that he's 62, not 82, and 5'7", about 3" taller than the average US female, and his pictures suggest he's not obese or anything, but quite burly. We know nothing of the size of this maid, but I wouldn't be surprised to learn that she weighed less and was shorter. Also, note that the charge is attempted rape, not rape, due, in part by the account given to the fact that the maid was partially successful in fighting him off.

3) "People accuse other people of crimes all of the time. What do we know about the complainant besides that she is a hotel maid? I love and admire hotel maids. They have incredibly hard jobs and they do them uncomplainingly. I am sure she is a fine woman. On the other hand, I have had hotel maids that were complete lunatics, stealing airline tickets from me, stealing money from me, throwing away important papers, stealing medications from me. How do we know that this woman's word was good enough to put Mr. Strauss-Kahn straight into a horrific jail?" We should likely ignore her, or give him some super duper benefit of the doubt, because she's a maid and there have been instances of maids behaving poorly? I would add that history is replete with examples of rich, powerful people behaving poorly, but I don't think that this constitutes extra evidence of Strauss Kahn's guilt, any more than examples of hotel maids, other than the accuser, behaving poorly gives good reasons to think him innocent.

4) But then Ben Stein stops with all the subtlety and lays his cards on the table, this is not a case of rape it's really a case of the poor unjustly attacking the rich. How do we know this? Because news articles have mentioned that Strauss Kahn was staying in a $3000/night hotel room. "In what possible way is the price of the hotel room relevant except in every way: this is a case about the hatred of the have-nots for the haves, and that's what it's all about." (emphasis added)

ETA: Jon Stewart comments on same.

Saturday 7 May 2011

2011 Canada Election Results under AV

A few weeks ago, I tried to do some analysis of how the outcome of the 2008 election might have changed under alternative vote. Today I ran the same process using data from the 2011 election (I used preliminary results where the verified results weren't yet ready) and voter second choice data (see slide 5 in the gallery section) from a slight more recent survey than the one I used in the 2008 analysis. The Procedure and Politics blog did a similar analysis recently, I'm rerunning the most recently updated numbers using the script developed earlier.

Here were the results in this updated simulation. See the earlier 2008 analysis for a description of the methodology.

Party	Original	In AV Sim	% Seats won	% Seats AV	Popular Vote
Conservatives	166	144	0.54	0.47	0.4
NDP	103	114	0.33	0.37	0.31
Lib	34	48	0.11	0.16	0.19
BQ	4	1	0.01	0.003	0.06
Green	1	1	0.003	0.003	0.04

In this scenario, the Conservatives do not attain a majority, they get 22 fewer seats, (I'm reading slightly different reports on final numbers of seats so there may be a different of a seat or two based on validated final results). The Liberals pick up 14 and the NDP picks up 11. The BQ drops down to a single seat. Also note that while the percentage of seats won more closely mirrors popular vote for the three largest parties in the AV case, that's not true for the BQ, their seat percentage more closely ties with popular vote in FPTP. This is a good reminder that AV isn't necessarily of much utility to some smaller fringe parties with heavily concentrated support in relatively diverse ridings.

The ridings that changed in the calculations I ran:

District Number	Name	Winning Party	Recalculated
10004	Labrador	Consv	Lib
12009	South Shore--St. Margaret's	Consv	NDP
13007	Moncton--Riverview--Dieppe	Consv	Lib
24002	Ahuntsic	BQ	NDP
24036	Lotbinière--Chutes-de-la-Chaudière	Consv	NDP
24054	Bas-Richelieu--Nicolet--Becancour	BQ	NDP
24055	Richmond--Arthabaska	BQ	NDP
24075	Westmount--Ville-Marie	Lib	NDP
35006	Bramalea--Gore--Malton	Consv	NDP
35016	Don Valley East	Consv	Lib
35017	Don Valley West	Consv	Lib
35022	Etobicoke Centre	Consv	Lib
35023	Etobicoke--Lakeshore	Consv	Lib
35039	Kitchener--Waterloo	Consv	Lib
35043	London North Centre	Consv	Lib
35048	Mississauga East--Cooksville	Consv	Lib
35057	Nipissing--Timiskaming	Consv	Lib
35072	Pickering--Scarborough East	Consv	Lib
35079	Sault Ste. Marie	Consv	NDP
35081	Scarborough Centre	Consv	Lib
35100	Willowdale	Consv	Lib
46005	Elmwood--Transcona	Consv	NDP
46014	Winnipeg South Centre	Consv	Lib
47005	Palliser	Consv	NDP
59031	Vancouver Island North	Consv	NDP
60001	Yukon	Consv	Lib

Data is available here: link (see the "WithAV" tab for the numbers w/ the changes)

Sunday 1 May 2011

A proposed criterion for what counts as semantic in the Semantic Web

Last year Richard MacManus introduced the Modigliani test as a "semantic web tipping point". His argument was, "The tipping point for the long-awaited Semantic Web may be when you can query a set of data about someone not too famous, and get a long list of structured results in return." This would represent significant progress in the implementation of semantic web technology and, to be sure, structured data would be very helpful in realizing an effective implementation of the semantic web. Being able to integrate this information from disparate sources is exactly the sort of thing that the implementation of the semantic web is supposed to help us realize.

However, this example as a semantic web tipping point gives pause because realizing it doesn't, to my mind, necessarily require too much in terms of semantics. Similarly, many alleged examples of semantic search leave me unsatisfied. Consider a recent discussion entitled "Exploring the Semantics of Yahoo Direct Search" -- the discussion points to the categorization of results and the ability to auto-complete queries and/or hint at the results they might produce. But none of these things seem particularly semantic or at least not necessarily semantic.

Of course, "semantics" is notoriously vague and under-specified in the context of semantic web and semantic search discussions. It certainly doesn't mean the same thing as what we're discussing when we consider, say, the semantics of first order logic. Rather, it usually means something vaguer like implementing the concepts that tokens denote rather than the tokens themselves in search and in representation. But even given this vague notion, I think that there's a relatively clear criterion, (one that, I might add, many extant alleged semantic search and semantic web implementations fail to meet), that will allow us to pass over the linked data vs. RDF debate and jump to the very crux of the matter with respect to semantics. I propose that a representation and query system is semantic to the extent that it's able to identify correct or useful query responses despite the fact that some terms in the query or salient query disjunct are not present in the response. That, to my mind, is a fair and interesting test of the extent to which the system is implementing the concepts that tokens represent rather than just the tokens themselves.

The condition is, I realize, neither necessary nor completely sufficient* to establish the presence of the implementation of semantics, but it is a fairly strong and reliable indicator. It's difficult to realize this condition without being able to do some reasoning about the concepts in the query. As such, I would suggest there are really a series of "tipping points" or at least types of queries and responses that realize this condition that would suggest we're well on the way to realizing a truly semantic web. Here's a set of example queries, or query types, and the kinds of results they'd need to meet the criterion.

Synonyms: To my mind, the simplest search implementation that would meet the criterion and have a legitimate claim at implementing semantics is any system that will recognize synonyms. Simple implementation of synonyms is implemented in Google already at present. Synonym recognition doesn't require extensive understanding of meaning but it does require some sort of semantic model for supplementing search. An example, searches for 'car parts' that return results containing phrases like 'automotive parts' or 'auto parts'.
Subclass: Search engine users have high expectations of search engines in terms of natural language understanding. However, they tend to be very forgiving of the fact that state of the art search engines are for the most part completely incapable of doing any sort of subtype reasoning, although natural language questions do involve this. Why shouldn't we expect a query on, say, "graph traversal algorithm AND scripting language" to be able to identify documents discussing a depth first search algorithm written in perl? Querying over subtypes is a far better test for the implementation semantics and has the potential to make the querying system far more powerful as an information retrieval tool. Simple examples include queries such as 'heart disease drug' and getting results withouth 'heart disease' or "drug" but containing instead, for example, 'Myocardial infarction' and "aspirin". Or we might imagine queries for 'vegetable side dishes for poultry' returning documents lacking those terms but returning references to green bean casserole recipes to accompany turkey. Of course, it's worth noting that such semantic search tools exist already and don't require the maturity of the more formal semantic web to be realized. Consider for example, a search for 'heart disease drug' in Search Medica or a search for 'meat with vegetables' in Yummly.
Instances: We can also imagine a search system that allowed us to search 'NHL team bankruptcy' and returning documents about, say, the Buffalo Sabres financial plight of some years ago even if the document failed to contain the phrase 'NHL team', i.e., based on recognition that 'Buffalo Sabres' is an instance of NHL team. Or, why shouldn't we expect a search tool to allow us to query "SCOTUS judges Harvard" and be able to retrieve documents containing references to Harvard and particular SCOTUS judges?
Another useful kind of subsumption reasoning is the recognition of parthood. This would be particularly useful for queries referring to geographical entities, e.g., in travel queries, "find airports in Northeastern USA". Other examples include a search for 'baseball teams in Southern United States' that recognized that references to, say, Alabama, are relevant or queries on 'cancer treatment in Canada' that recognized references to British Columbia as potentially salient.
Negation reasoning: Another particularly useful test for semantics is the ability to do actual conceptual negation in a query. For example, I often like to search for soup recipes that don't include meat. However, searches for 'soup NOT meat', typically only return references without "meat", but again it would be most useful if they also left out "chicken", "beef" etc.
Common sense/rule following: In a recent article about the ITA Software acquisition, a Google VP, Jeff Huber, asked "How cool would it be if you could type ‘flights to somewhere sunny for under $500 in May’ into Google and get not just a set of links but also flight times, fares and a link to sites where you can actually buy tickets quickly and easily?" This, I would argue, is an excellent "tipping point" query for the semantic web. While linked data is required for such a query, it wouldn't be sufficient. Recognizing which locations satisfied 'somewhere sunny' would indeed be indicative that the system is implementing semantics.

There are, of course, lots of improvements to be realized in search that don't meet the criterion I've spelled out. Improvements in ranking and categorizing and search suggestion and result extraction may, in fact, be of as much utility as improvements that implement these kinds of semantics. I just wish we'd stop using "semantics" for any kind of addition of structured data to documents or results.

*To the sufficiency question, there are some query response systems that meet the criterion I propose but which fail to be "semantic" in any reasonable sense of the term. As mentioned, stemming variants wrt the query probably aren't good examples. Nor are search tools that allow us to constrain dates and values, e.g., a Craigslist search that allows one to specify a maximum price (or age) or a newspaper archive search allowing me to constrain dates. Any satisfaction of the criterion that is realized almost solely via arithmetic, probably doesn't.

Nothin' Matters and What if it Did?