The two alternatives to the monasterisation of the World wide web

Saint Michael’s Abbey, in the Susa Valley, Piedmont. Source: Wikipedia.

In Medieval Europe, information was physically concentrated in very few secluded libraries and archives. Powerful institutions managed them and regulated who could access what. The library of the fictional abbey that is described in Umberto Eco’s The Name of the Rose is located in a fortified tower and only the librarian knows how to navigate its mysteries. Monasteries played an essential role in preserving written information and creating new intelligence from that knowledge. But being written information a scarce resource, with the keys to libraries came also authority and power. Similarly, Internet companies are amassing information within their fortified walls. In so doing, they provide services that we now see as essential but they also contravene the two core principles of the Internet: openness and decentralisation.


Monday, 7 May 2018

Local participation and not unemployment explains the M5S result in the South

The abundance of economic data and the scarcity of social data with a comparable level of granularity is a problem for the quantitative analysis of social phenomena. I argue that this fundamental problem has misguided the analysis of the electoral results of the Five Star Movement (M5S) and its interpretation. In this article, I provide statistical evidence suggesting that — in the South — unemployment is not associated with the exceptional increase in the M5S support and that local participation is a stronger predictor of support than most of the demographics.

What happened

The 2018 Italian general elections (elections, since both the Chamber of Deputies and the Senate, were renewed) saw

  1. a significant increase in the number of votes for two parties, the Five Start Movement (M5S) and the League (formerly Northern League),


  1. an increase in the importance geography as an explanatory dimension for the distribution of votes.

The following two maps show where the M5S and the League have increased electoral support from 2013 to 2018. (Electoral data are always data for the election of the Chamber of Deputies).

Vote difference: 2018-2013 (a few communes have not reported all the results, notably Rome)


The geographic pattern is quite simple. The M5S has increased its support in the South and maintained its votes in the North, the League has significantly strengthened its support in the North but has also collected votes in the South, where it had virtually no support. The third and the fourth most voted parties, the Democratic Party (PD) and Berlusconi’s Forza Italia (FI), have lost votes almost everywhere. If we map the results of the four parties side-by-side with the same scale, the PD and FI almost faded into the background.

Votes in the 2018 General elections

Yet, major metropolitan areas do not always follow the national trend. If Naples unambiguously voted M5S, Turin, Milan and Rome did saw the Democratic Party as the most voted party in the wealthiest districts.

Votes in the 2018 General elections (Clock-wise from top-left: Turin, Milan, Naples, Rome)

The density of the distribution of results at the commune and sub-commune level in the macro regions indicates that if the M5S electorally dominates in the South and in the two major islands, the League is the most popular party in the North.

Distribution of votes at commune or sub-commune level

The territoriality of the results, especially along the North-South dimension, makes the analysis especially complicated. This because the strong result of the League in the North and of the M5S in the South might simplistically suggest that immigration (which is much stronger in the North) explains the League’s result in the North and unemployment and poverty (stronger in the South) explain the M5S’s result in the South. This reading is especially attractive since immigration and the M5S proposal to introduce a guaranteed minim income have dominated the campaign.


Tuesday, 20 March 2018

2018 Italian general election: Details on my simulation

This article describes the simulation behind the app that you find here

This simulation of the results for the 2018 general election is based on the results from the last two national elections (the Italian parliament election in 2013 and the European Parliament election 2014) and national polls conducted until 16 February 2018. The simulation is based on one assumption, which is reasonable but not necessarily realistic: the relative territorial strength of parties is stable. From this assumption derives that if the national support for a party (as measured by national voting intention polls) varies, it varies consistently and proportionally everywhere. A rising tide lifts all boats and vice versa. The assumption has some empirical justification. If we compare the difference from the national support (in percentage) for each district in 2013 and 2014 we see a significant correlation, especially in the major parties.

Votes to party in the 2018 Chamber districts


Tuesday, 27 February 2018

Quick analysis of the Italian referendum results

The 2016 Italian referendum torpedoed the constitutional reform presented by the government presided by Matteo Renzi (41). According to the final count, which includes 1.2 million votes cast overseas, the reform was rejected by almost 60% of the voters.

Three parties played a predominant role during the electoral campaign: the ruling Democraric Party (PD), leaded by the chief of government Renzi, the Five Star Movement (M5S), founded and leaded by Beppe Grillo (68), and the Lega Nord (LN), leaded by Matteo Salvini (43). The fourth Italian party, Forza Italia, for different reasons – including the health of Silvio Berlusconi (80) – played a minor role.


Monday, 5 December 2016

Cosa possiamo imparare dal M5S

Leggo e rispondo al post di Massimo Mantellini (Il M5S, il wifi e il principio di precauzione) in cui si evidenzia con preoccupazione come il Movimento abbia portato in Parlamento, dunque in qualche modo legittimandole, posizioni anti-scientifiche; un “pensiero tossico, banale e a suo modo inattaccabile, che nuoce al Paese intero”.

Il Movimento Cinque Stelle con un bacino elettorale che si aggira tra il 25 e il 30% (8.5-10 milioni di persone) è necessariamente complesso in termini di rappresentanza demografica e di diversità di opinione. Considerando un astensionismo del 25%, se vi trovate in fila al supermercato delle 10 persone che vi precedono circa due votano M5S. Purtroppo questa complessità raramente traspare nelle narrazioni giornalistiche, e chi fa informazione tende (troppo) spesso a preferire i tratti caricaturali (da cappello di carta stagnola o da gita in Corea del Nord, per intenderci). Ma questo tipo di informazione è sbagliata: primo perché distorce nella semplificazione, secondo perché incoraggia comportamenti macchiettistici, grotteschi e sbracati da parte di chi sedendo in istituzioni affollate cerca visibilità.


Friday, 22 July 2016

Road to Rome: The organisational and political success of the M5S

The Five Star Movement (M5S) obtained two major victories in the second round of municipal elections on 19 June 2016 in Rome and Turin. Rome attracted the most international attention but it is M5S’ victory in Turin that is likely the most consequential for them and other European anti-establishment parties.

In Rome, a municipality with 2.8 million people and an annual budget of €5 billon, Virginia Raggi (age 37) gained doubled the votes of her contender Roberto Giachetti (age 55). In Turin, a city with a population of 900,000 and an annual budget of €1.69 billion, Chiara Appendino (age 31) outstripped Piero Fassino (age 66) by about 10 percentage points.

Continue reading on Pop Politics Aus

Friday, 8 July 2016

Explicit semantic analysis with R

Explicit semantic analysis (ESA) was proposed by Gabrilovich and Markovitch (2007) to compute a document position in a high-dimensional concept space. At the core, the technique compares the terms of the input document with the terms of documents describing the concepts estimating the relatedness of the document to each concept. In spatial terms if I know the relative distance of the input document from meaningful concepts (e.g. ‘car’, ‘Leonardo da Vinci’, ‘poverty’, ‘electricity’), I can infer the meaning of the document relatively to explicitly defined concepts because of the document’s position in the concept space.


Tuesday, 26 April 2016

Italy’s Five Star Movement – a spectral analysis of its political composition

To talk about identity and soul of the Five Star Movement (M5S) is not only politically contentious but also practically challenging because of the different axes (at least three) along which the M5S has been developing: the vertical top-down axis from Beppe Grillo to his followers (and sympathising voters), the horizontal axis connecting thousands of militants across the country to local, flexible and loosely organised meetups, and finally the cloudy axis linking Internet users through the different online communicative platforms pertaining to the Movement. The academic literature and the media have been prevalently interested in mapping the provenance of votes. I will try here to show some data also on the position of the M5S derived from its 2013 electoral program and the political background of both the onsite and online activists of the Movement.

But let’s first start briefly introducing the trajectory of a movement that vehemently refuses to be called a party or to be associated with any traditional political identity.

Continue reading on the blog of the WZB.

Tuesday, 12 May 2015

NDVI, risk assessment and developing countries

The Normalized Difference Vegetation Index (NDVI) estimates the greenness of plants covering the surface of the Earth by measuring the light reflected by the vegetation into space. The main idea behind the NDVI is that visible and near-infrared light is absorbed in different proportions by healthy and unhealthy plants: a green plant will reflect 50% of the near infrared-light it receives and only 8% of the visible light while an unhealthy plant will reflect respectively 40% and 30%. NDVI can then be used to quantitatively compare vegetation conditions across time and space (and indeed is quite widely used, a Google Scholar search on NDVI produced 60,500 hits).


Thursday, 14 February 2013

Eyes on Guatemala

The Economist has published an article on malnutrition in Guatemala. Hunger is not new in the country, with half of the children population not eating enough Guatemala is the six-worst country in the world, but in some Maya communities children chronic malnutrition can reach 75% (the Economist says 80%). These figures are astonishing, especially because the problem is not food scarcity.

But this as well is hardly new. It was 1981 when Amartya Sen published his Poverty and Famines: An Essay on Entitlement and Deprivation demonstrating that hunger is mostly caused by inequality rather than scarcity. There is no lack of food in Guatemala if you have the money to buy it. In Guatemala City is taking place, as we speak, the 14th Festival Gastronómico Internacional so it seems difficult to talk about a famine or about an emergency (according to the Longman Dictionary an emergency is “an unexpected and dangerous situation that must be dealt with immediately”). The problem is the lack of a functioning state. Because a state cannot function with tax revenues estimated at just 10% of GDP.

Democracy is highly unrepresentative in Guatemala. Who should push for a better redistribution of resources has no voice. National newspapers point constantly the finger at the government (presidency, parliament, judiciary) in a impressive campaign of delegitimation. The Rosenberg tape was just part of it. I’m not defending the government, but saying that criticising it and attempting to systematically destroy its credibility are not quite the same thing. While the headlines cover crime, corruption and hunger the real battle within the country is on the tax reform. A battle that so far every government has badly lost.

Friday, 28 August 2009


Twitter: frbailo




  • Smoothing Time Series Data
    These include both global methods, which involve fitting a regression over the whole time series; and more flexible local methods, where we relax the constraint...
  • Sleek & Shiny
    I was trying to find an apt image to accompany this post, but googling “sleek and shiny” mostly brought up images of women with ridiculously, well, sleek and shiny hair (and magical products to achieve just that), which has nothing to do with what I want to talk about here (not that this has ever […]
  • RcppClassic 0.9.11
    A new maintenance release, now at version 0.9.11, of the RcppClassic package arrived earlier today on CRAN. This package provides a maintained version of the otherwise deprecated initial Rcpp API which no new projects should use as the normal Rcpp AP...
  • Six Sigma DMAIC Series in R – Part 3
    Hope you liked the Part 1 and Part 2 of this Series. In this Part 3, we will go through the tools used during the analyze phase of Six Sigma DMAIC cycle. In this phase, available data is used to identify the key process inputs and their relation to the output. We will go through […]
  • A thought experiment: How CRAN saved 3,620 (working) lives
    Given the vast amount of R packages available today, it makes sense (at least to me, as a trained economist) to ask a simple yet difficult question: How much value has been created by all those packages? As all R stuff on CRAN is open-source (...

RSS Simply Statistics

  • Teaching R to New Users - From tapply to the Tidyverse
    Abstract The intentional ambiguity of the R language, inherited from the S language, is one of its defining features. Is it an interactive system for data analysis or is it a sophisticated programming language for software developers? The ability of R to cater to users who do not see themselves as programmers, but then allow […]
  • What Should be Done When Data Have Creators?
    I was listening to the podcast The West Wing Weekly recently and Episode 4.17 (“Red Haven’s on Fire”) featured former staff writer Lauren Schmidt Hissrich. In introducing her, the podcast co-hosts mentioned that Hissrich was a writer for the Netflix series Daredevil, based on the Marvel Comics character. She is also the showrunner for a […]
  • Cultural Differences in Map Data Visualization
    Matthew Panzarino had an interesting article in TechCrunch on Apple’s process for rebuilding their Maps app. While most of the article describes the laborious process of data collection, one part jumped out at me, which was the team that Panzarino describes as the “Department of Details.” They are responsible for a number of odds and […]

RSS Statistical Modeling, Causal Inference, and Social Science

  • Mister P wins again
    Chad Kiewiet De Jonge, Gary Langer, and Sofi Sinozich write: This paper presents state-level estimates of the 2016 presidential election using data from the ABC News/Washington Post tracking poll and multilevel regression with poststratification (MRP). While previous implementations of MRP for election forecasting have relied on data from prior elections to establish poststratification targets for […]
  • The course of science
    Shravan Vasishth sends this along: Yup. Not always, though. Even though the above behavior is rewarded. The post The course of science appeared first on Statistical Modeling, Causal Inference, and Social Science.
  • What happens to your career when you have to retract a paper?
    In response to our recent post on retractions, Josh Krieger sends along two papers he worked on with Pierre Azoulay, Jeff Furman, Fiona Murray, and Alessandro Bonatti. Krieger writes, “Both papers are about the spillover effects of retractions on other work. Turns out retractions are great for identification!” Paper #1: “The career effects of scandal: […]