Il successo della Lega, i media e le “crisi” migratorie

La crescita di Salvini e della Lega è forse per la politica italiana l’evento più significativo del 2018. Nel gennaio 2018, prima delle elezioni di marzo, la Lega di Salvini era intorno al 12-13%. Alla fine del 2018 la Lega era stimata sopra al 30%. Un guadagno di quasi 20 punti percentuali in 12 mesi.

Fig 1. La crescita della Lega (media mobile dei sondaggi, 30 giorni)


Sunday, 8 December 2019

Explicit semantic analysis with R

Explicit semantic analysis (ESA) was proposed by Gabrilovich and Markovitch (2007) to compute a document position in a high-dimensional concept space. At the core, the technique compares the terms of the input document with the terms of documents describing the concepts estimating the relatedness of the document to each concept. In spatial terms if I know the relative distance of the input document from meaningful concepts (e.g. ‘car’, ‘Leonardo da Vinci’, ‘poverty’, ‘electricity’), I can infer the meaning of the document relatively to explicitly defined concepts because of the document’s position in the concept space.


Tuesday, 26 April 2016


Twitter: frbailo




  • New Course Available Now: Machine Learning with Tidymodels
    New Course Available Now: Machine Learning with Tidymodels The ever increasing application of machine learning models in industry and academia requires tools which are easy to use and ensure a reliable model fitting process. The R package universe cov... The post New Course Available Now: Machine Learning with Tidymodels first appeared on R-bloggers.
  • Cluster Analysis in R
    Cluster Analysis in R, when we do data analytics, there are two kinds of approaches one is supervised and another is unsupervised. Clustering is... The post Cluster Analysis in R appeared first on finnstats. The post Cluster Analysis in R first appeared on R-bloggers.
  • Recidivism: Identifying the Most Important Predictors for Re-offending with OneR
    In 2018 the renowned scientific journal science broke a story that researchers had re-engineered the commercial criminal risk assessment software COMPAS with a simple logistic regression (Science: The accuracy, fairness, and limits of predicting recidivism). According to this article, COMPAS uses 137 features, the authors just used two. In this post, I ... The post […]
  • Webscraping Tables in R: Datapasta Copy-and-Paster
    This article is part of R-Tips Weekly, a weekly video tutorial that shows you step-by-step how to do common R coding tasks. Here are the links to get set up. 👇 Get the Code YouTube Tutorial (Click image to play tutorial) ... The post Webscraping Tables in R: Datapasta Copy-and-Paster first appeared on R-bloggers.
  • SwimmeR goes to the Para Games and other Updates – v0.9.0
    There’s a new version of SwimmeR available, v0.9.0. It follows v0.8.0, which I didn’t like and didn’t write about. I’ve made some improvements though and here we are. Rather than just telling you what’s in v0.9.0 I’m going to indulge myself and approach this ... The post SwimmeR goes to the Para Games and other […]

RSS Simply Statistics

  • Streamline - tidy data as a service
    Tldr: We started a company called Streamline Data Science that offers tidy data as a service. We are looking for customers, partnerships and employees as we scale up after closing our funding round! Most of my career, I have worked in the muck of data cleaning. In the world of genomics, a lot of […]
  • The Four Jobs of the Data Scientist
    In 2019 I wrote a post about The Tentpoles of Data Science that tried to distill the key skills of the data scientist. In the post I wrote: When I ask myself the question “What is data science?” I tend to think of the following five components. Data science is (1) the application of design […]
  • Palantir Shows Its Cards
    File this under long-term followup, but just about four years ago I wrote about Palantir, the previously secretive but now soon to be public data science company, and how its valuation was a commentary on the value of data science more generally. Well, just recently Palantir filed to go public and therefore submitted a registration […]

RSS Statistical Modeling, Causal Inference, and Social Science

  • Can you trust international surveys? A follow-up:
    Michael Robbins writes: A few years ago you covered a significant controversy in the survey methods literature about data fabrication in international survey research. Noble Kuriakose and I put out a proposed test for data quality. At the time there were many questions raised about the validity of this test. As such, I thought you […]
  • We’re hiring (in Melbourne)
    Andrew, Qixuan and I (Lauren) are hiring a postdoctoral research fellow to explore research topics around the use on multi-level regression and poststratification with non-probability surveys. This work is funded by the National Institutes of Health, and is collaborative work with Prof Andrew Gelman (Statistics and Political Science, Columbia University) and Assoc/Prof Qixuan Chen (Biostatistics, […]
  • Hierarchical modeling of excess mortality time series
    Elliott writes: My boss asks me: For our model to predict excess mortality around the world, we want to calculate a confidence interval around our mean estimate for total global excess deaths. We have real excess deaths for like 60 countries, and are predicting on another 130 or so. we can easily calculate intervals for […]