Explicit semantic analysis with R

Explicit semantic analysis (ESA) was proposed by Gabrilovich and Markovitch (2007) to compute a document position in a high-dimensional concept space. At the core, the technique compares the terms of the input document with the terms of documents describing the concepts estimating the relatedness of the document to each concept. In spatial terms if I know the relative distance of the input document from meaningful concepts (e.g. ‘car’, ‘Leonardo da Vinci’, ‘poverty’, ‘electricity’), I can infer the meaning of the document relatively to explicitly defined concepts because of the document’s position in the concept space.

(more…)

Tuesday, 26 April 2016

tweets


Twitter: frbailo

links


blogroll


RSS r-bloggers.com

  • eRum2020 in Milan
    The European R conference will visit Milan in 2020! Mirai Solutions is delighted to actively support and participate in the organization of the event. The European R Users Meeting (eRum) is a biennial conference, taking place in Europe during those...
  • Quick Hit: A Different (Diminutive) Look At Distributions With {ggeconodist}
    Despite being a full-on denizen of all things digital I receive a fair number of dead-tree print magazines as there’s nothing quite like seeing an amazing, large, full-color print data-driven visualization up close and personal. I also like supporting data journalism through the subscriptions since without cash we will only have insane, extreme left/right-wing perspectives... […]
  • Is Scholarly Use of R Use Beating SPSS Already?
    by Bob Muenchen & Sean Mackinnon One of us (Muenchen) has been tracking The Popularity of Data Science Software using a variety of different approaches. One approach is to use Google Scholar to count the number of scholarly articles found … Continue reading →
  • Twitter coverage of the useR! 2019 conference
    Very briefly: Last week was useR! conference time again, coming to you this time from Toulouse, France I’ve retrieved 8 318 tweets that mention #user2019 and run them through my report generator And here are the results Take-home message this year: the R Ladies rock!
  • Looking at flood insurance claims with choroplethr
    I recently learned how to use the choroplethr package through a short tutorial by the package author Ari Lamstein (youtube link here). To cement what I learned, I thought I would use this package to visualize flood insurance claims. I … Continue reading →

RSS Simply Statistics

  • Research quality data and research quality databases
    When you are doing data science, you are doing research. You want to use data to answer a question, identify a new pattern, improve a current product, or come up with a new product. The common factor underlying each of these tasks is that you want to use the data to answer a question that […]
  • I co-founded a company! Meet Problem Forward Data Science
    I have some exciting news about something I’ve been working on for the last year or so. I started a company! It’s called Problem Forward data science. I’m pumped about this new startup for a lot of reasons. My co-founder is one of my families closest friends, Jamie McGovern, who has more than 2 decades […]
  • Generative and Analytical Models for Data Analysis
    Describing how a data analysis is created is a topic of keen interest to me and there are a few different ways to think about it. Two different ways of thinking about data analysis are what I call the “generative” approach and the “analytical” approach. Another, more informal, way that I like to think about […]

RSS Statistical Modeling, Causal Inference, and Social Science

  • Voter turnout and vote choice of evangelical Christians
    Mark Palko writes, “Have you seen this?”, referring to this link to this graph: I responded: Just one of those things, I think. Palko replied: Just to be clear, I am more than willing to believe the central point about the share of the population dropping while the share of the electorate holds relatively steady, […]
  • Endless citations to already-retracted articles
    Ken Cor and Gaurav Sood write: Many claims in a scientific article rest on research done by others. But when the claims are based on flawed research, scientific articles potentially spread misinformation. To shed light on how often scientists base their claims on problematic research, we exploit data on cases where problems with research are […]
  • Gigerenzer: “The Bias Bias in Behavioral Economics,” including discussion of political implications
    Gerd Gigerenzer writes: Behavioral economics began with the intention of eliminating the psychological blind spot in rational choice theory and ended up portraying psychology as the study of irrationality. In its portrayal, people have systematic cognitive biases that are not only as persistent as visual illusions but also costly in real life—meaning that governmental paternalism […]