While we're on the subject of maybe-meaningful data-mining output, let me share with you some semi-refined ore from the dataset of real-estate listings that I mentioned the other day. I've collected all the (non-foreclosure) listings for 8 cities from trulia.com — about 50,000 listings altogether — and extracted the descriptions, e.g. Truly,"A Diamond in the [...]
↧