A Commonplace Book

Home | Authors | Titles | Words | Subjects | Random Quote | Advanced Search | About...


Search Help   |   Advanced Search

New Republic 8Ford 9

 

I decided to absorb the data into a database. The first draft of the code I wrote to do so informed me that it would take 25 days of computing processing to complete. That was too long. Also I was out of hard drive space. So I went to a store and bought a computer, a big, boxy, unfashionable PC with a 4-GHz quad-core processor and ten terabytes of extra hard-drive space, installed Linux on it, and got the most recent version of the PostgreSQL database.

With the help of that machine and quite a few database tricks to massage and extract the data, I got 25 days down to one, with searchable titles, descriptions, and reviews. Seven days of programming and one day of absorption to beat one day of programming and 25 days of absorption: a pretty familiar set of trade-offs. You're always trying to balance your time against the computer's, but there's also the challenge of the thing. I probably should have just let it run for four weeks.

-- Paul Ford. "Does Amazon's Data Speak for Itself?" New Republic (Feb 17, 2016). https://newrepublic.com/article/129026/amazons-data-speak-itself
permalink