The landscape

For textual information, Wikipedia has done a tremendous job of creating a vibrant public commons. In the software world, the Free Software Foundation and others have had similar success in creating a public commons for source code. What is missing today though is a public commons for data.

As things presently stand, data is typically siloed inside proprietary web sites and applications, and inaccessible to other users, or for other purposes. Facebook, Twitter, EBay, Craigslist, Match.com, and Yelp are all examples of sites that are little more than a user interface around user contributed data. Data which they take from the public sphere, lock up, and then refuse to contribute back.

We seek to rectify this: to provide a queryable real time repository for open data using a data as a service model. Some examples of the broad range of data under consideration:

  • classified, realty, job, and personal ads
  • customer reviews of products and businesses
  • geographic, scientific, and economic data
  • app data for open source applications

Our approach

Our solution has two components:
  • the FreeTable database - designed for programmers, a database with an SQL like interface on the public Internet, with all the ownership, permissions, revision control, and abuse protections such an environment requires. Data can be contributed to the FreeTable database by anyone, or at least initially, by any programmer. Data is in the standard table, record, field format. Rather than trying to carefully design the database tables we support, we allow people to create any database table on our system, and then see which tables prove popular.
  • the FreeTable browser - an easy to use web interface to the FreeTable database. This is primarily a debugging tool. Most end user interaction with FreeTable is in fact expected to happen transparently through third party apps accessing the FreeTable database.

Our commitment to Open

  • The FreeTable data is distributed under Open Knowledge Definition compliant licenses.
  • All table data and metadata can be exported/forked.
  • The FreeTable software is distributed under the GNU Affero GPL.
  • The FreeTable software is bult on top of a free platform: Python, MySQL, and Ubuntu.
  • Minimal use is made of Amazon EC2 specific cloud functionality.
  • In the event FreeTable decides to cease operations at least one month's notice will be given during which any data can be exported off of FreeTale.

Mission critical

FreeTable has been designed from the ground up to support mission critical applications, such as customer facing websites.
  • Reliability. Our architecture supports redundant hardware and failover operation, although we don't presently have the customer demand to justify deploying this feature.
  • System recovery. For disaster recovery we take daily point in time snapshots and store them at Amazon S3. We have the ability restore a backend and all the tables it hosts from one of the snapshots.
  • User error recovery. To cope with the accidental "drop table" command, we have the ability to restore a single table from the daily point in time snapshots.
  • Network monitoring. All nodes in our system are monitored, and in the event of a failure, notification is sent by email, SMS, and phone.
  • Capacity. By running on virtual hardware at Amazon EC2, FreeTable has the ability to grow and rapidly swap out faulty hardware.

Please contact us if you are interested in purchasing a Service Level Agreement (SLA) for FreeTable that provides a guaranteed uptime or dedicated hardware that provides lower latency.

Where are we?

FreeTable has been under development since December 2009, and has been continuously running internally since December 2010. Only small changes to the underlying table schema have needed to be made since then, and these have been able to be made on the live running instance. FreeTable has proven itself stable.

In September 2011, FreeTable was opened for alpha release, with the primary goal being to start gathering useful public data sets.