Contents
- Overview
- Interfaces
- Data queries
- Function queries
- Special column types
- Special queries
- Information schema
Overview
The goal of the FreeTable project is to create a public commons for open data.freetable.org helps achieve this goal by providing a queryable real time repository for all but the largest of shared datasets. Think of freetable.org as a database with an SQL like interface that has been placed on the public Internet, and with all the ownership, permissions, revision control, and abuse protections such an environment requires.
FQL
The FreeTable query language is called FQL. It is SQL inspired, but receives and returns results as JSON.To ease implementation, and to make it easier to re-architect the system in the future, where clauses and much of the rest of the query language are greatly restricted compared to SQL. Notably, as espoused by the NoSQL movement, joins are not supported.
Concepts
Names. Names of tables, columns, and so on are case insensitive, and default to lower case. Spaces or other punctuation may be contained in a name, but the name must then be enclosed by double quotes.Namespaces. Tables are organized into namespaces. Normal users are only able to create tables in the "global" and "<username>" namespaces, where <username> is the user's user name. The special namespace "local" can be used to refer to the <username> namespace of the current user. If a namespace isn't specified "local" is used.
Tables. Data is stored in named tables. Table names comprise an optional namespace name and period, followed by sequence of one or more period separated parts. Each part starts with a letter, followed by letters, digits, and single underscores. Tables are composed of fields arranged in rows and columns.
Types. Within each table data is stored in records arranged into rows. Records or rows have a number of named column values or fields. Each field stores a single value. The type of each column can be boolean, integer, float, string (Unicode), array, object, json, email, or secret. Their valid values are as specified by JSON. json is a catchall type that can hold objects of any type. Email is a special JSON object type for securely storing email addresses. Secret is a special JSON object type for securely storing passwords. We don't support binary data, as this is of little use for the sorts of datasets we envision.
Storage. Each record is stored internally as a JSON object. In addition some columns of an object may be stored in "memory". Such columns are said to have memory storage and are eligible for use in where clauses, indexes, and aggregates. boolean, integer, float, and string columns are eligible for memory storage, and for these columns memory storage is the default. It is recommended that memory storage be disabled for large text columns that are not used in where clauses, indexes, or aggregates. This will save table space. Columns that do not have memory storage are referred to as having json storage.
Columns. All columns are required columns. Required columns are specified by the column specifications of the table. (We used to have the notion of optional columns of json type and json storage, that corresponded to non-required column names, but it was too easy to mistype a column name and end up creating an optional column).
Versioning. Each record is versioned. By default only the current version of an record is returned by select, although it is possible to return non-current records, and to change the current version of an record to any preceding or succeeding version.
Curation. Tables may be curated, in which case any insert/update/delete does not have any effect on the current versions of records in the table immediately, but only after the table curators have accepted any changes.
Users. A user of our service is represent by a username string. It may denote either an individual or a company. A username must start with a letter, followed by a sequence of letters, digits, single underscores and hyphens. The username "anonymous" is reserved for records inserted/updated/modified by non-authenticated users. For most datasets non-authenticated users will only be able to select, and at most possibly insert.
Subusers. A user name containing a hyphen denotes a subuser. The string prior to the hyphen is the parent user. Parent users have the same permission over records created by subusers as they do over records created by themselves. Subusers provide a convenient framework for FreeTable accounts that provide services to multiple end users to manage records created by different end users, such as being able to list the records created by a given end user, or only allowing end users to modify or delete their own records.
Groups. A user may belong to one or more groups. A group name begins "group-" and the remainder must comprise an letter followed by a sequence of letters, digits, and underscores. The group "group-root" is reserved for users that have full permissions to the system.
Stewardship. Each table has a set of curators. Each record within a table has a contact. The table curators and record contact may be a username or a group name.
Permissions. Each table has a set of permissions that describe what different classes of users may do to the table's contents. These permissions are expressed as a series of character triples "siu", indicating select permission, insert permission, and update/delete permission, or "---" for no permission. They are arranged in order of describing permissions for the table curators, record contact, authenticated user, and non-authenticated user. No semantics are currently assigned for select or insert by the record contact, so a full set of permissions on a table range from "siu--usiusiu" to "------------". Only the table curators and group-root may make modifications to the table meta-data.
Talk table
A table named <table>_talk shall exist for each regular table that hosts notes and discussion on the table and its values. Table names ending in "_talk" are thus reserved.These tables define the following fields:
- id integer not null. Row id of row being commented on. Or zero for comments on the entire table.
- comment string not null. Comment on the table or row.
A table or row may have multiple different comments each concerned with a different issue.
Interfaces
HTTP/AJAX interface
FreeTable supports a HTTP based interface, in which results are returned as JSON. This can be programmed against directly in most programming languages, or using either a cross domain proxy or dynamic script tag in client side JavaScript.POST requests are required for all queries other than "select". "select" may use either GET or POST. The URLs for accessing FreeTable are:
http://db.freetable.org/ft
https://db.freetable.org/ft
The parameters must be appropriately URL encoded for a GET or
multipart/form-data or application/x-www-form-urlencoded encoded for a
POST. The following parameters must be supplied:
- query=<query> - The FQL query/command to be performed. The maximum length for a query in the current implementation of FreeTable is 200Mbytes.
- args=<args> - Any positional parameter arguments for to the query expressed as a JSON array. Arguments substitute for "?" in the query.
- user=<user> - The db access user used to access the site.
- password=<db_access_password> - The db access password used to access the site.
Example of a complete query:
http://db.freetable.org/ft?query=select+*+from+testuser.test
Queries and results should be utf8 encoded Unicode.
The results returned by a query are a JSON object. For example:
{"error": null, "result": [
{"x": "a b", "y": 1.3e-3, "z": "fish", "_id": 1, "_rev": 12, "_license_notices": "", "_license_ids": ["Public_Domain"],
"_submitter_ip": "127.0.0.1", "_submitter": "testuser", "_deleted": false, "_submit_time": 1264630161.690778},
{"x": "a b", "y": 2.4e-3, "z": "fish\n\u0d06\nfly", "_id": 2, "_rev": 9, "_license_notices": "", "_license_ids": ["Public_Domain"],
"_submitter_ip": "127.0.0.1", "_submitter": "testuser", "_deleted": false, "_submit_time": 1264630162.783294}, ... ],
"stats": {"bytes_in": 117, "bytes_out": 17865, "compute_units": 0.000278, "estimated_price": 0.000036125}}
The fields of this object are:
- error. string or null. An error message from the query, or null if none.
- result. json. Only present on select queries. A JSON array of objects matching the query.
- message. string. Optional. Any informational message associated with the execution of the query.
- last_insert_id. integer. Only present on insert queries. The value of _id for the last object inserted by the insert statement.
- row_count. integer. Only present on update/delete queries. The number of rows matching the where clause of the update/delete query.
- stats. object. Statistics on performance of the query.
Contains the following fields:
- bytes_in. integer. Required. The size of the input query parameter. This may be fine tuned to be more encompassing of the entire inbound data transfer in the future.
- bytes_out. integer. Required. The size of any output result. This may be fine tuned to be more encompassing of the entire outbound data transfer in the future.
- compute_units. float. Required. The amount of compute units taken up executing the query.
- estimated_price. float. Required. The cost of performing the query in US dollars, before any discounts or free quota are applied.
Data queries
Where
A where clause comprises a sequence of terms joined with "and".Terms:
- Identity operators: <column>.
- Unary operators: not <term>, <column> is null, <column> is not null.
- Binary operators: <column> <op> <json_literal>, where <op> is =, <, >, <=, >=.
- In: <column> in (<json_literal>, ...).
- <json_literal>: null, true, false, <integer>, <floating point>, "<string>".
One annoyance is single quoted string literals are not supported, but this is a requirement of the JSON spec that we dare not break.
Select
Example:
select * from testuser.test where x="a b" and y>1.2e-3 order by x desc, y limit 10, 5
select * from testuser.test where fulltext("rubber grommets", title, description)
select synchronous a, b from testuser.test
select count(*) from testuser.test
select min(a) from testuser.test where b < 12
Fulltext queries are used to search and rank the occurrence of keywords
within fields. "fulltext()" can not be combined with any other terms.
Parameters to a fulltext query beyond the first are used to constrain
the search to specific fields of the fulltext index. Only "*" can be
selected in conjunction with fulltext. A table can have at most one
fulltext index. There is up to a 10 second latency from when an
operation is performed until when the results of that operation are
reflected in the fulltext index. Creating a boolean field called
"fulltext" in a fulltext indexed table can be used to selectively
enable or disable the fulltext indexing of individual rows.
Specifying synchronous causes the select to occur after any prior insert/update/delete commands. Without synchronous the select may be performed against a nearby read only slave database. This provides significantly lower latency but worse consistency.
The "*" form should normally be preferred over the identifier list form. It trades off increased network bandwidth for reduced server CPU time marshaling and unmarshaling the JSON results. This is normally a win.
The where clause is optional.
For "*" or an identifier list, returns a JSON object in which "result" maps to a JSON array of result objects, one for each row.
For "count(*)" returns a JSON array containing a single element, a JSON object mapping from "count" to the number of matching rows.
For "max"/"min"/"sum" returns a JSON array containing a single element, a JSON object mapping from "min"/"max"/"sum" to the maximum, minimum, or sum.
Insert
Example:
insert into testuser.test values [
{"_license_ids": ["Public_Domain"], "a": 4, "b": 7, "c": 2},
{"_license_ids": ["Public_Domain"], "a": 5, "b": 9}]
insert into testuser.test value
{"_license_ids": ["Public_Domain"], "a": 6, "b": 8}
Values to be inserted must be valid JSON. Amongst other things this means newlines must be encoded as the two character sequence \n.
Even when inserting a single value, that individual JSON value must be composed into a JSON array using [{...}].
Columns not inserted into will take on their default values.
Specifying a value for the _license_ids column is required.
Update
Example:
update testuser.test set {"a": null, "b": 2} where a=6
update testuser.test set a = null, b = 2 where a=6
Values to be updated must be valid JSON. Amongst other things this means newlines must be encoded as the two character sequence \n.
Only the columns to be updated need to be specified.
The where clause is optional.
Values returned from select shall be suitable for use in update. This means that any reserved identifiers returned by select, namely _id, _rev _license_ids, and so on, shall either be updated (_license_ids), or if they are not allowed to be updated confirmed that they are unchanged (_id), or discarded (all other reserved identifiers).
Delete
Example:
delete from testuser.test where _id=1
The where clause is optional.
A deleted record remains in the database, it is just inaccessible to ordinary queries.
Purge
Example:
purge from testuser.test where _submit_time < 12345
The where clause is optional.
Purged records are permanently removed from the database. Only revisions that are non-current are removed by the purge command.
Reserved identifiers
Within the query language column names beginning with "_" are reserved for accessing FreeTable specific functionality.Within each record the following reserved fields are defined:
- _id integer not null. An integer identifier that uniquely identifies this row within the table. _id's form an increasing sequence based upon the order in which values are inserted. Zero will never be a valid _id. Immutable for a particular record. Returned by select. Discarded by insert/update.
- _rev integer not null. An integer identifier that uniquely identifies a particular version of a row for a given _id. _rev's form an increasing sequence based upon the order in which they were made. Zero will never be a valid _rev. Immutable for a particular revision. Returned by select. Replaced by insert/update.
- _current boolean not null. This value indicates if this is this record represents the current record. Returned by select all_columns. Error on insert.
- _deleted boolean not null. This boolean indicates that a particular row has been deleted. Will always be false unless selecting non-current records, in which case it may be true. Immutable for a particular revision. Returned by select. Discarded by insert/update.
- _visible boolean not null. A boolean indicating if this revision should be visible to select statements issued by users other than those in group-root, and the table curators. Used to block access to potentially defamatory content. Returned by select all_columns. Error on insert.
- _review_state string not null. This value indicates that this record is 'pending' review, or has been 'accepted', or 'rejected', by the table curators. Returned by select all_columns. Error on insert.
- _contact string not null. The steward username of this particular revision. This is the party that would be contacted if the record appeared in error, or to discuss the record. Returned by select all columns. Error on insert.
- _submitter string not null. The username of the submitter of the record. Immutable for a particular revision. Returned by select. Discarded by insert/update.
- _submitter_ip string not null. The IP address of the submitter of the record. Immutable for a particular revision. Returned by select. Discarded by insert/update.
- _submit_time float not null. The time the record was submitted, expressed as seconds since 1970. Immutable for a particular revision. Returned by select. Discarded by insert/update.
- _license_ids array of string not null. A sequence of brief textual identifiers representing the licensing terms under which a revision is provided according to the submitter. The presence of multiple licenses indicates the end user has a choice. This field is of json storage type and so may not be used in constructing queries. Returned by select. Required by insert. Updated by update.
- _license_notices string not null. Any copyright notices, attribution notices, or other notices required under the licenses under which the record is made available. These are notices applying to the record by itself. Table licensing notices are recorded separately. Plain text. This field is of json storage type and so may not be used in constructing queries. Returned by select. Optional on insert. Updated by update.
The above semantics have been chosen in part so that it is possible to use the output value from a select statement as the input value for an insert statement.
Function queries
Functions possibly perform some computation and return a result. Functions are expressed using select statements and results are returned as JSON arrays. Thus the query:
select current_time()
might return:
{"error": null, "result": [{"current_time": 1278391881.696197}],
"stats": {"bytes_in": 21, "bytes_out": 19, "compute_units": 0.007688, "estimated_price": 0.000007705}}
current_time()
Returns the current FreeTable system time as seconds since 1970 UTC.FreeTable system time is not guaranteed to be strictly monotonic. This is because the current_time() function in FreeTable is potentially implemented by multiple hosts, and while the clocks of each host are synchronized, this synchronization is not perfect.
ip_in_range(ip)
The specified select table has string memory storage columns named literally start_ip and end_ip. ip is an IPv4 or IPv6 address string. This routine returns a single JSON row object having start_ip <= ip <= end_ip, or null if no such row exists.Traditional SQL implementations are inefficient at looking up such things as IP addresses in blacklists. We make use of ip_in_range() to identify and optimize such lookups.
For this function to work an ip_range index must be set up on the table on index rows precisely named start_ip and end_ip. Example:
create index range ip_range on user.ip_address_map(start_ip, end_ip)
[...]
select ip_in_range("10.0.0.100") from user.ip_address_map
Might result in:
{"error": null,
"result": [{..., "start_ip": "10.0.0.1", "end_ip": "10.0.0.255", "comment": "Internal subnet"}],
"stats": {"bytes_in": 69, "bytes_out": 392, "compute_units": 0.004744, "estimated_price": 0.000005597}}
table_size(table_name)
Example:
select table_size("user.table")
Returns a object in which "result" maps to a single element array
containing a mapping from "table_size" to the size of the table in
bytes. This size includes any space allocated for the table and
normal indexes, but does not currently include space consumed by range
or fulltext indexes. The size returned may be up to 10 seconds out of
date.
sponsor_size([sponsor])
Example:
select sponsor_size("username")
Returns the aggregate space currently consumed by tables sponsored by
sponsor. If sponsor is omitted the name of the user performing this
command will be used for sponsor.
Special column types
A column of type email is very similar to a JSON object and has the following instantiatable fields:
- email string not null. The email address (required).
- count integer not null. The number of messages that can be sent to a single forwardable instance (default: 1).
- duration float not null. The time in seconds a forwardable instance will be active (default: 86400).
- price float not null. The price to obtain a forwardable instance (default: US$ 0.01).
- pre_subject. An optional string to prepend to the subject line of any messages being forwarded (default: null).
Normal selects on an email column will reveal the count, duration, and price field values.
Selects can also take an argument object containing a price limit (select <col>({"price": <price_limit>}), and an optional count, and duration. If the count and duration constraints are met, and the price limit is less than or equal to the aggregate price for the field in all the rows matched by the select, then the select will return those fields as before but with an additional email forwarding email address, and bill the account the aggregate price. Otherwise an error will be returned. The forwarding email address can be used to contact the chosen party, with limits on the number of messages that can be sent and for how long the address is valid.
Updates to an email column will update the individual email type field values specified. To mark an email address as no longer valid, set the entire field, not just the email sub-field to null. For this reason making email columns non-null is normally inappropriate.
Secret
The purpose of the secret column type is to securely store passwords. The password is stored as an inaccessible hash (Sha-512) and only the database can perform hash comparisons. Because of the ease of implementation the secret type also supports pay-per-view content. This goes against the philosophy of FreeTable, and this functionality might be removed in the future if a good use isn't found for it.A column of type secret is very similar to a JSON object and has the following instantiatable fields:
- secret string. The secret to be hashed (default: null).
- success_price not null. The price to be charged on successful lookup (default: 0.00).
- failure_price not null. The price to be charged on unsuccessful lookup (default: 0.00).
- success_data. The data to be returned on successful lookup (default: null).
- failure_data. The data to be returned on unsuccessful lookup (default: null).
Selects can also take an argument object containing a price limit (select <col>({"price": <price_limit>}), and optionally a secret to be hashed. If the price limit is less than or equal to the aggregate price of the larger of the success_price and failure_price for the field in all the rows matched by the select, then the select will return those fields as before, but with an additional success boolean indicating whether the secret hash matched the stored value, and data, a data field containing the decrypted data for the indicated status. Additionally the account will be billed the relevant aggregate price. Otherwise an error will be returned.
Updates to a secret column will update the individual secret type field values specified.
Special queries
Table commands
Example:
create table [ if not exists ] <table> [ ( <column_decl>, ... ) ] [ <table_option>, ... ]
<column_decl>: <column> <datatype> [ <column_specifier> ] ...
<datatype>: boolean | integer | float | string | array | object | json | secret | email
<column_specifier>: { [ not ] null } | { storage { memory | json } } | { first | last | after <column2> } |
{ set default <literal> | drop default } | { comment "<comment>" }
<table_option>: curator=[<sponsor>] | curated=true | notify=[<sponsor>] | notify_curate=true |
history=true | visible=true |perm="siu--usiusiu" | range_table=false | advertise=false |
license_ids_required=[["Public_Domain"]] | license_ids=[] | license_notices="" |
table_title="" | table_comment=""
drop table [ if exists ] <table>
rename table <table> to <new_table_name>
These commands create, drop, or rename a named table and its corresponding talk table.
Memory storage is the default for boolean, integer, float, and string, and json storage is the default for array, object and json.
The column position specifier is primarily used by the alter table commands. The column position specifier has no effect on the JSON returned by select statements. It is only used in conjunction with any possible dataset browser application.
If no default value has been specified the default is null for a nullable column, or an appropriately zero value for a non-null column. If the default is explicit dropped then no default value exists, and an error will be returned on attempting to insert a record without specifying a value for the column.
The option values shown for <table_options> are the default values used if that option isn't specified. The meaning of the various options is as documented in the section on information_schema.tables, with the exception that "curators" corresponds to the documentation for "curator".
As shown above by default a created table is curated. Records inserted in a newly created table are by default visible, although this doesn't mean you will be able to retrieve them by default. If the table is curated the records need to be accepted and made current by the curators. Typically non-curated tables will have more restrictive permissions so as to prevent spam.
Creating a table also creates a corresponding <table>_talk table that is used for discussion of the table and the objects it contains.
Thought must be given to any potentially unknown users of a table before renaming or dropping a table.
Unlike deleting rows from a table dropping a table is irreversible.
Only group-root and the table curators can drop a table.
Table alias commands
Table alias commands create alternative names under which a canonically named table may be accessed. They are useful after a table has been renamed to direct traffic from the old name to the new name. Multiple levels of alias are not supported.Example:
create alias <old_name> to <new_name>
drop alias <old_name>
For the create alias command <old_name> must not already exist
as a table or an alias.
Aliases are not deleted when the table they point to is deleted, and instead must be deleted manually.
Column commands
Only the table curators and group-root can perform column commands.Example:
alter table <table> add column <column_decl>
Example:
alter table <table> modify column <column> <datatype> [ <column_specifier> ] ...
Example:
alter table <table> change column <old_column> <new_column> <datatype> [ <column_specifier> ] ...
Example:
alter table <table> drop column <column>
"change column" is like "modify column" except that the column is also renamed.
The column must not exist in any indexes.
These commands are irreversible.
These commands do not respect history. That is they make direct modifications to the relevant data values, and do not save the previous values.
Index commands
Example:
alter table <table> add index <index_name> (<column>, ...)
alter table <table> drop index <index_name>
Alternate syntax example:
create index <index_name> [<index_type>] on <table> (<column>, ...)
drop index <index_name> on <table>
Indexes can be created by the table curators or group-root on any columns with memory storage class. Allowed <index_type> values are "normal", the default, "range", and "fulltext". There will be an implicit leading hidden index on _current added to every normal index. Selects then specify an value for this _current field behind the scenes.
Non-current revisions
Example:
select * from <table> all_rows where <expr>
By default select only returns current non-deleted rows. The select ... all_rows command can be used to see all past and future potential revisions of an object. It is most commonly used with curated tables to examine a proposed new revision, but can also be used to examine the revision history.
Hidden columns
Example:
select all_columns from <table> [all_rows] where <expr>
By default the values of _current, _visible, _review_state, and _contact are not returned by select statements. The reason for this is the value of these fields are not static, but are liable to change over the life of the revision, and so it is easiest to store them external to the JSON. Updates to regular fields produce a new revision, while updates to hidden columns produce the same revision but with updated values for the hidden fields. The value of these hidden fields can be retrieved using the select all_columns command. This command requires more compute resources than a regular select since it involves unpacking and repacking the JSON result with the hidden field values. It will normally be used with an all_rows qualifier to include non-current revisions.
Example:
update hidden_columns <table> [all_rows] set {"_current": true, "_review_state": "accepted"} where <expr>
The update hidden_columns command can be used by the table curators and group-root to update the value of hidden_columns on a revision. The values being set must be limited to hidden_columns. It will normally be used with an all_rows qualifier to include non-current revisions. When updating _current any other existing current row with the same _id will be reset.
Group management commands
Example:
create group "<groupname>" owner "<user_or_group_name>"
Creates the indicated user group. Group names must be off the form
"group-<name>".
Example:
alter group "<groupname>" owner "<user_or_group_name>"
Changes the owner of the specified group.
Example:
drop group "<groupname>"
Deletes the specified group.
Example:
alter group "<groupname>" add member "<username>"
Adds "<username>" to group "<groupname>".
Example:
alter group "<groupname>" delete member "<username>"
Deletes "<username>" from group "<groupname>".
Account management commands
A user has a website access password, manipulated using the create/alter account commands, and a set of zero or more FreeTable database access authorizations, manipulated using the create/drop auth commands.Accounts need to be first created, and then verified before they can be used. Accounts can't be dropped. This is because the account name may be used by other system components, such as resource usage, and so were an account to be dropped and re-created the old usage data would erroneously get bound to the new account.
Example:
create account "<username>" identified by "<password>" email "<email>"
Create an account for the indicated user using the indicated website
access password, email address, and account verification status of
false. A single username may have only one account. This command
creates both the information_schema.account table entry and the
information_schema.account_private table entry for the new user
account. Set "<password>" to null to disable web logins to the
account. Set "<email>" to null to for no email address.
Website access passwords must be eight or more alphanumeric characters in length, contain at least one letter, and at least one digit, and are case insensitive.
Only group-account may execute this command.
Example:
alter account "<username>" [ identified by "<password>" ] [ email "<email>" ] [ verified ]
Alters the website access password, email address, or verification
status associated with the account. Set "<password>" to null to
disable web logins to the account. Set "<email>" to null to for
no email address.
Only group-account and user <username> may execute this command. Only group-account may employ the verified sub-command.
Example:
create auth "<username>" identified by "<password>" [ description "<description>" ]
Add a db access authorization for the indicated user using the
indicated password. A single user may have many db access passwords
shared by different sets of machines. Description is an optional
textual description which may be used to keep track of the different
passwords.
Only group-account may execute this command.
Db access passwords are not the same as the website access password.
Example:
drop auth <auth_id>
Delete the indicated db access authorization. <auth_id> is
contained in the information_schema.auth table.
Only group-account and the user to which the auth corresponds may execute this command.
Performance tuning commands
The query log keeps track of slow queries in order to facilitate performance tuning.The following commands can only be performed by the table curator and group-root.
show query log for <table>
This command displays information from the query log for
<table>.
reset query log for <table>
This clears the query log for <table>.
Administrative commands
Example:
show tables
This commmand is a synonym for:
select table_name from information_schema.table_info where table_namespace="<username>"
and displays all of the tables in the current user's namespace.
Example:
describe <table>
This command is a synonym for:
select * from information_schema.columns where table_name="<table>"
and displays information regarding the columns of a particular table.
Example:
rebuild directory table for database "<db_name>"
rebuild info table for { all | table "<table>" }
rebuild { normal | range | fulltext } indexes for { database "<db_name>" | table "<table>" }
These commands may only be executed by group-root and are for internal
use only.
Information schema
The information schema tables provide access to meta-information about the the database. They can both be selected, and insert/update/deleted.The information schema commands are important because some things, such as making a table be non-curated can only be done by manipulating the information schema.
information_schema.tables table
information_schema.tables records information about specific tables.Example:
update information_schema.tables set {"curated": false, "perm": "siu--usius--"} where table_name='testuser.test'
Only the table curators and group-root can update/delete the values in this table.
For this table, all commands, including insert, must include a where table_name="<table>" clause.
The following fields are defined:
- version integer not null. The version number of the public and private schema describing the layout of the table. Only group-root can change the version.
- state string not null. This field stores whether the table is "online", "offline", or "read_only". "offline" and "read_only" are used while the table is being maintained or migrated. Only group-root can change the state. Changes to this value may take up to 10 seconds to take effect.
- size_limit integer not null. This field stores the maximum amount of memory that can be taken up by the table in bytes. A size_limit of -1 means unlimited. Only group-root can change the size_limit.
- row_size_limit integer not null. This field stores the maximum amount of memory that can be taken up by the json storage for a single row of the table in bytes. A size_limit of -1 means unlimited. Only group-root can change the row_size_limit.
- sponsor string not null. The user fiscally responsible for the table. Only group-root can change the sponsor.
- curator array not null. The user or group names of the curators of the table.
- curated boolean not null. This boolean indicates whether the table is curated, and changes made to it other than by the table curators need to be accepted by the table curators, or if changes made to it happen immediately.
- notify array not null. The user or group names of the parties to be notified when a table notification event occurs.
- notify_curate boolean not null. Whether to provide notice of newly pending curated data.
- notify_curate_last float not null. Time stamp of the last curation notification.
- history boolean not null. This boolean indicates whether old revisions of each row are kept or immediately deleted.
- visible boolean not null. This boolean indicates whether newly inserted physical records should be marked as visible.
- perm string not null. Permissions associated with the table. Changes to this value may take up to 10 seconds to take effect.
- range_table boolean not null. Whether this table has string memory storage columns named start_ip and end_ip and supports the ip_in_range() functions. This value can only be changed by creating or dropping an ip_range index on the table. Changes to this value may take up to 10 seconds to take effect.
- advertise boolean not null. Whether to publicize the existence of this table. Note, a table can be discovered whether or not it is officially publicized.
- license_ids_required array not null. A list of lists of textual
identifiers indicating the minimum licensing terms required for new
physical records. There is a choice between sub-lists, each of which
lists a set of required identifiers. For instance [["L1"], ["L2",
"L3"]] denotes new physical records must be licensed under L1 or (L2
and L3). At the time of writing valid values are:
- Public_Domain - public domain
- CC-by-3.0 - Creative Commons Attribution License 3.0
- CC-by-sa-3.0 - Creative Commons Attribution Share Alike License 3.0
- CC-by-nd-3.0 - Creative Commons Attribution No Derivatives License 3.0
- ODC-DbCL-1.0 - Open Data Commons Database Contents License 1.0
- Other_OKD_Compliant - Other Open Knowledge Definition compliant
- license_ids array not null. A list of textual identifiers
indicating the database licensing terms of the table as a whole.
Multiple values indicate choice. At the time of writing valid values
are:
- Public_Domain - public domain
- ODC-By-1.0 - Open Data Commons Attribution License 1.0
- ODC-ODbL-1.0 - Open Data Commons Open Database License 1.0
- Other_OKD_Compliant - Other Open Knowledge Definition compliant
- license_notices string not null storage json. Any copyright notices, attribution notices or other notices required under the licenses under which the table is made available. These are notices applying to the table itself. Record licensing notices are recorded separately. Plain text.
- table_title string not null storage json. A brief textual title for the table. Plain text.
- table_comment string not null storage json. A textual description of the table. Plain text.
information_schema.columns table
information_schema.columns records information about the columns of a specific table.Example:
insert [{...}] into information_schema.columns where table_name='<table>'
Only the table curators and group-root can update/delete the values in this table.
For this table, all commands, including insert, must include a where table_name="<table>" clause.
The following fields are defined:
- column_name string not null. The name of the column.
- ordinal_position integer not null. The position of this column in the table starting at 1.
- column_default string. The default value for the column expressed as a JSON string. Null if no default, and a value is required.
- is_nullable boolean not null. Boolean indicating whether the column is nullable.
- data_type string not null. The type of the column. Allowed values: boolean, integer, float, string, array, object (hash/dict/associative array), json (arbitrarily typed data).
- storage string not null. The storage class for the column: "memory" (stored in the table space and JSON), or "json" (stored only in JSON). Memory columns can be used in where clauses, indexes, and in aggregates such as min/max/sum.
- column_comment string not null. A brief textual description of the column. Plain text.
information_schema.indexes table
This table defines the indexes on a table. Normally the alter table add/drop index commands will be used instead of accessing this table directly. Only the table curators and group-root can update/delete the values in this table.For this table, all commands, including insert, must include a where table_name="<table>" clause.
The following fields are defined:
- index_name string not null. The name of the index. Local to this table.
- column_names string not null. A comma separated list of the column names making up the index. An implicit hidden index on _current is always present and never needs to be specified.
- index_type string not null. Currently always has the value "normal".
- index_comment string not null. A brief textual description of the index. Plain text.
information_schema.table_db table
Internal use only and subject to change.information_schema.db_host table
Internal use only and subject to change.information_schema.curator_info
Internal use only and subject to change.information_schema.table_info table
Because the information_schema.tables table requires a where table_name="<table>" clause, there is no way to query it to discover all the tables in the system. That is the purpose of the information_schema.table_info table.- table_name string not null. The full name of the table.
- table_title string not null storage json. A brief textual title for the table. Plain text.
- table_comment string not null storage json. A textual description of the table. Plain text.
information_schema.alias table
This table defines alias names for other tables in the system. It is read only, except for users in group-root.- alias string not null. Alias under which this table is known.
- actual_name string not null. Actual name for this table.
information_schema.account table
This table defines per user account information.- user string not null. The username for this account.
- email_address email. Email address of the user if any.
information_schema.account_restricted table
This table defines per user account information. It is only readable by group-account.Internal use only and subject to change.
information_schema.account_private table
This table defines per user account information, such as the login account password. It is only accessible to users in group-root.Internal use only and subject to change.
Updates and deletes may take up to 60 seconds to take effect. Inserts are instantaneous.
information_schema.auth table
This table defines the website accessible components of the db access passwords associated with a user. It is readable by group-account, but only writable by users in group-root.Internal use only and subject to change.
information_schema.auth_private table
This table defines the db access passwords associated with a user. It is only accessible to users in group-root.The columns in this table and are for internal use only and subject to change. Inserts should be performed using the create auth command.
Updates and deletes may take up to 60 seconds to take effect. Inserts are instantaneous.
information_schema.resource
This table records periodic resource usage. Logically separate tables are kept for resource usage by non-anonymous users, anonymous users, and resource usage associated with commands operating on tables. These are all stored in a the same physical table.Internal use only and subject to change.
This table is read only, except for users in group-bill.
information_schema.resource table
This table records periodic table storage resource usage for each table.Internal use only and subject to change.
This table is read only, except for users in group-bill.
information_schema.bill_info table
Internal use only and subject to change.information_schema.group
This table defines ownership of groups, which defines which users are permitted to alter or drop the group. It is read only except by users in group-root.- group string not null. The name of the group.
- owner string not null. The user or group name of the owner of the group.
information_schema.group_membership table
This table defines which users belong to which groups. It is read only, except for users in group-root. Changes to this table can take up to 10 seconds to take effect.The following fields are defined:
- group string not null. The name of the group.
- member string not null. The user name of the member of the group.
information_schema.ip_blacklist table
This table is used to prevent abuse by blacklisting or limiting access to the system by certain IP address ranges. This table is intended for internal use, not as a general purpose IP address list blacklist.- start_ip string not null. IP start of blacklisted range. e.g. "10.0.0.1".
- end_ip string not null. IP end of blacklisted range (inclusive).
- comment string not null storage json. Notes on this range.
information_schema.email table
To prevent spamming, all email addresses in the system should be consolidated in this table that is selectable only by group-root, insertable by non-anonymous users, and updatable by the record contact. On insertion of an email address the last_insert_id field should be stored in a field of the relevant table, typically named, email_id. The system then requires the solving of a CAPTCHA as a function of email_id before providing a generated email address that can be used for forwarding a single email.- email string not null. Email address.
- expire float. Expiration time of this email address or null if none.
information_schema.query
In order to help with performance tuning, the system maintains a cache of the 100 slowest running transactions for each table.- user string. User from which the query was received.
- ip_address string not null. IP address from which the query was received.
- start float not null. Time the query was received.
- query string not null. The query executed.
- table_name string. The table against which the query executed.
- compute_units float not null. Normalized compute time taken up by the query.
Insertions, updates, and deletions to rows in this table may take up to 60 seconds to have an impact on the system query caching functionality.
Copyright (C) 2010 Gordon Irlam. Creative Commons Attribution-Share Alike 3.0 United States License. http://creativecommons.org/licenses/by-sa/3.0/us/