MLS - Aggregated database

This database contains all the databases of libre software projects generated with MailingListStat tool.

The tables schema is as shown:

 MLS Aggregated DB

Tables Descriptions

* projects This table contains general information about projects downloaded using the MLStats tool.

Name Data Type Description Key
project_IDINTEGER Project unique identifierPK
name VARCHAR(255) Project name
dbname VARCHAR(255) Database name

* datasource Table to store information about the retrieval process such as the tool executed, its version, the creation date and other.

Name Data Type Description Key
datasource_ID INTEGERDatasource unique identifierPK
project_ID INTEGER Project identifier
tool VARCHAR(255)Name of the tool
tool_version VARCHAR(255)Tool version
datasource VARCHAR(255) Location of the data sources
datasource_info TEXT Access parameters to the data sources
creation_date DATETIMEDate of creation of the database
last_modificationDATETIMEDate of the last modification of the database

* mailing_lists_messages This table contains the relationship between the table projects, mailing list and messages. Exist messages that belong to differents projects. It con- trols and stores data.

Name Data Type Description Key
datasource_IDINTEGERDatasource identifierPK
mailing_list_urlVARCHAR(255)Mailing list url identifierPK
message_IDVARCHAR(255)Message identifierPK
mailing_listINTEGERMailing list identifier

* compressed files This table contains a register for each archive file that has been down- loaded, or tried to download.

Name Data Type Description Key
datasource_IDINTEGERDatasource identifierPK
url VARCHAR URL of the file PK
mailing_list_url VARCHAR URL of the web archives of the mailing list where this file belongs to FK
status ENUM Either visited, new or failed
last_analysis DATETIME Date and time of the last analysis of this time

* mailing lists This table contains a register for each different mailing list analysed.

Name Data Type Description Key
datasource_ID INTEGERDatasource unique identifierPK
mailing_list_url VARCHAR(255) URL of the archives web page PK
mailing_list_name VARCHAR(255) Name of the mailing list, as it appears in the headers of the messages
project_name VARCHAR(255) Name of the software project were this list belongs to. Taken from the email address of the mailing list
last_analysis DATETIME Date and time of the last analysis performed on this mailing list

* mailing lists people This table joins the table mailing lists and people, making possible to search for people grouping by different mailing lists.

Name Data Type Description Key
datasource_IDINTEGERDatasource identifierPK
people_ID INTEGER People unique identifier PK
mailing_list_url VARCHAR(255) URL of the mailing list archives web page PK

* messages This table contains a register for each message in the mailing list archives. It contains all the information in the headers plus the message itself.

Name Data Type Description Key
message_id VARCHAR(255) Unique identifier assigned by the mailing list manager PK
mailing_list_url VARCHAR(255) URL of the archives web page of the mailing list FK
mailing_list VARCHAR(255) Name and address of the mailing list
first_date DATETIME Local date written in the message by the original sender
first_date_tz INTEGER Time zone of the above date
arrival_date DATETIME Local time of the server that received the message
arrival_date_tz INTEGER Time zone of the above date
subject VARCHAR(255) Subject of the message
message_body TEXT Main text of the message
mail_path TEXT Mail path
is_response_of VARCHAR(255) If this message is a reply of another, this is the id of the original message FK

* messages people This is a table establish the relationship between email addresses and messages.

Name Data Type Description Key
message_id VARCHAR(255) Id of the message where that person appears PK
people_ID VARCHAR(255) People unique identifier PK
type_of_recipient ENUM Either To, Cc or Bcc PK

* people This table contains a register for each one of the people who has written a message to the mailing list, or at least appears as destination in a message that has been sent to the mailing list.

Name Data Type Description Key
people_ID INTEGER People unique identifier PK
email_address VARCHAR(255) Email address of the person
name VARCHAR(255) Name (if appears in the header)
user_name VARCHAR(255) The first part (before the @) of the email address
domain_name VARCHAR(255) The second part (after the @) of the email address
top_level_domain VARCHAR(255) Top level domain of the email address (.com, .org, .es, etc)
 
aggregated_databases_for_mls.txt · Last modified: 2009/12/10 16:36 by lilitovar
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki