MLS - MailingListStats

* MLS MailingListStats

MalingListStats is capable of parsing electronic mail messages from different mboxes and stores the results in a SQL database. The data stored is basically extracted from the headers of the massages and specifically corresponds to the senders and receivers, the subjects, the send and receive dates and the content of the messages. It also stores information related to the mailing list such as the number of received messages and the people that write to the list.

The tables schema is as shown:

MailingListStats schema

Tables Descriptions

All the fields of these tables are also described in a socio-economic context in MLS_METRICS

* compressed files This table contains a register for each archive file that has been down- loaded, or tried to download.

Name Data Type Description Key
url VARCHAR URL of the file PK
mailing_list_url VARCHAR URL of the web archives of the mailing list where this file belongs to FK
status ENUM Either visited, new or failed
last_analysis DATETIME Date and time of the last analysis of this time

* mailing lists This table contains a register for each different mailing list analysed.

Name Data Type Description Key
mailing_list_url VARCHAR(255) URL of the archives web page PK
mailing_list_name VARCHAR(255) Name of the mailing list, as it appears in the headers of the messages
project_name VARCHAR(255) Name of the software project were this list belongs to. Taken from the email address of the mailing list
last_analysis DATETIME Date and time of the last analysis performed on this mailing list

* mailing lists people This table joins the table mailing lists and people, making possible to search for people grouping by different mailing lists.

Name Data Type Description Key
people_ID INTEGER People unique identifier PK
mailing_list_url VARCHAR(255) URL of the mailing list archives web page PK

* messages This table contains a register for each message in the mailing list archives. It contains all the information in the headers plus the message itself.

Name Data Type Description Key
message_id VARCHAR(255) Unique identifier assigned by the mailing list manager PK
mailing_list_url VARCHAR(255) URL of the archives web page of the mailing list FK
mailing_list VARCHAR(255) Name and address of the mailing list
first_date DATETIME Local date written in the message by the original sender
first_date_tz INTEGER Time zone of the above date
arrival_date DATETIME Local time of the server that received the message
arrival_date_tz INTEGER Time zone of the above date
subject VARCHAR(255) Subject of the message
message_body TEXT Main text of the message
mail_path TEXT Mail path
is_response_of VARCHAR(255) If this message is a reply of another, this is the id of the original message FK

* messages people This is a table establish the relationship between email addresses and messages.

Name Data Type Description Key
message_id VARCHAR(255) Id of the message where that person appears PK
people_ID VARCHAR(255) People unique identifier PK
type_of_recipient ENUM Either To, Cc or Bcc PK

* people This table contains a register for each one of the people who has written a message to the mailing list, or at least appears as destination in a message that has been sent to the mailing list.

Name Data Type Description Key
people_ID INTEGER People unique identifier PK
email_address VARCHAR(255) Email address of the person
name VARCHAR(255) Name (if appears in the header)
user_name VARCHAR(255) The first part (before the @) of the email address
domain_name VARCHAR(255) The second part (after the @) of the email address
top_level_domain VARCHAR(255) Top level domain of the email address (.com, .org, .es, etc)

* tool info Table to store information about the retrieval process such as the tool executed, its version and the creation date.

Name Data Type Description Key
project VARCHAR(255) Project name PK
tool VARCHAR(255) Name of the tool
tool_version VARCHAR(255) Tool version
datasource VARCHAR(255) Location of the datasources
datasource_info TEXT Access parameters to the datasources
created_date DATETIME Date of the database creation
last_modification TIMESTAMP Date of the last modification of the database
 
mls.txt · Last modified: 2009/12/10 19:39 by matinte
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki