* MLS MailingListStats
MalingListStats is capable of parsing electronic mail messages from different mboxes and stores the results in a SQL database. The data stored is basically extracted from the headers of the massages and specifically corresponds to the senders and receivers, the subjects, the send and receive dates and the content of the messages. It also stores information related to the mailing list such as the number of received messages and the people that write to the list.
The tables schema is as shown:
All the fields of these tables are also described in a socio-economic context in MLS_METRICS
* compressed files This table contains a register for each archive file that has been down- loaded, or tried to download.
| Name | Data Type | Description | Key |
|---|---|---|---|
| url | VARCHAR | URL of the file | PK |
| mailing_list_url | VARCHAR | URL of the web archives of the mailing list where this file belongs to | FK |
| status | ENUM | Either visited, new or failed | |
| last_analysis | DATETIME | Date and time of the last analysis of this time |
* mailing lists This table contains a register for each different mailing list analysed.
| Name | Data Type | Description | Key |
|---|---|---|---|
| mailing_list_url | VARCHAR(255) | URL of the archives web page | PK |
| mailing_list_name | VARCHAR(255) | Name of the mailing list, as it appears in the headers of the messages | |
| project_name | VARCHAR(255) | Name of the software project were this list belongs to. Taken from the email address of the mailing list | |
| last_analysis | DATETIME | Date and time of the last analysis performed on this mailing list |
* mailing lists people This table joins the table mailing lists and people, making possible to search for people grouping by different mailing lists.
| Name | Data Type | Description | Key |
|---|---|---|---|
| people_ID | INTEGER | People unique identifier | PK |
| mailing_list_url | VARCHAR(255) | URL of the mailing list archives web page | PK |
* messages This table contains a register for each message in the mailing list archives. It contains all the information in the headers plus the message itself.
| Name | Data Type | Description | Key |
|---|---|---|---|
| message_id | VARCHAR(255) | Unique identifier assigned by the mailing list manager | PK |
| mailing_list_url | VARCHAR(255) | URL of the archives web page of the mailing list | FK |
| mailing_list | VARCHAR(255) | Name and address of the mailing list | |
| first_date | DATETIME | Local date written in the message by the original sender | |
| first_date_tz | INTEGER | Time zone of the above date | |
| arrival_date | DATETIME | Local time of the server that received the message | |
| arrival_date_tz | INTEGER | Time zone of the above date | |
| subject | VARCHAR(255) | Subject of the message | |
| message_body | TEXT | Main text of the message | |
| mail_path | TEXT | Mail path | |
| is_response_of | VARCHAR(255) | If this message is a reply of another, this is the id of the original message | FK |
* messages people This is a table establish the relationship between email addresses and messages.
| Name | Data Type | Description | Key |
|---|---|---|---|
| message_id | VARCHAR(255) | Id of the message where that person appears | PK |
| people_ID | VARCHAR(255) | People unique identifier | PK |
| type_of_recipient | ENUM | Either To, Cc or Bcc | PK |
* people This table contains a register for each one of the people who has written a message to the mailing list, or at least appears as destination in a message that has been sent to the mailing list.
| Name | Data Type | Description | Key |
|---|---|---|---|
| people_ID | INTEGER | People unique identifier | PK |
| email_address | VARCHAR(255) | Email address of the person | |
| name | VARCHAR(255) | Name (if appears in the header) | |
| user_name | VARCHAR(255) | The first part (before the @) of the email address | |
| domain_name | VARCHAR(255) | The second part (after the @) of the email address | |
| top_level_domain | VARCHAR(255) | Top level domain of the email address (.com, .org, .es, etc) |
* tool info Table to store information about the retrieval process such as the tool executed, its version and the creation date.
| Name | Data Type | Description | Key |
|---|---|---|---|
| project | VARCHAR(255) | Project name | PK |
| tool | VARCHAR(255) | Name of the tool | |
| tool_version | VARCHAR(255) | Tool version | |
| datasource | VARCHAR(255) | Location of the datasources | |
| datasource_info | TEXT | Access parameters to the datasources | |
| created_date | DATETIME | Date of the database creation | |
| last_modification | TIMESTAMP | Date of the last modification of the database |