This database contains all the databases of libre software projects generated with MailingListStat tool.
The tables schema is as shown:
* projects This table contains general information about projects downloaded using the MLStats tool.
| Name | Data Type | Description | Key |
|---|---|---|---|
| project_ID | INTEGER | Project unique identifier | PK |
| name | VARCHAR(255) | Project name | |
| dbname | VARCHAR(255) | Database name |
* datasource Table to store information about the retrieval process such as the tool executed, its version, the creation date and other.
| Name | Data Type | Description | Key |
|---|---|---|---|
| datasource_ID | INTEGER | Datasource unique identifier | PK |
| project_ID | INTEGER | Project identifier | |
| tool | VARCHAR(255) | Name of the tool | |
| tool_version | VARCHAR(255) | Tool version | |
| datasource | VARCHAR(255) | Location of the data sources | |
| datasource_info | TEXT | Access parameters to the data sources | |
| creation_date | DATETIME | Date of creation of the database | |
| last_modification | DATETIME | Date of the last modification of the database |
* mailing_lists_messages This table contains the relationship between the table projects, mailing list and messages. Exist messages that belong to differents projects. It con- trols and stores data.
| Name | Data Type | Description | Key |
|---|---|---|---|
| datasource_ID | INTEGER | Datasource identifier | PK |
| mailing_list_url | VARCHAR(255) | Mailing list url identifier | PK |
| message_ID | VARCHAR(255) | Message identifier | PK |
| mailing_list | INTEGER | Mailing list identifier |
* compressed files This table contains a register for each archive file that has been down- loaded, or tried to download.
| Name | Data Type | Description | Key |
|---|---|---|---|
| datasource_ID | INTEGER | Datasource identifier | PK |
| url | VARCHAR | URL of the file | PK |
| mailing_list_url | VARCHAR | URL of the web archives of the mailing list where this file belongs to | FK |
| status | ENUM | Either visited, new or failed | |
| last_analysis | DATETIME | Date and time of the last analysis of this time | |
* mailing lists This table contains a register for each different mailing list analysed.
| Name | Data Type | Description | Key |
|---|---|---|---|
| datasource_ID | INTEGER | Datasource unique identifier | PK |
| mailing_list_url | VARCHAR(255) | URL of the archives web page | PK |
| mailing_list_name | VARCHAR(255) | Name of the mailing list, as it appears in the headers of the messages | |
| project_name | VARCHAR(255) | Name of the software project were this list belongs to. Taken from the email address of the mailing list | |
| last_analysis | DATETIME | Date and time of the last analysis performed on this mailing list |
* mailing lists people This table joins the table mailing lists and people, making possible to search for people grouping by different mailing lists.
| Name | Data Type | Description | Key |
|---|---|---|---|
| datasource_ID | INTEGER | Datasource identifier | PK |
| people_ID | INTEGER | People unique identifier | PK |
| mailing_list_url | VARCHAR(255) | URL of the mailing list archives web page | PK |
* messages This table contains a register for each message in the mailing list archives. It contains all the information in the headers plus the message itself.
| Name | Data Type | Description | Key |
|---|---|---|---|
| message_id | VARCHAR(255) | Unique identifier assigned by the mailing list manager | PK |
| mailing_list_url | VARCHAR(255) | URL of the archives web page of the mailing list | FK |
| mailing_list | VARCHAR(255) | Name and address of the mailing list | |
| first_date | DATETIME | Local date written in the message by the original sender | |
| first_date_tz | INTEGER | Time zone of the above date | |
| arrival_date | DATETIME | Local time of the server that received the message | |
| arrival_date_tz | INTEGER | Time zone of the above date | |
| subject | VARCHAR(255) | Subject of the message | |
| message_body | TEXT | Main text of the message | |
| mail_path | TEXT | Mail path | |
| is_response_of | VARCHAR(255) | If this message is a reply of another, this is the id of the original message | FK |
* messages people This is a table establish the relationship between email addresses and messages.
| Name | Data Type | Description | Key |
|---|---|---|---|
| message_id | VARCHAR(255) | Id of the message where that person appears | PK |
| people_ID | VARCHAR(255) | People unique identifier | PK |
| type_of_recipient | ENUM | Either To, Cc or Bcc | PK |
* people This table contains a register for each one of the people who has written a message to the mailing list, or at least appears as destination in a message that has been sent to the mailing list.
| Name | Data Type | Description | Key |
|---|---|---|---|
| people_ID | INTEGER | People unique identifier | PK |
| email_address | VARCHAR(255) | Email address of the person | |
| name | VARCHAR(255) | Name (if appears in the header) | |
| user_name | VARCHAR(255) | The first part (before the @) of the email address | |
| domain_name | VARCHAR(255) | The second part (after the @) of the email address | |
| top_level_domain | VARCHAR(255) | Top level domain of the email address (.com, .org, .es, etc) |