Canadian Hansard Live Database

R
SQL
A ready to use SQL database that we can consult with dplyr.
Author

Mauricio “Pachá” Vargas S.

Published

July 27, 2023

Motivation

The original LiPad site provides a SQL dump that requires you to configure PostreSQL on your own laptop or server.

Because some classmates asked me about it, I decided to host the database to simplify its access.

Description

Databases.pacha.dev provides a complete the PostgreSQL database of the Canadian Hansard dataset, including supplementary data. It covers the Canadian Hansard from 1901 to 2019 and doesn’t incldue modifications to LiPad contents, it only saves you the step of configuring PostgreSQL.

Usage

You can adapt this code for R. Please see my post about the census to get an idea of other functions you can use with SQL.

library(dplyr)
library(RPostgres)

con <- dbConnect(
  Postgres(),
  user = Sys.getenv("dbedu_usr"),
  password = Sys.getenv("dbedu_pwd"),
  dbname = "canadianhansard",
  host = "databases.pacha.dev"
)

dbListTables(con)
 [1] "andablog_entryimage"         "contactme_contact_msg"      
 [3] "dilipadsite_datenav"         "dilipadsite_constituency"   
 [5] "dilipadsite_basehansard"     "auth_group"                 
 [7] "auth_group_permissions"      "auth_user_user_permissions" 
 [9] "auth_user_groups"            "better500s_caughterror"     
[11] "dilipadsite_basehansard_bak" "dilipadsite_blogger"        
[13] "auth_user"                   "auth_permission"            
[15] "dilipadsite_datenav_bak"     "dilipadsite_datepickle"     
[17] "dilipadsite_parlsess"        "dilipadsite_member"         
[19] "django_admin_log"            "django_migrations"          
[21] "django_content_type"         "dilipadsite_position"       
[23] "django_session"              "downtime_period"            
[25] "taggit_templatetags_amodel"  "taggit_taggeditem"          
[27] "taggit_tag"                  "andablog_entry"             
[29] "django_site"                 "dilipadsite_party"          
tbl(con, "dilipadsite_basehansard") %>% glimpse()
Rows: ??
Columns: 15
Database: postgres  [teacher@databases.pacha.dev:5432/canadianhansard]
$ basepk          <int> 2691572, 2691576, 514587, 14, 21, 4043, 514588, 3, 4, …
$ hid             <chr> "ca.proc.d.1970-11-13.28.1.89", "ca.proc.d.1970-11-13.…
$ speechdate      <date> 1970-11-13, 1970-11-13, 1918-05-20, 1901-02-06, 1901-…
$ pid             <chr> "7f5ea10f-f4c5-4350-b707-6c6522149e5f", "intervention"…
$ opid            <chr> "ca.m.3511", "", "ca.m.4097", "", "", "", "ca.m.4942",…
$ speakeroldname  <chr> "Mr. Cafik:", "", "Sir GEORGE FOSTER:", "", "", "", "M…
$ speakerposition <chr> "", "", "", "topic", "stagedirection", "", "", "topic"…
$ maintopic       <chr> "GOVERNMENT ORDERS", "GOVERNMENT ORDERS", "SUPPLY.-", …
$ subtopic        <chr> "PUBLIC ORDER (TEMPORARY MEASURES) ACT, 1970 PROVISION…
$ speechtext      <chr> "Let me point out to the hon. member for Matane that I…
$ speakerparty    <chr> "Liberal", "", "Conservative (1867-1942)", "", "", "",…
$ speakerriding   <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", ""…
$ speakername     <chr> "Norman Augustine Cafik", "An hon. Member:", "George G…
$ speakerurl      <chr> "http://www.parl.gc.ca/parlinfo/Files/Parliamentarian.…
$ subsubtopic     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
dbDisconnect(con)

Citing

For citation purposes, a full description of the project is available in the following article:

Beelen, K., Thijm, T. A., Cochrane, C., Halvemaan, K., Hirst, G., Kimmins, M., Lijbrink, S., Marx, M., Naderi, N., Rheault, L., Polyanovsky, R., and Whyte, T. (2017). “Digitization of the Canadian Parliamentary Debates.” Canadian Journal of Political Science, 50(3), 849–864. https://doi.org/10.1017/S0008423916001165.

Contributing

If you find this database to be useful, please consider donating on Buy me a Coffee. Additionally, I provide R and Python training services in English and Spanish language.