Skip to contents

Randomized evaluation in Kenyan primary schools, focusing on student test scores, demographic information, and class characteristics. The dataset includes 5,795 observations with information on endline and follow-up test scores.

Usage

ddk2011

Format

A data frame with 5,795 observations and 62 variables.

pupilid

ID of student

schoolid

ID of primary school

district

District

bungoma

Indicator if school is located in Bungoma District

bungoma_num

Bungoma indicator as numeric variable

division

Division

zone

Zone

tracking

Indicator if school is sampled for tracking

tracking_num

Tracking indicator as numeric variable

sbm

Indicator if school is sampled for School-Based Management

sbm_num

SBM indicator as numeric variable

girl

Sex of student (1 if female, 0 otherwise)

girl_num

Girl indicator as numeric variable

agetest

Age of student at time of test

etpteacher

Indicator if student is assigned to a contract teacher

etpteacher_num

ETP teacher indicator as numeric variable

lowstream

Indicator if student is assigned to lower-ability section (in tracking schools)

lowstream_num

Lowstream indicator as numeric variable

stream_meanpercentile

Mean standardized percentile of classmates at baseline

sdstream_std_mark

Standard deviation of baseline scores within stream (including own score)

meanstream_std_mark

Mean of baseline scores within stream (including own score)

bottomhalf

Indicator if student is in the bottom half of initial distribution

bottomhalf_num

Bottomhalf indicator as numeric variable

tophalf

Indicator if student is in the top half of initial distribution

tophalf_num

Tophalf indicator as numeric variable

bottomquarter

Indicator if student is in the bottom quarter of initial distribution

bottomquarter_num

Bottomquarter indicator as numeric variable

secondquarter

Indicator if student is in the second quarter of initial distribution

secondquarter_num

Secondquarter indicator as numeric variable

thirdquarter

Indicator if student is in the third quarter of initial distribution

thirdquarter_num

Thirdquarter indicator as numeric variable

topquarter

Indicator if student is in the top quarter of initial distribution

topquarter_num

Topquarter indicator as numeric variable

std_mark

Student's standardized mark in baseline exam

percentile

Student’s percentile in initial distribution

realpercentile

Student’s percentile in initial distribution (integer values)

quantile5p

Student's 20-quantile at baseline

attrition

Indicator if student was absent for endline test (Fall 2006)

attrition_num

Attrition indicator as numeric variable

wordscore

Endline score on word recognition (max: 24)

sentscore

Endline score on sentence recognition (max: 40)

letterscore

Endline score on letter recognition (max: 70)

spellscore

Endline score on spelling (max: 10)

sentscore24

Rescaled endline score on sentence recognition (0-24 scale)

letterscore24

Rescaled endline score on letter recognition (0-24 scale)

spellscore24

Rescaled endline score on spelling (0-24 scale)

litscore

Total endline score on literacy test

additions_score

Endline score on additions section

substractions_score

Endline score on subtractions section

multiplications_score

Endline score on multiplications section

mathscoreraw

Total endline score on math test

totalscore

Total endline score

rmeanstream_std_baselinemark

Peers' mean score at baseline, excluding own score

rsdstream_std_baselinemark

Peers' standard deviation in baseline score, excluding own score

rmeanstream_std_total

Peers' mean total score at endline (Fall 2006)

rsdstream_std_total

Peers' standard deviation in total score at endline

rmeanstream_std_math

Peers' mean math score at endline

rsdstream_std_math

Peers' standard deviation in math score at endline

rmeanstream_std_lit

Peers' mean literacy score at endline

rsdstream_std_lit

Peers' standard deviation in literacy score at endline

r2_attrition

Indicator if student was absent at long-term follow-up test (Fall 2007)

r2_attrition_num

R2 attrition indicator as numeric variable

r2_age

Age of student at long-term follow-up test

r2_wordscore

Score on word recognition at long-term follow-up

r2_sentscore

Score on sentence recognition at long-term follow-up

r2_letterscore

Score on letter recognition at long-term follow-up

r2_spellscore

Score on spelling at long-term follow-up

r2_sentscore24

Rescaled score on sentence recognition at long-term follow-up

r2_letterscore24

Rescaled score on letter recognition at long-term follow-up

r2_spellscore24

Rescaled score on spelling at long-term follow-up

r2_litscore

Total literacy score at long-term follow-up

r2_mathscoreraw

Total math score at long-term follow-up

r2_additions_score

Score on additions section at long-term follow-up

r2_substractions_score

Score on subtractions section at long-term follow-up

r2_multiplications_score

Score on multiplications section at long-term follow-up

r2_totalscore

Total score at long-term follow-up

Source

Duflo, E., Dupas, P., & Kremer, M. (2011). "Peer Effects, Teacher Incentives, and the Impact of Tracking: Evidence from a Randomized Evaluation in Kenya." American Economic Review, 101(5), 1739-1774. Data available at https://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl:1902.1/16787