Opportunity Methodology

Opportunity Index

The following presents the methodology and indicators for the Connecticut opportunity analysis. To download the data, go here (for Z-score indexed data) or here (for original base data). For more on the use of Opportunity Mapping in Connecticut, see the websites for the Open Communities Alliance, the Kirwan Institute and the Connecticut Fair Housing Center.

What is opportunity?

For this analysis, opportunity is defined as environmental conditions or resources that are conducive to healthier, vibrant communities and are associated with success in life, defined in a variety of ways. Indicators could either be impediments to opportunity (which are analyzed as negative neighborhood factors, e.g., high neighborhood poverty) or conduits to opportunity (which are analyzed as positive factors, e.g., access to an abundance of jobs). 

To map opportunity in the region, we use variables that are indicative of high and low opportunity. High opportunity indicators include the availability of sustainable employment, high-performing schools, a safe environment, and safe neighborhoods. A central requirement of indicator selection is a clear connection between the indicator and opportunity. 

Data Sources

Spatial distribution of opportunity is based on indicators categorized under three sub areas of opportunity: 

  • Educational
  • Economic
  • Neighborhood/Housing quality 

The comprehensive opportunity map represents the combined score based on these three sub-areas. 

The 2009 Opportunity Index for Connecticut was created by the Kirwan Institute working in partnership with the CT Fair Housing Center, and was based on 10 variables from public data sources; details can be found in appendix B and C here.

This updated 2014 Opportunity Index for Connecticut was created by the CT Open Communities Alliance and contributors, and is based on 12 variables from similar public data sources, as described below.

Census data for neighborhood variables

The index uses census tracts as a proxy for 'neighborhoods,' which restricts are reported as 5-year estimates from the American Community Survey. The acs.R package uses the Census API to download data by tract for the entire state for each of these seven variables. For this project, the 2008 - 2012 5-year estimates are reported, but the script could be updated for new years as data becomes available.

To ensure that each of the variables are oriented in the same 'direction' (more homeownership is 'good,' while more poverty is 'bad'), the public assistance, poverty, unemployment and vacancy rates are converted to their inverse percentages (i.e. 1 - rate).


Town data: test scores

The variables for math and reading test scores and job growth aren't publicly available at the neighborhood level. 

Math and reading scores for Connecticut are reported by the State Department of Education at the school and district level. Since many children do not attend neighborhood schools, even if data were readily available by neighborhood, it may not accurately represent the academic performance for students residing in that neighborhood. As a proxy, the index uses the average scale scores for the local school district in each town. Average scale scores take into account the performance of all students, not just those crossing a particular threshold. The Connecticut index uses 3rd grade reading and math scores as a standard milestone indicator for education

A few smaller districts do not have 2013 reports for math and reading test scores, so the most recent year available was used instead. Cornwall and Union did not have data for any of the past seven years and thus don't report values for this variable. Scores for regional school districts were manually assigned to each town in that region, using the assignment here

Town data: economic climate and crime

"Economic climate" was defined for the original Opportunity Index as "the change in jobs within 5 miles from 2005 to 2008," using data from ESRI Business Analyst. In order to not rely on proprietary data sources, like Business Analyst, this index relies on data from the Quarterly Census of Earnings and Wages series from the Bureau of Labor Statistics. Data from this series is a direct census of employment from wage records reported by town. The index uses 2009 to 2012 as the timeframe, as the most recent available at the time of this update. 

As in the prior Opportunity Index, the job change data has some outlier values, particularly for small towns (for example, Barkhamsted, where employment doubled from 616 to 1145 people over the three years). These are noticeable in the summary stats reported below, but the effect of this should be minimized when combined with the other index components that are largely uncorrelated with this measure of economic climate.

Crime rates are reported by local authorities to the Uniform Crime Reports database. Rates are calculated as the number of incidents in a town divided by the current population of the town. For this index, the 2010 crime rates are used as the most recent readily-available for the state.

Employment Access and Diversity Indices

The final two variables provide new measures for access to employment and the diversity of local job markets. Data for both of these indices are drawn from the Location Affordability Index (LAI). LAI values are reported at the block group level for metro areas in Connecticut, but for the Opportunity Index the metro-level results are combined and aggregated at the census tract level in order to combine with the other variables. 

The methodology for calculating access to employment and jobs diversity is described more fully in the LAI documentation

The employment access index replaces the average commute time variable from the previous Opportunity Mapping effort in Connecticut. The jobs access index measures potential access to jobs - indicating opportunity - rather than the actual commute times experienced by currently employed residents. The index is calculated as the number of jobs in a block group, divided by the squared distance to that block group - jobs that are closer to a given neighborhood are thus weighted more highly than jobs that are distant from that neighborhood. 

The jobs diversity index looks at the correlation between 20 major job sectors - areas with higher concentration in a few sectors are reported as having lower diversity. For instance, in Connecticut, parts of Fairfield County with a high concentration of employment in finance and insurance are reported with relatively low levels of job diversity.

Results for components of the Opportunity Index

Below are summary stats for the components of the index:

% adults with college degree  % not receiving public assistance  % not in poverty  % employed  % living in owner-occupied housing  % housing that is not vacant  Employment access index 
Min. :0.142  Min. :0.202  Min. :0.000  Min. :0.615  Min. :0.000  Min. :0.000  Min. : 3928 
1st Qu.:0.518  1st Qu.:0.850  1st Qu.:0.860  1st Qu.:0.890  1st Qu.:0.526  1st Qu.:0.885  1st Qu.: 11162 
Median :0.666  Median :0.944  Median :0.936  Median :0.926  Median :0.793  Median :0.930  Median : 19149 
Mean :0.646  Mean :0.884  Mean :0.890  Mean :0.910  Mean :0.692  Mean :0.912  Mean : 23248 
3rd Qu.:0.780  3rd Qu.:0.974  3rd Qu.:0.967  3rd Qu.:0.945  3rd Qu.:0.907  3rd Qu.:0.959  3rd Qu.: 29170 
Max. :1.000  Max. :1.000  Max. :1.000  Max. :1.000  Max. :1.000  Max. :1.000  Max. :113840 
NA's :6  NA's :8  NA's :8  NA's :8  NA's :8  NA's :7  NA's :6


Job diversity index  % change in jobs (2009-12)  3rd grade math, avg. scale scores  3rd grade reading, avg. scale scores  Lack of crime (1 - rate) 
Min. :1719  Min. :-0.367  Min. :212  Min. :208  Min. :0.929 
1st Qu.:2219  1st Qu.:-0.011  1st Qu.:239  1st Qu.:228  1st Qu.:0.967 
Median :2349  Median : 0.013  Median :255  Median :240  Median :0.979 
Mean :2468  Mean : 0.012  Mean :254  Mean :241  Mean :0.975 
3rd Qu.:2567  3rd Qu.: 0.034  3rd Qu.:271  3rd Qu.:257  3rd Qu.:0.988 
Max. :5053  Max. : 0.858  Max. :298  Max. :279  Max. :0.996 
NA's :6  NA's :6  NA's :7  NA's :8  NA's :5 


To visualize the results for each of the variables, we can map each for the state. Several variables - like poverty, public assistance, unemployment - show similar patterns across tracts, while job growth and commute times are less similar. In each case, the darker shades of orange highlight areas doing 'better' on that variable.

plot of chunk unnamed-chunk-2 

plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2

The patterns in each map correspond to the distribution of values for these indicators across the state. 

Another way to see the same patterns is to plot the distribution for the components across the 833 census tracts in Connecticut. For instance, the map of employment access shows many areas of relatively low access to employment, with concentrations of higher access to jobs along the Metro North corridor and around Hartford and I-91. That concentration is reflected in the relatively unequal distribution plotted below. 

One can see that most variables do not have 'bell-curve' shaped distributions. Rather, several are skewed, which reflects the general concentration of poverty, public assistance and related variables in a small set of neighborhoods within the state.


plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3


Calculating z-scores for the index

The distribution of values shown above is important because it directly influences how the index is calculated. The Opportunity Index uses z-scores to scale the component variables and calculate the index. 

Z-scores are a way to standardize data by reporting how many standard deviations an observation is from the average value. The interpretation of the z-scores depends on how the data are distributed. If data are distributed normally ('bell-curve' style), the z-scores can tell us roughly how much of the data is below or above a certain z-score. You can then also compare z-scores for different bell-curve-shaped data sets - the z-scores mean the same thing if the underlying distributions have the same shape. 

If the data are not normally distributed - if, for instance, they are skewed or there are multiple modes in the data - then the z-scores can be harder to interpret. And it's also harder to compare the z-scores across variables - a z-score of 2 for poverty doesn't mean the same thing as a z-score of 2 for reading test scores if they don't have the same-shaped distribution. 

This matters since the opportunity index is calculated using the average z-scores across all of the variables. The OECD guide to composite indicators notes that using z-scores means that "indicators with extreme values thus have a greater effect on the composite indicator." That can be an issue in a state with a high degree of inequality and concentration of poverty. If the variables have different distributions, then the z-scores will have different ranges and the z-scores won't have the same interpretation or influence on the final index values. 

The charts below show the standardized results for each variable. The z-scores between -/+2 standard deviations are shown for each variable. Variables like poverty, public assistance, unemployment tend to have similar shapes and are skewed positive - there are many above-average tracts, but a long tail of tracts with below-average scores on these variables.


plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4plot of chunk unnamed-chunk-4


We can then calculate the opportunity index as the average of the z-scores of the individual variables. The map below shows the updated index for the state. 

plot of chunk unnamed-chunk-5

The Kirwan Institute mapping uses quintiles to color the maps, which means 1/5th of the tracts will fall into each color category. 

Another way to see this is to plot the distribution of the index values for the tracts, including the breakpoints. The chart below shows the breakpoints using the quintiles. (Again, the overall distribution is skewed positive.)


plot of chunk unnamed-chunk-6

Using quintiles means that roughly 20 percent of the population will always live in high opportunity areas (since census tracts have roughly similar population).

What is driving the Opportunity Index?

With a composite index, it helps to see if specific variables are playing more of a role in determining the final index values. 

As a start, we know that many of the variables are correlated with each other - the correlation matrix below shows that several of the variables - poverty, public assistance, etc. - are correlated with each other. Job change (economic climate) has almost no correlation with any of the variables.


  % adults with college degree  % not receiving public assist-ance  % not in poverty  % employ -ed  % living in owner-occupied housing  % housing that is not vacant 

Employ -ment access index 

Job diver -sity index  % change in jobs (2009-12)  3rd grade math, avg. scale scores  3rd grade reading, avg. scale scores  Lack of crime (1 - rate) 
% adults with college degree  1.00  0.77  0.65  0.69  0.66  0.30  -0.34  -0.23  0.05  0.69  0.70  0.53 
% not receiving public assistance  0.77  1.00  0.88  0.80  0.81  0.45  -0.60  -0.29  0.02  0.66  0.64  0.67 
% not in poverty  0.65  0.88  1.00  0.73  0.84  0.56  -0.63  -0.35  0.02  0.64  0.62  0.65 
% employed  0.69  0.80  0.73  1.00  0.66  0.35  -0.52  -0.20  -0.03  0.54  0.53  0.54 
% living in owner-occupied housing  0.66  0.81  0.84  0.66  1.00  0.49  -0.65  -0.35  -0.00  0.66  0.64  0.64 
% housing that is not vacant  0.30  0.45  0.56  0.35  0.49  1.00  -0.24  -0.27  -0.02  0.32  0.29  0.29 
Employment access index  -0.34  -0.60  -0.63  -0.52  -0.65  -0.24  1.00  0.15  0.09  -0.48  -0.46  -0.67 
Job diversity index  -0.23  -0.29  -0.35  -0.20  -0.35  -0.27  0.15  1.00  -0.05  -0.37  -0.36  -0.41 
% change in jobs (2009-12)  0.05  0.02  0.02  -0.03  -0.00  -0.02  0.09  -0.05  1.00  0.03  0.05  0.00 
3rd grade math, avg. scale scores  0.69  0.66  0.64  0.54  0.66  0.32  -0.48  -0.37  0.03  1.00  0.96  0.77 
3rd grade reading, avg. scale scores  0.70  0.64  0.62  0.53  0.64  0.29  -0.46  -0.36  0.05  0.96  1.00  0.76 
Lack of crime (1 - rate)  0.53  0.67  0.65  0.54  0.64  0.29  -0.67  -0.41  0.00  0.77  0.76  1.00 

Principal components analysis is another way to see the key factors that determine the final index. A principal components analysis of the index data shows that the first principal component dominates the results - explaining 56 percent of the overall variance in the data (first bar in the graph, first column in the table).

  PC1  PC2  PC3  PC4  PC5  PC6  PC7  PC8  PC9  PC10  PC11  PC12 
Standard deviation  2.5833  1.0526  0.9919  0.9760  0.8644  0.7534  0.5534  0.4528  0.4384  0.3793  0.2920  0.1875 
Proportion of Variance  0.5561  0.0923  0.0820  0.0794  0.0623  0.0473  0.0255  0.0171  0.0160  0.0120  0.0071  0.0029 
Cumulative Proportion  0.5561  0.6484  0.7304  0.8098  0.8721  0.9194  0.9449  0.9620  0.9780  0.9900  0.9971  1.0000 

plot of chunk unnamed-chunk-8

We can look at the weights for each of the variables in the first principal component in the chart below. This shows that job growth has little influence on the first component (weight close to 0), while job diversity and access to employment offset some of the other variables (positive weight). Poverty, public assistance and owner-occupied housing have the strongest weights. In other words, access to jobs and job diversity are counterbalanced by areas with high poverty and low home-ownership - which roughly matches the patterns in the maps of the index components above. 

plot of chunk unnamed-chunk-9

Not surprisingly, many of the same variables have very skewed distributions across Connecticut neighborhoods, and hence a more extreme range of z-scores to factor into the overall index.

Current as of 2/23/15

Methodology by Scott Gaul

  • Open Communities Alliance
  • 75 Charter Oak Avenue
  • Suite 1-210
  • Hartford, CT 06106
  • Phone: 860-610-6040