Using Standards to Make Big Data Analytics That Work

Joshua Bleiberg and Darrell M. West

Big data is a group of statistical techniques that uncover patterns.  On their own, the correlations between individual data points have little substantive meaning.  To get actionable results, analytics designers must develop a theory of how students learn and map which data points allow for inferences about those skills.  Standards, like the Common Core, make big data analytics work because they support the creation of more rigorous models of student learning and enable larger big data systems.  Improving efficiency also lowers barriers to entry which encourages greater competition and frees up analytics designers to innovate.

Standards can help to prevent over fitting in big data.  This occurs when analytics designers tweak a model repeatedly to fit the data and begin to interpret noise or randomness as truth.  The best strategy to prevent over fitting is developing a theory of student learning based on the best research prior to examining the data itself.  This helps ensure that substantive arguments exist for the model rather than a deceptive pattern in the data.  Standards can serve as the base of such a model because they consist of skills and the appropriate sequence for acquiring those competencies.  Much time and effort has gone into ensuring that Common Core skills are ones students will need to succeed.  Therefore users of big data can have confidence that analytics model’s based on the Common Core reflects signal from the data rather than a noise.

National standards like the Common Core allow analytics systems to make better inferences for detailed sub-groups of students.  The Common Core includes only two assessments which, assuming national adoption, would greatly reduce the number of tests.  It is technically easier to link data from separate states if they use the same test or an assessment aligned to the Common Core.  The size of the database has practical implications for the usefulness of the analytics.  Big data in education uses data sets much smaller than in other sectors like finance.  Larger numbers of students improve the accuracy of predictions about specific groups.  Even a state with millions of students may have only a few thousand that for example have repeated a grade and have a specific learning disability.  A larger sample size can make an outsized difference for how big data can help these students.

Standards lower the barriers to entry for startups seeking to enter the personalized learning market.  National standards reduce the resources necessary to develop big data tools that are usable nationwide.  If each state has its own standards then analytics creators need to develop 50 different tools.  This means that companies spend more time developing redundant products rather than innovating.  The resource intensive nature of developing big data analytics also scares off entrepreneurs considering new startups in the sector.

Taken together standards can greatly improve the utility of big data analytics.  They can help prevent over fitting which leads to inaccurate predictions.  They enable even larger data systems that can improve predictions for specific groups of students.  Finally they can encourage new entrants into the field of big data and free them to innovate.  The Common Core will usher in the next generation of big data tools and transform classrooms across the country.

Interested in learning how standards impact education?  Read the recently released paper from Joshua Bleiberg and Darrell West.



Joshua Bleiberg

PhD student, Vanderbilt University. Former Research Analyst, The Brookings Institution.