Linkedin the data science of educational management and policy, with barton poulson course

- [Instructor] Baseball was the first major sport to go through a data revolution, as was so vividly described in Michael Lewis' book Moneyball and the movie of the same name. The reason baseball was first is 'cause it has a long tradition, over 100 years, of counting everything. So, they had the data all wrapped up in nice, complete databases that allowed analysts to look for a hidden value and hopefully get a competitive edge, and it's been enormously effective. Just ask the Boston Red Sox who, after an 86-year drought, have one the World Series three times since they adopted this data-intensive approach in late 2002. But what about education and how does this apply to teaching, learning, and educational management? Well, data in education is kind of a different thing. There is some passive data from attendance and graduation rates. By passive it means, you don't have to do anything extra to get the data, it's just there in the records, and that's the way baseball data is. They play the game, it simply shows up in the records. On the other hand, in education, most data comes from standardized testing, and the problem is that test preparation and test administration are time-consuming, to put it mildly, and expensive. And so, the average American student, for example, from pre K to 12th grade will take about 112 mandatory standardized tests, and it's not evenly distributed across grades. When a student is in a testing year, they can spend up to 100 hours of class time preparing for the tests and another 50 taking them, which is about 15% of the roughly 1,000 hours in the class room each year. In addition, the cost associated with test administration can be as high as $800 per student per year, not counting the cost of lost instruction time. And so, yeah, this is very active process to gather the data in education, and really, it's kind of intrusive. Now, it wouldn't be such a problem if it made a really big difference, like the data did for the Red Sox. But more testing doesn't seem to generally be associated with producing more learning, so there's a really big mismatch there and it lets you know that data in education is an area that really is ripe for some significant innovation, and data science can be one method that can really bring some new energy and utility to analysis in education. There are a few reasons for this. Number one, data science allows researchers to use more diverse data. Now, big data and data science are different things, but you may know about big data, they have the three Vs. They have volume and velocity and variety, where variety refers to the different kinds of data. Not just a nice structured database, but bringing in free text, bringing in images and audio and video. Data science methods allow educational researchers to bring in an enormous quantity of diverse datasets. In addition to that, it allows researchers to use a lot more passive data. Again, that means data that already exists and doesn't take any extra effort. So, not just attendance records, but say for instance, open remarks on grades on exams by a teacher or things that people post online or information about how much time students spend on the computer in a particular class, how many times they have to repeat a quiz question. That's all data that can be incorporated if you have the right methods for using it in the analysis, and that's what data science makes possible. The third is, because data science really has a very strong association with the business world in terms of e-commerce and commercial social media marketing, data science has a tradition of a very strong focus on prediction. Trying not to simply describe what's happening or knowing why it's happening, but really, what's going to happen next and what do we need to do? And also, a strong focus on ROI, or return on investment. Now, I'm not saying that we're talking about financial investments in education, though there is an element of that, that's part of the accountability equation of education, but mostly I'm trying to say, are you getting the most return for the time and the energy that you put into teaching or planning or managing? And that data science has this focus on prediction and high-impact activities that could be very helpful in education. Also, data science, because it's focused on prediction and because it's focused traditionally on individual consumers and trying to get them to take individual actions, it tends to get a very nuanced and individualized approach. You can bring in a lot of context information, you can bring in a lot of historical information about a single individual and make recommendations and predictions that are specific to that one individual or to a micro segment, and that gives a lot of added utility in educational research. Finally, data science methods are designed to be updated rapidly and to do so at scale, at large volumes. This is very different from a standard research project that can take a year or two to conduct. The idea here is that you can update it possibly even every day. And as things change for a student or for a classroom or for a school district, then the evaluations, the programs, and the recommendations can be updated in near real-time. Now, I wanna mention a little bit about the actual practice of data science. Please remember, the focus of this course is not technical, I'm not here to show you how to actually conduct these things, we have other course available for that. Instead, this is conceptual. It's an overview of what is possible in data science. I will mention, there are some common methods used in data science in education and anywhere else. They start with standard, familiar regression models that are used in a lot of fields. They're very powerful, they're very flexible, and I would always recommend that a person try using them. But there's a lot more that you can do than that. So, for instance, some of the best predictive research, it comes out the use of Bayesian models, or models that use information from previous sources to get what's called a prior probability, which is then explicitly integrated into the new model to get updated probabilities. Also in data science, there are techniques like decision trees and a ensemble or collection of decision trees that's called a random forest. These are nonparametric methods that work very, very differently from, say, regression models, but are also easy to interpret and can give really precise segments for decision-making and prediction. If you wanna get more sophisticated, it's possible to use neural networks and the variation deep learning models, which have been enormously influential in data science recently. All of these fall into the general rubric of machine learning and artificial intelligence, but it doesn't mean that you always have to fire up some whole server farm and you gotta do something huge. Again, a regression model, which can be done on a single computer, is an excellent first start. But these are some of the methods that I will be referring to in this course, but again, I'm speaking conceptually here, so I want you to be aware that these things exist, and when I talk about data science methods, this is often what I'm referring to. But taken all together, the idea here is that, as we apply data science to education, maybe we can bring the success and the creativity of the Moneyball approach to education. We can use data science to help plan curriculum, to allocate classrooms, to create schedules, to track the progress and engagement of students, to predict problems and intervene before they become serious issues, and to conduct more context-sensitive evaluations of educational programs. And hopefully, all of this will allow students to spend more time learning, less time testing, allow teachers to spend more class time on activities that will have the highest learning impact, and allow schools to have more flexibility in working proactively to meet their community's needs. And in that way, we can bring the extraordinary success of the Moneyball approach from baseball to education.

Watch courses on your mobile device without an internet connection. Download courses using your iOS or Android LinkedIn Learning app.

- [Instructor] I was recently talking to a friend of mine who mentioned that a local performing arts company he knew must be doing very well if they were able to continue wasting money advertising with billboards. You see, billboards are inherently one-size-fits-all, or one message to everyone. And any modern marketer worth their fees can tell you that you just can't do that anymore. You need to adapt your message and your medium to your target as closely as you can. Microtargeting is the most effective way to reach your audience and engage them continually. On the other hand, curriculum planning in schools tends to work a little differently. There's a legacy there where one size fits all, one curriculum is, at least, assumed to be able to reach all of the students. Now, it also assumes that the students are functionally the same, that the same methods and the same messages will reach people in the same way. It may be true if your students are incredibly uniform, that will work. Or, maybe your students are so smart and so good that they'll do well even with a poorly adapted and designed curriculum. Of course, that's not what we're looking for, but it has a tradition in the United States. So, for instance, we've had national standards for education in the United States. No Child Left Behind was a federal law from 2002 to 2015, which brought some pretty rigid national standards and often strict punishments for underperforming teachers in schools. Now, No Child Left Behind was credited with some important improvements, but it also had a constant barrage of criticism, and it led to some very creative cheating, not on the part of students, but on the parts of teachers and schools, who felt that they would get unfairly punished, and so, it was a problematic system, and part of it was the one-size-fits-all. So, for instance, people in Special Education were expected to be held to the same standards as people in mainstream classrooms, which was unrealistic. So, not surprisingly, No Child Left Behind didn't last forever, and it was replaced with the federal Every Student Succeeds Act, or E-S-S-A, ESSA, which was implemented in the 2017-2018 school year. And one of the important things of this act is that it gave a lot more flexibility and involvement from both states and from local school boards. And so, this was an attempt to take what was a one-size-fits-all approach and give it at least some ability to customize and to adapt. Now, in terms of data science, this is where you want to get the idea of microtargeting in curriculum design, borrowing a term from ecommerce. The idea here is that more data is available about students and schools than ever before, and it comes from more sources. And if you're thoughtful about the way you gather data, you can have some incredible resources available to you. And when you do this, it allows you to make more nuanced goals. It allows you to adapt the content delivery, and it allows you to change the way that you do assessment and tailor it to the individual students. It's data science methods that allow this level of individualization, the same way that it has been done in social media marketing and in ecommerce. Those methods that have been used so effectively there, can be adapted to curriculum design as well. So, let's take one example of this and that's individual learning plans. Now, an individual learning plan, or an ILP, that refers to collaborative documents or contracts that are based on a student's unique abilities and their own personalized goals, and it's a wonderful thing to do, because it allows you to adapt things to that student in particular. On the other hand, it involves a lot of people, the teacher, the student, the parents, the case workers, the therapists, the school counselors, and potentially other people. Not surprisingly, that makes it very time-intensive, it takes a lot of time to create the ILP, and revising it becomes really this gargantuan task. And so, one important question is, given how useful ILPs are is, is there some way that data science can be used to help with ILP creation at scale? So, it can be done either in whole or even just elements of ILP to make some of the data gathering and processing easier for the creation. And this is where we also get to the question of applying data science. A few things that you can use with data science in curriculum planning are number one, use more diverse data to assess both student performance and interest. Not just learning as defined by various measures, but really even simple things like attendance and assignment submission. If you think about it, so many major developments in data science have been brought about by using these peripheral data sources to get important insights. Really, all you have to think about is how informative the metadata from cell phones is, even if that's used for other purposes. But you can tell a lot about that. The same way you can use some of this peripheral and passive information about student engagement, especially through a digital medium to assess their learning and involvement. Second, you can use predictive analytics models to identify students who, for example, are at risk of getting overwhelmed and possibly falling behind, or students who are at risk of dropping out. And that then lets you to design an intervention early and assess its implementation and adapt it as necessary. And then there are some wonderful examples of colleges and universities using data science for the simple task of scheduling rooms. I'm in charge of scheduling classes in my university department, and I gotta tell you, doing it manually is a nightmare. And the opportunity to make better use of the space that's available because so many students do learn better in a face-to-face situation and also other universities that have used analytics productively to help students plan their courses for each semester and over their length of their time at the university. This can make things much more efficient and reduce some of the headaches and frustrations that are inherent in both designing a curriculum for the school as a whole and implementing it for the individual students. Now, there are a few advantages to using data science methods in curriculum design. Number one, again, borrowing from the way the data science is used in ecommerce and social media marketing is the ability for constant passive feedback and iteration of curriculum plan. One of the critical things here is the idea of passive feedback. That means you don't actually have to ask somebody to do an assessment. I have teenage children who in middle school and high school do an unbelievable amount of assessment. It's time consuming and it's stressful. Passive feedback is where you use data that already exists like, was the assignment selected or how much time do they spend looking at a particular page? There's no extra effort required of the student or the teacher to gather this information, but that can be used to get iterative or try-it-revise-it curriculum plans for those individual students. One fascinating development in data science is genetic algorithms or the class of evolutionary algorithms, more generally, that allows you to create and test unknown, novel variations to address possible omissions and bias. This is used in engineering to come up with solutions that people would never have thought of on their own. And it's also a way of getting you out of your own rut and helping you find things that you may have ignored completely, and especially when you're dealing with a student who's in an unusual situation. Sometimes having this automated thinking outside the box through a genetic algorithm can be critical. Also, data science models like random forests can address nonlinear growth. Not all cognitive and emotional development is in a straight line. Instead, there tend to be jumps and plateaus, jumps and plateaus, and maybe decreases. Many common methods in data science, random forests are one, neural networks are another, allow you to capture some of this nonlinear growth. Also, they're better able to identify exceptional cases, people who are doing very well, people who are doing very poorly, people who are outliers in terms of their combinations. These exceptional cases are not well captured by standard, linear methods like a new regression, which is very flexible, but data science methods can get you more insight in these cases that fall on the periphery. Finally, Bayesian methods, which include prior probabilities can actually help you include other information like a teacher's personal evaluation or an interview with the student or the parents. Or you can use their information as the prior probability in a model that is used to design the intervention and the curriculum for the student. And taken together, what all of these let you do, is they let you escape intellectual silos. The idea is that education and curriculum planning have long traditions, they've been around for hundreds, thousands of years, but most educators are not familiar with data science or what the methods used in fields like ecommerce and social media marketing, where there's been enormous creativity and growth. And capturing people's attention and getting them involved and taking them through to the next step in whatever it is you're trying to do. These same methods can be used in education and curriculum planning. The idea is that data science methods from these digital fields can enrich educational planning and the execution and evaluation of courses and entire curricula.

Watch courses on your mobile device without an internet connection. Download courses using your iOS or Android LinkedIn Learning app.