TSDI: Unit 4 Expert Panel

Video Transcript

Hello, I'm Hollylynne Lee and I'm here again with my good friends and colleagues, Susan, Webster and Chris. And in this Unit we're really diving into a resource that is, here in the United States at least, housed by the American Statistical Association called Census at Schools, but there are also international sites that are housed in various locations that collect data from students and teachers have access to all of this data that have been collected over the years. So have any of you used the Census at Schools resource, and how have you used it in your classes? 


Well, I haven't actually used it in my classes, but I'm very familiar with it as a resource. It's an interesting data resource in my opinion. So, at least here in the states I've done a few Webinars with the American Statistical Association on using the Census at Schools and some ideas, and have some ideas about that. But one of the things that you understand very quickly when you start working with Census at Schools is data can be overwhelming, right, I mean you can get a lot of information of some questionable value sometimes. So as I've looked at this data as, particularly the quantitative variables that are in the data, it's...it really reinforces that idea of cleaning the data. It's...that's something that's very real that you have to do almost always in practice. So if you just try to do a histogram as some of these variables you'll see very quickly that there's some values that don't even make sense, that are completely obscuring your ability to look at the rest of the data, and you have to start at the filtering process, and you have to sort of slowly filter this down until you can get something that's reasonable. 


So for example, I know one of the variables is the number of text messages you have sent yesterday, something like that, and I have seen that data set where there is an answer of 10,000, you know, student put in 10,000. That's...in a day...I don't even think that's physically possible. 


I don't know, there's some very.... 


Yeah, so that's probably somebody with a sense of humor. 


Right. 


Miss Types or Miss Tight, yes. 


And they're no way typical of anyone else right, so, as an outlier you don't want to let that person have an impact on what you're saying about the rest of the data. 


Right. 


And that's where that filtering process becomes very, very important. 


Yeah, I agree. 


I've used Census at the Schools with a course that I teach at the University of Georgia that is for high school teachers, and I have usually a mixture of pre-service teachers, as well as, in-service teachers in this class, and one of the main reasons I expose them to this data set is for what Webster just indicated is that, it's not a clean data set, and that it is an accessible data set to where I think they could maybe develop some tasks or use it in their classrooms because it's relevant questions for their students, but at the same time, the beauty of using that data set is that it helps students see that in the real world this is the way most data sets look. They're not clean, and that they're messy and you need to learn how to clean them up, how to filter things. When is it permissible to delete an observation of 10,000 text messages, when is it not permissible to delete an observation? So I think that's one of the nice things about this data set, is it gives a real world exposure to how data sets look. 


Yeah, so I think we teach a very similar course, and so, one of the ways that I use it in my course for teachers is they...we first look at the survey that all the students take, and so they get to understand kind of the questions that are asked of where the data comes from. So we talk about all the different measurement issues, we're able to think about the kind of variables that you're going to get from this question, what type of output are we going to get, what kind of variation might you expect, might you anticipate, so there's some anticipation there? And then I get them engaged in, so if we download a sample of this data set and we have access to all these different variables and you look through these different survey questions, what questions might you pose, and we go back to this variable. So we're going to do a statistical investigation where we use some sub-set of this data, what questions are you interesting in and can we pose a good statistical question that this data set would allow us to answer, and so, that's one way that I get my students engaged and they work in groups, they pose questions, and then, they're...they're...each group is working actually to answer a different question. So they're not all working on the same question, and then they actually have to do their investigation and report their results and share that with the class, and kind of convince their classmates that they have done a thorough enough job of the investigation in supporting their argument. So, that's one way that I've used it with teachers. 


One other way, and oftentimes in statistics, we always show, comparing groups for example, we always find those examples where there is a significant difference right? Well, if you work with a Census at Schools data, you'll see in the real world, yeah, significant differences don't happen all that often, and I think that's actually valuable, you know, so a student shouldn't expect, you know, my P value should be low, it's not, there's something wrong. So, you could easily use that data to show many examples of where things aren't statistically significant, maybe find one or two where they are.

Throughout this MOOC we have all mentioned technology, we're all very avid users of technology, and we really see it as a central role in what we do in statistics. So, when we're using technology to analyze our data and to help us interpret our results, what does it bring to the table? What kinds of affordances and constraints does technology allow for us when we're in this phase of a statistical investigation? 


I'll start for a couple of observations because I always go back, I have been around quite a while, and very many, many years ago, the first piece of software I ever used for statistics was something called "Cricket Graph", did any of you ever know that? I love Cricket Graph, it was the simplest graphing software I'd ever seen, and I think I was writing curriculum then, and then later I taught a course where people at Bolt, Beranek and Newman were developing a piece of software and we tried that one too. But the thing that struck me when I used it, as simplistic as it was, and you were learning how to make bar graphs, but I was learning about the structure of graphs in a way I had never learned about when I did it by hand. What's the scale, are you going to do 2D, 3D, I mean, all these different decisions you make and then I starting thinking about, well, the impact on learning. And at that time, I was also getting in software for mathematics too, and my feeling was that the software built a level of depth and comprehension of understanding in ways that you could not do without using technology. And so, now, I helped...I was on the advisory committee for the development of Tinker Plots, I think that's a brilliant piece of software in the sense that kids have data objects they mess around with and they move and they put it into different shapes, they express their representations any way they want, they have to make sense of it. And, I've always felt it's probably one of the most brilliant pieces of software, particularly, for upper elementary and middle grades in terms of kids making sense of what data is all about before you ever even go into the Fathoms or other kinds of analysis. 


That's right, they can create very non-standard and informal visualizations. 


And they do, they do these amazing interpretations of things like that as well. And what they're learning is a whole different way to think about data analysis and statistics at a much deeper level, what I call the "visceral gut level" that they will never lose, and that then translates into the more abstract representations you might use with other kinds of software, at some point, but they need the intuitive building too, and we've never been able to do that before. Technology is our only real way to do that so. 


Well, I'm going to date my age now. 


Okay. 


So, I'm finishing... 


We've all the gray hair showing so you're among friends. 


So, I'm in my 35th year of teaching at the collegiate level, and I often say that when I think back about my first few years of teaching the Intro Stat course, I am so embarrassed at the curriculum that I delivered to my students. I was definitely more the computational, procedural type curriculum, that was the way I always taught, that was...but it was in the 1980's that I became exposed to statistical software, and my first really big experience with software was with Mini-Tab, when I was at the University of Florida on faculty there. And this is when we still had line commands, you know, which I still find myself when I use mini-tab like, can I go back and use my line commands instead of pull down menus, but I think the beauty that I found with mini-tab, and this was really a turning point in how I taught my courses, as well as, using Statistics, 1st Edition by David Freedman, I have to say, that is a fantastic book. 


That's a good book, yes. 


I tell my teachers that you should have that book on your bookshelves, Statistics, 1st Edition by David Freedman but with the ability to simulate, that I found with mini-tab. And so, this whole idea of teaching statistical topics such as sampling distributions or trying to develop the idea of what we mean by margin of error or with, you know, with confident centrigals or P value, through...even though it was a bit clunky with mini-tab it still changed the way I taught. And, I think back, I think to where we are now and it's just truly amazing, I mean, the ability of our software for just visual displays as you mentioned, at the art part of statistics, but also the ability to simulate, for students to just see what's going on, it's not...it's not just something magical, they can actually experience it. And I think from a K-12 level, technology has allowed us to truly introduce statistics at the K-12 level because we're able to teach sophisticated concepts like sampling distributions, like P values, like margins of error, at the high school level. 


Right, in very intuitive ways. 


In very intuitive ways and as the Common Core is promoting, it's all through simulation, through randomization tasks. It's not the traditional probability distribution based inference that we're still teaching in our college intro courses. So technology, basically, has changed the way we teach. It's allowed us to teach courses to where we can help students become statistically literate now. We're not teaching them procedures any more. 


Yeah, yeah. 


So as you might guess I have very strong opinions about technology. 


I am sure, I know you do. 


I'm similar to Chris, I think back when I was teaching 20 years ago, and what I teach today, it's very different, the approach is very different, I suspect many people are in that boat, but, you know, I think about the class I teach now and statistics in general, it's impossible to teach it without technology. I mean it's crazy, I mean and. 


And to take it as a given, it's a given. 


You don't use a calculator, okay, nobody does that, you know, that's one of my pet peeves, you know, statistics is not something you should think of pulling out a calculator, a hand held calculator just to try to solve a problem. But so, you know, as my ideas about technology have developed over the years, I would say that I see really two ways that technology makes a big difference. You know, so I use Stack Crunch in my courses of course, and, you know, to me it's a learning tool in that I can put data in front of the students very quickly, okay, and have them looking at the data thinking about the data. They can generate results quickly, you know, whether they be formula types of results or graphs, you know, it allows us to look past the formulas and those types of things and focus more on interpreting what's there. Okay, so that's one way that software, in general, can help out. Then the other way, I think technology's had a big impact on the way that we teach certain concepts and that we can develop these sort of interactive demonstrations that are focused, like on sampling distributions. So in Stack Crunch we have an entire Applets menu right, and it's really, that's the education side of the package where you can pull one of those out and use that to teach statistical concepts. And that's something that we could never have, that did not exist prior to my...part of the 1990's, you know, when I first started teaching it wasn't available. And now I think they're incorporated almost everywhere. 


Right, right. 


So that's another way, so just that working with data and being able to focus on interpretation rather than formulas and things of that nature. And then those specific things, there's certain concepts that I can't imagine teaching without, you know. And so, addressing that is important as well. 


And one of the things that I love whenever I use different software packages, no matter what it is, is the ability to actually...when I'm creating, I've got my data set, it might be representative in a table, it might be represented in data cards if I'm using something like Tinker Plots or Fathom, and I...when I create a graphical representation, I might create several graphical representations and I can...all of those are internally linked together. And so, if I select a data point in one particular graphical representation, I see where that case is represented in a different representation, and I see where it is in the table and so I can go find out more about that case that might lead to more interesting questions. So the ability to kind of link among representations, I have found to be extremely powerful in the way that I teach my students and my teachers to explore their data and it helps to kind of get them, they might be...they might think that they're still focusing on the question that they originally post, but then new questions start popping up because they're able to actually think about it in different ways. And then I also really think that, when thinking about the conceptual understanding, that the ability to be able to change data point and see how that affects different measures, you know, we...computing that if I was doing it by hand, what's the point, you know, I mean it would just take forever, but if I can change a data point and I can see how that affect the mean or the median or I remove an outlier and I see how things shift, that's a real powerful learning tool, I think, for our students, so I think we've brought up some nice points. 


I think it's also important, as for teachers, because one of the things I hear a lot, especially for K-12 teachers, but it's true at the college level as well, is sometimes they say, "Well, I don't access to these software packages, they cost money, I don't have the resources to get those", but there are so many wonderful free resources available on the internet, and Webster mentioned Applets, I have found that using Applets is one of the things I'm going more and more toward right within my own classrooms because then you don't have to rely on software. Students don't have to pay for it, you don't have to have them in the computer lab. So trying to utilize resources that they can easily just access from their own laptops or their computers, it's much easier to do that now. 


Yeah, yeah, and throughout this MOOC, we've actually introduced you to a lot, one of our core things is that we wanted to give you access to lots of different free tools that you can easily integrate into your classroom, so, thank you.