Reverse Chronological Order
What?!
I checked the date: October 15th. That's when I realized it had been a year. I was registered to take the November 2017 LSAT and had to request my refund by around this time last year...
Data Science
So it's finally over! I can come out of the Kaggle hole in which I've been hiding for the last month or so. It's been quite a little ride. When we left off, I was struggling to engineer some new features and was looking at ways to deal with the size of the data set, such as using an Easy Ensemble (which failed miserably). Since then, I had several little breakthroughs.
First, I started using a much better validation method...
Data Science
At long last, I decided to enter my first Kaggle contest. For the uninitiated, Kaggle hosts predictive, data science competitions. For example, Zillow recently had a contest on Kaggle to better improve their pricing algorithm. Prizes for the competitions can be pretty substantial (the Zillow prize pool was $1.2 million!).
As you can read about in my analysis of their survey, Kaggle is seen as a great resource for learning the tools of data science...
Data Science
"Baby steps" are how we get from A to B. We do the hard work of learning the details, spending hours on hiccups and chasing rabbits down holes. But I find it difficult to really post about baby steps. You all don't need to know about every little aspect of my data science education, and I don't have time to write about them. Reporting falls prey to the law of diminishing returns.
But the pressure to avoid reporting baby steps can overshoot the mark, leading to a desire to only post polished, new material...
Data Science
I'm still not in a place to really produce some original, quality analysis of my own yet, so I thought I'd teach you all about what is probably the most common pitfall in data science: over-fitting.
In very broad strokes, machine learning consists of splitting your data set into two chunks: a training set and a test set. Then you take whatever model you are attempting to use, whether it's linear regression, k-nearest-neighbors, or a random forest, etc., and "train" it on the training set. This involves tuning the hyperparameters that minimize whichever error function you're using...
Data Science
To the reader,
The best way to read this report is on my Github page here. You can also play around with the code yourself by forking (copying) the kernel on Kaggle itself here. I tried various ways to get the report to render correctly here on WordPress, but to no avail. If anyone has some CSS magic that could make the window below bigger (i.e...
Data Science
I realized last night that I have yet to really articulate why I am making this switch from teaching to data science. There's more to it than just not being fully satisfied with teaching and needing a job to earn a buck. This post is more for me than it is you, but I thought I'd share.
Those of you who know me well know that I spent last fall studying for the LSAT. I was fairly convinced for about a year and a half that I wanted to become a lawyer. I was taken with the idea of using my mind to answer tough questions and to convince others of my argument's merit, all while fighting for the most vulnerable...
Data Science
Hello world! My name is Chad Gardner. I am former AP Physics teacher, with an educational background in astronomy, philosophy, and religion(?!). I am currently a stay-at-home dad, spending what spare time I can muster learning Python for data science. I hope to use this site as a place to dump my brain, share my progress, keep myself accountable, and all those other reason people start blogs. Eventually, this will morph into a portfolio filled with beautiful insights, graphs, and stories from the world of data...
Data Science
I have had a difficult time figuring out when to stop learning and to start trying to work on something of my own. I feel a pretty strong urge to make something original. I downloaded years of data from the Florida Department of Education, but that whole thing seemed so daunting. Also, I don't feel like I know enough yet to really discover anything...
Data Science
I've been toying with all of this for about a month now. I have so many bookmarks for blogs, podcasts, free courses, paid courses, and on and on. I've checked out books from the library, bought others on Amazon, and downloaded open source texts. I have to admit that I'm a bit daunted. There are a million places to begin, and I have plenty of work to do before even becoming mildly employable...
Data Science