A Reminder on the Danger of Small Sample Sizes

Mandatory Credit: Denny Medley-USA TODAY Sports

We are now 1 week into the season, so I think it’s safe to say we have a good idea of how the rest of the Royals season will go. We can reasonably assume that:

Salvador Perez is going to win the American League MVP award.

– The only way Jason Vargas won’t win the Cy Young award is if too many voters opt for Bruce Chen.

Mike Moustakas will never get another hit, ever.

– The Royals’ bullpen will blow leads at least half the time.

– And, the Royals will hit fewer home runs than any other American League team.

Ok, so that last one will probably be true, but you get the point. Now that a few games have been played, we’ve reached the point in the season during which fans attempt to attach significance to statistically insignificant data. It’s hard not to do, since these few games are all we have to go on, of course. The only numbers we see are from just a small handful of games, so the natural reaction is to – for lack of a better word – react to that set of data. I wrote a bit about this last year as well. We want to form an opinion of players and teams using the information available, and honestly, there’s nothing really wrong with that. The point at which I begin to take issue with this kind of early evaluation is when we start changing our opinion on a player after seeing 20 plate appearances.

Believe me, this isn’t meant to sound like an “All is well!” kind of post. I don’t want to tell anyone how to be a fan, and if  worrying is your thing, go right ahead. There are absolutely reasons to worry about this team. However, the concerns I have with the Royals today are no different than the concerns I had with the Royals 2 weeks ago. Nothing I’ve seen in the Royals’ first 5 games has changed my opinion on what this team is. I still would take this bullpen over almost any other bullpen in the league. I still think Moustakas can be a serviceable hitter. I still think having Bruce Chen in the rotation is a mistake. Those beliefs may turn out to be hilariously silly, but what would be even sillier is allowing the smallest of sample sizes to affect an opinion originally formed using much, much more information.

The bullpen has struggled quite a bit early on, but it’s filled with talented guys who’ve had success at the big league level. Moustakas has clearly changed his approach, so I’ll focus on the process instead of the results there. And despite Chen’s domination of a hapless White Sox lineup, he’s still a guy with a tiny margin for error, and relying on his home run rates to stay lower than they’ve been in most of his career is a risky proposition.

Again, you can call me a fool for thinking those things, but I’d be a much bigger fool for doing a 180 after 3% of the season is complete.

Don’t get me wrong. Every game is important, and that’s especially true for a team like the Royals, so they do need to get things going. They can’t afford to scuffle along through April, putting themselves in a deep hole they’ll have to dig out of for the last few months. However, with 97% of the season to go, a lot of things can change. Players adjust. Players get healthy. The BABIP Fairy has violent mood swings. There are a lot of factors involved.

Ridiculous things can happen in small sample sizes. When those small sample sizes come in the middle of the season, it’s easier to tune out any statistical noise and take the rest of the season in context. But when the small samples are the only bits of data we have at the beginning of a season, we tend to perceive that data as somehow being more meaningful than what we’ve previously known. Every fan can react in whatever way he or she chooses, but I’m just trying to remind everyone to keep things in perspective. Using small sample sizes can be fun for entertainment purposes (Perez has a 239 wRC+, and Chen’s 9.95 K/9 is the 8th highest rate in the AL!), but when evaluating a player or team, it’s important to use more information prior to making any declarative statements this early in the season.