Baseball Toaster: Catfish Stew : Must. Bat. Kendall. Ninth.

STOP CASTING POROSITY! An Oakland Athletics blog.

About the Toaster

Baseball Toaster was unplugged on February 4, 2009.

Frozen Toast

Catfish Stew
And So To Fade Away

Fairpole
Baseball Toaster To Be Unplugged, Disassembled For Scraps

Cub Town
Fade to Black

Humbug
My Final Take-It-Or-Leave-It Offer

The Juice
Official Moving Day is Here

Cardboard Gods
Reggie Jackson, 1976

The Griddle
You did a good job, but it's time to go

Dodger Thoughts
Cheers

Bad Altitude
Pitcher News; I Dump My Tickets

Mike's Baseball Rants
The State of the Hall 2009

Bronx Banter
Movin' On

Just a Bit Outside
And We're Off!

Toaster.TV blogs
Western Homes
Screen Jam
Tin Ear
Aesthetics

Archives

2009
02 01

2008
12 11 10 09 08 07
06 05 04 03 02 01

2007
12 11 10 09 08 07
06 05 04 03 02 01

2006
12 11 10 09 08 07
06 05 04 03 02 01

2005
12 11 10 09 08 07
06 05 04 03 01

2004
12 09 08 01

2003
12 11 10 09 08

Email Us

Ken: catfish AT zombia d.o.t. com
Ryan: rarmbrust AT gmail d.o.t. com
Philip: kingchimp AT alamedanet d.o.t net

Ken's Greatest Hits

Things Are Looking Up

28 Aug 2003

Vlad the Insaner

12 Jan 2004

Blog the Dawgs

31 May 2005

The Evil Midnight Blogger What Blogs At Midnight

11 May 2005

Villains Foiled Again

29 Jun 2005

Hardscrabble

8 Jun 2005

Document 213543: Mission Failure Report

19 Jul 2005

11 Aug 2005

7 Sep 2005

20 Sep 2005

22 Sep 2005

26 Sep 2005

Gallery of Elimination

28 Sep 2005

Parting

29 Sep 2005

Remembering Bill King

18 Oct 2005

On Bartolo Colon's Cy Young

9 Nov 2005

And Gets Out Laughing

15 Nov 2005

AI on Zito

20 Nov 2005

An Ace of Spades

13 Dec 2005

Baseball in the Vanilla Sky

19 Jan 2006

Fan Fest: Three Things Learned

28 Jan 2006

Must. Bat. Kendall. Ninth.

21 Feb 2006

Death and Parataxis

10 Apr 2006

The Curse of the Bobblehead

16 Apr 2006

A Picture Worth A Thousand Words

22 Apr 2006

Ken's Adventures in Weirderland

7 May 2006

A Good Haymaker

25 May 2006

Intersections

31 May 2006

Matt Kemp's Magic Floating Helmet and Other Assorted Images

18 Jun 2006

25 Least Favorite Oakland Athletics

22 Jun 2006

This Is A Tank-Free Zone

6 Jul 2006

From Fjords To Fenway

17 Jul 2006

Is It Real or Is It Photoshop?

13 Aug 2006

Eskimo Ballplayers Have 108 Words For Slump

15 Aug 2006

Ask Dr. Catfish Stew, World-Famous Man of Science

16 Aug 2006

King Of Pain In The Groin

20 Aug 2006

Bookends

11 Oct 2006

Another Long, Rambling Post That Starts Out Nowhere Near Where It Finally Ends Up (The A's Dugout)

31 Oct 2006

Zito Thoughts, Part 1

29 Dec 2006

Zito Thoughts, Part 2

4 Jan 2006

Zito Thoughts, Part 3

12 Jan 2006

I Am Holding Bobby Kielty Prisoner

27 Jan 2007

Ex-Athletics Report: Zito's New Delivery

17 Feb 2007

Star-Cross'd

30 Apr 2007

Of Holes

27 Aug 2007

How I Learned to Stop Analyzing and Love the Game

5 Sep 2007

Traffic Vibration Rate

19 Oct 2007

In Memory of Joe Kennedy

23 Nov 2007

A's Trade Banjo Man to Minnesota for Three Young Musicians

5 Jan 2008

Mark Ellis is Better than Derek Jeter, and It Makes Me Unhappy

16 Jan 2008

Beer Run: How to Defeat a Sabermetrician in an Argument

4 Feb 2008

All These Boys Try Their Best

7 May 2008

Exclusive Excerpts: Wily Wolff and the Ballpark Factory

20 Jun 2008

And So To Fade Away

4 Feb 2008

Must. Bat. Kendall. Ninth.

2006-02-21 12:40

by Ken Arneson

Ryan has a post about optimizing the A's lineup over on The Pastime, using PECOTA projections and a formula from Cyril Morong over at Beyond the Boxscore.

Ryan didn't have the programming nerdiness to work through all 362,800 lineup permutations. But I happened to be cursed with such geekdom, so I wrote a perl script to churn out the calculations. I ran it twice, once with Frank Thomas in the lineup, and once with Jay Payton in place of Thomas.

Here are the best and worst lineups. The number is runs/162 games.

Five best lineups with Thomas:

853.45: Bradley Chavez Ellis Thomas Johnson Crosby Swisher Kotsay Kendall
853.44: Bradley Chavez Ellis Thomas Johnson Swisher Crosby Kotsay Kendall
853.13: Bradley Johnson Ellis Thomas Chavez Crosby Swisher Kotsay Kendall
853.12: Bradley Johnson Ellis Thomas Chavez Swisher Crosby Kotsay Kendall
852.90: Ellis Chavez Bradley Thomas Johnson Swisher Crosby Kotsay Kendall

Five best lineups with Payton:

834.91: Bradley Johnson Ellis Chavez Swisher Payton Crosby Kotsay Kendall
834.80: Bradley Johnson Ellis Chavez Crosby Payton Swisher Kotsay Kendall
834.78: Bradley Swisher Ellis Chavez Johnson Payton Crosby Kotsay Kendall
834.63: Bradley Crosby Ellis Chavez Johnson Payton Swisher Kotsay Kendall
834.50: Bradley Chavez Ellis Swisher Johnson Payton Crosby Kotsay Kendall

A few interesting notes:

This formula insists on batting Kotsay eighth and Kendall ninth. The other players switch around a lot at the top of the list, but that configuration is solid. If there is one conclusion to draw from this exercise, this is it.
The A's are about 20 runs/year better with Thomas in the lineup than Payton.
It likes Bradley leading off and Ellis batting third. That's probably not going to happen in real life, but the presumed order with Ellis leading off also works pretty well.
Given that Ellis is probably going to lead off, and Chavez will bat either third, fourth, or fifth, the ideal lineups with that configuration are:
```
With Thomas:  852.58: Ellis Johnson Bradley Thomas Chavez Crosby Swisher Kotsay Kendall
With Payton:  834.36: Ellis Johnson Bradley Chavez Swisher Payton Crosby Kotsay Kendall
```
Providing evidence that Zachary's preference for Ellis and Johnson at the top of the order is a good one.
When Thomas is in the lineup, it tends to like Chavez batting second. When Thomas is out of the lineup, it tends to like Chavez batting cleanup.
Crosby and Swisher are pretty much interchangeable. Swapping them between any two lineups spots produces almost exactly the same result.

Now for some fun: the worst lineups...

With Thomas:

816.79: Crosby Kotsay Johnson Kendall Swisher Ellis Bradley Chavez Thomas
816.84: Swisher Kotsay Johnson Kendall Crosby Ellis Bradley Chavez Thomas
816.92: Crosby Kotsay Johnson Kendall Swisher Bradley Ellis Chavez Thomas
816.97: Swisher Kotsay Johnson Kendall Crosby Bradley Ellis Chavez Thomas
817.05: Kotsay Ellis Swisher Kendall Crosby Bradley Johnson Chavez Thomas

With Payton:

799.02: Payton Kotsay Swisher Kendall Crosby Ellis Bradley Johnson Chavez
799.11: Payton Kotsay Crosby Kendall Swisher Ellis Bradley Johnson Chavez
799.15: Payton Kotsay Swisher Kendall Crosby Bradley Ellis Johnson Chavez
799.24: Payton Kotsay Crosby Kendall Swisher Bradley Ellis Johnson Chavez
799.59: Payton Kotsay Swisher Kendall Crosby Ellis Bradley Chavez Johnson

The perl code is below, for those of you with the Unixness for these things...




#!/usr/bin/perl
use Algorithm::Permute;

# put players and their obp/slgs here
my @pname = ('Ellis','Bradley','Chavez','Payton','Johnson','Crosby','Swisher','Kotsay','Kendall');
my @pobp = (.351,.355,.354,.322,.353,.346,.347,.332,.333);
my @pslg = (.426,.447,.479,.432,.462,.453,.455,.414,.338);

# formulae from http://www.beyondtheboxscore.com/story/2006/2/12/133645/296
my @obpx = (2.997,2.255,2.141,1.670,2.254,1.346,1.528,1.188,2.550);
my @slgx = (.931,1.263,.933,1.504,1.146,1.237,1.164,.825,.539);
my $constant = -5.261;

my $slots = 9;
my @array = (0..($slots-1));

Algorithm::Permute::permute {
        my $lineup = "";
        $rpg = $constant;
        for (my $i=0; $i<$slots; $i++) {
                $rpg += ($obpx[$i] * $pobp[$array[$i]]) + ($slgx[$i] * $pslg[$array[$i]]);
                $lineup .= $pname[$array[$i]] . " ";
        }
        print 1.00*(int($rpg*16200)/100) . " " . $lineup . "\n";
} @array;

# run the program from the command line like this:  ./permute.pl | sort -n >somefilename.txt

Comments

2006-02-21 13:07:45

1. Bob Timmermann

If I can get that program to run from the terminal emulation on my Mac, I will officially become a geek won't I?

2006-02-21 13:16:11

2. Ken Arneson

Yes. Especially if you can figure out how to download and install Algorithm::Permute from cpan.org.

2006-02-21 14:51:16

3. Roman

This formula insists on batting Kotsay eighth and Kendall ninth.

"I'm smarter than any stinkin' formula!"

--Ken Macha

2006-02-21 14:58:08

4. Zachary D Manprin

Cool.

I would really like to see data involving pitches per plate appearance (P/PA). The OBP and SLG statistics are great starting points.

The comments are great for leading up to a few posts this week.

2006-02-21 15:47:50

5. Ken Arneson

Beyond the Boxscore has a new formula based on DH-only leagues, which for some reason really minimizes the value of the #3 spot in the order. Using those numbers puts Kotsay in the #3 slot most of the time.

That's really weird, so I'm starting to question those numbers. Either that, or we need to have a radically different view of batting orders when there's a DH than we're used to.

2006-02-21 15:52:53

6. Ken Arneson

P/PA would be cool, as would having L/R splits. I don't know of any existing projections with L/R splits, but I suppose you could calculate a L/R Marcel projection.

2006-02-21 16:50:53

7. For The Turnstiles

It's pretty easy to see what's going on here. If the model says that the #9 hitter has the smallest effect on run production (which is probably true, but not to the extent that the original version claimed), then you'll certainly want to hide your least productive hitter there. And if it further claims (as in the revised version) that, of the 1-8 slots, slugging matters least for the #3 hitter, then you'll probably want to put Kotsay, with the lowest projected slugging other than Kendall, at #3, especially on a team with such a small range of expected OBPs.

The question is whether to believe the model in the first place. It looks to me like the noise in the data is so great, that there isn't much that can be salvaged here. In any case, this seems to be a somewhat perverse way of trying to solve the problem of optimizing a batting order. Simulations are simpler and likely to give more useful results.

The drawback of simulations is that it takes around 100K games with a fixed lineup to get precision on the order of .01 runs/game, so it would be somewhat prohibitive to do it for all possible permutations. But that fact should also tell you why it's so hard to draw any useful conclusions from a few years of historical data.

Something like the following might be interesting, though: run a simulation with a typical lineup (based on league averages for each slot), and then vary OBP/SLG in each slot slightly to get a table like Morong's. The coefficients should look considerably less random than what we have here. Then you could apply Ken's script to any actual group of players. There would be some circularity in logic here (it would generate lineups that are optimal, given the constraint that they look something like the lineups that managers actually use), and this can be seen as either a bug (it might miss a better answer) or a feature (it give answers that have some chance of actually being implemented).

I'm also looking forward to seeing what mgl/tango have to say about this subject in The Book.

2006-02-22 10:09:02

8. Ken Arneson

Simulations are simpler than a 10-line script?

I get what you're saying, Turnstiles. Still, it seems a waste to disregard real game data, and use simulated data instead. Maybe there really is something going on here that we wouldn't capture in simulation. Maybe a hybrid solution would be better?

2006-03-16 11:43:33

9. Dennis

What season stats did you use? Or are you using career stats? I started writing a simulator in C to churn through the permutations that is similar to the one that salb918 over at beyondtheboxscore.com wrote in MATLAB, but when I plugged in 2005 stats for your above lineup (using Payton), my estimated runs per season clocks in way lower than what you got. I'm double checking to make sure I didn't make any typos in the stats (I'm using PAs, BBs, hits, 2Bs, 3Bs, HRs, SOs), but I would'nt have expected to be 100 runs lower than what you got.

Comment status: comments have been closed. Baseball Toaster is now out of business.