The Puerco Family of Microprocessors
How the Puerco was born
Building a Microprocessor (or two)
This page is a sort of combination of crazy story and technical
document about a group of CS undergraduates who had the dubious
pleasure of designing and implementing two microprocessors. Our group
consisted of four foolish young CS students:
Ben Sittler , resident genius,
whip-wielder, and inhumanly good debugger.
Valerie Aurora, master state
machine minimizer, draftsman, and writer of really bad VHDL.
Rob Iverson, Xilinx expert,
lightning calculator, and fixer of really bad VHDL.
Shawn Simpson, poster boy
and power supply researcher. Cute, ain't he?
The Evil Taskmaster
The spring semester of 1998, I took CS 331, Computer Architecture,
taught by Victor
Yodaiken. I heard that maybe we'd be doing some work with FPGA's,
field programmable gate arrays. I'd done a lot of EE labs with Altera
chips the previous semester, and while programming FPGA's wasn't
exactly fun, it wasn't that difficult either.
The first day of class, Yodaiken told us that our first project is to
design and implement a microprocessor using an FPGA and a hardware
description language. Our second project is to design and implement a
pipelined microprocessor (pipelined means that
several instructions can run at the same time - sort of). We can work
in groups of two to five people and we have a little less than four
months to do both projects. I nearly fell out of my chair! This was
not at all what I had in mind when I heard we were using FPGA's. At
that point, I knew that processors executed instructions and some of
them were faster than others and not a whole lot more. And that most
processors were designed by huge groups of people working full time
for months on end. It seemed impossible.
Well, there were a number of simplifications which made it possible.
We only had to implement 10 instructions, memory was on chip (no
caches), and it didn't have to be fast. (Our final product had so many
timing problems that it ran at about 1 HZ - we'd hit a key in a DOS
program that sent the clock signal through a parallel port.) Yodaiken
specified the architecture pretty thoroughly (load/store, accumulator
based) and even gave us a (horrible!) state machine to use. You can
see the datapath and the state machine we came up with -- which
actually worked! I would like to point out that our state machine had
only five states, as opposed to the twentysomething states in the
state machine Yodaiken worked out in front of the class (a clever
decoy) and the next smallest state machine in the class, which had
nine states. It's actually a horrible hack. The idea is that the
longest instruction required five states, which meant that five was
the smallest possible number of states required. I then realized that
if it was Moore machine, in which you could make several different
transitions with different outputs to the next state, you could
overlap all the instructions onto five states and just have different
transitions depending on the instruction. The result was about 10
lines of VHDL to implement our whole state machine.
Xilinx sucks!
We had a choice between Altera and Xess boards. Both boards had a
seven segment LED as the main output device. I'd worked with Altera
the semester before and figured that there was a good chance that
Xilinx was better. Wrong! The Xilinx software (as of 1998) is a huge
memory hog! It's a haphazard accretion of poorly written hacks! The
user interface is unintelligible! It took us weeks to figure out how
to do simple basic things! I think the most shocking thing was the
constant parade of messages from the various components of the
program, things like "Joe and Bob's optimization hack, (c) University
of Alabama."
The hardest part was making our CPU fit on the chip. Our first
synthesization produced something that used up 400% of our CLB's
(combinational logic blocks). We minimized and minimized and threw
out every extraneous bit of logic and went from 32 bits to 9 bits, the
minimum needed to hold our opcode and any useful number of addresses.
Once we got the 9-bit CPU to fit, we began discovering all sorts of
sneaky ways to avoid using CLB's. We discovered that increasing the
"optimization" level in the Xilinx software resulted in exponentially
increasing compilation times in return for maybe 1% reduction in size
- most optimization had to be done at a higher level by hand.
Eventually, we got the whole pipelined 32 bit CPU on the chip, which
made us ecstatically happy. The whole experience was good preparation
for the frustration that must be found in, say, tax law.
From Puerco Jr. to Puerco
The entire time we working on the Puerco Jr., our non-pipelined CPU,
we were dreading the Puerco. Pipelining sounds
hard. It must be more difficult to design a
processor that runs three instructions at the same time than one.
Then we actually started designing it. It was easy! Basically, it
did the same thing every cycle instead of different things each cycle.
Each stage of the pipeline executed the currently loaded instruction
(we used straight line branch prediction). When a branch instruction
finished processing, we'd invalidate wrongly predicted instructions.
The way we handled invalid instructions was to simply add an invalid
bit to each stage. If the instruction was valid, the results got
written out to memory or latched into a register. If it wasn't valid,
the results of that stage were ignored. Piece of cake. And it was
smaller in terms of chip real estate than the Puerco Jr. I think we
got the Puerco Jr. working around the end of April, and the Puerco
working just a week or two after that. You can see our timing diagram for the Puerco if you want.
Looking back
We spent an average of 20 hours a week per person in the CS lab for
about 10 weeks to get these projects done. On the other hand, we
spent a lot of time reading email and talking to MegaHAL (sadly, the WWW
gateway is gone) while waiting for the design to finish synthesizing.
It was actually a lot like a social club, we'd go there every night
after class and jaw about computers and cuss gratuitously and tell
MegaHAL nasty things about each other. Free and open access to a
white board and several different colors of marker were crucial to our
design process, although it would have helped if the white board
wasn't behind the computer lab door. Towards the end, Ben started
working about 40 hours a week on the Puerco, without which I don't
think we would have finished. As it was, Ben, Rob, and I got A's in
the class, which frankly shocked the heck out of us.
This is cool!
Here's some really great advertising copy Ben Sittler wrote up.
Absolutely hilarious, at least if you were in CS 331. I would like to
point out the little pigs ("puerco" is Spanish for "pig").

Back to main page