Multicore Processor on FPGA
Designing and implmenting a pipelined, multicore processor with cache coherency remains one of the most difficult yet rewarding projects of my life. Purdue is the only university in the country that offers a course dedicated to such an extensive FPGA project. Prior to this class, I studied digital design on FPGA boards in two semesters of undergraduate ASIC design courses. I also wrote dual thread programs in the assembly language MIPS to test my designs.
My teammate and I recorded the maximum clock frequency and clock cycle used in gate level simulations and computed both the Instruction Latency (IL) and the number of Millions Instructions Per Second (MIPS) to compare our designs with and without caches. The benchmark program we used throughout testing was merge sort. This program's instruction count is 5404 in the single thread version and 5421 in the dual thread version.
Figure 1. Functional block diagram of the pipelined multicore processor with caches.
Figure 2. Results from quantitatively comparing our designs with and without caches.