# TPU
**Repository Path**: magicor/TPU
## Basic Information
- **Project Name**: TPU
- **Description**: https://github.com/charley871103/TPU.git
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-05-27
- **Last Updated**: 2026-05-27
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# AIC2021 Project1 - TPU
contributed by < `E94079029 施丞宥` >
## Project Description
Design a Tensor Processing Unit(TPU) which has **4x4** Processing elements(PEs) that is capable to calculate ```(4*K)*(K*4)``` 8-bit integer matrix muplication. (Where is ```K``` is limited by the size of input global buffer)
**Project Constraints**
1. Your designs should be written in verilog language.
2. Your PEs shouldn't more than **4x4**, where a 2D systolic array architecture is **strictly required** in this project.
3. An 8-bit data length design.
4. 3KiBytes in total of global buffer size.
## Systolic array
## Architecture
### TOP
### Data Loader
* 藉由遞增的暫存器來對DATA做pipeline的動作,達到systolic array的效果。
### MAC Unit
### FSM
* IDLE : 當 ```start=1``` 時,會進到BUZY開始做MAC運算。
* BUZY : 每當一次```4*4```的systolic array算完時,會進到OUTP。
* OUTP : 將運算完的結果存進output global buffer。
* DONE : 所有運算都做完後進到DONE表示運算結束。
## Goal
- [x] Pass atleast test1~3
- [x] Support ```(M*K)*(K*N)```
- [x] Synthesis
## Test Result
### Test1
### Test2
### Test3
### Monster
## Synthesis Result
* Area report
* Timing report
* Cell library
* tsmc13_neg