From RTL to Hardware Implementation
Table of Contents
- Overview
- Design Flow Diagram
- Simulation (Optional)
- Synthesis
- Implementation
- Bitstream Generation
- FPGA Programming
- Practical Example: Blink LED
- Troubleshooting
- Quick Reference
Overview
The FPGA design flow is the process of transforming your RTL code (Verilog/VHDL) into a working hardware design running on a physical FPGA chip.
Two Main Paths:
- Simulation Path → Software testing (verify logic)
- Implementation Path → Hardware generation (program FPGA)
Why Multiple Steps?
FPGAs are reconfigurable hardware containing millions of programmable elements. The design flow translates your high-level code into low-level configuration data that programs every switch, LUT, and connection in the FPGA fabric.
Design Flow Diagram
┌─────────────────────────────────────────────────────────────┐
│ RTL DESIGN (Verilog/VHDL) │
│ + CONSTRAINTS (XDC file) │
└────────────────────────────┬────────────────────────────────┘
│
┌────────────────┴────────────────┐
│ │
┌──────▼──────────┐ ┌────────▼────────┐
│ SIMULATION │ │ SYNTHESIS │
│ (Optional) │ │ (RTL → Gates) │
│ - ModelSim │ │ │
│ - Vivado Sim │ │ Time: 1-3 min │
│ - Verify Logic │ └────────┬────────┘
└─────────────────┘ │
┌───────▼─────────┐
│ IMPLEMENTATION │
│ (Place & Route) │
│ │
│ Time: 2-10 min │
└───────┬─────────┘
│
┌───────▼─────────┐
│ BITSTREAM │
│ GENERATION │
│ │
│ Time: 0.5-2 min│
└───────┬─────────┘
│
┌───────▼─────────┐
│ PROGRAM FPGA │
│ (Hardware) │
│ │
│ Time: 5-30 sec │
└─────────────────┘
1. Simulation (Optional)
Purpose
Test your design in software before committing to hardware.
What It Does
- Runs your Verilog/VHDL code in a software simulator
- Executes testbench to verify functionality
- Displays waveforms showing signal behavior over time
- Checks for logical correctness
- No hardware required – purely software-based
Tools
- Vivado Simulator (built-in to Vivado)
- ModelSim (Mentor Graphics)
- QuestaSim (Advanced ModelSim)
- Icarus Verilog (Open source)
- Verilator (Fast, open source)
Process
- Write testbench (stimulus generation)
- Run simulation
- View waveforms
- Verify outputs match expected behavior
- Debug and iterate
When to Use
✅ Use simulation when:
- Developing complex designs
- Need to verify timing relationships
- Want to test corner cases
- Debugging functionality issues
- Want fast iteration without hardware
❌ Can skip for:
- Very simple designs (like blink LED)
- Quick hardware prototypes
- When you’re confident in the logic
Example Output
Time: 0 ns | clk=0, reset=1, led=0
Time: 8 ns | clk=1, reset=1, led=0
Time: 16 ns | clk=0, reset=0, led=0
Time: 2500ms | clk=1, reset=0, led=1 (toggle!)
Time Required
- Compilation: 10-30 seconds
- Simulation run: Seconds to hours (depends on simulation time)
2. Synthesis
Purpose
Convert your RTL code into a gate-level netlist suitable for FPGAs.
What It Does
Input
- RTL source files (.sv, .v, .vhd)
- Constraint files (.xdc)
- Design parameters
Process
- Parse & Elaborate
- Reads all source files
- Resolves module hierarchy
- Expands parameters and generates statements
- Creates design database
- RTL Optimization
- Removes unused logic
- Simplifies boolean expressions
- Optimizes state machines
- Constant propagation
- Technology Mapping
- Maps RTL constructs to FPGA primitives:
always_ff blocks → Flip-Flops (FFs) Combinational logic → Look-Up Tables (LUTs) Arithmetic operators → DSP blocks or LUTs Memory arrays → Block RAM (BRAM)
- Maps RTL constructs to FPGA primitives:
- Constraint Processing
- Reads XDC file
- Identifies clock domains
- Applies timing constraints
- Sets I/O standards
- Netlist Generation
- Creates gate-level netlist
- Lists all components and connections
- Preserves hierarchy (optional)
Output
- Netlist file (.edf, .ngc)
- Resource utilization report
- LUTs used
- Flip-flops used
- Block RAM usage
- DSP slices used
- I/O pins used
- Timing estimates (preliminary)
- Warning/error messages
What Synthesis Does to Your Code
Example 1: Simple Register
// Your code:
reg [7:0] data;
always @(posedge clk) begin
data <= input_data;
end
// Synthesis creates:
→ 8 D-type flip-flops
→ Connected to clock network
→ Input pins routed to D inputs
Example 2: Counter
// Your code:
reg [25:0] counter;
always @(posedge clk) begin
if (reset)
counter <= 0;
else
counter <= counter + 1;
end
// Synthesis creates:
→ 26 flip-flops (for counter bits)
→ 26-bit adder (using LUTs)
→ Multiplexer for reset (using LUTs)
→ Reset routing to all flip-flops
Example 3: Comparator
// Your code:
if (counter == COUNTER_MAX - 1)
// do something
// Synthesis creates:
→ 26-bit comparator (using LUTs)
→ Constant value embedded in design
→ Comparison result feeds control logic
Resource Utilization Example
For the blink_led design:
+----------------------------+-------+
| Resource | Used |
+----------------------------+-------+
| LUTs | 35 |
| Flip-Flops | 30 |
| I/O | 3 |
| Clock Buffers | 1 |
+----------------------------+-------+
| Total LUTs Available | 53200 |
| Utilization | 0.07% |
+----------------------------+-------+
Typical Time
- Small designs (< 1K LUTs): 1-2 minutes
- Medium designs (1K-10K LUTs): 2-5 minutes
- Large designs (> 10K LUTs): 5-15 minutes
Common Synthesis Errors
- Syntax errors → Fix Verilog code
- Undriven signals → Add assignments
- Multiple drivers → Fix logic conflicts
- Unsupported constructs → Use synthesizable code
- Missing constraints → Add XDC file
3. Implementation
Purpose
Transform the abstract netlist into a physical design by placing components and routing connections on the actual FPGA chip.
What It Does
Implementation consists of several sub-steps:
3.1 Design Optimization (opt_design)
Purpose
Optimize the netlist for the target FPGA device.
Process
- Removes redundant logic
- Simplifies logic paths
- Optimizes for timing or area
- Prepares design for placement
Output
- Optimized netlist
- Initial timing estimates
Time: 10-30 seconds
3.2 Placement (place_design)
Purpose
Assign each logic element to a specific physical location on the FPGA chip.
What Gets Placed
- LUTs → Specific slice locations
- Flip-Flops → Specific slice locations
- Block RAM → Specific BRAM sites
- DSP blocks → Specific DSP48 sites
- I/O buffers → Specific I/O banks
Physical Coordinates
FPGAs use a grid system:
Slice Location: X12Y45
│ │ └─ Y coordinate (row)
│ └───── X coordinate (column)
└──────────── Resource type (Slice)
Placement Strategies
- Timing-driven: Places connected logic close together for speed
- Congestion-aware: Spreads logic to avoid routing congestion
- Power-optimized: Minimizes switching activity
Example Placement
For blink_led design:
counter[0] → Slice_X45Y67
counter[1] → Slice_X45Y68
counter[2] → Slice_X45Y69
...
led_reg → Slice_X50Y75 (near I/O)
I/O Pins:
clk input → IOB at H16 (clock-capable pin)
reset input → IOB at D19
led output → IOB at R14
Placement Visualization
FPGA Chip (simplified):
Column X45 Column X50
┌────────┐ ┌────────┐
Y70 │ counter│ │ │
Y69 │ counter│ │ led │ ← Placed near output
Y68 │ counter│ │ logic │
Y67 │ counter│ │ │
└────────┘ └────────┘
↓ ↑
└──── routed ──┘
Output
- Placed design with physical coordinates
- Placement utilization report
- Timing estimates (more accurate)
Time: 1-5 minutes
3.3 Physical Optimization (phys_opt_design)
Purpose
Optimize after placement to improve timing.
Process
- Replicates critical logic
- Moves cells slightly for better timing
- Optimizes critical paths
Time: 30 seconds – 2 minutes
3.4 Routing (route_design)
Purpose
Connect all placed components with physical wires on the FPGA.
What Gets Routed
- Logic connections: Between LUTs, FFs, and other elements
- Clock networks: Special low-skew routing for clocks
- Reset networks: Global reset distribution
- I/O connections: From pads to internal logic
Routing Resources
FPGAs have dedicated routing channels:
- Local routing: Short connections within a tile
- Long lines: Faster connections across chip
- Clock routing: Dedicated low-skew clock distribution
- Global routing: Chip-wide signal distribution
Routing Process
- Global routing: Plans approximate paths
- Detailed routing: Assigns specific wires
- Timing optimization: Adjusts routes for speed
- Conflict resolution: Fixes overlapping routes
Example Routing
counter[0] output ─┐
├→ LUT input (adder)
counter[1] output ─┘
adder output ─→ counter[0] input (feedback)
Clock H16 ─→ Clock buffer ─→ Clock tree ─→ All 30 FFs
Routing Congestion
Low congestion (good): High congestion (bad):
┌────┬────┬────┐ ┌────┬────┬────┐
│ ─ │ ─ │ │ │ ─══│══──│──═ │
│ │ ─ │ ─ │ │ ══─│─══─│═── │
│ ─ │ │ │ │ ───│═─══│──══│
└────┴────┴────┘ └────┴────┴────┘
Output
- Fully routed design
- Actual timing report (final, accurate)
- Routing utilization
- DRC (Design Rule Check) report
Time: 1-5 minutes
3.5 Final Optimization (post_route_phys_opt_design)
Purpose
Final tweaks to meet timing constraints.
Process
- Makes small adjustments to meet timing
- Fixes any remaining violations
Time: 30 seconds – 1 minute
Implementation Reports
Timing Report
Worst Negative Slack (WNS): 0.234 ns ← GOOD (positive = met timing)
Total Negative Slack (TNS): 0.000 ns
Number of Failing Paths: 0
Clock: sys_clk_pin (8.000 ns period)
Setup: PASS ✓
Hold: PASS ✓
Utilization Report
+----------------------------+-------+----------+
| Resource | Used | Available|
+----------------------------+-------+----------+
| Slice LUTs | 35 | 53200 |
| Slice Registers | 30 | 106400 |
| Block RAM | 0 | 140 |
| DSPs | 0 | 220 |
| I/O | 3 | 125 |
+----------------------------+-------+----------+
Power Estimate
Total On-Chip Power: 1.234 W
Dynamic Power: 0.123 W
Static Power: 1.111 W
Total Implementation Time
- Small designs: 2-5 minutes
- Medium designs: 5-15 minutes
- Large designs: 15-60 minutes
4. Bitstream Generation
Purpose
Create the binary configuration file that programs the FPGA.
What It Does
Input
- Fully placed and routed design
- Bitstream settings (from XDC)
- Configuration options
Process
- Read Design Database
- Extracts all placement information
- Extracts all routing information
- Reads configuration settings
- Generate Configuration Data
- For every LUT: truth table values
- For every FF: initial state
- For every routing switch: on/off state
- For every I/O: drive strength, slew rate, standard
- Apply Bitstream Options
set_property BITSTREAM.GENERAL.COMPRESS TRUE- Compression (reduces file size)
- Encryption (security)
- CRC checking (error detection)
- Startup options
- Generate Binary File
- Creates .bit file (Xilinx format)
- Optional: .bin file (raw binary)
- Optional: .mcs file (for flash)
What’s in the Bitstream?
The bitstream configures every programmable element in the FPGA:
For each LUT:
- 64-bit truth table (for 6-input LUT)
- Function select bits
For each Flip-Flop:
- Initial value (0 or 1)
- Clock enable settings
- Synchronous/asynchronous reset
For each Routing Switch:
- ON or OFF state
- (Millions of switches!)
For each I/O Pin:
- Direction (input/output)
- Drive strength (2mA, 4mA, 8mA, 12mA, etc.)
- Slew rate (FAST or SLOW)
- Pull-up/pull-down
- I/O standard (LVCMOS33, LVDS, etc.)
Clock Networks:
- Clock buffer enable
- Clock routing configuration
Configuration:
- Bank voltage settings
- Configuration mode
- CRC values
Example: Configuring One LUT
LUT Location: Slice_X45Y67, LUT A
Truth Table for: out = a & b & c
Input: abc
000 → 0
001 → 0
010 → 0
011 → 0
100 → 0
101 → 0
110 → 0
111 → 1
Bitstream encodes: 0b00000001 (64-bit total)
File Sizes
- Uncompressed: 4-6 MB (for Zynq-7020)
- Compressed: 2-3 MB
- Binary format: Similar size
Output Files
project.runs/impl_1/
├── blink_led.bit ← Main bitstream (JTAG programming)
├── blink_led.bin ← Binary format (SD card boot)
├── blink_led.ltx ← Debug probes (if using ILA)
└── blink_led_bd.bmm ← Memory mapping (if using processors)
Bitstream Settings Example
## From your XDC file:
set_property CFGBVS VCCO [current_design]
set_property CONFIG_VOLTAGE 3.3 [current_design]
set_property BITSTREAM.GENERAL.COMPRESS TRUE [current_design]
## Other common options:
# set_property BITSTREAM.CONFIG.SPI_BUSWIDTH 4 [current_design]
# set_property BITSTREAM.CONFIG.CONFIGRATE 33 [current_design]
# set_property BITSTREAM.STARTUP.STARTUPCLK CCLK [current_design]
Time Required
- Small designs: 30-60 seconds
- Medium designs: 1-2 minutes
- Large designs: 2-5 minutes
Common Bitstream Errors
- CFGBVS not set → Add to XDC file
- CONFIG_VOLTAGE mismatch → Verify board voltage
- Conflicting properties → Check XDC constraints
- Encrypted bitstream issues → Check security settings
5. FPGA Programming
Purpose
Load the bitstream onto your physical FPGA board to run your design in hardware.
Programming Methods
5.1 JTAG Programming (Volatile)
Most common for development and testing.
How it works:
- Connect JTAG cable (USB) to board
- Power on board
- Use Vivado Hardware Manager
- Select device
- Program bitstream
Characteristics:
- ✅ Fast: 5-30 seconds
- ✅ Easy: Direct from Vivado
- ✅ Flexible: Reprogram instantly
- ❌ Volatile: Lost on power-off
- ❌ Requires cable: JTAG connection needed
Steps in Vivado:
1. Open Hardware Manager
2. Click "Open Target" → "Auto Connect"
3. Right-click on device
4. Select "Program Device"
5. Browse to .bit file
6. Click "Program"
7. Wait for "Device programmed successfully"
Time: 10-30 seconds
5.2 Flash Memory Programming (Non-Volatile)
For permanent designs that persist after power-off.
How it works:
- Generate .bin or .mcs file
- Program external flash memory
- FPGA loads from flash on power-up
Characteristics:
- ✅ Persistent: Survives power-off
- ✅ Standalone: No computer needed after programming
- ❌ Slower: Takes longer to program
- ❌ More steps: Need to generate flash image
Types:
- SPI Flash: Most common (quad-SPI)
- BPI Flash: Faster but less common
Time: 1-5 minutes (programming flash)
5.3 SD Card Boot (Zynq-specific)
For Zynq boards like PYNQ-Z2.
How it works:
- Generate .bin file
- Name it
boot.bin - Copy to FAT32-formatted SD card
- Insert SD card
- Power on board
Characteristics:
- ✅ No cable needed: Standalone boot
- ✅ Easy updates: Swap SD cards
- ✅ Portable: Move between boards
- ❌ Requires SD card: Extra hardware
- ❌ Boot time: 3-10 seconds on power-up
Time: Instant (after copying to SD card)
FPGA Configuration Process
Once bitstream is loaded, here’s what happens inside the FPGA:
1. Configuration Start
↓
2. Clear all configuration memory
↓
3. Load bitstream bit-by-bit
- Configure each LUT
- Set each flip-flop initial state
- Program each routing switch
- Configure each I/O buffer
↓
4. CRC Check (verify data integrity)
↓
5. Start-up sequence
- Release global reset
- Enable clocks
- Initialize flip-flops
↓
6. Design is RUNNING!
Configuration time:
- JTAG: 10-30 seconds
- Flash: 100-500 ms (automatic on power-up)
Verification
After programming, verify your design works:
- Visual check: LEDs, displays should respond
- Test inputs: Press buttons, flip switches
- Observe outputs: Check LED behavior
- Use ILA: Internal logic analyzer for debugging
- UART output: Monitor serial debug messages
For Your Blink LED Design
Expected behavior after programming:
- Power on board
- Wait ~2.5 seconds
- LD0 (LED 0) turns ON
- Wait ~2.5 seconds
- LD0 turns OFF
- Repeat forever
To reset:
- Press BTN0 → LED turns off
- Release BTN0 → LED starts blinking again
Practical Example: Blink LED
Let’s trace what happens to your blink_led.sv design through each step.
Original Design
module blink_led (
input wire clk, // 125 MHz
input wire reset,
output reg led
);
parameter COUNTER_MAX = 312500000;
reg [28:0] counter;
always @(posedge clk) begin
if (reset) begin
counter <= 29'd0;
led <= 1'b0;
end else begin
if (counter == COUNTER_MAX - 1)
counter <= 29'd0;
led <= ~led;
else
counter <= counter + 1;
end
end
endmodule
Step 1: Synthesis
What happens:
1. Parse code:
- Found module "blink_led"
- 3 ports: clk (input), reset (input), led (output)
- Parameter COUNTER_MAX = 312500000
2. Analyze logic:
- 29-bit counter (reg [28:0])
- 1-bit LED register
- Comparison: counter == 312499999
- Increment: counter + 1
- Toggle: led <= ~led
3. Map to FPGA resources:
- Counter register → 29 flip-flops
- Adder (+1) → ~15 LUTs
- Comparator (==) → ~15 LUTs
- Toggle logic (~) → 1 LUT
- Mux for reset → ~3 LUTs
- LED register → 1 flip-flop
TOTAL: ~35 LUTs, 30 flip-flops
4. Process constraints (from XDC):
- clk at pin H16 → Clock capable pin
- reset at pin D19 → General I/O
- led at pin R14 → General I/O
- All use LVCMOS33 standard
- Clock period: 8 ns (125 MHz)
Output:
Resource Utilization:
LUTs: 35 / 53200 (0.07%)
FFs: 30 / 106400 (0.03%)
I/O: 3 / 125 (2.4%)
Timing Estimate:
Max Frequency: ~250 MHz (plenty of margin!)
Step 2: Implementation – Placement
What happens:
Physical placement on chip:
Clock Input:
H16 → IBUF → Clock Buffer (BUFG) → Global clock network
Counter Flip-Flops (placed close together):
counter[0] → Slice_X45Y100_FF
counter[1] → Slice_X45Y100_FF (same slice)
counter[2] → Slice_X45Y101_FF
counter[3] → Slice_X45Y101_FF
...
counter[28] → Slice_X45Y114_FF
Adder LUTs (placed near counter):
adder[0] → Slice_X45Y100_LUT
adder[1] → Slice_X45Y100_LUT
...
Comparator LUTs:
cmp_logic → Slice_X46Y105_LUT
LED Register:
led_reg → Slice_X50Y90_FF (near output pin)
I/O Buffers:
reset_IBUF → IOB_X0Y50 (pin D19)
led_OBUF → IOB_X1Y14 (pin R14)
Step 3: Implementation – Routing
What happens:
Clock Routing:
H16 (external pin)
→ IBUF (input buffer)
→ BUFG (global clock buffer)
→ Global clock tree
→ All 30 flip-flops (low skew!)
Reset Routing:
D19 (external pin)
→ IBUF (input buffer)
→ Control set routing
→ All 30 flip-flops
Counter Logic Routing:
counter[0].Q → adder_LUT.I0
counter[1].Q → adder_LUT.I1
...
adder_LUT.O → counter[0].D (feedback)
All counter bits → comparator inputs
comparator.O → toggle_mux → led.D
LED Output Routing:
led_reg.Q
→ Local routing
→ OBUF (output buffer)
→ R14 (external pin)
Routing visualization:
Clock Network (Global)
↓↓↓↓↓↓↓↓↓↓
┌─────────────────────┐
│ Counter FFs (29) │
│ [0][1][2]...[28] │
└────┬──────────┬─────┘
│ │
┌────▼────┐ ┌──▼──────┐
│ Adder │ │ Compare│
│ LUTs │ │ LUTs │
└────┬────┘ └──┬──────┘
│ │
└──→ Mux ←┘
↓
┌───▼────┐
│ LED FF │
└───┬────┘
↓
Output R14
Timing analysis result:
Clock Period: 8.000 ns (125 MHz)
Critical Path (longest):
counter[27].Q → adder → comparator → mux → counter[0].D
Path Delay: 3.245 ns
Setup Time: 0.125 ns
Total: 3.370 ns
Slack: 8.000 - 3.370 = 4.630 ns ✓ PASS
Conclusion: Design meets timing at 125 MHz!
Step 4: Bitstream Generation
What’s configured in the bitstream:
Configuration Data Generated:
1. Clock Pin H16:
- IOB configuration: INPUT
- IOSTANDARD: LVCMOS33
- Connected to: BUFG
2. Reset Pin D19:
- IOB configuration: INPUT
- IOSTANDARD: LVCMOS33
- Connected to: control set routing
3. LED Pin R14:
- IOB configuration: OUTPUT
- IOSTANDARD: LVCMOS33
- Drive strength: 12mA
- Slew rate: SLOW
4. Counter Flip-Flops (29):
- Each initialized to: 0
- Clock source: BUFG
- Reset source: reset signal
- D input: From adder/mux
5. LED Flip-Flop:
- Initialized to: 0
- Clock source: BUFG
- Reset source: reset signal
- D input: From toggle logic
6. LUT Configurations:
- Adder LUTs: Truth tables for +1 operation
- Comparator LUTs: Truth tables for == 312499999
- Mux LUTs: Truth tables for reset mux
- Toggle LUT: Truth table for NOT operation
7. Routing Switches:
- ~500 routing switches configured
- Clock tree: All switches to distribute clock
- Signal routing: Connect all logic paths
8. Global Settings:
- CFGBVS: VCCO
- CONFIG_VOLTAGE: 3.3V
- Compression: ENABLED
File generated:
blink_led.bit
Size: 2.1 MB (compressed)
Contains: 17,536,096 bits of configuration data
Step 5: Programming & Execution
What happens when you program the FPGA:
1. JTAG sends bitstream to FPGA (10 seconds)
2. FPGA configures itself:
- Sets all LUT truth tables
- Initializes all FFs to 0
- Configures all routing switches
- Sets up I/O buffers
3. Start-up sequence:
- Global reset released
- Clock starts running
- Design begins execution
4. Your design runs:
Clock cycle 1:
counter = 0, led = 0
Clock cycle 2:
counter = 1, led = 0
Clock cycle 3:
counter = 2, led = 0
...
Clock cycle 312,500,000:
counter = 312,499,999
Compare = TRUE → counter resets to 0
led toggles: 0 → 1
*** LED TURNS ON! ***
Clock cycle 312,500,001:
counter = 0, led = 1
...
Clock cycle 625,000,000:
counter = 312,499,999
led toggles: 1 → 0
*** LED TURNS OFF! ***
(Repeat forever)
Real-time on board:
T = 0.0s: LED OFF (power on)
T = 2.5s: LED ON ← First toggle
T = 5.0s: LED OFF ← Second toggle
T = 7.5s: LED ON ← Third toggle
T = 10.0s: LED OFF ← Fourth toggle
...continues forever...
Troubleshooting
Synthesis Errors
| Error | Cause | Solution |
|---|---|---|
| “cannot find port ‘xxx'” | XDC port name doesn’t match RTL | Fix port name in XDC or RTL |
| “cannot determine constant value” | Non-constant in parameter | Use localparam or constant |
| “multi-driven net” | Multiple assignments to same signal | Remove duplicate drivers |
| “instantiation of ‘xxx’ failed” | Missing module | Add source file or check spelling |
Implementation Errors
| Error | Cause | Solution |
|---|---|---|
| “timing not met” | Critical path too long | Reduce clock frequency or optimize logic |
| “unroutable design” | Too congested | Use fewer resources or partition design |
| “cannot place IOB” | Pin doesn’t support function | Choose different pin or standard |
| “hold violation” | Clock skew too large | Add delay or use different routing |
Bitstream Errors
| Error | Cause | Solution |
|---|---|---|
| “CFGBVS must be set” | Missing configuration property | Add to XDC: set_property CFGBVS VCCO |
| “CONFIG_VOLTAGE required” | Missing voltage setting | Add to XDC: set_property CONFIG_VOLTAGE 3.3 |
| “conflicting properties” | Multiple XDC constraints conflict | Review and fix XDC file |
Programming Errors
| Error | Cause | Solution |
|---|---|---|
| “no hardware targets found” | JTAG not connected | Check USB cable and drivers |
| “device not responding” | Power issue or wrong device | Check power, select correct device |
| “programming failed” | Bitstream corruption | Regenerate bitstream |
| “device does not match” | Wrong FPGA part | Check project settings |
Quick Reference
Complete Flow Commands (Vivado TCL)
# Synthesis
synth_design -top blink_led -part xc7z020clg400-1
# Implementation
opt_design
place_design
phys_opt_design
route_design
phys_opt_design
# Bitstream
write_bitstream -force blink_led.bit
# Programming (Hardware Manager)
open_hw_manager
connect_hw_server
open_hw_target
program_hw_devices [get_hw_devices xc7z020_1]
Time Estimates
| Step | Small Design | Medium Design | Large Design |
|---|---|---|---|
| Synthesis | 1-2 min | 2-5 min | 5-15 min |
| Place | 1-2 min | 3-7 min | 10-30 min |
| Route | 1-3 min | 3-10 min | 10-60 min |
| Bitstream | 0.5-1 min | 1-2 min | 2-5 min |
| Programming | 10-30 sec | 10-30 sec | 10-30 sec |
| Total | 4-8 min | 10-25 min | 30-110 min |
Resource Equivalents
| RTL Construct | FPGA Resource |
|---|---|
reg / logic with @(posedge clk) | Flip-Flop (FF) |
Combinational assign | Look-Up Table (LUT) |
if/else statements | Multiplexers (LUTs) |
Arithmetic +, -, * | LUTs or DSP blocks |
case statements | Multiplexers (LUTs) |
reg [7:0] mem [0:255] | Distributed RAM or BRAM |
| State machines | FFs + LUTs |
Key Files
| File | Purpose | When Created |
|---|---|---|
.sv / .v | RTL source code | You write this |
.xdc | Constraints (pins, timing) | You write this |
.dcp | Design checkpoint | After each step |
.bit | Bitstream for JTAG | After bitstream gen |
.bin | Binary for SD/Flash | After bitstream gen |
.ltx | Debug probes | If using ILA |
Summary
Key Takeaways
- Synthesis = RTL → Gates (logical design)
- Implementation = Gates → Physical locations & wires (physical design)
- Bitstream = Physical design → Binary configuration file
- Programming = Load binary into FPGA hardware
Why Each Step Matters
- Synthesis checks if your code is valid and synthesizable
- Placement determines performance (speed, power)
- Routing determines if design is physically realizable
- Bitstream packages everything for the FPGA
- Programming puts your design into action
The Big Picture
Your Code (Abstract)
↓
Synthesis
↓
Logic Gates (Generic)
↓
Implementation
↓
Physical Design (Chip-Specific)
↓
Bitstream
↓
Configuration Data
↓
Programming
↓
Running Hardware! 🎉