International Journal of Reconfigurable Computing

Research Article

A Scalable Unsegmented Multiport Memory for FPGA-Based Systems

Table 1

Analysis of the two multiport memory designs.


	Ports		4	8	16	32	64	128	256
	Memory space		16 KB	32 KB	64 KB	128 KB	256 KB	512 KB	1 MB

Small resource: linked list FIFO and reorder queue depth set to 64	Fully connected multiport memory
	Resource	Registers	4 K	14 K	50 K	190 K	728 K
	Utilization	LUTs	5.7 K	18 K	61 K	241 K	906 K
	Virtex 7	BlockRAM	4	8	16	32	64
	V2000T²	Clock frequency	345 Mhz	313 Mhz	256 Mhz	273 Mhz	230 Mhz
	Sequential	Throughput	100%	100%	100%	100%	50%
	Sequential	Latency¹	16	20	36	64	128
	Random	Throughput	97%	93%	88%	72%	49%
	Random	Latency¹	66	65	85	97	144
	Congested	Throughput	25%	13%	6%	3%	2%
	Congested	Latency¹	105	230	490	1034	2780
	Segregated	Throughput	100%	100%	100%	100%	100%
	Segregated	Latency¹	16	24	34	63	61
	Omega multiport memory
	Resource	Registers	3 K	9 K	22 K	53 K
	Utilization	LUTs	5 K	11 K	24 K	53 K
	Virtex 7	BlockRAM	4	8	16	32
	V2000T²	Clock frequency	258 Mhz	257 Mhz	260 Mhz	262 Mhz
	Sequential	Throughput	100%	100%	100%	100%
	Sequential	Latency¹	17	25	37	56
	Random	Throughput	94%	83%	68%	52%
	Random	Latency¹	72	110	131	193
	Congested	Throughput	25%	13%	6%	3%
	Congested	Latency¹	250	462	786	1046
	Segregated	Throughput	25%	13%	6%	3%
	Segregated	Latency¹	247	461	756	1043

Large resource: linked list FIFO and reorder queue depth set to 512	Fully connected multiport memory
	Resource	Registers	4.2 K	14 K	50 K	191 K	744 K
	Utilization	LUTs	5.3 K	17 K	60 K	241 K	928 K
	Virtex 7	BlockRAM	7	13	25	48	96
	V2000T²	Clock frequency	352 Mhz	315 Mhz	253 Mhz	271 Mhz	230 Mhz
	Sequential	Throughput	100%	100%	100%	100%	100%
	Sequential	Latency¹	16	20	36	68	100
	Random	Throughput	96%	95%	94%	99%	98%
	Random	Latency¹	134	296	512	553	616
	Congested	Throughput	25%	13%	6%	3%	2%
	Congested	Latency¹	105	231	491	1019	2750
	Segregated	Throughput	100%	100%	100%	100%	100%
	Segregated	Latency¹	16	23	38	61	63
	Omega multiport memory
	Resource	Registers	3.0 K	8.4 K	23 K	53 K	125 K	300 K	677 K
	Utilization	LUTs	7.3 K	16 K	36 K	77 K	163 K	345 K	746 K
	Virtex 7	BlockRAM	9	17	32	65	129	257	513
	V2000T²	Clock frequency	234 Mhz	234 Mhz	230 Mhz	230 Mhz	225 Mhz	202 Mhz	175 Mhz
	Sequential	Throughput	100%	100%	100%	100%	100%	100%	100%
	Sequential	Latency¹	17	25	37	57	93	161	293
	Random	Throughput	100%	99%	96%	89%	78%	63%	48%
	Random	Latency¹	312	580	566	751	1182	1397	2072
	Congested	Throughput	25%	13%	6%	3%	2%	1%	0.4%
	Congested	Latency¹	2040	4046	7954	15351	28698	49182	65556
	Segregated	Throughput	25%	13%	6%	3%	2%	1%	0.4%
	Segregated	Latency¹	2037	4039	79530	15331	28593	49174	65388

This measures the number of clock cycles between the end of the benchmark and when the last response of the last request gets received. In the worst case scenario several FIFOs queue data that has to wait for access to the same bank.
This particular chip has 2.4 M registers, 1.2 M LUTs, and 1.3 K RAM blocks.