Title:
FPGA integrated circuit having embedded sram memory blocks with registered address and data input sections
Document Type and Number:
United States Patent RE39510

Abstract:
A field-programmable gate array device (FPGA) having plural rows and columns of logic function units (VGB's) further includes a plurality of embedded memory blocks, where each memory block is embedded in a corresponding row of logic function units. Each embedded memory block has a registered address port for capturing received address signals in response to further-received, address-validating clock signals. Interconnect resources are provided for conveying the address-validating clock signals to address-changing circuitry so that a next address can be generated safely in conjunction with the capturing by the registered address port of a previous address signal.
Inventors:
Agrawal, Om P. (Los Altos, CA, US)
Chang, Herman M. (Cupertino, CA, US)
Sharpe-geisler, Bradley A. (San Jose, CA, US)
Nguyen, Bai (San Jose, CA, US)
      Plaque It!

Sponsored by:
Flash of Genius
Application Number:
10/392751
Publication Date:
03/13/2007
Filing Date:
03/20/2003
View Patent Images:
Images are available in PDF form when logged in. To view PDFs, Login  or  Create Account (Free!)
Assignee:
Lattice Semiconductor Corporation (Hillsboro, OR, US)
Primary Class:
Other Classes:
326/38
International Classes:
G06F7/38; H03K19/177
Field of Search:
326/38, 326/47, 326/41, 326/46, 326/40
US Patent References:
5689195Programmable logic array integrated circuit devicesNovember, 1997Cliff et al.326/41
5744980Flexible, high-performance static RAM architecture for field-programmable gate arraysApril, 1998McGowan et al.326/40
5787007Structure and method for loading RAM data within a programmable logic deviceJuly, 1998Bauer716/16
5828229Programmable logic array integrated circuitsOctober, 1998Cliff et al.326/40
6127843Dual port SRAM memory for run time use in FPGA integrated circuitsOctober, 2000Agrawal et al.326/40
Foreign References:
WO/1998/010517March, 1998FPGA ARCHITECTURE HAVING RAM BLOCKS WITH PROGRAMMABLE WORD LENGTH AND WIDTH AND DEDICATED ADDRESS AND DATA LINES
Primary Examiner:
Chang, Daniel D.
Claims:
What is claimed is:

1. A field programmable gate array (FPGA) device comprising: (a) a first plurality, P1 of repeated logic units wherein: (a.1) each said logic unit is user-configurable to acquire and process at least a second plurality, P2 of input logic bits and to responsively produce result data having at least a third plurality, P3 of output logic bits, (a.2) said logic units are distributed among a plurality of horizontal rows, with each row of the plurality of rows having a fourth plurality, P4 of said logic units; (b) a fifth plurality, P5 of horizontal interconnect channels (HIC's) correspondingly distributed adjacent to said horizontal rows of logic units, wherein: (b.1) each said horizontal interconnect channel (HIC) includes at least P3 interconnect lines, and (b.2) each said horizontal row of P4 logic units is configurably couplable to at least a corresponding one of the P5 HIC's at least for acquiring input logic its from the corresponding HIC and for outputting result data to the corresponding HIC; (a.3) wherein each of said logic unit can internally process its respective second plurality of input logic bits without using said horizontal interconnect channels or other general interconnect for such internal processing: and (c) an embedded memory subsystem, wherein said embedded memory subsystem includes: (c.1) a sixth plurality, P6 of independently-useable memory blocks, and wherein: (c.1a) each said independently-useable memory block is embedded within one of said rows of logic units and is configurably couplable to the corresponding HIC of said row for transferring storage data by way of the corresponding HIC of that row of P4 logic units; and (c.1b) each of said memory blocks includes at least a first address-capturing register that is programmably couplable to at least one of said HIC's for receiving and capturing in synchronism with a supplied address-strobing signal, an address signal supplied on said at least one HIC; (c.1c) each of said memory blocks includes at least a first data-capturing register for capturing said storage data in synchronism with a supplied first data-strobing signal; and (c.1d) each first address-capturing register is clockable by a first address-strobing signal that is independent of the first data-strobing signal.

2. A FPGA device according to claim 1 wherein: (a.3) said logic units are further distributed among a plurality of vertical columns, with each column of the plurality of columns having a seventh plurality, P7 of said logic units; and (c.1b1) plural ones of said memory blocks are arranged to define one or more columns of embedded memory within said FPGA device with each such column having an eighth plurality, P8 of said memory blocks.

3. A field programmable gate array device according to claim 2 wherein: (c.1c1) each said memory block is organized as a ninth plurality, P9 of addressable sets of storage data bits, where each addressable set of storage data bits includes at least P3 bits that are transferable by way of the corresponding HIC of its corresponding row of P4 logic units, said P3 plurality of bits corresponding to the P3 plurality of output logic bits producible by each said logic unit.

4. A field programmable gate array device according to claim 3 wherein: (c. 1c2) each of P2 and P3 is an integer equal to or greater than 4.

5. A field programmable gate array device according to claim 1 wherein: (a.3) groups of said logic units are further wedged together such that no HIC's pass between the wedged together logic units, and such that each group of logic units defines a logic superstructure; and (c.1c2) groups of said memory blocks are also wedged together such that no HIC's pass between the wedged together memory blocks, and such that each group of memory blocks defines a memory superstructure that is configurably-couplable to a corresponding logic superstructure.

6. A field programmable gate array device according to claim 1 wherein said embedded memory subsystem includes: (c.2) at least one special interconnect channel for supplying address signals to the first address-capturing registers of a respective set of said memory blocks.

7. A field programmable gate array device according to claim 6 wherein: (c.1b1) there are at least two of said columns of embedded memory; and (c.2a) there are at least two of said special interconnect channels, and each respective special interconnect channel is for supplying address signals to a respective one of the at least two columns of embedded memory.

8. A field programmable gate array device according to claim 6 wherein: (c.1c3) each said memory block has at least first and second data ports each for outputting storage data; (c.1d) each said memory block has at least first and second address ports each for receiving address signals identifying the storage data to be output by a corresponding one of the at least first and second data ports; (c.1e) each said memory block has in addition to said respective first address-capturing register, a second address-capturing register that is programmably couplable to at least one of said HIC's for receiving and capturing an address signal supplied on said at least one HIC, and said first and second address-capturing registers respectively service the first and second address ports; and (c.2a) the at least one special interconnect channel includes first and second address-carrying components along which independent address signals may be respectively carried for application to respective ones of the first and second address ports of at least two memory blocks.

9. A field programmable gate array device according to claim 6 wherein: (c.1d) each said memory block has a controls-receiving port for programmably acquiring from said at least one special interconnect channel, control signals that control operations of said memory block; and (c.1e) said first address-strobing signal is acquired by said controls-receiving port.

10. In a field programmable gate array device (FPGA) having a user-configurable interconnect network that includes a plurality of horizontal interconnect channels each with a diversified set of long-haul interconnect lines and shorter-haul interconnect lines, an embedded memory subsystem comprising: (a) a plurality of multi-ported memory blocks each arranged adjacent to a horizontal interconnect channel (HIC) of the interconnect network; wherein: (a.1) each multi-ported memory block includes a first, independently-addressable data port and a second, independently-addressable data port; (a.2) each of said first and second, independently-addressable data ports includes a respective address-capturing register that is connectable by user-configurable intercouplings to one or both of the long-haul interconnect lines and the shorter-haul interconnect lines for capturing a respective address signal in synchronism with a supplied, address-strobing signal; and (a.3) each of said first and second, independently-addressable data ports includes a respective, read data-capturing resister that is connectable by user-configurable intercouplings to at least the long-haul interconnect lines for capturing respective read data of the port independently of the address-strobing signal and for outputting the captured read data to the long-haul interconnect lines.

11. In an FPGA device having a plurality of variable grain, configurable logic blocks (VGB's) and VGB interconnect resources including lines of diversified continuous lengths for interconnecting said VGB's, an embedded memory subsystem comprising: a special interconnect channel, programmably couplable to said VGB interconnect resources: and a plurality of memory blocks wherein each memory block includes: (a) at least a first address-capturing register that is programmably couplable to said VGB interconnect resources by way of said special interconnect channel for receiving and capturing a respective first address signal supplied by way of said VGB interconnect resources and said special interconnect channel; and (a.1) address-strobing means for strobing the first address-capturing register by way of said VGB interconnect resources and said special interconnect channel, where said address-strobing can occur independently of data-capture strobing for corresponding data.

12. The embedded memory subsystem of claim 11 wherein each memory block further includes: (b) a second address-capturing register that is programmably couplable to said interconnect resources for receiving and capturing a respective second address signal supplied by way of said VGB interconnect resources.

13. The embedded memory subsystem of claim 11 wherein: (a.1) said first address-capturing register is further programmably couplable to said VGB interconnect resources by way the special interconnect channel for receiving a respective first address clock signal to which the first address-capturing register is responsive.

14. A method for use in an FPGA device having plural variable grain blocks (VGB's), configurable interconnect resources with continuous conductors of diversified lengths, and an embedded memory subsystem comprising a plurality of memory blocks situated for configurable coupling to the diversified interconnect resources, where the memory blocks each have at least one address input port and at least one data port, the address input port having a respective address-capturing register, said method comprising the steps of: (a) outputting a first address signal for conveyance by at least part of said interconnect resources to a respective first address-capturing register of an address input port of a given memory block; (b) outputting a first address-strobing signal for conveyance by at least part of said interconnect resources to the respective first address-capturing register to thereby capture the conveyed first address signal in the respective first address-capturing register of the given memory block; and (d) coupling the first address-strobing signal through delaying logic for thereby invoking a delay in outputting of a next address signal for conveyance by at least part of said interconnect resources to the respective first address-capturing register of the address input port of the given memory block, said invoked delay assuring that the first address signal is captured by the respective first address-capturing register before the outputting of said next address signal.

15. The method of claim 14 wherein at least one of said step (a) of outputting the first address signal and said step (d) of coupling the first address-strobing signal through delaying logic includes the substep of: (a/d.1) transmitting the first address signal through a configurable sequential output element of a first of said VGB's.

16. The method of claim 15 wherein at least one of said step (a) of outputting the first address signal and said step (d) of coupling the first address-strobing signal through delaying logic includes the further substep of: (a/d.2) sourcing the first address signal from a storage register within a configurable sequential element of said first of said VGB's.

17. The method of claim 16 wherein at least one of said step (a) of outputting the first address signal and said step (d) of coupling the first address-strobing signal through delaying logic includes the further substep of: (a/d.3) applying an address-changing clock signal to the storage register that sources the first address signal, where said address-changing clock signal is derived from the first address-strobing signal.

18. The method of claim 14 wherein said step (a) of outputting the first address signal includes the substeps of: (a.1) transmitting the first address signal through a first of plural tristate drivers, where each of the tristate drivers has an output enabling terminal; (a.2) providing an address-changing control signal that deactivates the output enabling terminal of the first tristate driver, where said address-changing control signal is derived from the first address-strobing signal.

19. A method for configuring an FPGA device having plural variable grain blocks (VGB's), configurable interconnect resources, and an embedded memory subsystem comprising one or more memory blocks situated for configurable coupling via the configurable interconnect resources to the VGB's, where the memory blocks each have at least one registered address input port for receiving and storing supplied address bits, and where the memory blocks each further have at least one registered data output port for storing and outputting retrieved read-data, said method comprising the steps of: (a) defining a first route through and interconnect resources from an address signal sourcing circuit of the FPGA device to the at least one registered address input port; (b) defining a second route through said interconnect resources from an address clock sourcing circuit of the FPGA device to the at least one registered address input port; (c) defining a third route through said interconnect resources from the address clock sourcing circuit to an address-changing circuit of the FPGA device, the third route being configured such that a new address signal can be produced by action of said address-changing circuit substantially at the same time or shortly after an address clock signal of the address clock sourcing circuit clocks the at least one registered address input port, said new address signal being produced so as to not interfere with a current address signal captured by the registered address input port; and (d) defining a fourth route through said interconnect resources from a read clock sourcing circuit of the FPGA device to the at least one registered data output port.

20. A method for producing configuration signals for configuring an FPGA device having plural variable grain blocks (VGB's) configurable interconnect resources, and an embedded memory subsystem comprising one or more memory blocks situated for configurable coupling via the configurable interconnect resources to the VGB's, where the memory blocks each have at least one registered address input port for receiving and storing supplied address bits in response to a supplied address-strobing signal, and where the memory blocks each have at least one registered data output port for storing and outputting retrieved read-data, the storing of the retrieved read-data being in response to a supplied data-strobing signal, said method comprising the steps of: (a) inputting a design definition; (b) searching the input design definition for the presence of one or more memory modules, address-sourcing modules, and data-using modules that will cooperate to perform a memory read or memory write operation; and (c) encouraging the creation in the configured FPGA of a shared signal route that transmits an address-strobing clock signal to the registered address input port and that transmits an address-change allowing signal to one or more of the address-sourcing modules and that transmits a data-strobing signal to one or more of the registered data output ports.

21. A field programmable gate array (FPGA) device comprising: (a) a plurality of configurable logic blocks (CLB's); (b) configurable CLB interconnect resources for configurably interconnecting said CLB's; (c) a memory subsystem comprising: (c.1) a plurality of independently-usable memory blocks each having: (c.1a) a shared array of memory cells; (c.1b) a first port unit coupled to the shared array and including a respective first data output port and a first address input port; (c.1c) a second port unit coupled to the shared array and including a respective second data output port and a second address input port, wherein the first and second port units can simultaneously access the shared array of memory cells; (c.1d) first and second address-capturing registers respectively coupled to the first and second address input ports, each address-capturing register having address and clock inputs and an address output; (c.1e) first and second read-data capturing registers respectively coupled to the first and second data output ports, each data capturing register having data and clock inputs and a data output; (c.2) a configurable, first special interconnect channel that is programmably couplable to said CLB interconnect resources, (c.2a) said first special interconnect channel extending adjacent to a respective first group of said memory blocks; (c.2b) said first special interconnect channel being further programmably couplable to the respective clock inputs of the first and second address-capturing registers of said first group of memory blocks such that the respective clock inputs of the first and second address-capturing registers of one or more memory blocks in the first group can be respectively driven by at least a first address-strobing signal which is transmitted by way of the first special interconnect channel; and (c.2c) said first special interconnect channel being further programmably couplable to the respective clock inputs of the first and second read-data capturing registers of said first group of memory blocks such that the respective clock inputs of the first and second read-data capturing registers of one or more memory blocks in the first group can be respectively driven by independent first and second data-strobing signals which are transmitted by way of the first special interconnect channel.

22. The FPGA device of claim 21 wherein: (c.2a1) said first special interconnect channel is programmably couplable to the respective clock inputs of the first and second address-capturing registers of said first group of memory blocks such that the clock input of the first address-capturing register of one or more memory blocks of the first group can be respectively driven by the first address-strobing signal and such that the clock input of the second address-capturing register of one or more memory blocks of the first group can be respectively driven by a second address-strobing signal which is independent of the first address-strobing signal and which is also transmitted by way of the first special interconnect channel.

23. The FPGA device of claim 21 wherein: (c.2c) the configurable, first special interconnect channel is further programmably couplable to the respective address inputs of the first and second address-capturing registers of said first group of memory blocks such that the respective address inputs of the first and second address capturing registers of one or more memory blocks of the first group can be respectively driven by independent first and second address signals which are transmitted by way of the first special interconnect channel.

24. The FPGA device of claim 21 wherein: (b.1) the configurable CLB interconnect resources include lines of diversified continuous lengths for configurably interconnecting said CLB's.

25. The FPGA device of claim 21 wherein: (a.1) at least a plurality of said CLB's are constituted by variable grain blocks (VGB's) where each said VGB is comprised of at least four Configurable Building Blocks (CBB's) and each CBB can output to adjacent parts of the CLB interconnect resources at least one bit of processed result data, the processed result data bit being a configuration-defined function of at least three input term signals that are acquirable by the CBB from adjacent parts of the CLB interconnect resources.

26. The FPGA device of claim 25 wherein: (a.1a) each said processed result data bit of a given CBB can be programmably defined to be a configuration-defined function of at least six input term signals that are acquirable by the CBB from adjacent parts of the CLB interconnect resources.

27. The FPGA device of claim 25 wherein: (a.1a) each said processed result data bit of a given CBB can be programmably defined to be a configuration-defined function of at least sixteen input term signals that are obtainable from parts of the CLB interconnect resources that neighbor the given CLB.

28. The FPGA device of claim 25 wherein: (a.1a) each said processed result data bit of a given CBB can be programmably defined to be a result of an addition or subtraction operation carried out at least partially within the given CBB.

29. The FPGA device of claim 25 wherein: (b.1) said configurable CLB interconnect resources include continuous lines of diversified length including lines of a first continuous length extending adjacent to at least two VGB's and lines of a second continuous length extending adjacent to at least eight VGB's, the second continuous length being at least twice the first continuous length.

30. The FPGA device of claim 29 wherein: (c.1c1) the second read-data capturing register of each memory block is programmably couplable to at least an adjacent one of the first continuous length lines.

31. The FPGA device of claim 30 wherein: (c.1c2) the second port unit of each memory block is a read-only port unit.

32. The FPGA device of claim 27 wherein: (a.1a) said VGB's are disposed to define vertical columns of VGB's and horizontal rows of VGB's; and (b.1) said configurable CLB interconnect resources include continuous lines of diversified length including lines of a first continuous length extending adjacent to at least two VGB's and lines of a second continuous length extending adjacent to a respective full row or a full column of VGB's, the second continuous length being at least ten times the first continuous length.

33. The FPGA device of claim 32 wherein: (c.1c1) the first and second read-data capturing registers of each memory block are each programmably couplable to a respective at least one adjacent one of the second continuous length lines.

34. The FPGA device of claim 33 wherein: (c.1c2) the second port unit of each memory block is a read-only port unit while the first port unit of each memory block is a read-write port unit.

35. The FPGA device of claim 21 wherein: (c.1b1) said first port unit includes a respective first data input port for receiving write data for writing into said shared array of memory cells; (c.1f) each given one of said memory blocks further includes a respective first write-data capturing register respectively coupled to the first data input port of the given memory block, each write-data capturing register having data and clock inputs and a data output; (c.1f1) the respective clock input of each write-data capturing register in a given one of said memory blocks can be respectively driven by the corresponding first data-strobing signal of the given memory block.

36. The FPGA device of claim 21 wherein: (c.2c) said first special interconnect channel includes a plurality of continuous conductors of respectively diversified lengths including maximum length conductors for broadcasting to the first group of memory blocks common address bits, and including shorter length conductors for conveying other address bits to respective subsets the first group of memory blocks.

37. The FPGA device of claim 21 wherein: (c.2c) said first special interconnect channel includes a plurality of continuous conductors of respectively diversified lengths including maximum length conductors for broadcasting to the first group of memory blocks common control bits, and including shorter length conductors for conveying other control bits to respective subsets the first group of memory blocks.

38. The FPGA device of claim 37 wherein: (c.2c1) said first special interconnect channel has global clock lines passing therethrough for broadcasting to the first group of memory blocks programmably acquirable global clock signals.

39. The FPGA device of claim 21 and further comprising: (c.3) a configurable, second special interconnect channel that is programmably couplable to said CLB interconnect resources, (c.3a) said second special interconnect channel extending adjacent to a respective second group of said memory blocks; (c.3b) said second special interconnect channel being further programmably couplable to the respective clock inputs of the first and second address-capturing registers of said second group of memory blocks such that the respective clock inputs of the first and second address-capturing registers of one or more memory blocks in the second group can be respectively driven by at least a second address-strobing signal which is transmitted by way of the second special interconnect channel; and (c.3c) said second special interconnect channel being further programmably couplable to the respective clock inputs of the first and second read-data capturing registers of said second group of memory blocks such that the respective clock inputs of the first and second read-data capturing registers of one or more memory blocks in the second group can be respectively driven by independent third and fourth data-strobing signals which are transmitted by way of the second special interconnect channel.

40. A method of configuring a field programmable gate array (FPGA) device where the FPGA device comprises: (0.1) a plurality of configurable logic blocks (CLB's); (0.2) configurable CLB interconnect resources for configurably interconnecting said CLB's; (0.3) a memory subsystem comprising: (0.31) a plurality of independently-usable memory blocks each having: (0.31a) a shared array of memory cells; (0.31b) a first port unit coupled to the shared array and including a respective first data output port and a first address input port; (0.31c) a second port unit coupled to the shared array and including a respective second data output port and a second address input port; (0.31d) at least one address-capturing registers respectively coupled to one of the first and second address input ports, the at least one address-capturing register having address and clock inputs and an address output; (0.31e) at least one read-data capturing register respectively coupled to one of the first and second data output ports, the at least one data capturing register having data and clock inputs and a data output; (0.32) a configurable, special interconnect channel that is programmably couplable to said CLB interconnect resources, (0.32a) said special interconnect channel extending adjacent to said memory blocks; (0.32b) said special interconnect channel being further programmably couplable to the respective clock inputs of the at least one address-capturing registers of said memory blocks such that the respective clock inputs of the at least one address-capturing registers of one or more of the memory blocks can be respectively driven by at least a first address-strobing signal which is transmittable by way of the special interconnect channel; and (0.32c) said special interconnect channel being further programmably couplable to the respective clock inputs of the at least one read-data capturing registers of said memory blocks such that the respective clock inputs of the at least one read-data capturing registers of one or more of the memory blocks can be respectively driven by a data-strobing signals which is transmittable by way of the special interconnect channel; said FPGA configuring method comprising: (a) configuring the special interconnect channel to supply an address-strobing signal to the clock input of an address-capturing register of a given one of said memory blocks; and (b) configuring the special interconnect channel to supply a data-strobing signal to the clock input of a read-data capturing register of the given one of said memory blocks such that the supplied address-strobing and data-strobing signals can be independent of one another.

41. The FPGA configuring method of claim 40 and further comprising: (c) configuring the special interconnect channel to supply an address signal to the data input of an address-capturing register of the given one of said memory blocks.

42. The FPGA configuring method of claim 41 and further comprising: (c) configuring a given one of said CLB's to be responsive to the supplied address-strobing signal and to produce a next and later address signal for the given memory block after said supplied address-strobing signal causes the address-capturing register of the given memory block the capture the earlier-supplied address signal.

43. The FPGA configuring method of claim 41 and further comprising: (d) configuring a given one of said CLB's to be responsive to the supplied address-strobing signal and to produce an output enabling signal that enables memory data to be output onto said CLB interconnect after the supplied data-strobing signal causes a read-data capturing register of the given one of said memory blocks to capture resource memory read data.

44. A field programmable gate array (FPGA) device comprising: (a) a plurality of configurable logic blocks (CLB's); (b) configurable CLB interconnect resources for configurably interconnecting said CLB's; (c) a memory subsystem comprising: (c.1) a plurality of independently-usable memory blocks each having: (c.1a) a shared array of memory cells; (c.1b) a first port unit coupled to the shared array and including a respective first data output port and a first address input port; (c.1c) a second port unit coupled to the shared array and including a respective second data output port and a second address input port, wherein the first and second port units can access respectively addressed part of the shared array of memory cells; (c.1d) first and second address-capturing registers respectively coupled to the first and second address input ports, each address-capturing register having address and clock inputs and an address output; (c.1e) first and second read-data capturing registers respectively coupled to the first and second data output ports, each data capturing register having data and clock inputs and a data output; (c.2) a configurable, first special interconnect channel that is programmably couplable to said CLB interconnect resources, (c.2a) said first special interconnect channel extending adjacent to a respective first group of said memory blocks; (c.2b) said first special interconnect channel being further programmably couplable to the respective clock inputs of the first and second address-capturing registers of said first group of memory blocks such that the respective clock inputs of the first and second address-capturing registers of one or more memory blocks in the first group can be respectively driven by at least a first address-strobing signal which is transmitted by way of the first special interconnect channel; and (c.2c) said first special interconnect channel being further programmably couplable to the respective clock inputs of the first and second read-data capturing registers of said first group of memory blocks such that the respective clock inputs of the first and second read-data capturing registers of one or more memory blocks in the first group can be respectively driven by one or more data-strobing signals which are independent of the first address-strobing signal thereby allowing read-data-capture and address-capture operations by respective ones of the read-data capturing registers and address-capturing registers to occur at different times, and wherein said one or more data-strobing signals are transmitted by way of the first special interconnect channel.

45. The FPGA device of claim 44 wherein: (c.2a1) said first special interconnect channel is programmably couplable to the respective clock inputs of the first and second address-capturing registers of said first group of memory blocks such that the clock input of the first address-capturing register of one or more memory blocks of the first group can be respectively driven by the first address-strobing signal and such that the clock input of the second address-capturing register of one or more memory blocks of the first group can be respectively driven by a second address-strobing signal which is independent of the first address-strobing signal and which is also transmitted by way of the first special interconnect channel.

46. The FPGA device of claim 44 wherein: (c.2d) the configurable, first special interconnect channel is further programmably couplable to the respective address inputs of the first and second address-capturing registers of said first group of memory blocks such that the respective address inputs of the first and second address capturing registers of one or more memory blocks of the first group can be respectively driven by independent first and second address signals which are transmitted by way of the first special interconnect channel.

47. The FPGA device of claim 44 wherein: (b.1) the configurable CLB interconnect resources include lines of diversified continuous lengths for configurably interconnecting said CLB's; and (c.2d) the configurable, first special interconnect channel is programmably couplable to at least two different length conductors of said CLB interconnect resources.

48. The FPGA device of claim 44 wherein: (a.1) at least a plurality of said CLB's are constituted by variable grain blocks (VGB's) where each said VGB is comprised of at least four Configurable Building Blocks (CBB's) and each CBB can output to adjacent parts of the CLB interconnect resources at least one bit of processed result data, the processed result data bit being a configuration-defined function of at least three input term signals that are acquirable by the CBB from adjacent parts of the CLB interconnect resources.

49. The FPGA device of claim 48 wherein: (a.1a) each said processed result data bit of a given CBB can be programmably defined to be a configuration-defined function of at least six input term signals that are acquirable by the CBB from adjacent parts of the CLB interconnect resources.

50. The FPGA device of claim 48 wherein: (a.1a) each said processed result data bit of a given CBB can be programmably defined to be a configuration-defined function of at least sixteen input term signals that are obtainable from parts of the CLB interconnect resources that neighbor the given CLB.

51. The FPGA device of claim 48 wherein: (a.1a) each said processed result data bit of a given CBB can be programmably defined to be a result of an addition or subtraction operation carried out at least partially within the given CBB.

52. The FPGA device of claim 48 wherein: (b.1) said configurable CLB interconnect resources include continuous lines of diversified length including lines of a first continuous length extending adjacent to at least two VGB's and lines of a second continuous length extending adjacent to at least eight VGB's, the second continuous length being at least twice the first continuous length; and (c.2d) the configurable, first special interconnect channel is programmably couplable to at least two different length conductors of said CLB interconnect resources.

53. The FPGA device of claim 52 wherein: (c.1c1) the second read-data capturing register of each memory block is programmably couplable to at least an adjacent one of the first continuous length lines.

54. The FPGA device of claim 44 wherein: (c.1c1) the second port unit of each memory block is a read-only port unit.

55. The FPGA device of claim 48 wherein: (a.1a) said VGB's are disposed to define vertical columns of VGB's and horizontal rows of VGB's; and (b.1) said configurable CLB interconnect resources include continuous lines of diversified length including lines of a first continuous length extending adjacent to at least two VGB's and lines of a second continuous length extending adjacent to a respective full row or a full column of VGB's, the second continuous length being at least ten times the first continuous length.

56. The FPGA device of claim 55 wherein: (c.1c1) the first and second read-data capturing registers of each memory block are each programmably couplable to a respective at least one adjacent one of the second continuous length lines.

57. The FPGA device of claim 56 wherein: (c.1c2) the second port unit of each memory block is a read-only port unit while the first port unit of each memory block is a read-write port unit.

58. The FPGA device of claim 44 wherein: (c.1b1) said first port unit includes a respective first data input port for receiving write data for writing into a portion of said shared array of memory cells that is addressed by the first address input port; (c.1f) each given one of said memory blocks further includes a respective first write-data capturing register respectively coupled to the first data input port of the given memory block, each write-data capturing register having data and clock inputs and a data output; (c.1f1) the respective clock input of each write-data capturing register in a given one of said memory blocks can be respectively driven by the corresponding first data-strobing signal of the given memory block.

59. The FPGA device of claim 44 wherein: (c.2d) said first special interconnect channel includes a plurality of continuous conductors of respectively diversified lengths including maximum length conductors for broadcasting to the first group of memory blocks common address bits, and including shorter length conductors for conveying other address bits to respective subsets the first group of memory blocks.

60. The FPGA device of claim 44 wherein: (c.2d) said first special interconnect channel includes a plurality of continuous conductors of respectively diversified lengths including maximum length conductors for broadcasting to the first group of memory blocks common control bits, and including shorter length conductors for conveying other control bits to respective subsets the first group of memory blocks.

61. The FPGA device of claim 60 wherein: (c.2d1) said first special interconnect channel has global clock lines passing therethrough for broadcasting to the first group of memory block programmably acquirable global clock signals, where the global clock signals are also programmably acquirable by said CLB's for synchronizing operations of the CLB's.

62. The FPGA device of claim 44 and further comprising: (c.3) a configurable, second special interconnect channel that is programmably couplable to said CLB interconnect resources, (c.3a) said second special interconnect channel extending adjacent to a respective second group of said memory blocks; (c.3b) said second special interconnect channel being further programmably couplable to the respective clock inputs of the first and second address-capturing registers of said second group of memory blocks such that the respective clock inputs of the first and second address-capturing registers of one or more memory blocks in the second group can be respectively driven by at least a second address-strobing signal which is transmitted by way of the second special interconnect channel; and (c.3c) said second special interconnect channel being further programmably couplable to the respective clock inputs of the first and second read-data capturing registers of said second group of memory blocks such that the respective clock inputs of the first and second read-data capturing registers of one or more memory blocks in the second group can be respectively driven by independent third and fourth data-strobing signals which are transmitted by way of the second special interconnect channel; (c.3d) wherein said first and second special interconnect channels can programmably acquire same or different control signals from the CLB interconnect resources.

63. The FPGA device of claim 44 wherein: (a.1) each of said plurality of CLB's is programmably couplable to the first special interconnect channel by way of at least one tristateable line drive such that different address signals can be injected in time multiplexed fashion from the CLB's to the first special interconnect channel by enabling outputs of different tristateable line drivers at different times; and (a.2) output enable terminals of said tristateable line drivers can be programmably made responsive to said at least first address-strobing signal such that injection of a new and replacing address signal into the first special interconnect channel can be blocked until a previous address signal has been captured by a corresponding one of the address-capturing registers in response to said at least first address-strobing signal.

64. The FPGA device of claim 44 wherein: (c.1f) the data output ports of said first and second read-data capturing registers can respectively couple to the CLB interconnect resources by way of first and second tristateable line drivers, where each tristateable line driver has a respective output enable terminal; (c.1g) the output enable terminals of said tristateable line drivers can be programmably made responsive to said one or more data-strobing signals such that injection of new and replacing data signals through the tristateable line drivers and into corresponding parts of the CLB interconnect resources can be blocked until said replacing data signals have been captured by corresponding ones of the data-capturing registers in response to said one or more data-strobing signals.

65. The FPGA device of claim 44 and further comprising: (d) a plurality of programmably configurable input/output blocks (IOB's) coupled to the CLB interconnect resources and having configurable I/O storage means which can configured to operate in synchronism with at least the first address-strobing signal.

66. The FPGA device of claim 44 and further comprising: (d) a plurality of programmably configurable input/output blocks (IOB's) coupled to the CLB interconnect resources and having configurable I/O storage means which can configured to operate in synchronism with said one or more data-strobing signals.

67. A field programmable gate array (FPGA) device comprising: a plurality of configurable logic blocks (CLBs); a memory subsystem comprising a plurality of independently-usable memory blocks each having: an array of memory cells; a first address-capturing register coupled to a first address port of the array, the first address-capturing register having address and clock inputs and an address output; a second address-capturing register coupled to a second address port of the array, the second address-capturing register having address and clock inputs and an address output; a first read-data capturing register coupled to a first data port of the array, the first read-data capturing register having data and clock inputs and a data output; and a second read-data capturing register coupled to a second data port of the array, the second read-data capturing register having data and clock inputs and a data output; and interconnect resources configurable to interconnect the CLBs and the memory subsystem.

68. The FPGA device of claim 67, wherein the first address port and first data port comprise a first port unit and the second address port and second data port comprise a second port unit, and wherein the first and second port units are operable to simultaneously access the array of memory cells.

69. The FPGA device of claim 67, wherein the interconnect resources include a first special interconnect channel programmable to interconnect the CLBs and the memory subsystem.

70. A method of configuring an FPGA device comprising: providing a memory subsystem comprising a plurality of independently-usable memory blocks each having an address-capturing register and a read-data capturing register; configuring interconnect resources to supply an address-strobing signal to the address-capturing register of a memory block; and configuring interconnect resources to supply a data-strobing signal to the read-data capturing register of the memory block, wherein the address-strobing and data-strobing signals can be supplied independent of one another.

71. A field programmable gate array (FPGA) device comprising: a plurality of configurable logic blocks (CLBs); a memory subsystem comprising a plurality of independently-usable memory blocks each having: an array of memory cells; a first address-capturing register coupled to a first address port of the array, the first address-capturing register having address and clock inputs and an address output; and a read-data capturing register coupled to a first data port of the array, the first read-data capturing register having data and clock inputs and a data output; and interconnect resources configurable to interconnect the CLBs and the memory subsystem and configurable to supply an address-strobing signal to the address-capturing register and a data-strobing signal to the read-data capturing register, wherein the address-strobing and data-strobing signals can be supplied independent of one another.

Description:

CROSS REFERENCE TO RELATED APPLICATIONS

The following copending U.S. patent applications are owned by the owner of the present application, and their disclosures are incorporated herein by reference:

    • (A) Ser. No. 08/948,306 filed Oct. 9, 1997 by Om P. Agrawal et al. and originally entitled, “VARIABLE GRAIN ARCHITECTURE FOR FPGA INTEGRATED CIRCUITS”;
    • (B) (A) Ser. No. 08/996,049 filed Dec. 22, 1997 by Om P. Agrawal et al and originally entitled, DUAL PORT SRAM MEMORY FOR RUN-TIME USE IN FPGA INTEGRATED CIRCUITS;
    • (C) Ser. No. 08/996,361 filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, “SYMMETRICAL, EXTENDED AND FAST DIRECT CONNECTIONS BETWEEN VARIABLE GRAIN BLOCKS IN FPGA INTEGRATED CIRCUITS”;
    • (D) Ser. No. 08/995,615 filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, “A PROGRAMMABLE INPUT/OUTPUT BLOCK (IOB) IN FPGA INTEGRATED CIRCUITS”;
    • (E) Ser. No. 08/995,614 filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, “INPUT/OUTPUT BLOCK (IOB) CONNECTIONS TO MAXL LINES, NOR LINES AND DENDRITES IN FPGA INTEGRATED CIRCUITS”;
    • (F) Ser. No. 08/995,612 filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, “FLEXIBLE DIRECT CONNECTIONS BETWEEN INPUT/OUTPUT BLOCKs (IOBS) AND VARIABLE GRAIN BLOCKs (VGBS) IN FPGA INTEGRATED CIRCUITS”;
    • (G) Ser. No. 08/997,221 filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, “PROGRAMMABLE CONTROL MULTIPLEXING FOR INPUT/OUTPUT BLOCKs (IOBs) IN FPGA INTEGRATED CIRCUITS”;
    • (H) Ser. No. 09/191,444 filed Nov. 12, 1998 by inventors Bai Nguyen et al and originally entitled, MULTI-PORT SRAM CELL ARRAY HAVING ISOLATION BUFFER IN EACH SRAM CELL FOR PROTECTING SRAM CELL FROM READ NOISE;
    • (I) Ser. No. 09/235,536 filed concurrently herewith by inventors Bai Nguyen et al and entitled, MULTI-PORT SRAM CELL ARRAY HAVING PLURAL WRITE PATHS INCLUDING FOR WRITING THROUGH ADDRESSABLE PORT AND THROUGH SERIAL BOUNDARY SCAN; and
    • (J) Ser. No. 09/008,762 filed Jan. 19, 1998 by inventors Om Agrawal et al and entitled, SYNTHESIS-FRIENDLY FPGA ARCHITECTURE WITH VARIABLE LENGTH AND VARIABLE TIMING INTERCONNECT.

CROSS REFERENCE TO RELATED PATENTS

The disclosures of the following U.S. patent are incorporated herein by reference:

    • (A) U.S. Pat. No. 5,212,652 issued May 18, 1993 to Om Agrawal et al, (filed as Ser. No. 07/394,221 on Aug. 15, 1989) and entitled, PROGRAMMABLE GATE ARRAY WITH IMPROVED INTERCONNECT STRUCTURE;
    • (B) U.S. Pat. No. 5,621,650 issued Apr. 15, 1997 to Om Agrawal et al, and entitled, PROGRAMMABLE LOGIC DEVICE WITH INTERNAL TIME-CONSTANT MULTIPLEXING OF SIGNALS FROM EXTERNAL INTERCONNECT BUSES; and
    • (C) U.S. Pat. No. 5,185,706 issued Feb. 9, 1993 to Om Agrawal et al.

BACKGROUND

1. Field of the Invention

The invention is generally directed to integrated circuits, more specifically to on-chip memory provided for run-time use with on-chip logic circuits. The invention is yet more specifically directed to on-chip memory provided for run-time use within Programmable Logic Devices (PLD's), and even more specifically to a subclass of PLD's known as Field Programmable Gate Arrays (FPGA's).

2. Description of Related Art

Field-Programmable Logic Devices (FPLD's) have continuously evolved to better serve the unique needs of different end-users. From the time of introduction of simple PLD's such as the Advanced Micro Devices 22V10™ Programmable Array Logic device (PAL), the art has branched out in several different directions.

One evolutionary branch of FPLD's has grown along a paradigm known as Complex PLD's or CPLD's. This paradigm is characterized by devices such as the Advanced Micro Devices MACH™ family. Examples of CPLD circuitry are seen in U.S. Pat. Nos. 5,015,884 (issued May 14, 1991 to Om P. Agrawal et al.) and 5,151,623 (issued Sep. 29, 1992 to Om P. Agrawal et al.).

Another evolutionary chain in the art of field programmable logic has branched out along a paradigm known as Field Programmable Gate Arrays or FPGA's. Examples of such devices include the XC2000™ and XC3000™ families of FPGA devices introduced by Xilinx, Inc. of San Jose, Calif. The architectures of these devices are exemplified in U.S. Pat. Nos. 4,642,487; 4,706,216; 4,713,557; and 4,758,985; each of which is originally assigned to Xilinx, Inc.

An FPGA device can be characterized as an integrated circuit that has four major features as follows.

    • (1) A user-accessible, configuration-defining memory means, such as SRAM, EPROM, EEPROM, anti-fused, fused, or other, is provided in the FPGA device so as to be at least once-programmable by device users for defining user-provided configuration instructions. Static Random Access Memory or SRAM is of course, a form of reprogrammable memory that can be differently programmed many times. Electrically Erasable and reprogrammable ROM or EEPROM is an example of nonvolatile reprogrammable memory. The configuration-defining memory of an FPGA device can be formed of mixture of different kinds of memory elements if desired (e.g., SRAM and EEPROM).
    • (2) Input/Output Blocks (IOB's) are provided for interconnecting other internal circuit components of the FPGA device with external circuitry. The IOB's may have fixed configurations or they may be configurable in accordance with user-provided configuration instructions stored in the configuration-defining memory means.
    • (3) Configurable Logic Blocks (CLB's) are provided for carrying out user-programmed logic functions as defined by user-provided configuration instructions stored in the configuration-defining memory means. Typically, each of the many CLB's of an FPGA has at least one lookup table (LUT) that is user-configurable to define any desired truth table, —to the extent allowed by the address space of the LUT. Each CLB may have other resources such as LUT input signal pre-processing resources and LUT output signal post-processing resources. Although the term ‘CLB’ was adopted by early pioneers of FPGA technology, it is not uncommon to see other names being given to the repeated portion of the FPGA that carries out user-programmed logic functions. The term, ‘LAB’ is used for example in U.S. Pat. No. 5,260,611 to refer to a repeated unit having a 4-input LUT.
    • (4) An interconnect network is provided for carrying signal traffic within the FPGA device between various CLB's and/or between various IOB's and/or between various IOB's and CLB's. At least part of the interconnect network is typically configurable so as to allow for programmably-defined routing of signals between various CLB's and/or IOB's in accordance with user-defined routing instructions stored in the configuration-defining memory means. Another part of the interconnect network may be hard wired or nonconfigurable such that it does not allow for programmed definition of the path to be taken by respective signals traveling along such hard wired interconnect. A version of hard wired interconnect wherein a given conductor is dedicatedly connected to be always driven by a particular output driver, is sometimes referred to as ‘direct connect’.

In addition to the above-mentioned basic components, it is sometimes desirable to include on-chip reprogrammable memory that is embedded between CLB's and available for run-time use by the CLB's and/or resources of the FPGA for temporarily holding storage data. This embedded run-time memory is to be distinguished from the configuration memory because the latter configuration memory is generally not reprogrammed while the FPGA device is operating in a run-time mode. The embedded run-time memory may be used in speed-critical paths of the implemented design to implement, for example, FIFO or LIFO elements that buffer data words on a first-in/first-out or last-in/first-out basis. Read/write speed, data validating speed, and appropriate interconnecting of such on-chip embedded memory to other resources of the FPGA can limit the ability of a given FPGA architecture to implement certain speed-critical designs.

Modern FPGA's tend to be fairly complex. They typically offer a large spectrum of user-configurable options with respect to how each of many CLB's should be configured, how each of many interconnect resources should be configured, and how each of many IOB's should be configured. Rather than determining with pencil and paper how each of the configurable resources of an FPGA device should be programmed, it is common practice to employ a computer and appropriate FPGA-configuring software to automatically generate the configuration instruction signals that will be supplied to, and that will cause an unprogrammed FPGA to implement a specific design.

FPGA-configuring software typically cycles through a series of phases, referred to commonly as ‘partitioning’, ‘placement’, and ‘routing’. This software is sometimes referred to as a ‘place and route’ program. Alternate names may include, ‘synthesis, mapping and optimization tools’.

In the partitioning phase, an original circuit design (which is usually relatively large and complex) is divided into smaller chunks, where each chunk is made sufficiently small to be implemented by a single CLB, the single CLB being a yet-unspecified one of the many CLB's that are available in the yet-unprogrammed FPGA device. Differently designed FPGA's can have differently designed CLB's with respective logic-implementing resources. As such, the maximum size of a partitioned chunk can vary in accordance with the specific FPGA device that is designated to implement the original circuit design. The original circuit design can be specified in terms of a gate level description, or in Hardware Descriptor Language (HDL) form or in other suitable form.

After the partitioning phase is carried out, each resulting chunk is virtually positioned into a specific, chunk-implementing CLB of the designated FPGA during a subsequent placement phase.

In the ensuing routing phase, an attempt is made to algorithmically establish connections between the various chunk-implementing CLB's of the FPGA device, using the interconnect resources of the designated FPGA device. The goal is to reconstruct the original circuit design by reconnecting all the partitioned and placed chunks.

If all goes well in the partitioning, placement, and routing phases, the FPGA configuring software will find a workable ‘solution’ comprised of a specific partitioning of the original circuit, a specific set of CLB placements and a specific set of interconnect usage decisions (routings). It can then deem its mission to be complete and it can use the placement and routing results to generate the configuring code that will be used to correspondingly configure the designated FPGA.

In various instances, however, the FPGA configuring software may find that it cannot complete its mission successfully on a first try. It may find, for example that the initially-chosen placement strategy prevents the routing phase from completing successfully. This might occur because signal routing resources have been exhausted in one or more congested parts of the designated FPGA device. Some necessary interconnections may have not been completed through those congested parts. Alternatively, all necessary interconnections may have been completed, but the FPGA configuring software may find that simulation-predicted performance of the resulting circuit (the so-configured FPGA) is below an acceptable threshold. For example, signal propagation time may be too large in a speed-critical part of the FPGA-implemented circuit. More specifically, certain synchronization signals may need to propagate from one section of the FPGA to another according to a particular sequence and architectural constraints of the FPGA device may impede this from happening in an efficient manner in so far as resource utilization is concerned.

Given this, if the initial partitioning, placement and routing phases do not provide an acceptable solution, the FPGA configuring software will try to modify its initial place and route choices so as to remedy the problem. Typically, the software will make iterative modifications to its initial choices until at least a functional place-and-route strategy is found (one where all necessary connections are completed), and more preferably until a place-and-route strategy is found that brings performance of the FPGA-implemented circuit to a near-optimum point. The latter step is at times referred to as ‘optimization’. Modifications attempted by the software may include re-partitionings of the original circuit design as well as repeated iterations of the place and route phases.

There are usually a very large number of possible choices in each of the partitioning, placement, and routing phases. FPGA configuring programs typically try to explore a multitude of promising avenues within a finite amount of time to see what effects each partitioning, placement, and routing move may have on the ultimate outcome. This in a way is analogous to how chess-playing machines explore ramifications of each move to each chess piece on the end-game. Even when relatively powerful, high-speed computers are used, it may take the FPGA configuring software a significant amount of time to find a workable solution. Turn around time can take more than 8 hours.

In some instances, even after having spent a large amount of time trying to find a solution for a given FPGA-implementation problem, the FPGA configuring software may fail to come up with a workable solution and the time spent becomes lost turn-around time. It may be that, because of packing inefficiencies, the user has chosen too small an FPGA device for implementing too large of an original circuit.

Another possibility is that the internal architecture of the designated FPGA device does not mesh well with the organization and/or timing requirements of the original circuit design.

Organizations of original circuit designs can include portions that may be described as ‘random logic’ (because they have no generally repeating pattern). The organizations can additionally or alternatively include portions that may be described as ‘bus oriented’ (because they carry out nibble-wide, byte-wide, or word-wide, parallel operations). The organizations can yet further include portions that may be described as ‘matrix oriented’ (because they carry out matrix-like operations such as multiplying two, multidimensional vectors). These are just examples of taxonomical descriptions that may be applied to various design organizations. Another example is ‘control logic’ which is less random than fully ‘random logic’ but less regular than ‘bus oriented’ designs. There may be many more taxonomical descriptions. The point being made here is that some FPGA structures may be better suited for implementing random logic while others may be better suited for implementing bus oriented designs or other kinds of designs. In cases where embedded memory is present, the architecture of the embedded memory can play an important role in determining how well a given taxonomically-distinct design is accommodated. Compatibility between the embedded memory architecture and the architecture of intertwined CLB's and interconnect can also play an important role in determining how well a given taxonomically-distinct design is accommodated.

If after a number of tries, the FPGA configuring software fails to find a workable solution, the user may choose to try again with a differently-structured FPGA device. The user may alternatively choose to spread the problem out over a larger number of FPGA devices, or even to switch to another circuit implementing strategy such as CPLD or ASIC (where the latter is an Application Specific hardwired design of an IC). Each of these options invariably consumes extra time and can incur more costs than originally planned for.

FPGA device users usually do not want to suffer through such problems. Instead, they typically want to see a fast turnaround time of no more than, say 4 hours, between the time they complete their original circuit design and the time a first-run FPGA is available to implement and physically test that design. More preferably, they would want to see a fast turnaround time of no more than, say 30 minutes, for successful completion of the FPGA configuring software when executing on a 80486-80686 PC platform (that is, a socommercially specified, IBM compatible personal computer) and implementing a 25000 gate or less, design in a target FPGA device.

FPGA users also usually want the circuit implemented by the FPGA to provide an optimal emulation of the original design in terms of function packing density, cost, speed, power usage, and so forth irrespective of whether the original design is taxonomically describable generally as ‘random logic’, or as ‘bus oriented’, ‘memory oriented’, or as a combination of these, or otherwise.

When multiple FPGA's are required to implement a very large original design, high function packing density and efficient use of FPGA internal resources are desired so that implementation costs can be minimized in terms of both the number of FPGA's that will have to be purchased and the amount of printed circuit board space that will be consumed.

Even when only one FPGA is needed to implement a given design, a relatively high function packing density is still desirable because it usually means that performance speed is being optimized due to reduced wire length. It also usually means that a lower cost member of a family of differently sized FPGA's can be selected or that unused resources of the one FPGA can be reserved for future expansion needs.

In summary, end users want the FPGA configuring software to complete its task quickly and to provide an efficiently-packed, high-speed compilation of the functionalities provided by an original circuit design irrespective of the taxonomic organization of the original design.

In the past, it was thought that attainment of these goals was primarily the responsibility of the computer programmers who designed the FPGA configuring software. It has been shown however, that the architecture or topology of the unprogrammed FPGA can play a significant role in determining how well and how quickly the FPGA configuring software completes the partitioning, placement, and routing tasks.

As indicated above, the architectural layout, implementation, and use of on-chip embedded memory can also play a role in how well the FPGA configuring software is able to complete the partitioning, placement and routing tasks with respect to using embedded memory; and also how well the FPGA-implemented circuit performs in terms of propagating signals into, through and out of the on-chip embedded memory.

SUMMARY OF THE INVENTION

An improved FPGA device in accordance with the invention includes one or more columns of multi-ported SRAM blocks for holding run-time storage data.

In each such SRAM block, at least a first of the multiple ports is a read/write port (Port_ 1 ) which can receive first address signals and respond by directing the writing of further-received first data to an address-defined first area of the SRAM block and which can alternatively respond by directing the reading of stored data from an address-defined area of the SRAM block. A second of the multiple ports (Port_ 2 ) has at least an independent read-capability such that the second port can receive respective second address signals and can respond independently of the first port by reading stored second data from a respective address-defined area of the SRAM block.

The address signals that drive the multiple ports of each SRAM block generally come from respective signal sources that have changing output states. In accordance with the invention, one or more address-capturing registers are provided for a respective one or more of the multiple ports of each SRAM block for capturing a respective address signal for that port in response to an address-validating strobe signal. The address-validating strobe signal is routable to the respective signal source of the address signal so that the address-validating strobe signal may be used to enable a changing of the output state of the signal source once the respective address signal has been captured by the address-capturing register.

In one embodiment, an address-validating strobe signal of each SRAM block may be coupled by userconfiguration from a special SRAM control bus (SVIC) to crossing bidirectional interconnect lines (e.g., tri-stated horizontal longlines) for providing timingsynchronization to the respective signal source of the address signal so that the address-validating strobe signal may be used to enable a changing of the output state of the signal source once the respective address signal has been captured by the address-capturing register.

Further in accordance with the invention, one or more data-capturing registers are provided for a respective one or more of the multiple ports of each SRAM block for capturing a respective data signal for that port in response to a data-validating strobe signal.

When data writing is taking place, the data-validating strobe signal is routable to the respective signal source of the data signal so that the data-validating strobe signal may be used to enable a changing of the output state of the signal source once the respective data write signal has been captured by the data-capturing register.

When data reading is taking place, the data-validating strobe signal is routable to respective logic of the data signal destination so that the data-validating strobe signal may be used to indicate to that logic that a valid data output state is present for the respective to-be read data signal which has now been captured by the data-capturing register.

In one embodiment, special, vertical interconnect channels are provided adjacent to embedded SRAM columns for supplying the address-validating strobe signals and data-validating strobe signals to the SRAM blocks as well as additional control signals. The control signals (which include the address-validating and data-validating strobe signals) may be broadcast via special longlines (SMaxL lines) to all SRAM blocks of a given column or localized to groups of SRAM blocks in a given column by using shorter special vertical lines (S 4 xL lines).

One of the features of embodiments that include the address-capturing registers is that read operations can be performed simultaneously at the multiple ports of each SRAM block using respective, and typically different, address signals for each such port, as well as different interconnect lines for transferring the output data. The data output (data reading) bandwidth of the embedded memory can be thereby maximized, if such maximize bandwidth is desired. Logic circuits can engage in generating a next, new address signals even while the SRAM blocks are busy responding to register-captured, old address signals. Such pipelining of operations can help to increase overall system bandwidth.

Another of the features of embodiments that include the data-capturing registers is that the SRAM blocks can begin responding to new address signals even while the destination logic blocks of old data are busy responding to register-captured, old data signals. Such pipelining of operations can help to increase overall system bandwidth.

Other aspects of the invention will become apparent from the below detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

The below detailed description makes reference to the accompanying drawings, in which:

FIG. 1 illustrates FIGS. 1A and 1B illustrate a first FPGA having an 8×8 matrix of VGB's (Variable Grain Blocks) with an embedded left memory column (LMC) and an embedded right memory column (RMC) in accordance with the invention;

FIG. 2 is a diagram showing the placement of switch boxes along double length, quad length, and octal length lines within normal interconnect channels of another, like FPGA device having a 20×20 matrix of VGB's with embedded LMC and RMC;

FIG. 3 illustrates FIGS. 3A and 3B illustrate more details of a Right Memory Column (RMC), and in particular of two adjacent memory blocks and of the relation of the memory blocks to an adjacent super-VGB core tile and its horizontal interconnect channels (HIC's);

FIG. 4 illustrates how the 2/4/8xL output lines of respective CBB's (X, Z, W, Y) within a SVGB are configurably couplable to surrounding interconnect channels;

FIG. 5 illustrates how MaxL line drivers of respective SVGB's are coupled to surrounding interconnect channels;

FIG. 6A shows one embodiment of a VGB;

FIG. 6B shows an exemplary CSE (Configurable Sequential Element) having a flip flop that is responsive to a VGB clock signal;

FIG. 7A illustrates how the MaxL line drivers of respective IOB's are coupled to surrounding interconnect channels in one embodiment of the invention;

FIG. 7B illustrates internal components of an exemplary IOB (configurable Input/Output Block)having plural flip flops that are respectively responsive to respective IOB input and output clock signals;

FIG. 7C illustrates an exemplary IOB controls-acquiring multiplexer that may be used for acquiring respective IOB input and output clock signals from neighboring interconnect lines;

FIG. 8 is a further magnified illustration of one embodiment of FIG. 3, showing further details of a Right Memory Column (RMC), and in particular of a given SRAM block in accordance with the invention and its neighboring interconnect channels;

FIG. 9 is a further magnified illustration of one embodiment of FIG. 8, showing further details inside of a given SRAM block;

FIG. 10 is a block diagram of embodiments of FPGA devices, including those conform with FIG. 9 as one set of alternatives, wherein respective flows may be seen for respective address signals, address-validating strobe signals, memory data signals, and memory data-validating strobe signals of dual-ported SRAM block; and

FIG. 11A is a schematic diagram of an FPGA configuring process wherein a predefined design definition is supplied to an FPGA compiling software module; and

FIG. 11 B is a flow chart of FPGA-configuration software that takes advantage of the ability to configurably route respective address-validating strobe signals and data-validating strobe signals in FPGA devices that conform to the present invention.

DETAILED DESCRIPTION

FIG. 1 shows a macroscopic view of an FPGA device 100 in accordance with the invention. The illustrated structure is preferably formed as a monolithic integrated circuit.

The macroscopic view of FIG. 1 is to be understood as being taken at a magnification level that is lower than later-provided, microscopic views. The more microscopic views may reveal greater levels of detail which may not be seen in more macroscopic views. And in counter to that, the more macroscopic views may reveal gross architectural features which may not be seen in more microscopic views. It is to be understood that for each more macroscopic view, there can be many alternate microscopic views and that the illustration herein of a sample microscopic view does not limit the possible embodiments of the macroscopically viewed entity. Similarly, the illustration herein of a sample macroscopic view does not limit the possible embodiments into which a microscopically viewed embodiment might be included.

FPGA device 100 comprises a regular matrix of super structures defined herein as super-VGB's (SVGB's). In the illustrated embodiment, a dashed box(upper left corner) circumscribes one such super-VGB structure which is referenced as 101 . There are four super-VGB's shown in each super row of FIG. 1 and also four super-VGB's shown in each super column. Each super row or column contains plural rows or columns of VGB's. One super column is identified as an example by the braces at 111 . Larger matrices with more super-VGB's per super column and/or super row are of course contemplated. FIG. 1 is merely an example.

There is a hierarchy of user-configurable resources within each super-VGB. At a next lower level, each super-VGB is seen to contain four VGB's. In the illustrated embodiment, identifier 102 points to one such VGB within SVGB 101 .

A VGB is a Variable Grain Block that includes its own hierarchy of user configurable resources. At a next lower level, each VGB is seen to contain four Configurable Building Blocks or CBB's arranged in a L-shaped configuration. In the illustrated embodiment, identifier 103 points to one such CBB within VGB 102 .

At a next lower level, each CBB has its own hierarchy of user configurable resources. Some of these (e.g., a CSE) will be shown in later figures. A more detailed description of the hierarchal resources of the super-VGB's, VGB's, CBB's, and so forth, may be found in the above-cited Ser. No. 08/948,306filed Oct. 9, 1997 by Om P. Agrawal et al. and originally entitled, VARIABLE GRAIN ARCHITECTURE FOR FPGA INTEGRATED CIRCUITS, whose disclosure is incorporated herein by reference.

It is sufficient for the present to appreciate that each CBB includes a clocked flip flop and that each CBB is capable of producing at least one bit of result data and/or storing one bit of data in its flip flop and/or of outputting the stored and/or result data to adjacent interconnect lines. Each VGB ( 102 ) is in turn, therefore capable of producing and outputting at least 4 such result bits at a time to adjacent interconnect lines. This is referred to as nibble-wide processing. Nibble-wide processing may also be carried out by the four CBB's that line the side of each SVGB (e.g., 101 ).

With respect to the adjacent interconnect lines (AIL's), each SVGB is bounded by two horizontal and two vertical interconnect channels (HIC's and VIC's). An example of a HIC is shown at 150 . A sample VIC is shown at 160 . Each such interconnect channel contains a diverse set of interconnect lines as will be seen later.

The combination of each SVGB (e.g., 101 ) and its surrounding interconnect resources (of which resources, not all are shown in FIG. 1) is referred to as a matrix tile. Matrix tiles are tiled one to the next as seen, with an exception occurring about the vertical sides of the two central, super columns, 115 . Columns 114 (LMC) and 116 (RMC) of embedded memory are provided along the vertical sides of the central pair 115 of super columns. These columns 114 , 116 will be examined in closer detail shortly.

From a more generalized perspective, the tiling of the plural tiles creates pairs of adjacent interconnect channels within the core of the device 100 . An example of a pair of adjacent interconnect channels is seen at HIC's 1 and 2 . The peripheral channels (HIC 0 , HIC 7 , VIC 0 , VIC 7 ) are not so paired. Switch matrix boxes (not shown, see FIG. 2) are provided at the intersections of the respective vertical and horizontal interconnect channels. The switch matrix boxes form part of each matrix tile construct that includes a super-VGB at its center. See area 465 of FIG. 3 .

The left memory column (LMC) 114 is embedded as shown to the left of central columns pair 115 . The right memory column (RMC) 116 is further embedded as shown to the right of the central columns pair 115 . It is contemplated to have alternate embodiments with greater numbers of such embedded memory columns symmetrically distributed in the FPGA device and connected in accordance with the teachings provided herein for the illustrative pair of columns, 114 and 116 . It is also possible to additionally have embedded rows of such embedded memory extending horizontally.

Within the illustrated LMC 114 , a first, special, vertical interconnect channel (SVIC) 164 is provided adjacent to respective, left memory blocks MLO through ML 7 . Within the illustrated RMC 164 , a second, special, vertical interconnect channel (SVIC) 166 is provided adjacent to respective, right memory blocks MRO through MR 7 .

As seen, the memory blocks, ML 0 -ML 7 and MR 0 -MR 7 are numbered in accordance with the VGB row they sit in (or the HIC they are closest to) and are further designated as left or right (L or R) depending on whether they are respectively situated in LMC 114 or RMC 116 . In one embodiment, each of memory blocks, ML 0 -ML 7 and MR 0 -MR 7 is organized to store and retrieve an addressable plurality of nibbles, where a nibble contains 4 data bits. More specifically, in one embodiment, each of memory blocks, ML 0 -ML 7 and MR 0 -MR 7 has an internal SRAM array organized as a group of 32 nibbles (32×4=128 bits) where each nibble is individually addressable by five address bits. The nibble-wise organization of the memory blocks, ML 0 -ML 7 and MR 0 -MR 7 corresponds to the nibble-wise organization of each VGB ( 102 ) and/or to the nibble-wise organization of each group of four CBB's that line the side of each SVGB ( 101 ). Thus, there is a data-width match between each embedded memory block and each group of four CBB's or VGB. As will be seen a similar kind of data-width matching also occurs within the diversified resources of the general interconnect mesh.

At the periphery of the FPGA device 100 , there are three input/output blocks (IOB's) for each row of VGB's and for each column of VGB's. One such IOB is denoted at 140 . The IOB's in the illustrated embodiment are shown numbered from 1 to 96 . In one embodiment, there are no IOB's directly above and below the LMC 114 and the RMC 116 . In an alternate embodiment, special IOB's such as shown in phantom at 113 are provided at the end of each memory column for driving address and control signals into the corresponding memory column.

Each trio of regular IOB's at the left side ( 1 - 24 ) and the right side ( 49 - 72 ) of the illustrated device 100 may be user-configured to couple data signals to the nearest HIC. Similarly, each trio of regular IOB's on the bottom side ( 25 - 48 ) and top side ( 73 - 96 ) may be user-configured for exchanging input and/or output data signals with lines inside the nearest corresponding VIC. The SIOB's (e.g., 113 ), if present, may be userconfigured to exchange signals with the nearest SVIC (e.g., 164 ). Irrespective of whether the SIOB's (e.g., 113 ) are present, data may be input and/or output from points external of the device 100 to/from the embedded memory columns 114 , 116 by way of the left side IOB's ( 1 - 24 ) and the right side IOB's ( 49 - 72 ) using longline coupling, as will be seen below. The longline coupling allows signals to move with essentially same speed and connectivity options from/to either of the left or right side IOB's ( 1 - 24 , 49 - 72 ) respectively to/from either of the left or right side memory columns.

It is sufficient for the present to appreciate that each IOB includes one or more clocked flip flops and that each IOB is capable of receiving at least one bit of external input data from a point outside the FPGA device, and/or outputting at least one bit of external output data to a point outside the FPGA device, and/or storing one bit of input or output data in respective ones of its one or more flip flops, and/or of transferring such external input or output data respectively to or from adjacent interconnect lines. Each set of 24 IOB's that lie adjacent to a corresponding one of the peripheral HIC's and VIC's may therefore transfer in parallel, as many as 24 I/O bits at a time. Such transference may couple to the adjacent one of the peripheral HIC's and VIC's and/or to neighboring VGB's.

Data and/or address and/or control signals may be generated within the FPGA device 100 by its internal VGB's and transmitted to the embedded memory 114 , 116 by way of the peripheral and inner HIC's, as will be seen below.

The VGB's are numbered according to their column and row positions. Accordingly, VGB( 0 , 0 ) is in the top left corner of the device 100 , VGB( 7 , 7 ) is in the bottom right corner of the device 100 ; and VGB( 1 , 1 ) is in the bottom right corner of SVGB 101 .

Each SVGB ( 101 ) may have centrally-shared resources. Such centrally-shared resources are represented in FIG. 1 by the diamond-shaped hollow at the center of each illustrated super-VGB (e.g., 101 ). Longline driving amplifiers (see FIG. 5) correspond with these diamond-shaped hollows and have their respective outputs coupling vertically and horizontally to the adjacent HIC's and VIC's of their respective superVGB's.

As indicated above, each super-VGB in FIG. 1 has four CBB's along each of its four sides. The four CBB's of each such interconnect-adjacent side of each super-VGB can store a corresponding four bits of result data internally so as to define a nibble of data for output onto the adjacent interconnect lines. At the same time, each VGB contains four CBB's of the L-shaped configuration which can acquire and process a nibble's worth of data. One of these processes is nibble-wide addition within each VGB as will be described below. Another of these processes is implementation of a 4:1 dynamic multiplexer within each CBB. The presentation of CBB's in groups of same number (e.g., 4 per side of a super-VGB and 4 within each VGB) provides for a balanced handling of multi-bit data packets along rows and columns of the FPGA matrix. For example, nibbles may be processed in parallel by one column of CBB's and the results may be efficiently transferred in parallel to an adjacent column of CBB's for further processing. Such nibble-wide handling of data also applies to the embedded memory columns 114 / 116 . As will be seen, nibble-wide data may be transferred between one or more groups of four CBB's each to a corresponding one or more blocks of embedded memory (MLx or MRx) by way of set of 4 equally-long lines in a nearby HIC. Each such set of 4 equally-long lines may be constituted by so-called, double-length lines (2xL lines), quad-length lines (4xL lines), octal-length lines (8xL lines) or maximum length longlines (MaxL lines).

In one particular embodiment of the FPGA device, the basic matrix is 10-by-10 SVGB's, with embedded memory columns 114 / 116 positioned around the central two super columns 115 . (See FIG. 2.) In that particular embodiment, the integrated circuit may be formed on a semiconductor die having an area of about 100,000 mils 2 or less. The integrated circuit may include four metal layers for forming interconnect. So-called ‘direct connect’ lines and ‘longlines’ of the interconnect are preferably implemented entirely by the metal layers so as to provide for low resistance pathways and thus relatively small RC time constants on such interconnect lines. Logic-implementing transistors of the integrated circuit have drawn channel lengths of 0.35 microns or 0.25 microns or less. Amplifier output transistors and transistors used for interfacing the device to external signals may be larger, however.

As indicated above, the general interconnect channels (e.g., HIC 150 , VIC 160 of FIG. 1) contain a diverse set of interconnect lines. FIG. 2 shows a distribution 200 of different-length horizontal interconnect lines (2xL, 4xL, 8xL) and associated switch boxes of a single horizontal interconnect channel (HIC) 201 , as aligned relative to vertical interconnect channels in an FPGA of the invention. This particular FPGA has a 10×10 matrix of super-VGB's (or a 20×20 matrix of VGB's). The embedded memory columns ( 114 / 116 ) are not fully shown, but are understood to be respectively embedded in one embodiment, between VIC's 7 - 8 and 11 - 12 , as indicated by zig-zag symbols 214 and 216 .

For an alternate embodiment, symbol 214 may be placed between VIC's 6 and 7 while symbol 216 is placed between VIC's 12 and 13 to indicate the alternate placement of the embedded memory columns 114 / 116 between said VIC's in the alternate embodiment. For yet another alternate embodiment, zig-zag symbol 214 may be placed between VIC's 8 and 9 while zig-zag symbol 216 is placed between VIC's 10 and 11 to represent corresponding placement of the embedded memory columns 114 / 116 in the corresponding locations. Of course, asymmetrical placement of the embedded memory columns 114 / 116 relative to the central pair of SVGB columns ( 115 ) is also contemplated. In view of these varying placement possibilities, the below descriptions of which 2xL, 4xL, or 8xL line intersects with corresponding columns 214 / 216 should, of course, be read as corresponding to the illustrated placement of symbols 214 and 216 respectively between VIC's 7 - 8 and VIC's 11 - 12 with corresponding adjustments being made if one of the alternate placements of 214 / 216 is chosen instead.

By way of a general introduction to the subject of interconnect resources, it should be noted that the interconnect mesh of FPGA 100 includes lines having different lengths. It may be said that, without taking into account any length changes created by the imposition of the embedded memory columns 114 / 116 , the horizontally-extending general interconnect channels (HIC's) and vertically-extending general interconnect channels (VIC's) of the FPGA device 100 are provided with essentially same and symmetrically balanced interconnect resources for their respective horizontal (x) and vertical (y) directions. These interconnect resources include a diversified and granulated assortment of MaxL lines, 2xL lines, 4xL lines and 8xL lines as well as corresponding 2xL switch boxes, 4xL switch boxes, and 8xL boxes.

In one embodiment, each general channel, such as the illustrated example in FIG. 2 of HIC 201 (the horizontal interconnect channel), contains at least the following resources: eight double-length (2xL) lines, four quad-length (4xL) lines, four octal-length (8xL) lines, sixteen full-length (MaxL) lines, sixteen directconnect (DC) lines, eight feedback (FB) lines and two dedicated clock (CLK) lines. Vertical ones of the general interconnect channels (VIC's) may contain an additional global reset (GR) longline. Parts of this total of 58/59 lines may be seen in FIGS. 4 and 5 as having corresponding designations AILO through AIL 57 / 58 for respective interconnect lines that are adjacent to corresponding VGB's. Not all of the different kinds of lines are shown in FIG. 2 . Note that each of the 2xL, 4xL, 8xL and MaxL line sets includes at least four lines of its own kind for carrying a corresponding nibble's worth of data or address or control signals.

In FIG. 2, core channels 1 through 18 are laid out as adjacent pairs of odd and even channels. Peripheral channels 0 and 19 run alone along side the IOB's (see FIG. 1 ). Although not shown in FIG. 2, it should be understood that each switch box has both horizontally-directed and vertically-directed ones of the respective 2xL, 4xL, and 8xL lines entering into that respective switch box. (See region 465 of FIG. 3.) A given switchbox (XxSw) may be user-configured to continue a signal along to the next XxL line (e.g., 2xL line) of a same direction and/or to couple the signal to a corresponding same kind of XxL line of an orthogonal direction. A more detailed description of switchboxes for one embodiment may be found in the above-cited, U.S. Ser. No. 09/008,762, filed Jan. 19, 1998 by inventors Om Agrawal et al whose disclosure is incorporated herein by reference.

Group 202 represents the 2xL lines of HIC 201 and their corresponding switch boxes. For all of the 2xL lines, each such line spans the distance of essentially two adjacent VGB's (or one super-VGB). Most 2xL lines terminate at both ends into corresponding 2x switch boxes (2xSw's). The terminating 2xSw boxes are either both in even-numbered channels or both in odd-numbered channels. Exceptions occur at the periphery where either an odd or even-numbered channel is nonexistent. As seen in the illustrated embodiment 200 , interconnections can be made via switch boxes from the 2xL lines of HIC 201 to any of the odd and even-numbered vertical interconnect channels (VIC's) 0 - 19 .

With respect to the illustrated placement 214 / 216 of embedded memory columns 114 / 116 , note in particular that 2xL line 223 and/or its like (other, similarly oriented 2xL lines) may be used to provide a short-haul, configurable connection from SVGB 253 (the one positioned to the right of VIC # 6 ) to LMC 214 . Similarly, line 224 and its like may be used to provide a short-haul connection from SVGB 254 (the one positioned to the right of VIC # 8 ) to LMC 214 . Line 225 and/or its like may be used to provide a short-haul connection from SVGB 255 to RMC 216 . Line 226 and/or its like may be used to provide a short-haul connection from SVGB 256 to RMC 216 . Such short-haul connections may be useful for quickly transmitting speed-critical signals such as address signals and/or data signals between a nearby SVGB ( 253 - 256 ) and the corresponding embedded memory column 114 or 116 .

Group 204 represents the 4xL lines of HIC 201 and their corresponding switch boxes. Most 4xL lines each span the distance of essentially four, linearly-adjacent VGB's and terminate at both ends into corresponding 4x switch boxes (4xSw's). The terminating 4xSw boxes are either both in even-numbered channels or both in odd-numbered channels. As seen in the illustrated embodiment 200 , interconnections can be made via switch boxes from the 4xL lines of HIC 201 to any of the odd and evennumbered vertical interconnect channels (VIC's) 0 - 19 .

With respect to the illustrated placement 214 / 216 of embedded memory columns 114 / 116 , note in particular that 4xL line 242 and/or its like (other, similarly oriented 4xL lines that can provide generally similar coupling) may be used to provide a medium-haul configurable connection between LMC 214 and either one or both of SVGB 252 and SVGB 253 . Line 243 and/or its like may be used to provide a configurable connection of medium-length between LMC 214 and either one or both of SVGB's 253 and 254 . Similarly, line 245 and/or its like may be used to provide medium-length coupling between RMC 216 and either one or both of SVGB's 255 and 256 . Moreover, line 247 and/or its like may be used to configurably provide medium-haul interconnection between RMC 216 and either one or both of SVGB's 257 and 256 . Such medium-haul interconnections may be useful for quickly propagating address signals and/or data signals in comparatively medium-speed applications.

Group 208 represents the 8xL lines of HIC 201 and their corresponding switch boxes. Most 8xL lines (7 out of 12) each spans the distance of essentially eight, linearly-adjacent VGB's. A fair number of other 8xL lines (5 out of 12) each spans distances less than that of eight, linearly-adjacent VGB's. Each 8xL line terminates at least one end into a corresponding 8x switch box (8xSw). The terminating 8xSw boxes are available in this embodiment only in the core oddnumbered channels ( 1 , 3 , 5 , 7 , 9 , 11 , 13 , 15 and 17 ). Thus, in embodiment 200 , interconnections can be made via switch boxes from the 8xL lines of HIC 201 to any of the nonperipheral, odd-numbered vertical interconnect channels (VIC's). It is within the contemplation of the invention to have the 8xSw boxes distributed symmetrically in other fashions such that even-numbered channels are also covered.

With respect to the illustrated placement 214 / 216 of embedded memory columns 114 / 116 , note in particular that 8xL line 281 or its like may be used to provide even longer-haul, configurable connection from between LMC 214 and any one or more of SVGB's 251 - 254 . (In one embodiment where 214 places to the left of VIC 7 , 8xL line 280 provides configurable interconnection between LMC 214 and any one or more of SVGB's 250 - 253 .) In the illustrated embodiment, 8xL line 282 may be used to provide 8xL coupling between any two or more of: LMC 214 and SVGB's 252 - 255 . Line 283 may be used to provide 8xL coupling between any two or more of: LMC 214 , RMC 216 , and SVGB's 253 - 256 . Line 284 may be used to provide 8xL coupling between any two or more of: LMC 214 , RMC 216 , and SVGB's 254 - 257 . Line 285 may be used to provide 8xL coupling between any two or more of: RMC 216 and SVGB's 255 - 258 . Line 286 may be similarly used to provide 8xL coupling between any two or more of: RMC 216 and SVGB's 256 - 259 . Although the largest of the limited-length lines is 8xL in the embodiment of FIG. 2, it is within the contemplation of the invention to further have 16xL lines, 32xL lines and so forth in arrays with larger numbers of VGB's.

In addition to providing configurable coupling between the intersecting memory channel 214 and/or 216 , each of the corresponding 2xL, 4xL, 8xL and so forth lines may be additionally used for conveying such signals between their respective switchboxes and corresponding components of the intersecting memory channel.

Referring briefly back to FIG. 1, it should be noted that the two central super columns 115 are ideally situated for generating address and control signals and broadcasting the same by way of short-haul connections to the adjacent memory columns 114 and 116 . High-speed data may be similarly conveyed from the memory columns 114 / 116 to the SVGB's of central columns 115 .

Before exploring more details of the architecture of FPGA device 100 , it will be useful to briefly define various symbols that may be used within the drawings. Unless otherwise stated, a single line going into a trapezoidal multiplexer symbol is understood to represent an input bus of one or more wires. Each open square box (MIP) along such a bus represents a point for user-configurable acquisition of a signal from a crossing line to the multiplexer input bus. In one embodiment, a PIP (programmable interconnect point) is placed at each MIP occupied intersection of a crossing line and the multiplexer input bus. Each of PIP (which may be represented herein as a hollow circle) is understood to have a single configuration memory bit controlling its state. In the active state the PIP creates a connection between its crossing lines. In the inactive state the PIP leaves an open between the illustrated crossing lines. Each of the crossing lines remains continuous however in its respective direction (e.g., x or y).

PIP's (each of which may be represented herein by a hollow circle covering a crossing of two continuous lines) may be implemented in a variety of manners as is well known in the art. In one embodiment pass transistors such as MOSFET's may be used with their source and drain respectively coupled to the two crossing lines while the transistor gate is controlled by a configuration memory bit. In an alternate embodiment, nonvolatilely-programmable floating gate transistors may be used with their source and drain respectively coupled to the crossing lines. The charge on the floating gate of such transistors may represent the configuration memory bit. A dynamic signal or a static turn-on voltage may be applied to the control gate of such a transistor as desired. In yet another alternate embodiment, nonvolatilely-programmable fuses or anti-fuses may be provided as PIP's with their respective ends being connected to the crossing lines. One may have bidirectional PIP's for which signal flow between the crossing lines (e.g., 0 and 1 ) can move in either direction. Where desirable, PIP's can also be implemented with unidirectional signal coupling means such as AND gates, tri-state drivers, and so forth.

An alternate symbol for a group of PIP's is constituted herein by a hollow and tilted ellipse covering a bus such as is seen in FIG. 10 .

Another symbol that may be used herein is a hollow circle with an ‘X’ inside. This represents a POP. POP stands for ‘Programmable Opening Point’. Unless otherwise stated, each POP is understood to have a single configuration memory bit controlling its state. In the active state the POP creates an opening between the colinear lines entering it from opposing sides. In the inactive state the POP leaves closed an implied connection between the colinear lines entering it. Possible implementations of POP's include pass transistors and tri-state drivers. Many other alternatives will be apparent to those skilled in the art.

Referring now to FIG. 3, this figure provides a mid-scopic view of some components within an exemplary matrix tile 400 that lays adjacent to embedded memory column, RMC 416 . Of course, other implementations are possible for the more macroscopic view of FIG. 1 .

The mid-scopic view of FIG. 3 shows four VGB's brought tightly together in mirror opposition to one another. The four, so-wedged together VGB's are respectively designated as ( 0 , 0 ), ( 0 , 1 ), ( 1 , 0 ) and ( 1 , 1 ). The four VGB's are also respectively and alternately designated herein as VGB_A, VGB_B, VGB_C, and VGB_D.

Reference number 430 points to VGB_A which is located at relative VGB row and VGB column position ( 0 , 0 ). Some VGB internal structures such as CBB's Y, W, Z, and X are visible in the mid-scopic view of FIG. 3 . An example of a Configurable Building Block (CBB) is indicated by 410 . As seen, the CBB's 410 of each VGB 430 are arranged in an L-shaped organization and placed near adjacent interconnect lines. Further VGB internal structures such as each VGB's common controls developing (Ctrl) section, each VGB's wide-gating supporting section, each VGB's carry-chaining (Fast Carry) section, and each VGB's coupling to a shared circuit 450 of a corresponding super-structure (super-VGB) are also visible in the mid-scopic view of FIG. 3 . VGB local feedback buses such as the L-shaped structure shown at 435 in FIG. 3 allow for high-speed transmission from one CBB to a next within a same VGB, of result signals produced by each CBB.

The mid-scopic view of FIG. 3 additionally shows four interconnect channels surrounding VGB's ( 0 , 0 ) through ( 1 , 1 ). The top and bottom, horizontally extending, interconnect channels (HIC's) are respectively identified as 451 and 452 . The left and right, vertically extending, interconnect channels (VIC's) are respectively identified as 461 and 462 .

Two other interconnect channels that belong to other tiles are partially shown at 453 (HIC2) and 463 (VIC2) so as to better illuminate the contents of switch boxes area 465 . Switch boxes area 465 contains an assortment of 2xL switch boxes, 4x switch boxes and 8x switch boxes, which may be provided in accordance with FIG. 2 .

In addition, a memory-control multiplexer area 467 is provided along each HIC as shown for configurably coupling control signals from the horizontal bus (e.g., HIC 452 ) to special vertical interconnect channel (SVIC) 466 . The illustrated placement of multiplexer area 467 to the right of the switch boxes (SwBoxes) of VIC's 462 and 463 is just one possibility. Multiplexer area 467 may be alternatively placed between or to the left of the respective switch boxes of VIC's 462 and 463 .

In one embodiment (see FIG. 8 ), SVIC 466 has sixteen, special maximum length lines (16 SMaxL lines), thirty-two, special quad length lines (32 S4xL lines), and four special clock lines (SCLK 0 - 3 ). SVIC 466 carries and couples control signals to respective control input buses such as 471 , 481 of corresponding memory blocks such as 470 , 480 .

A memory-I/O multiplexer area 468 is further provided along each HIC for configurably coupling memory data signals from and to the horizontal bus (e.g., HIC 452 ) by way of data I/O buses such as 472 , 482 of corresponding memory blocks such as 470 , 480 . Again, the illustrated placement of multiplexer area 468 to the right of the switch boxes (SwBoxes) of VIC's 462 and 463 is just one possibility. Multiplexer area 468 may be alternatively placed between or to the left of the respective switch boxes of VIC's 462 and 463 .

Memory control multiplexer area 477 and memory I/O multiplexer area 478 are the counterparts for the upper HIC 451 of areas 467 and 468 of lower HIC 452 . Although not specifically shown, it is understood that the counterpart, left memory channel (LMC) is preferably arranged in mirror symmetry to the RMC 416 so as to border the left side of its corresponding matrix tile.

As seen broadly in FIG. 3, the group of four VGB's ( 0 , 0 ) through ( 1 , 1 ) are organized in mirror image relationship to one another relative to corresponding vertical and horizontal centerlines (not shown) of the group and even to some extent relative to diagonals (not shown) of the same group. Vertical and horizontal interconnect channels (VIC's and HIC's) do not cut through this mirror-wise opposed congregation of VGB's. As such, the VGB's may be wedged-together tightly.

Similarly, each pair of embedded memory blocks (e.g., 470 and 480 ), and their respective memory-control multiplexer areas ( 477 and 467 ), and their respective memory-I/O multiplexer areas ( 478 and 468 ) are organized in mirror image relationship to one another as shown. Horizontal interconnect channels (HIC's) do not cut through this mirror-wise opposed congregation of embedded memory constructs. As such, the respective embedded memory constructs of blocks MRx 0 (in an even row, 470 being an example) and MRx 1 (in an odd row, 480 being an example) may be wedged-together tightly. A compact layout may be thereby achieved.

With respect to mirror symmetry among variable grain blocks, VGB ( 0 , 1 ) may be generally formed by flipping a copy of VGB ( 0 , 0 ) horizontally. VGB ( 1 , 1 ) may be similarly formed by flipping a copy of VGB ( 0 , 1 ) vertically. VGB ( 1 , 0 ) may be formed by flipping a copy of VGB ( 1 ,