DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT
[0049] A method and apparatus for coordinating memory operations among diversely-located memory components is described. In accordance with an embodiment of the invention, wave-pipelining is implemented for an address bus coupled to a plurality of memory components. The plurality of memory components are configured according to coordinates relating to the address bus propagation delay and the data bus propagation delay. A timing signal associated with address and/or control signals which duplicates the propagation delay of these signals is used to coordinate memory operations. The address bus propagation delay, or common address bus propagation delay, refers to the delay for a signal to travel along an address bus between the memory controller component and a memory component. The data bus propagation delay refers to the delay for a signal to travel along a data bus between the memory controller component and a memory component.
[0050] According to one embodiment of the invention, a memory system includes multiple memory modules providing multiple ranks and multiple slices of memory components. Such a system can be understood with reference to FIG. 27 . The memory system of FIG. 27 includes memory module 2703 and memory module 2730 . Memory module 2703 includes a rank that includes memory components 2716 - 2618 and another rank that includes memory components 2744 - 2746 .
[0051] The memory system is organized into slices across the memory controller component and the memory modules. The memory system of FIG. 27 includes a slice 2713 that includes a portion of memory controller 2702 , a portion of memory module 2703 including memory components 2716 and 2744 , and a portion of memory module 2730 including memory components 2731 and 2734 . The memory system of FIG. 27 includes another slice 2714 that includes another portion of memory controller 2702 , another portion of memory module 2703 including memory components 2717 and 2745 , and another portion of memory module 2730 including memory components 2732 and 2735 . The memory system of FIG. 27 further includes yet another slice 2715 that includes yet another portion of memory controller 2702 , yet another portion of memory module 2703 including memory components 2718 and 2746 , and yet another portion of memory module 2730 including memory components 2733 and 2736 .
[0052] The use of multiple slices and ranks, which may be implemented using multiple modules, allows efficient interconnection of a memory controller and several memory components while avoiding degradation of performance that can occur when a data bus or address bus has a large number of connections to it. With a separate data bus provided for each slice, the number of connections to each data bus can be kept to a reasonable number. The separate data buses can carry different signals independently of each other. A slice can include one or more memory components per module. For example, a slice can include one memory component of each rank. Note that the term slice may be used to refer to the portion of a slice excluding the memory controller. In this manner, the memory controller can be viewed as being coupled to the slices. The use of multiple modules allows memory components to be organized according to their path lengths to a memory controller. Even slight differences in such path lengths can be managed according to the organization of the memory components into ranks. The organization of memory components according to ranks and modules allows address and control signals to be distributed efficiently, for example through the sharing of an address bus within a rank or module.
[0053] In one embodiment, a slice can be understood to include several elements coupled to a data bus. As one example, these elements can include a portion of a memory controller component, one or more memory components on one module, and, optionally, one or more memory components on another module. In one embodiment, a rank can be understood to include several memory components coupled by a common address bus. The common address bus may optionally be coupled to multiple ranks on the module or to multiple modules. The common address bus can connect a memory controller component to each of the slices of a rank in succession, thereby allowing the common address bus to be routed from a first slice of the rank to a second slice of the rank and from the second slice of the rank to a third slice of the rank. Such a configuration can simplify the routing of the common address bus.
[0054] For discussion purposes, a simplified form of a memory system is first discussed in order to illustrate certain concepts, whereas a more complex memory system that includes a plurality of modules and ranks is discussed later in the specification.
[0055] FIG. 1 is a block diagram illustrating a memory system having a single rank of memory components with which an embodiment of the invention may be implemented. Memory system 101 comprises memory controller component 102 and memory module 103 . Address clock 104 provides an address clock signal that serves as a timing signal associated with the address and control signals that propagate along address bus 107 . Address clock 104 provides its address clock signal along address clock conductor 109 , which is coupled to memory controller component 102 and to memory module 103 . The address and control signals are sometimes referred to as simply the address signals or the address bus. However, since control signals may routed according to a topology common to address signals, these terms, when used, should be understood to include address signals and/or control signals.
[0056] Write clock 105 provides a write clock signal that serves as a timing signal associated with the data signals that propagate along data bus 108 during write operations. Write clock 105 provides its write clock signal along write clock conductor 110 , which is coupled to memory controller component 102 and memory module 103 . Read clock 106 provides a read clock signal that serves as a timing signal associated with the data signals that propagate along data bus 108 during read operations. Read clock 106 provides its read clock signal along read clock conductor 111 , which is coupled to memory controller component 102 and memory module 103 .
[0057] Termination component 120 is coupled to data bus 108 near memory controller component 102 . As one example, termination component 120 may be incorporated into memory controller component 102 . Termination component 121 is coupled to data bus 108 near memory module 103 . Termination component 121 is preferably incorporated into memory module 103 . Termination component 123 is coupled to write clock conductor 110 near memory component 116 of memory module 103 . Termination component 123 is preferably incorporated into memory module 103 . Termination component 124 is coupled to read clock conductor 111 near memory controller component 102 . As an example, termination component 124 may be incorporated into memory controller component 102 . Termination component 125 is coupled to read clock conductor 111 near memory component 116 of memory module 103 . Termination component 125 is preferably incorporated into memory module 103 . The termination components may utilize active devices (e.g., transistors or other semiconductor devices) or passive devices (e.g. resistors, capacitors, or inductors). The termination components may utilize an open connection. The termination components may be incorporated in one or more memory controller components or in one or more memory components, or they may be separate components on a module or on a main circuit board.
[0058] Memory module 103 includes a rank 112 of memory components 116 , 117 , and 118 . The memory module 103 is organized so that each memory component corresponds to one slice. Memory component 116 corresponds to slice 113 , memory component 117 corresponds to slice 114 , and memory component 1 18 corresponds to slice 115 . Although not shown in FIG. 1 , the specific circuitry associated with the data bus, write clock and associated conductors, and read clock and associated conductors that are illustrated for slice 113 is replicated for each of the other slices 114 and 115 . Thus, although such circuitry has not been illustrated in FIG. 1 for simplicity, it is understood that such dedicated circuitry on a slice-by-slice basis is preferably included in the memory system shown.
[0059] Within memory module 103 , address bus 107 is coupled to each of memory components 116 , 117 , and 118 . Address clock conductor 109 is coupled to each of memory components 116 , 117 , and 118 . At the terminus of address bus 107 within memory module 103 , termination component 119 is coupled to address bus 107 . At the terminus of address clock conductor 109 , termination component 122 is coupled to address clock conductor 109 .
[0060] In the memory system of FIG. 1 , each data signal conductor connects one controller data bus node to one memory device data bus node. However, it is possible for each control and address signal conductor to connect one controller address/control bus node to an address/control bus node on each memory component of the memory rank. This is possible for several reasons. First, the control and address signal conductors pass unidirectional signals (the signal wavefront propagates from the controller to the memory devices). It is easier to maintain good signal integrity on a unidirectional signal conductor than on a bidirectional signal conductor (like a data signal conductor). Second, the address and control signals contain the same information for all memory devices. The data signals will be different for all memory devices. Note that there might be some control signals (such as write enable signals) which are different for each memory device —these are treated as unidirectional data signals, and are considered to be part of the data bus for the purposes of this distinction. For example, in some instances, the data bus may include data lines corresponding to a large number of bits, whereas in some applications only a portion of the bits carried by the data bus may be written into the memory for a particular memory operation. For example, a 16-bit data bus may include two bytes of data where during a particular memory operation only one of the two bytes is to be written to a particular memory device. In such an example, additional control signals may be provided along a similar path as that taken by the data signals such that these control signals, which control whether or not the data on the data bit lines is written, traverse the system along a path with a delay generally matched to that of the data such that the control signals use in controlling the writing of the data is aptly timed. Third, routing the address and control signals to all the memory devices saves pins on the controller and memory module interface.
[0061] As a result, the control and address signals will be propagated on wires that will be longer than the wires used to propagate the data signals. This enables the data signals to use a higher signaling rate than the control and address signals in some cases.
[0062] To avoid impairment of the performance of the memory system, the address and control signals may be wave-pipelined in accordance with an embodiment of the invention. The memory system is configured to meet several conditions conducive to wave-pipelining. First, two or more memory components are organized as a rank. Second, some or all address and control signals are common to all memory components of the rank. Third, the common address and control signals propagate with low distortion (e.g. controlled impedance). Fourth, the common address and control signals propagate with low intersymbol-interference (e.g. single or double termination).
[0063] Wave-pipelining occurs when Tbit<Twire, where the timing parameter Twire is defined to be the time delay for a wavefront produced at the controller to propagate to the termination component at the end of the wire carrying the signal, and the timing parameter Tbit is defined to be the time interval between successive pieces (bits) of information on the wire. Such pieces of information may represent individual bits or multiple bits encoded for simultaneous transmission. Wave-pipelined signals on wires are incident-wave sampled by receivers attached to the wire. This means that sampling will generally take place before the wavefront has reflected from the end of the transmission line (e.g., the wire).
[0064] It is possible to extend the applicability of the invention from a single rank to multiple ranks of memory components in several ways. First, multiple ranks of memory components may be implemented on a memory module. Second, multiple memory modules may be implemented in a memory system. Third, data signal conductors may be dedicated, shared, or “chained” to each module. Chaining involves allowing a bus to pass through one module, connecting with the appropriate circuits on that module, whereas when it exits that particular module it may then enter another module or reach termination. Examples of such chaining of conductors are provided and described in additional detail in FIGS. 29, 32 , and 35 below. Fourth, common control and address signal conductors may be dedicated, shared, or chained to each module. Fifth, data signal conductors may be terminated transmission lines or terminated stubs on each module. For this discussion, transmission lines are understood to represent signal lines that have sufficient lengths such that reflections and other transmission line characteristics must be considered and accounted for in order to assure proper signal transmission over the transmission lines. In contrast, terminated stubs are understood to be of such limited length that the parasitic reflections and other transmission line characteristics associated with such stubs can generally be ignored. Sixth, common control and address signal conductors may be terminated transmission lines or terminated stubs on each module. Permitting the shared address and control signals to be wave-pipelined allows their signaling rate to be increased, thereby increasing the performance of the memory system.
[0065] FIG. 2 is a block diagram illustrating clocking details for one slice of a rank of memory components of a memory system such as that illustrated in FIG. 1 in accordance with an embodiment of the invention. The memory controller component 102 includes address transmit block 201 , which is coupled to address bus 107 and address clock conductor 109 . The memory controller component 102 also includes, on a per-slice basis, data transmit block 202 and data receive block 203 , which are coupled to data bus 108 . Data transmit block 202 is coupled to write clock conductor 110 , and data receive block 203 is coupled to read clock conductor 111 .
[0066] Within each memory component, such as memory component 116 , an address receive block 204 , a data receive block 205 , and a data transmit block 206 are provided. The address receive block 204 is coupled to address bus 107 and address clock conductor 109 . The data receive block 205 is coupled to data bus 108 and write clock conductor 110 . The data transmit block 206 is coupled to data bus 108 and read clock conductor 111 .
[0067] A propagation delay 207 , denoted t PD0 , exists along address bus 107 between memory controller component 102 and memory module 103 . A propagation delay 208 , denoted t PD1 , exists along address bus 107 within memory module 103 .
[0068] The basic topology represented in FIG. 2 has several attributes. It includes a memory controller. It includes a single memory module. It includes a single rank of memory components. It includes a sliced data bus (DQ), with each slice of wires connecting the controller to a memory component. It includes a common address and control bus (Addr/Ctrl or AC) connecting the controller to all the memory components. Source synchronous clock signals flow with data, control, and address signals. Control and address signals are unidirectional and flow from controller to memory components. Data signals are bi-directional and may flow from controller to memory components (write operation) or may flow from memory components to controller (read operation). There may be some control signals with the same topology as data signals, but which flow only from controller to memory components. Such signals may be used for masking write data in write operations, for example. These may be treated as unidirectional data signals for the purpose of this discussion. The data, address, control, and clock wires propagate with low distortion (e.g., along controlled impedance conductors). The data, address, control, and clock wires propagate with low inter-symbol interference (e.g., there is a single termination on unidirectional signals and double termination on bi-directional signals). These attributes are listed to maintain clarity. It should be understood that the invention is not constrained to be practiced with these attributes and may be practiced so as to include other system topologies.
[0069] In FIG. 2 , there is a two dimensional coordinate system based on the slice number of the data buses and the memory components (S={0,1, . . . N S }) and the module number (M={0,1}). Here a slice number of “0’ and a module number of ‘0’ refer to the controller. This coordinate system allows signals to be named at different positions on a wire. This coordinate system will also allow expansion to topologies with more than one memory rank or memory module.
[0070] FIG. 2 also shows the three clock sources (address clock 104 , which generates the AClk signal, write clock 105 , which generates the WClk signal, and read clock 106 , which generates the RClk signal) which generate the clocking reference signals for the three types of information transfer. These clock sources each drive a clock wire that is parallel to the signal bus with which it is associated. Preferably, the positioning of the clock sources within the system is such that the physical position on the clock line at which the clock source drives the corresponding clock signal is proximal to the related driving point for the bus line such that the propagation of the clock for a particular bus generally tracks the propagation of the related information on the associated bus. For example, the positioning of the address clock (AClk clock 104 ) is preferably close to the physical position where the address signals are driven onto the address bus 107 . In such a configuration, the address clock will experience similar delays as it propagates throughout the circuit as those delays experienced by the address signals propagating along a bus that follows generally the same route as the address clock signal line.
[0071] The clock signal for each bus is related to the maximum bit rate on the signals of the associated bus. This relationship is typically an integer or integer ratio. For example, the maximum data rate may be twice the frequency of the data clock signals. It is also possible that one or two of the clock sources may be “virtual” clock sources; the three clock sources will be in an integral-fraction-ratio (N/M) relationship with respect to one another, and any of them may be synthesized from either of the other two using phase-locked-loop (PLL) techniques to set the frequency and phase. Virtual clock sources represent a means by which the number of actual clock sources within the circuit can be minimized. For example, a WClk clock might be derived from an address clock (AClk) that is received by a memory device such that the memory device is not required to actually receive a WClk clock from an external source. Thus, although the memory device does not actually receive a unique, individually-generated WClk clock, the WClk clock generated from the AClk clock is functionally equivalent. The phase of a synthesized clock signal will be adjusted so it is the same as if it were generated by a clock source in the positions shown.
[0072] Any of the clock signals shown may alternatively be a non-periodic signal (a strobe control signal, for example) which is asserted only when information is present on the associated bus. As was described above with respect to clock sources, the non-periodic signal sources are preferably positioned, in a physical sense, proximal to the appropriate buses to which they correspond such that propagation delays associated with the non-periodic signals generally match those propagation delays of the signals on the buses to which they correspond.
[0073] FIG. 3 is a timing diagram illustrating address and control timing notations used in timing diagrams of other Figures. In FIG. 3 , a rising edge 302 of the AClk signal 301 occurs at a time 307 during transmission of address information ACa 305 . A rising edge 303 of the AClk signal occurs at a time 308 during transmission of address information ACb 306 . Time 308 occurs at a time t CC before the time 309 of the next rising edge 304 of AClk signal 301 . The time tCC represents a cycle time of a clock circuit of a memory controller component. Dashed lines in the timing diagrams are used to depict temporal portions of a signal coincident with address information or datum information. For example,.the AClk signal 301 includes a temporal portion corresponding to the presence of address information ACa 305 and another temporal portion corresponding to the presence of address information ACb 306 . Address information can be transmitted over an address bus as an address signal.
[0074] If one bit per wire occurs per t CC , address bit 311 is transmitted during cycle 310 . If two bits per wire occur per t CC , address bits 313 and 314 are transmitted during cycle 312 . If four bits per wire occur per t CC , address bits 316 , 317 , 318 , and 319 are transmitted during cycle 315 . If eight bits per wire occur per t CC , address bits 321 , 322 , 323 , 324 , 325 , 326 , 327 , and 328 are transmitted during cycle 320 . Note that the drive and sample points for each bit window may be delayed or advanced by an offset (up to one bit time, which is t CC /N AC ), depending upon the driver and sampler circuit techniques used. The parameters N AC and N DQ represent the number of bits per t CC for the address/control and data wires, respectively. In one embodiment, a fixed offset is used. An offset between the drive/sample points and the bit windows should be consistent between the driving component and the sampling component. It is preferable that in a particular system, any offset associated with the drive point for a bus is consistent throughout the entire system. Similarly, any understood sampling offset with respect to the bus should also be consistent. For example, if data is expected to be driven at a point generally corresponding to a rising edge of a related clock signal for one data bus line, that understood offset (or lack thereof) is preferably consistently used for all data lines. Note that the offset associated with driving data onto the bus may be completely different than that associated with sampling data carried by the bus. Thus, continuing with the example above, the sample point for data driven generally coincident with a rising edge may be 180 degrees out of phase with respect to the rising edge such that the valid window of the data is better targeted by the sample point.
[0075] FIG. 4 is a timing diagram illustrating data timing notations used in timing diagrams of other Figures. In FIG. 4 , a rising edge 402 of the WClk signal 401 occurs at a time 407 during transmission of write datum information Da 405 . A rising edge 403 of the WClk signal 401 occurs at a time 408 . A rising edge 404 of the WClk signal 401 occurs at a time 409 during transmission of read datum information Qb 406 . Time 407 is separated from time 408 by a time t CC , and time 408 is separated from time 409 by a time t CC . The time t CC represents the duration of a clock cycle. RClk signal 410 includes rising edge 411 and rising edge 412 . These rising edges may be used as references to clock cycles of RClk signal 410 . For example, transmission of write datum information Da 405 occurs during a clock cycle of RClk signal 410 that includes rising edge 411 , and transmission of read datum information Qb 406 occurs during a clock cycle of RClk signal 410 that includes rising edge 412 . As is apparent to one of ordinary skill in the art, the clock cycle time associated with the address clock may differ from the clock cycle time associated with the read and/or write clocks.
[0076] Write datum information is an element of information being written and can be transmitted over a data bus as a write data signal. Read datum information is an element of information being read and can be transmitted over a data bus as a read data signal. As can be seen, the notation Dx is used to represent write datum information x, while the notation Qy is used to represent read datum information y. Signals, whether address signals, write data signals, read data signals, or other signals can be applied to conductor or bus for a period of time referred to as an element time interval. Such an element time interval can be associated with an event occurring on a conductor or bus that carries a timing signal, where such an event may be referred to as a timing signal event. Examples of such a timing signal include a clock signal, a timing signal derived from another signal or element of information, and any other signal from which timing may be derived. In a memory access operation, the time from when an address signal begins to be applied to an address bus to when a data signal corresponding to that address signal begins to be applied to a data bus can be referred to as an access time interval.
[0077] If one bit per wire occurs per t CC , datum bit 415 is transmitted during cycle 414 . If two bits per wire occur per t CC , data bits 417 and 418 are transmitted during cycle 416 . If four bits per wire occur per t CC , data bits 420 , 421 , 422 , and 423 are transmitted during cycle 419 . If eight bits per wire occur per t CC , data bits 425 , 426 , 427 , 428 , 429 , 430 , 431 , and 432 are transmitted during cycle 424 . Note that the drive and sample points for each bit window may be delayed or advanced by an offset (up to one bit time, which is t CC /N DQ ), depending upon the driver and sampler circuit techniques used. In one embodiment, a fixed offset is used. An offset between the drive/sample points and the bit windows should be consistent between the driving component and the sampling component. For example, if the data window is assumed to be positioned such that data will be sampled on the rising edge of the appropriate clock signal at the controllers a similar convention should be used at the memory device such that valid data is assumed to be present at the rising edge of the corresponding clock at that position within the circuit as well.
[0078] If one bit per wire occurs per t CC , datum bit 434 is transmitted during cycle 433 . If two bits per wire occur per t CC , data bits 436 and 437 are transmitted during cycle 435 . If f 6 ur bits per wire occur per t CC , data bits 439 , 440 , 441 , and 442 are transmitted during cycle 438 . If eight bits per wire occur per t CC , data bits 444 , 445 , 446 , 447 , 448 , 449 , 450 , and 451 are transmitted during cycle 443 . Note that the drive and sample points for each bit window may be delayed or advanced by an offset (up to one bit time, which is t CC /N DQ ), depending upon the driver and sampler circuit techniques used. In one embodiment, a fixed offset is used. An offset between the drive/sample points and the bit windows should be consistent between the driving component and the sampling component. As stated above, it is preferable that in a particular system, any offset associated with the drive point or sampling point for a bus is consistent throughout the entire system.
[0079] The column cycle time of the memory component represents the time interval required to perform successive column access operations (reads or writes). In the example shown, the AClk, RClk, and WClk clock signals are shown with a cycle time equal to the column cycle time. As is apparent to one of ordinary skill in the art, the cycle time of the clock signals used in the system may be different from the column cycle time in other embodiments.
[0080] Alternatively, any of the clocks could have a cycle time that is different than the column cycle time. The appropriate-speed clock for transmitting or receiving signals on a bus can always be synthesized from the clock that is distributed with the bus as long as there is an integer or integral-fraction-ratio between the distributed clock and the synthesized clock. As mentioned earlier, any of the required clocks can be synthesized from any of the distributed clocks from the other buses.
[0081] This discussion will assume a single bit is sampled or driven on each wire during each t CC interval in order to keep the timing diagrams as simple as possible. However, the number of bits that are transmitted on each signal wire during each t CC interval can be varied. The parameters N AC and N DQ represent the number of bits per t CC for the address/control and data wires, respectively. The distributed or synthesized clock is multiplied up to create the appropriate clock edges for driving and sampling the multiple bits per t CC . Note that the drive and sample points for each bit:window may be delayed or advanced by an offset (up to one bit time, which is t CC /N AC or t CC /N DQ ), depending upon the driver and sampler circuit techniques used. In one embodiment, a fixed offset is used. An offset between the drive/sample points and the bit windows should be consistent between the driving component and the sampling component. Once again, as stated above, it is preferable that in a particular system, any offset associated with the drive point or sampling point for a bus is consistent throughout the entire system.
[0082] FIG. 5 is a timing diagram illustrating timing of signals communicated over the address and control bus (Addr/Ctrl or AC S,M ) in accordance with an embodiment of the invention. This bus is accompanied by a clock signal AClk S,M which sees essentially the same wire path as the bus. The subscripts (S,M) indicate the bus or clock signal at a particular module M or a particular slice S. The controller is defined to be slice zero.
[0083] The waveform for AClk clock signal 501 depicts the timing of the AClk clock signal at the memory controller component. A rising edge 502 of AClk clock signal 501 occurs at time 510 and is associated with the transmission of address information ACa 518 . A rising edge 503 of AClk clock signal 501 occurs at time 511 and is associated with the transmission of address information ACb 519 .
[0084] The waveform for AClk clock signal 520 depicts the timing of the AClk clock signal at a memory component located at slice one. The AClk signal 520 is delayed a delay of by t PD0 from signal 501 . For example, the rising edge 523 of signal 520 is delayed by a delay of t PD0 from edge 502 of signal 501 . The address information ACa 537 is associated with the rising edge 523 of signal 520 . The address information ACb 538 is associated with the rising edge 525 of signal 520 .
[0085] The waveform for AClk clock signal 539 depicts the timing of the AClk clock signal at the memory component located at slice N S . The AClk signal 539 is delayed by a delay of t PD1 from signal 520 . For example, the rising edge 541 of signal 539 is delayed by a delay of t PD1 from edge 523 of signal 520 . The address information ACa 548 is associated with the rising edge 541 of signal 539 . The address information ACb 549 is associated with the rising edge 542 of signal 539 .
[0086] The clock signal AClk is shown with a cycle time that corresponds to the column cycle time. As previously mentioned, it could also have a shorter cycle time as long as the frequency and phase are constrained to allow the controller and memory components to generate the necessary timing points for sampling and driving the information on the bus. Likewise, the bus is shown with a single bit per wire transmitted per t CC interval. As previously mentioned, more than one bit could be transferred in each t CC interval since the controller and memory components are able to generate the necessary timing points for sampling and driving the information on the bus. Note that the actual drive point for the bus (the point at which data signals, address signals, and/or control signals are applied to the bus) may have an offset from what is shown (relative to the rising and falling edges of the clock)—this will depend upon the design of the transmit and receive circuits in the controller and memory components. In one embodiment, a fixed offset is used. An offset between the drive/sample points and the bit windows should be consistent between the driving component and the sampling component. As reiterated above, it is preferable that in a particular system, any offset associated with the drive point or sampling point for a bus is consistent throughout the entire system.
[0087] It should be noted in FIG. 5 is that there is a delay t PD0 in the clock AClk S,M and bus AC S,M as they propagate from the controller to the first slice. As indicated, AClk signal 520 is shifted in time and space from AClk signal 501 . Also note that there is a second delay t PD1 in the clock AClk S,M and bus AC S,M as they propagate from the first slice to the last slice Ns. There will be a delay of t PD1 /(N S −1) as the clock and bus travel between each slice. Note that this calculation assumes generally equal spacing between the slices, and, if such physical characteristics are not present in the system, the delay will not conform to this formula. Thus, as indicated, AClk signal 539 is shifted in time and space from AClk signal 520 . As a result, the N S memory components will each be sampling the address and control bus at slightly different points in time.
[0088] FIG. 6 is a timing diagram illustrating timing of signals communicated over the data bus (DQ S,M ) in accordance with an embodiment of the invention. This bus is accompanied by two clock signals RClk S,M and WClk S,M which see essentially the same wire path as the bus. The subscripts (S,M) indicate the bus or clock signal at a particular module M and a particular slice S. The controller is defined to be module zero. The two clocks travel in opposite directions. WClk S,M accompanies the write data which is transmitted by the controller and received by the memory components. RClk S,M accompanies the read data which is transmitted by the memory components and received by the controller. In the example described, read data (denoted by “Q”) and write data (denoted by “D”) do not simultaneously occupy the data bus. Note that in other embodiments, this may not be the case where additional circuitry is provided to allow for additive signaling such that multiple waveforms carried over the same conductor can be distinguished and resolved.
[0089] The waveform of WClk clock signal 601 depicts the timing of the WClk clock signal at the memory controller component. Rising edge 602 occurs at time 610 and is associated with write datum information Da 618 , which is present at slice one of module zero. Rising edge 607 occurs at time 615 , and is associated with write datum information Dd 621 , which is present at slice one of module zero. Rising edge 608 occurs at time 616 , and is associated with write datum De 622 , which is present at slice one of module zero.
[0090] The waveform of RClk clock signal 623 depicts the timing of the RClk clock signal at the memory controller component (at module zero). Rising edge 626 is associated with read datum information Qb 619 , which is present at the memory controller component (at slice one of module zero). Rising edge is associated with read datum information Qc 620 , which is present at the memory controller component (at slice one of module zero).
[0091] The waveform of WClk clock signal 632 depicts the timing of the WClk clock signal at the memory component at slice one of module one. Rising edge 635 is associated with write datum information Da 649 , which is present at slice one of module one. Rising edge 645 is associated with write datum information Dd 652 , which is present at slice one of module one. Rising edge 647 is associated with write datum information De 653 , which is present at slice one of module one.
[0092] The waveform of RClk clock signal 654 depicts the timing of the RClk clock signal at the memory component of slice one of module one. Rising edge 658 is associated with read datum information Qb 650 , which is present at slice one of module one. Rising edge 660 is associated with read datum information Qd 651 , which is present at slice one of module one.
[0093] The clock signals are shown with a cycle time that corresponds to t CC . As previously mentioned, they could also have a shorter cycle time as long as the frequency and phase are constrained to allow the controller and memory components to generate the necessary timing points for sampling and driving the information on the bus. Likewise, the bus is shown with a single bit per wire. As previously mentioned, more than one bit could be transferred in each t CC interval since the controller and memory components are able to generate the necessary timing points for sampling and driving the information on the bus. Note that the actual drive point for the bus may have an offset from what is shown (relative to the rising and falling edges of the clock)—this will depend upon the design of the transmit and receive circuits in the controller and memory components. In one embodiment, a fixed offset is used. An offset between the drive/sample points and the bit windows should be consistent between the driving component and the sampling component.
[0094] It should be noted in FIG. 6 is that there is a delay tPD 2 in the clock WClk S,M and bus DQ S,M (with the write data) as they propagate from the controller to the slices of the first module. Thus, WClk clock signal 632 is shifted in time and space from WClk clock signal 601 . Also note that there is an approximately equal delay t PD2 in the clock RClk S,M and bus DQ S,M (with the read data) as they propagate from the slices of the first module to the controller. Thus, RClk clock signal 623 is shifted in time and space from RClk clock signal 654 .
[0095] As a result, the controller and the memory components must have their transmit logic coordinated so that they do not attempt to drive write data and read data at the same time. The example in FIG. 6 shows a sequence in which there are write-read-read-write-write transfers. It can be seen that read-read and write-write transfers may be made in successive t CC intervals, since the data in both intervals is traveling in the same direction. However, gaps (bubbles) are inserted at the write-read and read-write transitions so that a driver only turns on when the data driven in the previous interval is no longer on the bus (it has been absorbed by the termination components at either end of the bus wires).
[0096] In FIG. 6 , the read clock RClk S,M and the write clock WClk S,M are in phase at each memory component (however the relative phase of these clocks at each memory component will be different from the other memory components—this will be shown later when the overall system timing is discussed). Note that this choice of phase matching is one of several possible alternatives that could have been used. Some of the other alternatives will be described later.
[0097] As a result of matching the read and write clocks at each memory component (slice), the t CC intervals with read data Qb 650 will appear to immediately follow the t CC intervals with write data Da 649 at the memory components (bottom of FIG. 6 ), but there will be a gap of 2*t PD2 between the read data interval Qb 619 and write data interval Da 618 at the controller (top of FIG. 6 ). There will be a second gap of (2*t CC −2*t PD2 ) between the read data Qc 620 and the write data Dd 621 at the controller. There will be a gap of (2*t CC ) between the read data Qc 651 and the write data Dd 621 . Note that the sum of the gaps at the memory components and the controller will be 2*t CC .
[0098] The overall system timing will be described next. The example system phase aligns the AClk S,M , RClk S,M , and WClk S,M clocks at each memory component (the slice number varies from one through N S , and the module number is fixed at one). This has the benefit of allowing each memory component to operate in a single clock domain, avoiding any domain crossing issues. Because the address and control clock AClk S,M flows past each memory component, the clock domain of each memory slice will be offset slightly from the adjacent slices. The cost of this phasing decision is that the controller must adjust the read and write clocks for each slice to different phase values—this means there will be 1+(2*N S ) clock domains in the controller, and crossing between these domains efficiently becomes very important. Other phase constraints are possible and will be discussed later.
[0099] FIG. 7 is a timing diagram illustrating system timing at a memory controller component in accordance with an embodiment of the invention. As before, the controller sends a write-read-read-write sequence of operations on the control and address bus AClk S0,M1 . The Da write datum information is sent on the WClk S1,M0 and WClk SNs,M0 buses so that it will preferably arrive at the memory component of each slice one cycle after the address and control information ACa. This is done by making the phase of the WClk S1,M0 clock generally equivalent to (t PD0 −t PD2 ) relative to the phase of the AClk S0,M1 clock (positive means later, negative means earlier). This will cause them to be in phase at the memory component of the first slice of the first module. Likewise, the phase of the WClk SNs,M0 clock is adjusted to be generally equivalent to (t PD0 −t PD1 −t PD2 ) relative to the phase of the AClk S0,M1 clock. Note that some tolerance is preferably built into the system such that the phase adjustment of the clock to approximate the propagation delays can vary slightly from the desired adjustment while still allowing for successful system operation.
[0100] In a similar fashion, the phase of the RClk S1,M0 clock is adjusted to be generally equivalent to (t PD0 +t PD2 ) relative to the phase of the AClk S0,M1 clock. This will cause them to be in phase at the memory component of the last slice of the first module. Likewise, the phase of the RClk SNs,M0 clock is adjusted according to the expression (t PD0 +t PD1 +t PD2 ) relative to the phase of the AClk S0,M1 clock to cause the RClk SNs,M0 clock and the AClk S0,M1 clock to be in phase at the memory component of the last slice of the first module.
[0101] The waveform of AClk clock signal 701 depicts the AClk clock signal at the memory controller component, which is denoted as being at slice zero. Rising edge 702 occurs at time 710 and is associated with address information ACa 718 , which is present at slice zero. Rising edge 703 occurs at time 711 and is associated with address information ACb 719 , which is present at slice zero. Rising edge 704 occurs at time 712 and is associated with address information ACc 720 , which is present at slice zero. Rising edge 707 occurs at time 715 and is associated with address information ACd 721 , which is present at slice zero.
[0102] The waveform of WClk clock signal 722 depicts the WClk clock signal for the memory component at slice one when that WClk clock signal is present at the memory controller component at module zero. Rising edge 724 occurs at time 711 and is associated with write datum information Da 730 , which is present. Rising edge 729 occurs at time 716 and is associated with write datum information Dd 733 , which is present.
[0103] The waveform of RClk clock signal 734 depicts the RClk clock signal for the memory component of slice one when that RClk clock signal is present at the memory controller component at module zero. Rising edge 737 is associated with read datum information Qb 731 , which is present. Rising edge 738 is associated with read datum information Qc 732 , which is present.
[0104] The waveform of WClk clock signal 741 depicts the WClk clock signal for the memory component at slice N S when that WClk clock signal is present at the memory controller component at module zero. Write datum information Da 756 is associated with edge 744 of signal 741 . Write datum information Dd 759 is associated with edge 754 of signal 741 .
[0105] The waveform of RClk clock signal 760 depicts the RClk clock signal for the memory component at slice N S when that RClk clock signal is present at the memory controller component at module zero. Read datum information Qb 757 is associated with edge 764 of signal 760 . Read datum information Qc 758 is associated with edge 766 of signal 760 .
[0106] FIG. 8 is a timing diagram illustrating alignment of clocks AClk S1,M1 , WClk S1,M1 , and RClk S1,M1 at the memory component in slice 1 of rank 1 in accordance with an embodiment of the invention. All three clocks are delayed by t PD0 relative to the AClk S0,M1 clock produced at the controller.
[0107] The waveform of AClk clock signal 801 depicts the AClk clock signal for the memory component at slice one of module one. Address information ACa 822 is associated with edge 802 of signal 801 . Address information ACb 823 is associated with edge 804 of signal 801 . Address information ACc 824 is associated with edge 806 of signal 801 . Address information ACd 825 associated with edge 812 of signal 801 .
[0108] The waveform of WClk clock signal 826 depicts the WClk clock signal for the memory component at slice one of module one. Write datum information Da 841 is associated with edge 829 of signal 826 . Write datum information Dd 844 is associated with edge 839 of signal 826 .
[0109] The waveform of RClk clock signal 845 depicts the RClk clock signal for the memory component at slice one of module one. Read datum information Qb 842 is associated with edge 850 of signal 845 . Read datum information Qc 843 is associated with edge 852 of signal 845 .
[0110] FIG. 9 is a timing diagram illustrating alignment of clocks AClk SNs,M1 , WClk SNs,M1 , and RClk SNs,M1 at the memory component in slice N S of rank one of module one in accordance with an embodiment of the invention. All three clocks are delayed by (t PD0 +t PD1 ) relative to the AClk S0,M1 , clock produced at the controller.
[0111] The waveform of AClk clock signal 901 depicts the AClk clock signal for the memory component at slice N S at module one. Rising edge 902 of signal 901 is associated with address information ACa 917 . Rising edge 903 of signal 901 is associated with address information ACb. Rising edge 904 of signal 901 is associated with address information ACc 919 . Rising edge 907 of signal 901 is associated with address information ACd 920 .
[0112] The waveform of WClk clock signal 921 depicts the WClk clock signal for the memory component at slice N S at module one. Rising edge 923 of signal 921 is associated with write datum information Da 937 . Rising edge 928 of signal 921 is associated with write datum information Dd 940 .
[0113] The waveform RClk clock signal 929 depicts the RClk clock signal for the memory component at slice N S at module one. Rising edge 932 of signal 929 is associated with read datum information Qb 938 . Rising edge 933 of signal 929 is associated with read datum information Qc 939 .
[0114] Note that in both FIGS. 8 and 9 there is a one t CC cycle delay between the address/control information (ACa 917 of FIG. 9 , for example) and the read or write information that accompanies it (Da 937 of FIG. 9 in this example) when viewed at each memory component. This may be different for other technologies; i.e. there may be a longer access delay. In general, the access delay for the write operation at the memory component should be equal or approximately equal to the access delay for the read operation in order to maximize the utilization of the data bus.
[0115] FIGS. 10 through 18 illustrate the details of an exemplary system which uses address and data timing relationships which are nearly identical to what has been described in FIGS. 5 through 9 . In particular, all three clocks are in-phase on each memory component. This example system has several differences relative to this earlier description, however. First, two bits per wire are applied per t CC interval on the AC bus (address/control bus, or simply address bus). Second, eight bits per wire are applied per t CC interval on the DQ bus. Third, a clock signal accompanies the AC bus, but the read and write clocks for the DQ bus are synthesized from the clock for the AC bus.
[0116] FIG. 10 is a block diagram illustrating further details for one memory rank (one or more slices of memory components) of a memory system such as that illustrated in FIG. 1 in accordance with an embodiment of the invention. The internal blocks of the memory components making up this rank are connected to the external AC or DQ buses. The serialized data on these external buses is converted to or from parallel form on internal buses which connect to the memory core (the arrays of storage cells used to hold information for the system). Note that FIG. 10 shows all 32 bits of the DQ bus connecting to the memory rank—these 32 bits are divided up into multiple, equal-sized slices and each slice of the bus is routed to one memory component. Thus, slices are defined based on portions of the DQ bus routed to separate memory components. The example shown in FIG. 10 illustrates a memory component, or device, that supports the entire set of 32 data bits for a particular example system. In other embodiments, such a system may include two memory devices, where each memory device supports half of the 32 data bits. Thus, each of these memory devices would include the appropriate data transmit blocks, data receive blocks, and apportionment of memory core such that they can individually support the portion of the overall data bus for which they are responsible. Note that the number of data bits need not be 32, but may be varied.
[0117] The AClk signal is the clock which accompanies the AC bus. It is received and is used as a frequency and phase reference for all the clock signals generated by the memory component. The other clocks are ClkM 2 , ClkM 8 , and ClkM. These are, respectively, 2×, 8×, and 1× the frequency of AClk. The rising edges of all clocks are aligned (no phase offset). The frequency and phase adjustment is typically done with some type of phase-locked-loop (PLL) circuit, although other techniques are also possible. A variety of different suitable PLL circuits are well known in the art. The feedback loop includes the skew of the clock drivers needed to distribute the various clocks to the receive and transmit blocks as well as the memory core. The memory core is assumed to operate in the ClkM domain.
[0118] Memory component 116 comprises memory core 1001 , PLL 1002 , PLL 1003 , and PLL 1004 . AClk clock signal 109 is received by buffer 1015 , which provides clock signal 1019 to PLLs 1002 , 1003 , and 1004 . Various PLL designs are well known in the art, however some PLLs implemented in the example embodiments described herein require minor customization to allow for the specific functionality desired. Therefore, in some embodiments described herein, the particular operation of the various blocks within the PLL are described in additional detail. Thus, although some of the PLL constructs included in the example embodiments described herein are not described in extreme detail, it is apparent to one of ordinary skill in the art that the general objectives to be achieved by such PLLs are readily recognizable through a variety of circuits well known to those skilled in the art. PLL 1002 includes phase comparator and voltage controlled oscillator (VCO) 1005 . PLL 1002 . provides clock signal ClkM 1024 to memory core 1001 , address/control receive block 204 , data receive block 205 , and data transmit block 206 .
[0119] PLL 1003 comprises prescaler 1009 , phase comparator and VCO 1010 , and divider 1011 . Prescaler 1009 may be implemented as a frequency divider (such as that used to implement divider 1011 ) and provides a compensating delay with no frequency division necessary. Prescaler 1009 provides a signal 1021 to phase comparator and VCO 1010 . The phase comparator in VCO 1010 is represented as a triangle having two inputs and an output. The functionality of the phase comparator 1010 is preferably configured such that it produces an output signal that ensures that the phase of the feedback signal 1023 , which is one of its inputs, is generally phase aligned with a reference signal 1021 . This convention is preferably applicable to similar structures included in other PLLs described herein. Divider 1011 provides a feedback signal 1023 to phase