Next Patent: Pressure sensitive keyboard
Next Patent: Pressure sensitive keyboard
[0001] The present invention relates to a method and system for visualizing data and data objects. More specifically, the present invention pertains to a method and system for computing forces on data objects in physics-based visualization techniques.
[0002] The analysis of data objects can be facilitated by displaying them in multi-dimensional space. By displaying data objects in this manner, it is easier to visualize the relationships between the data objects while preserving the essential information in the data.
[0003] One technique known in the art for computing the layout of data objects in multi-dimensional space is generally referred to as a physics-based visualization system. Data objects are positioned in “space” according to how strongly the data objects are related. The relationship between data objects is characterized according to their “similarity” or “confidence level” and their “support.” Similarity (confidence level) and support are best defined by way of a market basket example: if 85 percent of the customers who bought a printer also bought paper, and 10 percent of all customers bought both a printer and paper, then similarity is 85 percent and support is 10 percent. Note that similarity is directional; that is, although 85 percent of the customers who bought a printer also bought paper, it is not necessarily true that 85 percent of the customers who bought paper also bought a printer.
[0004] In multi-dimensional space, the distance between a pair of data objects indicates their degree of support; the closer the objects, the greater their degree of support. The similarity between data objects is represented by a “link” or an “edge” between the objects. A link or edge is essentially a line between two data objects that have a degree of similarity that is greater than zero. The color of the link (edge) can be used to indicate the degree of similarity. Also, the link may include an arrow to show the direction of the association.
[0005] Physics-based visualization systems execute in a known manner to place data objects such that the distance between any two data objects is indicative of the degree of support between those two objects. Much as stars and planets exert forces on each other in astrophysical space, data objects in visualization space can be thought of as exerting “forces” on each other. In essence, these forces should be proportionate with the degree of support between each data object and the other data objects in the system. Data objects with higher degrees of support should exert higher forces on each other and thus should be placed closer together.
[0006] Thus, in a physics-based visualization system, one of the key procedures for determining the final placement of data objects is to calculate the forces applied to each data object by the other data objects in the system. To calculate the force for every data object, the simplistic approach is to compute the force for every possible pairing of data objects, and sum the results. However, this would result in a run time on the order of N
[0007] In astrophysics, techniques known in the art can be used to accelerate the force computations. One such technique, referred to herein as “Barnes-Hut,” is described in J. E. Barnes and P. Hut (1986), “A Hierarchical O(N log N) Force-Calculation Algorithm,” Nature, 324(6270) (pages 446-449), hereby incorporated by reference. Instead of a run time on the order of N
[0008] Techniques such as Barnes-Hut generally represent objects that are far away as a single, heavier object. For example, to calculate the force of a distant galaxy on a single star, the force is calculated considering the galaxy as a whole, rather than by calculating the force exerted by each individual star in the galaxy and summing the results.
[0009] In astrophysics, the force between objects is a function of the distance between the objects, and so the approximations introduced by Barnes-Hut can be readily applied. However, in physics-based visualization systems, the force between data objects is a function not only of the distance between objects (their degree of support), but also the similarity between objects. Because forces in a physics-based visualization system are dependent on both the distance between data objects and the similarity of the data objects, applying a computationally efficient force computation technique such as Barnes-Hut is problematic. This problem is illustrated by Prior Art
[0010] Prior Art
[0011] When calculating the force exerted on data object
[0012] Accordingly, what is needed is a method and/or system for accelerating calculations of force in physics-based visualization systems. What is also needed is a method and/or system that can satisfy this need and that can account for the similarity between data objects. The present invention provides a novel solution to these needs.
[0013] A method and system thereof for computing forces on data objects in a physics-based visualization system are described. First forces exerted on a data object by other data objects in the plurality are determined without considering the similarity between data objects. Second forces exerted on the data object by a portion of the other data objects, each data object in the portion having a degree of similarity to the data object, are determined considering the similarity between data objects. The first forces are adjusted using the second forces to determine a net force on the data object. The net force is thus determined without having to consider similarity between all data objects in the plurality.
[0014] The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention:
[0015] PRIOR ART
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025] Embodiments of the present invention provide a method and system for accelerating calculations of force in physics-based visualization systems. Embodiments of the present invention also provide a method and system that can account for the similarity between data objects.
[0026] Embodiments of the present invention generally pertain to a two-pass procedure including a prediction step and a correction step. In the prediction step, the forces exerted on data objects by other data objects are calculated without considering the degree of similarity between the objects. As such, a technique that accelerates force computations (such as but not limited to Barnes-Hut) can be used in the prediction step.
[0027] In the correction step, forces are calculated for those data object pairs that have a certain (e.g., specified) degree of similarity, using a function that considers the similarity between data objects. These forces are used with the forces calculated in the prediction step to determine the actual forces on the data objects. For those data object pairs that have the specified degree of similarity, the forces calculated in the prediction step are subtracted and the forces calculated in the correction step are added.
[0028] In its various embodiments, the present invention can reduce the overall computational effort while striking an appropriate balance with accuracy. Suppose there are N data objects, and M pairs of data objects having the specified degree of similarity. A conventional approach (not utilizing the present invention) requires explicit force calculations for all N data objects and a computational effort on the order of N
[0029]
[0030] A non-zero degree of similarity between data objects can be indicated using a “link” or an “edge.” In
[0031] In one embodiment, a threshold value can be defined such that pairs of data objects having a degree of similarity less than the threshold value can be considered to be dissimilar (e.g., they can be assumed to have a similarity value of zero). In this embodiment, a link or edge (e.g., link
[0032] In the example of
[0033]
[0034]
[0035]
[0036]
[0037] In step
[0038] In step
[0039]
[0040] In step
[0041] In step
[0042] In the present embodiment, step
[0043] In the present embodiment, if M is the number of links with similarity other than zero (or above the threshold value), then the correction step (step
[0044] Note that the threshold value can be defined to achieve a desired value for M. In effect, the value of the threshold (and hence the value of M) determines the amount by which the estimated forces (from step
[0045] Because the present embodiment of step
[0046]
[0047] In step
[0048] In the present embodiment, the first forces are calculated by assuming that all of the data object pairs have a similarity of zero. In one embodiment, this is accomplished by setting all values of similarity to zero. Alternatively, functions for calculating the first forces as a function of only distance can be derived from functions that are known in the art. In one embodiment, if the function being used to calculate forces includes a similarity term, the similarity term is zeroed out (for example, the coefficient of the similarity term is set equal to zero) to remove this term from the function. Thus, according to the present embodiment of the present invention, the first forces can be calculated using the simplifying assumptions introduced according to Barnes-Hut, FME, or another technique known in the art. That is, by calculating the first forces without considering similarity, the force calculation is amenable to the use of Barnes-Hut, FME, and the like.
[0049] In step
[0050] In the present embodiment, the second forces are calculated using a function that considers similarity between data objects. In one embodiment, a function that is dependent on both the distance between data objects and their similarity is used. Such functions are known in the art. The function used to calculate the second forces may be the same function as that used in step
[0051] In step
[0052] With reference back to
[0053] Next, according to the present embodiment, second forces dependent on both distance and similarity are calculated for the data object pairs that have a degree of similarity greater than zero (or greater than the threshold). Continuing the example from the preceding paragraph, in which the method of the present invention is illustrated for data objects
[0054] Then, according to the present embodiment, the net forces on data objects
[0055] Thus, in accordance with the present invention, a portion of the data objects can be modeled as superobjects using techniques such as Barnes-Hut, reducing the total number of explicit force calculations that need to be performed and thereby reducing the run time needed to place data objects in a physics-based visualization system. The present invention thus provides a method and system for accelerating calculations of force in physics-based visualization systems while accounting for both the distance and the similarity between data objects.
[0056] The present invention may be utilized in a number of applications. These applications include but are not limited to: market basket analysis of the similarity of products used/purchased by consumers; customer behavior analysis of the similarity of customers; text mining in which the similarity of documents is analyzed; multidimensional scaling algorithms; large graph layouts of abstract, multidimensional graphs; and physical simulations of complex force fields in which force or other phenomena depend on more than one parameter.
[0057]
[0058] System
[0059] An objective of physics-based visualization engine
[0060] Continuing with reference to
[0061]
[0062] In
[0063] Referring now to
[0064] In the embodiment of
[0065] The preferred embodiment of the present invention is thus described. While the present invention has been described in particular embodiments, it should be appreciated that the present invention should not be construed as limited by such embodiments, but rather construed according to the following claims.