Title:
System and method for analyzing data
Document Type and Number:
Kind Code:
A1

Abstract:
A system and method for analyzing data (the “system”) is disclosed. The system can automatically identify patterns of template data points encapsulated in the form of one or more “events.” Calculations and analysis relating to those identified events can be automatically performed at the identified locations of the events. Events are user-defined, and can be defined in reference to multiple channels of data. The system can perform various correlation calculations in comparing events with data points. Upon identifying the location of various events in the various data files, markers can be placed at those file locations. Analysis calculations can then be performed related to the marked data. The system can incorporate the automated time-scaling of patterns, marker sorting heuristics, the adjustment of fit sensitivity based on the size of the pattern, target value weighing, and the calculation of various confidence values relating to the processing of the system.
Inventors:
James, Frederick Earl (Kalamazoo, MI, US)
      Plaque It!

Sponsored by:
Flash of Genius
Application Number:
10/377511
Publication Date:
09/02/2004
Filing Date:
02/28/2003
View Patent Images:
Images are available in PDF form when logged in. To view PDFs, Login  or  Create Account (Free!)
Primary Class:
International Classes:
(IPC1-7): G06F017/00
Attorney, Agent or Firm:
RADER, FISHMAN & GRAUER PLLC (39533 WOODWARD AVENUE, BLOOMFIELD HILLS, MI, 48304-0610, US)
Claims:

In the claims:



1. A system for analyzing data, comprising: a data subsystem, including a file, said file comprising a plurality of data points; an interface subsystem, including a plurality of input characteristics and a plurality of events, wherein said interface subsystem creates said events from at least one said input characteristic; and an analysis subsystem, including a plurality of markers, wherein said analysis subsystem searches said data subsystem for said data points indicative of said events, and wherein said analysis subsystem places said markers on said data points indicative of said events without human intervention.

2. The system of claim 1, further comprising a plurality of template data types and a plurality of template data points, wherein at least one said event includes said plurality of template data points and said plurality of template data types.

3. The system of claim 1, further comprising a plurality of data types, wherein said file includes said plurality of data types.

4. The system of claim 1, further comprising: a plurality of data types, a plurality of template data types, and a plurality of template data points; wherein at least one said event includes said plurality of template data points; wherein at least one said event includes said plurality of data types; and wherein said file includes said plurality of data types.

5. The system of claim 4, wherein each data type in said plurality of template data types is not represented in said plurality of data types.

6. The system of claim 4, wherein each template data type in said plurality of data types is not represented in said plurality of template data types.

7. The system of claim 1, further comprising a weight factor, wherein said analysis subsystem uses said weight factor to identify at least one said event.

8. The system of claim 1, said analysis subsystem further including a marker location associated with said marker and an analysis, wherein said analysis subsystem generates said analysis from at least one said data point at said marker location.

9. The system of claim 8, wherein said analysis is generated without human intervention.

10. The system of claim 1, said data subsystem further including a plurality of files in a pre-defined and user-defined format, wherein said plurality of data points are stored within said plurality of files.

11. The system of claim 1, said data subsystem further including a data collection module, said data collection module including a sensor for capturing sensor data, wherein said data collection module generates said data points from said sensor data.

12. The system of claim 11, wherein said sensor collects sensor data for a plurality of channels in a substantially simultaneous manner, and wherein each channel is of a different data type.

13. The system of claim 12, wherein one channel in said plurality of channels is a force channel.

14. The system of claim 1, wherein said markers are placed at a plurality of marker locations, wherein said marker locations are identified by a correlation-based matching heuristic, and wherein a confidence value is associated with the identification of each said marker locations.

15. The system of claim 1, wherein said markers are placed at a plurality of marker locations, wherein said marker locations are identified by a correlation-based matching heuristic, and wherein a confidence value is associated with the identification of said marker locations.

16. The system of claim 1, said analysis subsystem including a skill level indicator and a menu comprising of a plurality of menu selections, wherein said skill level indicator is set in accordance with at least one said input characteristic, and wherein said menu selections are selectively disabled depending on said skill level indicator.

17. The system of claim 1, wherein at least one said event is stored in a pattern array, wherein a subset of said plurality of data points are stored in a data array, and wherein a correlation heuristic is applied to said pattern array and said data array to identify a marker location.

18. The system of claim 1, further comprising a confidence value and a threshold value, wherein a confidence value is calculated by said analysis subsystem, and wherein said confidence value is compared to said threshold value.

19. The system of claim 1, further comprising a pattern matching heuristic, wherein said analysis subsystem identifies a marker location with said pattern matching heuristic.

20. The system of claim 1, further comprising a sample size adjustment, wherein said analysis subsystem identifies said marker using said sample size adjustment.

21. The system of claim 1, further comprising a scaling adjustment, wherein said analysis subsystem identifies said marker using said scaling adjustment.

22. A system for analyzing data comprising: a data subsystem, including a plurality of data types and plurality of files comprising a plurality of data points, wherein each said data point is associated with at least one said data type; a pattern subsystem, including a plurality of events, a plurality of input characteristics, and a plurality of template data points, wherein said events are defined from said input plurality of characteristics, and wherein at least two said template data points are associated with each said event; and a search subsystem, including a plurality of markers, wherein said search subsystem identifies a plurality of locations indicative of said events within said data subsystem, and wherein said search subsystem places said markers at said locations.

23. The system of claim 22, further comprising an analysis module including an analysis calculation, wherein said analysis calculation is generated from said data points at said location of said marker.

24. The system of claim 22, said search subsystem further including a search criteria and a search result, wherein said plurality of files are searched for said search criteria, and wherein said search result includes at least two files.

25. The system of claim 24, wherein said search criteria includes a plurality of data types.

26. The system of claim 22, further comprising a reporting tool, a configurable report, and a plurality of locations, wherein said search subsystem automatically generates a plurality of markers at said plurality of locations, and wherein said reporting tool automatically generates said configurable report from said data points at said plurality of locations.

27. The system of claim 22, further comprising a plurality of confidence values and a plurality of locations, wherein at least one said confidence value is generated at each said location.

28. The system of claim 22, further comprising a confidence value, a threshold value, and an analysis module, wherein an analysis module generates said confidence value from at least one marker, and wherein said analysis module compares said confidence value to said threshold value.

29. The system of claim 22, further comprising a fit sensitivity, wherein said fit sensitivity is automatically adjusted for the number of said template data points associated with said event.

30. The system of claim 22, further comprising an analysis calculation, and a marker sort heuristic, wherein said calculation is performed with said marker sort heuristic.

31. The system of claim 22, further comprising a target value weighing heuristic, wherein said system performs said target value weighing heuristic to identify said event.

32. The system of claim 22, further comprising a marker control, wherein said marker control provides for moving said location.

33. A method for locating an event in a data file, comprising: defining a plurality of events in terms of a plurality of template data points; accessing a file of data points; and identifying the location of said events in said file without human intervention by comparing said template data points with said data points.

34. The method of claim 33, further comprising: loading said data points into a data array; storing said template data points into a pattern array; calculating a correlation value between said data array and said pattern array; and placing one or more markers at a file location of the data points.

35. The method of claim 33, further comprising generating a confidence value to represent the degree of confidence in the identification of at least one location of at least one event.

36. The method of claim 35, further comprising comparing said confidence value to a threshold value.

37. The method of claim 33, further comprising performing a user-defined analysis heuristic at said marker location without human intervention.

38. The method of claim 33, further comprising adjusting the time scaling of the template data points stored in said pattern array without human intervention.

39. The method of claim 33, further comprising adjusting a fit sensitivity value used in calculating a correlation value, wherein said fit sensitivity value is based on the number of data points associated with the event.

40. The method of claim 33, further comprising loading an excluded array to exclude template data points and data points from the comparison of template data points and data points.

41. The method of claim 33, further comprising: selecting an operating mode from one of automatic, batch, and manual; and invoking predetermined processing rules in accordance with the selected operating mode.

42. The method of claim 33, further comprising selecting two or more channels from a plurality of channels.

43. The method of claim 33, further comprising setting an always rematch flag to yes.

44. The method of claim 33, further comprising invoking a pattern recognition heuristic.

45. The method of claim 33, further comprising setting a user skill level.

46. The method of claim 33, wherein a number of channels for determining each marker placement is set by the user to a value greater than 3.

47. The method of claim 33, further comprising automatically recalculating a calculation at the marker location using updated pattern information.

48. The method of claim 33, further comprising performing a target value heuristic.

49. The method of claim 33, further comprising sorting a plurality files based on an overall confidence value.

50. The method of claim 33, further comprising identifying a user expertise level.

51. The method of claim 33, further comprising calculating a weight factor, wherein said weight factor is used to identify the plurality of locations.

52. The method of claim 33, further comprising performing a pattern size adjustment heuristic.

53. The method of claim 33, further comprising sorting said plurality of markers.

Description:

BACKGROUND OF INVENTION

[0001] The invention relates generally to systems and methods for analyzing data. More specifically, the invention relates to a systems and methods (collectively “the system”) for automatically identifying an “event” in a data file.

[0002] Advances in information storage technology provide individuals, businesses, universities, think tanks, government agencies, research institutions, hospitals, and other types of organizations and entities with entirely new sets of challenges and opportunities. As the cost of data storage decreases, the amount of data being stored increases. However, the ability to effectively and efficiently analyze data has not kept up with the technology of capturing and storing data. It would be desirable for analysis tools to possess user-friendly interfaces that are easy to use, and yet are also computationally robust and comprehensive. It would be desirable for an analysis tool to be both highly automated and highly configurable.

[0003] The voluminous abundance of data provides a yet untapped opportunity to look for patterns, and perform various statistical analysis relating to those patterns. The undiscovered patterns in existing data could be the source of future insights in engineering, economics, medicine, computer science, business and other fields. The ability to identify and explore statistical correlations and other data relationships can be the key to effective problem solving and optimization. One significant barrier to such a harvesting of analysis is the inability to find the data you want when you want it. It would be desirable for a data analysis system to include some type of search tool to better facilitate access to meaningful data. It would be desirable for such a search tool to include the ability to perform statistical correlations in identifying particular events or patterns.

[0004] Pursuing data analysis from the ground up is a difficult task. The persons with the subject matter expertise often do not have the educational background in statistics, and the person with the background in statistics will often not have sufficient knowledge of the relevant subject matter. It would be desirable if a data analysis system could encapsulate patterns of data in the form of events. Persons with subject matter expertise, could then define the events that are of interest from a subject matter perspective, and apply automated statistical tools to those events (e.g. patterns of data). The ability to place “markers” in data files to mark the occurrence of various events, and perform automated processing based on those markers would also be desirable.

[0005] The inability of an analysis system to search data files for user-defined events in an automated way impedes the ability to conduct data analysis in a timely and efficient manner. There are many obstacles to such an automated system. The prior art does not appear to provide an analysis tool that provides comprehensive correlation and other statistical tools in a way that promote both meaningful flexibility and significant automation. When dealing with data in the time domain, data is typically stored at various different scales, so it would be desirable to adjust the scaling of data so that an “apples to apples” comparison can be made. Similarly, statistical fitting heuristics will possess different sensitivities given different sample sizes. Thus, it would be desirable for an analysis system to make corresponding adjustments in an automated fashion. The ultimate use of statistical analysis requires some sense of how strong the end results are. It would be desirable for an automated analysis system to provide some measure of a confidence value in the output of the system.

[0006] The prior art does not appear to disclose or even suggest that an analysis system can be both highly configurable and substantially automated. The goals are in direct conflict with each other, with an apparent zero-sum game being the end result. Those well rooted in data analysis typically sacrifice automation and ease of use to facilitate comprehensive functionality. Those focused on the subject matter at hand typically sacrifice flexibility and comprehensiveness to obtain the goals of ease of use and automation. Persons focused on the subject matter often fail to realize the analytical tools that at least theoretically exist. The possibility of a tool that is easy to use, highly configurable, and computationally comprehensive is not suggested in the art.

SUMMARY OF INVENTION

[0007] The invention is a system and method (collectively “the system”) for automatically identifying the location an “event” in a data file.

[0008] A user of the system can associate various patterns of data with events, e.g. patterns of data can be labeled on the basis of what the pattern represents. The system can then look for the pattern in other data files, placing various “markers” in the various data files at the locations in which the pattern is found. Various correlation heuristics can be used to evaluate whether the analyzed data resembles the template data used to represent the event. Markers can be automatically placed at such locations. The marker location determinations can incorporate various weight factors set by the user, or by the system itself. Multiple markers can be used to mark a sequence of events, and the accuracy of marker placement can be optimized in the aggregate, across all marker locations.

[0009] Automated analysis can later be performed at those marker locations. Such automation can take into consideration the type of marker, and other characteristics. In embodiments involving time domain data, the automated time scaling of data can be performed. In some embodiments, markers can be subject to various s orting algorithms and c onfidence v alue c alculations. S ome systems m ay incorporate the automatic adjustment of fit sensitivity to compensate for differences in the size of data patterns and relevant portions of data.

[0010] The system can be implemented through various combinations of subsystems. A data subsystem can be used to capture and store data, both the sample data to be tested and the pattern data of user-defined events. An interface subsystem can be used to facilitate the configurability of the system, allowing the user to select from many options and to submit data of all types to the data subsystem. The interface subsystem can allow the user to define new events based on data found in the data subsystem. An analysis subsystem can identify sample data in the content system that resembles the pre-defined pattern data (e.g. template data for the event). The analysis subsystem can place markers at file locations where a pattern is detected, and invoke an automated analysis at those marker locations.

BRIEF DESCRIPTION OF DRAWINGS

[0011] The embodiments of a data analysis system and method (collectively the “system”) will be described in detail, with reference to the following figures:

[0012] FIG. 1 is an environmental diagram illustrating an example of some of the elements that can be incorporated into the system.

[0013] FIG. 2 is a block diagram illustrating an example of the elements used by the system to automatically identify event locations in a data file.

[0014] FIG. 3 is a flowchart illustrating one example of a process flow that includes event identification.

[0015] FIG. 4 is a flowchart illustrating one example of a process flow that includes marker placement.

[0016] FIG. 5 is a block diagram illustrating an example of a subsystem-level view of the system.

[0017] FIG. 6 is a block diagram illustrating a different example of a subsystem-level view of the system.

[0018] FIG. 7 is a block diagram illustrating an example of a module-level view of the system.

[0019] FIG. 8 is block diagram illustrating an example of a component-level view of the system.

[0020] FIG. 9 a block diagram illustrating an example of the logical operations that can be performed by the analysis component.

[0021] FIG. 10 is a block diagram illustrating an example of the data/control flow of the analysis component.

[0022] FIG. 11 is an equation illustrating one example of a correlation heuristic that can be incorporated into the system.

[0023] FIG. 12 is an equation illustrating one example of a correlation heuristic that can be incorporated into the system.

[0024] FIG. 13 is a flow chart illustrating one example of a marker sort heuristic.

[0025] FIG. 14 is a data chart illustrating one example of evaluation of a marker sort heuristic that can be incorporated into the system.

[0026] FIG. 15 is a line graph corresponding to the example of a data chart in FIG. 14 .

[0027] FIG. 16 is a flow chart illustrating an example of the process steps for a user to interact with the system in a manual mode.

[0028] FIG. 17 is a flow chart illustrating an example of a time scaling heuristic that can be incorporated into the system.

[0029] FIG. 18 is a flow chart illustrating an example a heuristic for adjusting a data configuration.

[0030] FIG. 19 is an example of a file setting interface that can be incorporated into the system.

[0031] FIG. 20 is an example of a data operations interface that can be incorporated into the system.

[0032] FIG. 21 is an example of a pattern edit interface that can be incorporated into the system.

[0033] FIG. 22 is an example of a marker parameters interface that can be incorporated into the system.

[0034] FIG. 23 is an example of a main interface or “home” interface that can be incorporated into the system.

[0035] FIG. 24 is an example of a file format that can be used by the system to display the results of an analysis.

[0036] FIG. 25 is an example of a file search interface that can be incorporated into the system.

[0037] FIG. 26 is an example of a configuration selection interface that can be incorporated into the system.

[0038] FIG. 27 is an example of a file header interface that can be incorporated into the system.

[0039] FIG. 28 is an example of a “pattern-match channels” interface that can be incorporated into the system.

[0040] FIG. 29 is an example of a data operations interface that can be incorporated into the system.

[0041] FIG. 30 is an example of a data normalization interface that can be incorporated into the system.

[0042] FIG. 31 is an example of a marker parameter interface that can be incorporated into the system.

[0043] FIG. 32 is an example of a pattern files interface that can be incorporated into the system.

[0044] FIG. 33 is an example of an analysis selection interface that can be incorporated into the system.

[0045] FIG. 34 is an example of a menu interface that can be incorporated into the system.

[0046] FIG. 35 is an example of a graph displaying data channels that can be incorporated into the system.

[0047] FIG. 36 is an example of a data channels control interface that can be incorporated into the system.

[0048] FIG. 37 is an example of a graph control interface that can be incorporated into the system.

[0049] FIG. 38 is an example of an axis control interface that can be incorporated into the system.

[0050] FIG. 39 is an example of a file statistics interface that can be incorporated into the system.

[0051] FIG. 40 is an example of control buttons that can be incorporated into a “manual” mode of the system.

[0052] FIG. 41 is an example of control buttons that can be incorporated into a “automatic” or “batch” mode of the system.

[0053] FIG. 42 is an example of a markers interface that can be incorporated into the system.

[0054] FIG. 43 is an example of a marker mover control that can be incorporated into the system.

[0055] FIG. 44 is an example of an analysis results display that can be incorporated into the system.

[0056] FIG. 45 is an example of a file documentation interface that can be incorporated into the system.

[0057] FIG. 46 is an example of a comment interface that can be associated with the file document interface in FIG. 45 .

[0058] FIG. 47 is an example of a file path interface that can be incorporated into the system.

[0059] FIG. 48 is an example of a confidence graph interface that can be incorporated into the system.

DETAILED DESCRIPTION

I. Introduction of Elements

[0060] FIG. 1 is an environmental diagram of a data analysis system and method (collectively the “system”) 100 . The system 100 provides a highly configurable and yet highly automated mechanism to analyze data.

[0061] A. Physical Data Source

[0062] A physical data source 102 can be potentially any object, measurement, or phenomenon in the physical world outside the system 100 . The physical data source 102 can be the transmission system of a truck, a Petri dish filled with bacteria, unemployment numbers in the state of Michigan, a combustion engine, resistance in a wind tunnel, the radioactivity of a substance, the luminosity of a light source, earth quake frequency and/or severity, or any other characteristic that can be measured or represented in a quantitative manner. The system 100 does not require a physical data source 102 in order to function. Data can be imported to the system 100 from an extended chain of sources. Data can also be created by the system 100 itself. In some embodiments, the analysis of the system 100 is performed on simulated data, and such data has no physical data source 102 . In some embodiments, data may be associated with a data type (a type of characteristic described by the numerical value) such as velocity, force, luminosity, revenue, or other attribute, even though the data was not captured from a physical data source 102 . Pure numerical data without any data type attributes can also be processed by the system 100 .

[0063] B. Sensor

[0064] A sensor 104 is potentially any device capable of capturing measurable characteristics from the physical data source 102 . Sensors 104 can be cameras, motion detectors, radar, infrared beams, Geiger counters, and any other device capable of being used to generate data relating to the physical data source 102 . Just as the system 100 does not require the existence of a physical data source 102 , the existence of the sensor 104 is similarly optional. The system 100 can incorporate many different sensors 104 and many different combinations of sensors 104 .

[0065] C. Channels and Data Types

[0066] A sensor 104 can capture data relating to the physical data source 102 through one or more channels 106 . Channels 106 are any potentially any type of characteristic that is captured from the physical data source 102 through the sensor 104 . Force, speed, position, torque, acceleration, kinetic energy, luminosity, heat, friction, blood pressure, lifespan, gestation period, revenues, IQ, and any other type of measurement can be a channel 106 incorporated into the system 100 . Some channels 106 are directly measured by the sensor 104 (“direct channels” 106 ), while other channels 106 are derived from other channels 106 (“derived channels” 106 ). Kinetic energy is an example of a derived channel 106 , because kinetic energy is calculated from the values of mass and velocity.

[0067] The system 100 can accommodate a wide variety of different channels 106 . Embodiments of the system 100 that do not involve sensors 104 or physical data sources 102 will not involve channels. Other embodiments could utilize a far greater number of channels 106 . In an embodiment of the system 100 used for transmission systems, the channels of force, speed, position, and torque are the preferred channels 106 .

[0068] Channels 106 are not a required element for the functioning of the system 100 . In a one channel 106 embodiment, it may not matter to the system 100 what type of data the channel 106 represents. In a pure statistical embodiment, there is no channel 106 and the numbers are processed as numbers without any association or relationship to a type of measurement. However, channels 106 can be helpful in an analysis because channels 106 can have relationships with each other. For example, position, velocity, and acceleration are related measures.

[0069] A single sensor 104 can capture sensor data across multiple different channels 106 in a substantially simultaneous manner. Thus, the different channels 106 may have relationships with each other. For example, a force channel and a toque channel may relate to the same physical data source 102 over the same increments in time.

[0070] Channels 106 correspond to data types. Embodiments of the system 100 that do not involve physical data sources 102 or sensors 106 can process data associated with various data types instead of channels 106 . Force, speed, position, torque, acceleration, kinetic energy, luminosity, heat, friction, blood pressure, lifespan, gestation period, revenues, IQ, and any other type of measurement can be a data type associate with a numerical value that is processed by the system 100 . In some embodiments of the system 100 , data need not be associated with a channel 106 or data type.

[0071] D. Data Collection Component

[0072] A data collection component 107 is typically a grouping of the sensor 104 capturing data relating to the physical data source 102 through one or more channels 106 . In some embodiments, the data collection component 107 is a simulation component that generates data from a model of a physical data source 102 , but not an actually existing physical data source 102 . Some embodiments of the system 100 do not have a data collection component 107 .

[0073] E. Data Storage Component

[0074] A data storage component 108 is the aggregation of all data storage mechanisms that can provide data to the system 100 . The data storage component 108 can be organized in many different ways. Databases, flat files, object-oriented programming objects, arrays, other types of data structures, and potentially any mechanism or method for storing data can be the data storage component 108 or part of the data storage component 108 . The data storage component 108 can reside on CD-ROMs, floppy disks, hard drives, web servers, proxy servers, or potentially any other type of information technology mechanism.

[0075] There are typically two types of data stored by the system 100 . One type is template data. Template data can also be referred to as pattern data or baseline data. The other type of data is analysis data. Analysis data can also be referred to as sample data, test data, or simply “data.” In many embodiments, the distinction between analysis data and template data is a purely contextual or situational distinction. Template data can be analysis data from a previous analysis. Once analysis data has been analyzed, it can serve as template data for future analysis. Thus, there need not be a physical distinction between analysis data and template data, or between an analysis data storage 109 component and a template data storage 114 component. Any data that is to be searched in a particular context is data that resides in the analysis data storage 109 component.

[0076] If the data collection component 107 is part of the system 100 , it is desirable that the data storage component 108 have an effective way to receive the data of the data collection component 107 . A wide variety of different configurations are possible.

[0077] F. Analysis Data Storage

[0078] Analysis data is the data stored in an analysis data storage 109 component. Physically, the analysis data storage 109 component can vary as widely as the data storage component 18 . Analysis data is the data being analyzed by the system 100 . Analysis data can also be referred to as test data, actual data, physical data, sample data, or simply “data.” The system 100 analyzes analysis data using template data, described below.

[0079] G. Analysis Data Files

[0080] The analysis data storage 109 is made up of one or more files 110 . Files 110 can also be referred to as analysis data files, actual data files, test data files, physical data files, or sample data files. In preferred embodiments, files 110 may be incorporated into a hierarchy of folders and directories. In preferred embodiments, files 110 are in pre-defined and user-desired formats set by the administrator of the system 100 . The uniform formatting of files 110 enhances the opportunity to facilitate the automated processing of the system 100 .

[0081] H. Analysis Data Points

[0082] Each file 110 can be made up of one or more data points 112 . Data points 112 can also be referred to as analysis data points, actual data points, test data points, physical data points, or sample data points. Data points 112 have a numerical value associated with each data point. In a preferred embodiment, data points 112 are also affiliated with at least one channel 106 or data type. A single data file 110 can have data points 112 associated with multiple different channels 106 or multiple different data types.

[0083] I. Template Data Storage

[0084] A template data storage 114 component can have as many different physical, logical, and functional variations as the analysis data storage 109 component or the aggregate data storage component 109 . The template data storage 114 component can also be referred to as a pattern data storage 114 component or an event data storage 114 component. In some embodiments, template data storage 114 is a subset of analysis data storage 109 .

[0085] J. Template Data File

[0086] The template data storage 114 is made up of one or more template files 116 . Template data files 116 can also be referred to as template files, template files, event files, even data files, pattern files, and pattern data files. In preferred embodiments, template files 116 are incorporated into a hierarchy of folders and directories. In a preferred embodiment, template files 116 are created and stored in a pre-defined and user-desired format set by the administrator of the system 100 . Adherence to standardized formatting facilitates the opportunity for system 100 automation.

[0087] The distinction between an analysis data file 110 and a template data 116 is purely contextual in many embodiments. Template data files 116 can be analysis data from a previous analysis. Once analysis data has been analyzed, it can serve as template data for future analysis. Thus, there need not be a physical distinction between analysis data files 110 and template data files 116 . In a preferred embodiment, all data in the data storage component 108 is equally accessible and stored in a non-segregated manner. In some contexts, template data files 120 are simply a subset of data files 110 .

[0088] K. Template Data Points

[0089] Each template file 116 can be made up of one or more template data points 120 . Template data points 120 can also be referred to as pattern data points 120 or event data points 120 . Template data points 120 have a numerical value associated with each data point. In a preferred embodiment, data points 120 are also affiliated with at least one channel 106 or data type. A single template data file 120 can have template data points 120 associated with multiple different channels 106 or multiple data types.

[0090] The distinction between analysis data point 112 and a template data points 120 is in many embodiments, a purely contextual or situational distinction. Template data points 120 can be analysis data points 112 from a previous analysis. Once analysis data has been analyzed, it can serve as template data for future analysis. Thus, there need not be a physical distinction between analysis data points 112 and template data points 120 . In a preferred embodiment, all data in the data storage component 108 is equally accessible and stored in a non-segregated manner. In some contexts, template data points 120 are simply a subset of data points 112 and can be referred to as data points 112 .

[0091] Template data points 120 and analysis data points 112 can be and represent time domain data points. Time domain data points can be used to analyze data over time. This can be useful in many different embodiments of the system 100 , including vehicle transmission embodiments where the time domain data relates to a shift in the gears of the transmission of the vehicle.

[0092] L. Events

[0093] An event 118 is a pattern of template data points 120 . Thus, events 118 can also be referred to as patterns 118 . Patterns 118 can have as few as one template data point 112 , or as many template data points 120 as desired. The number of template data points 120 in a pattern 118 can have repercussions with respect to fit sensitivity. Thus, it can be desirable to use different sized patterns 118 in different contexts. Patterns 118 are preferably defined by a user 122 (e.g. patterns 118 are preferably user-defined). A pattern size adjustment heuristic described in greater detail below can be used to adjust for differences in pattern size.

[0094] The system 100 can perform a wide variety of different search and correlation heuristics. In a preferred embodiment, the system 100 calculates a correlation or fit between one or more patterns 118 and the analysis data points 112 . Patterns 118 can be created from past analysis data points 112 , patterns 118 can be derived from data captured from physical data sources 102 , or patterns 118 can be made up by the user 122 without any relationship to a physical data source 102 . Patterns 118 can include template data points 120 relating to multiple channels 106 . A single template file 116 can have multiple patterns 118 .

[0095] M. Computer System

[0096] The data storage component 108 interfaces with a computer system 128 . The computer system 128 can be a wide variety of different information technology devices and/or networks. Desktop computers, laptop computers, work stations, local area networks (LANs), wide area networks (WAN), servers, web pages, main frame computers, mini computers, and other devices can serve as computer systems 128 for the system 128 . In some embodiments of the system 100 , the computer system 128 and an interface device 124 are the same device. The computer system 128 houses much if not all of the programming logic needed to support the functionality of the system. The computer system 128 may directly house the data storage component 108 , or may merely interface with the data storage component 108 . The computer system 128 can include a wide variety of ancillary, supplemental, or related information technology devices.

[0097] N. Interface Device

[0098] An interface device 124 is any device capable of communicating with the computer system 128 . Personal digital assistants (PDAs), cell phones, satellite pagers, standard telephones with a computer system 129 employing voice recognition technologies, desktop computers, laptop computers, work stations, local area networks (LANs), wide area networks (WAN), servers, web pages, main frame computers, mini computers, and other devices can serve as interface devices 124 for the system 100 . In some embodiments, the computer system 129 and the interface device 124 are the same device.

[0099] O. User

[0100] A user 122 is typically a human being interacting with the system 100 . In some embodiments, the user 122 can be an expert system, a robot, a neural network, or a computer employing artificial intelligence (collectively “intelligence technologies”). Users 122 interact with the system 100 by interacting with the computer system 128 . Users 128 interact with the computer system 128 by interacting with the interface device 124 . The interactions between the user 122 and the interface device 124 take the form of interface characteristics 126 . In preferred embodiments, a skill level is associated with each user 122 . The system 100 preferably utilizes skill level information in enabling or disabling the functionality of the system 100 . Other embodiments may incorporate less sophisticated interfaces for users 122 .

[0101] P. Interface Characteristics

[0102] An interface characteristic 126 is any action by the user 122 using the interface device 124 or any output of the interface device 124 made available to the user 122 . Interface characteristics 126 can be either input characteristics or output characteristics.

[0103] Input characteristics are actions (or omissions) by the user 122 with respect to the interface device 124 . Typed text, a click of the mouse, a selection in a drop down list box, the pressing of a button, the selection of a menu item, the scanning in of a document, speech into a voice recognition technology, or the failure of a user 122 to provide any input are each examples of input characteristics.

[0104] Output characteristics are any actions of changes by the system 100 that are accessible or observable through the interface device 124 . Sounds; visual changes to the screen; printed reports; the recording of data on a CD, floppy drive, or other portable device; or any other form of output is one or more output characteristics.

[0105] The types of interface characteristics 126 that can be used by the system 100 are limited only by the types of characteristics 126 that can be processed by the interface device 124 .

[0106] Q. Search Tool

[0107] The computer system 128 uses a search tool 130 to find the desirable any desirable data points 112 (both analysis and template), data files 110 (both analysis and template), and events 118 . The system can incorporate a wide variety of different search tools 130 . The types of search tools 130 that can be used will depend on the types of files 110 and the architecture of the data storage component 108 . The search tool 130 performs an efficient search heuristic In some embodiments, the search tool 130 can be used to perform a marker sort heuristic used to sort a marker 134 used to mark the location of the pattern 118 within the analysis data storage 109 component. The search tool 130 can also be used to search as well as sort markers 134 .

[0108] The process of searching and the functionality of the search tool 130 is discussed in greater detail below. The search tool 128 can include the searching functionality found in file management systems. Files 110 can be searched on the basis of title, date of last update, author, a description, a subset of data within the file, or some form of meta data relating to the file 110 . The search tool 130 can also include the functionality of identifying events 118 in the file 110 . Thus, the search tool 130 can perform a variety of pattern searching heuristics, and can include a correlation tool 132 . In a preferred embodiment of the search tool 130 , only the N % of pattern fits are included in the search result. Although N can be any number from 0 through 100, a number such as 10 can be an effective way to focus on the key data.

[0109] R. Correlation Tool

[0110] The computer system 128 can use a correlation tool 132 to compare events 118 to the library of data points 112 making up the analysis data storage 109 component. The correlation tool 132 can perform a correlation-based pattern matching heuristic. Different variations or modifications of correlation heuristics can be incorporated into the system 100 .

[0111] The fit sensitivity of such correlations can be adjusted on the basis of the size of the pattern. A fit sensitivity adjustment heuristic can perform such adjustments automatically. The correlation tool 132 can generate various confidence values associated with the identification of a correlation between an event 118 and data points 112 within the analysis data storage 109 component. The correlation tool 132 can also perform a target value weighing heuristic and an overall confidence value heuristic. Confidence values for each pattern match can be combined into an overall confidence value. Moreover, the confidence values of the correlation tool 132 can be compared to user-defined or system-defined minimum confidence thresholds. Determinations with insufficient confidence can be modified by the system 100 , or flagged for approval by the user 122 . Particular matches and even entire files 110 can be ignored when there is insufficient confidence to merit subsequent analysis.

[0112] The process of correlating data points 112 with events 118 and other statistical processing are discussed in greater detail below.

[0113] S. Markers and Marker Locations

[0114] A marker 134 is an identifying label that the system 100 places within the location of a data file 110 that possess data points 112 that correlate with a particular pattern 118 . A single marker 134 can cover many data points 112 and those data points 112 can relate to more than one channel 106 of information. In a preferred embodiment, multiple markers 134 are used in tandem and the confidence values associated with marker placement are optimized in the aggregate, for all of the markers 134 . Otherwise, the system 100 could make undesirable tradeoffs with respect to the placement of particular markers 134 .

[0115] Markers 134 can be used to trigger certain types of automated analysis. Markers 134 can encapsulate information relating to the marker 134 , as well as data points 112 at the identified marker location. Thus, in a preferred embodiment, markers 134 “know” what type of markers 134 they are, and they “know” certain characteristics of the underlying pattern 118 , such as the number and types of channels 106 . A fit adjustment heuristic can be used to that large numbers of markers 134 do not reduce the sensitivity because one marker 134 has a bad fit.

[0116] The process of setting markers 134 , sorting markers 134 , and performing analysis on the basis of markers 134 is discussed in greater detail below.

[0117] T. Analysis

[0118] Once marker 134 are in their marker locations, various forms of analysis 136 can be created by the system 100 . Analysis 136 can be created through both manual and automated means. The different types of analysis that can be performed are discussed in greater detail below.

[0119] U. Notes

[0120] Some embodiments of the system 100 provide users 122 with the ability to document a note 138 directly into the system 100 such that the note 138 documents markers 134 , files 110 , or particular activities and/or configurations. Notes 138 are described in greater detail below.

[0121] V. Reports

[0122] The system 100 can be configured to create a wide variety of different reports 140 . Some reports 140 are standardized, while others customized, while still others are ad-hoc one time only reports. Reports 140 are discussed in greater detail below.

[0123] W. Pattern Array

[0124] In order to compare analysis data points 112 with template data points 120 , many embodiments of the system 100 will load the template data points 120 into a pattern array 142 (it can also be referred to as an event array) to facilitate processing. Pattern arrays 142 are discussed in greater detail below.

[0125] X. Data Array

[0126] In order to compare analysis data points 112 with template data points 120 , many embodiments of the system 100 will load the analysis data points 112 into an analysis array 144 (it can also be referred to as a data array, an actual data array, a physical data array, a test data array, a fit array, or a sample event array) to facilitate processing. Data arrays 144 are discussed in greater detail below.

[0127] Y. Confidence Value

[0128] A confidence value 146 can be associated with any correlation calculation, marker location, or even aggregate marker locations associate with a file 110 or folder. A minimum confidence threshold can be used to automatically eliminate markers and even entire files with aggregate confidence values 146 that are lower than the applicable minimum confidence threshold. In other embodiments, the comparison of threshold value to confidence value 146 can be used to trigger other activities, such as the seeking of user 122 approval, the ability of users 122 to manually move the markers 134 , or other processing options. The various minimum confidence thresholds can be user-defined, or set by the system 100 . The minimum confidence threshold can also be part of a marker configuration, as described below.

II. System Overview

[0129] FIG. 2 is a block diagram illustrating an example of the elements used by the system 100 to automatically identify event 118 locations in a data file 110 . Data points 112 in the data file 110 on the left side of FIG. 2 are evaluated for matches to the template data points 120 encapsulated in the one or more events 118 . Markers 134 can then be placed at those data point locations where matches to the one or more events 118 are found. The file 110 on the right side of FIG. 2 is an illustration is an example of a file 110 with markers 134 at the locations of the pattern matches. Although not disclosed in FIG. 2 , the system 100 can identify event locations in multiple files 110 at the same time.

[0130] The process of placing automatically placing markers 134 without human intervention can utilize the inputs of one or more events 118 a pattern tool 150 . In multiple event 118 embodiments, the system 100 can also incorporate the inputs of an optimization tool 154 and a marker configuration 152 .

[0131] The pattern tool 150 uses one or more events 118 as in input, and identifies marker 134 locations by performing a pattern matching heuristic on the file 110 . A pattern matching heuristic is potentially any process by which a pattern 118 can be identified. A wide variety of different pattern matching heuristics can be incorporated into the system 100 . In a preferred embodiment, the pattern matching heuristic is some type of correlation heuristic. In some embodiments, the pattern tool 150 is part of the search tool 130 . A subset of pattern tools 150 are correlation tools 132 .

[0132] Marker configurations 152 can be used by the system 100 when there are multiple markers 134 being fitted to the data 112 in a file 110 . The marker configuration 152 typically includes the number of markers 134 and labels or names for the particular events 118 . The marker configuration 152 can also include minimum confidence value thresholds, predefined types of subsequent analysis, a marker sequence (e.g. the order of the markers 134 ), variations with respect to data types and channels 106 , search criteria, a “top N %” selection, and other potentially relevant or useful characteristics.

[0133] An optimization tool 154 can be used to optimize marker 134 placements in situations involving multiple markers 134 . Thus, the marker configuration is a desirable input for the optimization tool. The optimization heuristic performed by the optimization tool 154 is described in greater detail below. The optimization tool 154 allows the system 100 to optimize the aggregate placement of markers 134 . Otherwise, the system 100 may make undesirable tradeoffs, such as improving marker 1 by X %, while reducing the desirability of one or more other markers 134 by a factor that exceeds X %.

[0134] If a weight adjustment is incorporated into the system 100 , the optimization tool 154 should take into account the weight adjustment in optimizing marker 134 placement. The optimization tool 154 can be part of the search tool 130 , the correlation tool 132 , or the pattern tool 150 .

[0135] In the example illustrated in FIG. 2 , there are two rows of data points 110 , with each row associated with a different channel 106 and data type. In a multiple data type of channel 106 embodiment, markers 134 are preferably placed on a row-by-row basis and not a point by point basis. However, in alternative embodiments, the marker configuration 152 is sufficiently flexible to permit point-by-point marker 134 locations.

[0136] The system 100 can incorporate a wide variety of different heuristics in performing the functionality disclosed in FIG. 2 . A wide variety of different pattern matching heuristics can be incorporated into the system 100 . In some embodiments, each template data point 120 in the event 118 is used in the heuristic. In other embodiments, some template data points 120 can be excluded in a desire to ignore and exclude “white noise.” If the number of analysis points 112 is not equal to the number of template data points 120 , the extra data points can be removed from consideration. Pattern matching heuristics can be usefully invoked even if the channels 106 and data types making up the events 118 do not match the channels 106 and data types in the analysis data points 112 .

[0137] In preferred embodiments, the pattern matching h euristic(s) incorporated into the system 100 are some type of a correlation heuristic. A wide variety of different c orrelation calculations and correlation h euristics can be incorporated into the system 100 . An automated marker sort heuristic can be performed by the system 100 in order to better evaluate the “fit” of the selected marker 134 locations. Confidence values 146 can be generated for each individual marker location, as well as for the aggregate placement of all markers 134 .

[0138] In comparing analysis data 112 to template data 120 , the system 100 can invoke an automated time scaling heuristic. For time-domain data, the system 100 can adjust the time units/scale of either the analysis date 112 or the template data 120 so that the time units/scale of the analysis data 112 matches that of the template data 120 . The automated time scaling heuristic utilizes a scaling adjustment so that the system 100 is comparing “apples to apples” or “oranges to oranges.”

[0139] In an effort to create characteristics for pattern matching that are of greater or particular interest to the user 122 , the system 100 can invoke a weighing heuristic. Different characteristics of the data points 110 can be weighed differently in determining marker locations. The parameters of the weighing heuristic can be part of the marker configuration 152 . Multiple weight factors can be applied to a single sequence of events 118 . Multiple weight factors can be applied to a single event 118 .

[0140] The heuristics and processes discussed above and below can be invoked and performed by various subsystems and modules making up the system 100 .

III. System-Level Processing

[0141] The system 100 is highly configurable, and can incorporate a wide variety of different process flows for automatically identifying events 118 in one or more data files 110 .

[0142] A. Process Flow 1

[0143] FIG. 3 is a flow chart illustrating one example of the process flow that can be implemented by the system 100 .

[0144] At 160 , a user 122 defines an event 118 using template data points 120 . Template data points 120 can be imported from a file or a network. Template data points 120 can be inputted by the user 122 through the interface device 124 . Template data points 120 can be analysis data points 112 from a prior analysis. Events 118 can include a wide variety of different data types and channels 106 . Events 118 can also be organized into sequences or configurations of events 118 that correspond to the marker sequences or configurations 152 discussed above.

[0145] At 162 , the user 122 accesses the analysis data storage 109 component to identify potential matches of one or more events 118 . Various tools and heuristics can be used by the system 100 to access files 110 . In some embodiments, after file is accessed, the relevant data points 112 are loaded onto a data array 144 and template data points 120 into a pattern array 142 . The data array 144 and pattern array 142 can be used to calculate a correlation value between the analysis data points 112 and template data points 120 . The analysis data points 112 selected by the user 122 can be associated to many different channels 106 and data types, or can be defined as pure numbers not associated with any type of data type.

[0146] At 164 , the one or more events 118 in the data file 110 are located. This step can be performed in a fully automated manner. In some embodiments involving minimum confidence values or other threshold values, certain contexts can be flagged for the purpose of manual intervention by the user 122 .

[0147] Markers 134 can be placed at the appropriate locations within the file 110 or files 110 containing the data points 110 . Numerous heuristics discussed above and below can be incorporated into the step at 164 . A weighting heuristic can incorporate particular adjustment weights to the pattern matching process. Events 118 in the time domain can be scaled automatically and without human intervention, to match the time scale of the analysis data points 112 , or vice versa. Various marker sort heuristics can be used to evaluate the fit of the markers 134 with the analysis data 112 , on a marker by marker basis as well as an overall aggregate basis. Confidence values 146 can also be calculated on a marker-by-marker or on an aggregate basis. A fit sensitivity heuristic can be used to adjust the fit sensitivity for the number of data points 112 associated with the event 118 . Various analysis heuristics can be automatically invoked at particular marker 134 locations. Such analysis heuristics can differentiate markers 134 on the basis of the characteristics of the marker configuration 152 . Other examples of relevant heuristics are described both above and below.

[0148] It may be desirable for a user to exclude certain data points 112 or certain template data points 120 for the step at 164 . For such embodiments, an excluded array can loaded with the points to be excluded. Treating the excluded array as a matrix in linear algebra, linear algebra calculations could be used to remove the excluded array from the data array 144 and/or pattern array 142 .

[0149] The processing at 160 , 162 , and 164 can take place in several different modes of operation. In a preferred embodiment, the processing mode can be either manual, automatic, or batch. These modes are described in greater detail below. Different processing rules can be associated with different operational modes. System 100 processing can also include the setting of a user skill level. In some embodiments, the user skill level can be associated with their login ID. In other embodiments, skill level can be self-selected. In still other embodiments, the system 100 can base skill level on the user's experience with the system 100 . Alternative embodiments may involve the objective evaluation of the user's skill by the system 100 .

[0150] B. Process Flow 2

[0151] FIG. 4 is a flowchart illustrating one example of a process flow that includes the placement of markers 134 at marker locations. In a preferred embodiment, multiple markers 134 are placed automatically by the system 100 without human intervention after the event 118 is selected and a determination to place markers is made.

[0152] At 166 , analysis data points 110 are accessed. In some embodiments, this step is performed after the step of event definition at 108 . The search tool 130 can be used to select the directories, folders, and files 110 to be accessed within the analysis data storage 109 component. In some embodiments, the analysis data points 112 are loaded onto the data array 144 .

[0153] At 168 , one or more events 118 are either selected and/or defined on the system 100 . In some embodiments, the marker configuration 152 is also received by the interface device 124 at 166 . The type of pattern matching heuristic can preferably be selected at 168 , if the user 122 is free to choose one of multiple approaches. Pattern data points 120 can be loaded into the pattern array 142 for the purposes of comparing the data to the analysis data points 112 in the data array 144 . If desirable, the user 122 can be given the option of allowing the system 100 to adjust the subsequent pattern matching heuristic for the size of the pattern 118 . A fit sensitivity value for use in a fit sensitivity heuristic can be incorporated into the marker configuration 152 .

[0154] At 170 , the system 100 determines whether the data points 112 subject to the search, match the event 118 or multiple events 118 defined at 168 . A wide variety of pattern matching heuristics can be used at 170 . In a preferred embodiment, some type of correlation heuristic is used. With multiple event 118 embodiments, all events 118 are preferably at least tentatively identified at 170 . Each event identification can be associated with a confidence value 146 calculated by the system 100 . An overall confidence value 146 can be calculated for determining the reliability of the aggregate event 118 identifications. The various confidence values 146 can be compared to threshold values set by the users 122 or by the system 100 . The template data and analysis data being compare at 170 need not possess the same number or types of channels 106 , although identical channels 106 can be desirable in certain circumstances. Thus, the number of channels 106 in the analysis data points 112 may be only two, while nine channels 106 existing within the definition of the event 118 . In some embodiments of the system 100 , the user 122 can set an “always rematch” flag to yes. In such embodiments, if the “always rematch” flag is set to yes, the system 100 can automatically trigger the pattern recognition to invoke the process at step 170 , and continue with the processing in FIG. 4 .

[0155] At 172 , the system 100 optimizes the placement of markers 134 at the tentatively identified marker locations. An optimization heuristic can be used to optimize the overall fit of the various markers 134 . Different weights can be attributed to different factors involved in the optimization process.

[0156] At 174 , the markers 134 are actually placed at the final optimized marker locations. The system 100 can generate various analysis 136 and reports 140 using the data encapsulated at the marker locations. The analysis 136 can manual processes defined by the user 122 . Analysis 136 can also be generated automatically in accordance with pre-defined processing rules incorporated into the system 100 . Such processing can take into consideration the specific characteristics of the marker configuration 152 and the data points 112 encapsulated in the identified event 118 .

IV. Subsystem-Level View

[0157] FIG. 5 is a block diagram illustrating an example of a subsystem-level view of the system 100 . As shown in the diagram, each subsystem can directly interact with any other subsystem.

[0158] A. Interface Subsystem

[0159] An interface subsystem 200 is responsible for interactions between the user 122 and the system 100 . A wide variety of interface devices 124 can be used to perform the functions of the interface subsystem 200 . In preferred embodiments, the interface subsystem 200 includes a user-friendly graphical user interface that can be exercised through a platform-independent browser over the Internet. The interface subsystem 200 can also be referred to as an input subsystem, because the interface subsystem 200 is responsible for capturing all input characteristics.

[0160] The interface subsystem 200 can capture a wide variety of input characteristics, as identified above. The interface subsystem 200 can be used to create and define events 118 by selecting already existing data points 112 (including template data points 120 ), or by simply entering in new data points and data characteristics. Data can be downloaded or imported from other sources, using the interface subsystem 200 . Any activity that is guided, controlled, influenced, accessible, or viewable by the user 122 occurs through the interface subsystem 200 . Events 118 can include multiple data points 120 which span multiple channels 106 and multiple data types across multiple files 110 .

[0161] In some embodiments, users 122 are classified as beginners, intermediate users, or advanced users. Such classifications can use the selection of the user 122 to determine skill level in some embodiments. In other embodiments, skill level can be associated with log-in information, while in other embodiments, skill level could be objectively determined through experience with the system 100 . In all embodiments, the skill level determination will be based on one or more input characteristics. The skill level indicator can be used as the basis for enabling and/or disabling certain functionality and menu options.

[0162] The interface subsystem 200 can invoke other subsystems such that those other subsystems function automatically and without human intervention.

[0163] B. Data Subsystem

[0164] A data subsystem 202 is the subsystem than includes the data storage component 108 and the data collection component 107 . Thus, the data subsystem 202 can include a potentially voluminous number of data files 110 , data points 112 , template files 116 , template data points 120 , and events 118 . Multiple data collection components 107 , with each data collection component 107 involving multiple sensors 104 , channels 106 , and physical data sources 102 , can be incorporated into the data subsystem 202 . The data subsystem 202 can also be referred to as a content subsystem, because the data subsystem 202 includes all of the data content of the system 100 .

[0165] In a preferred embodiment, all files 110 (including template files 116 ) are in some type of pre-defined format. Such pre-defined formats can be user-created. Events 118 within the data subsystem 202 can have as few as one template data point 120 and as many as hundreds or even thousands of template data points 120 depending the type of data and the power of the computer system 128 . Events 118 can relate to and be defined in accordance with many different channels 106 because the various template data points 120 associated with and making up the event 118 can be associated with different channels 106 .

[0166] Various markers 134 can be placed in various marker locations within the data subsystem 202 . However, an analysis subsystem 204 can control the determinations of marker locations and the calculations and processing based on those locations.

[0167] C. Analysis Subsystem

[0168] An analysis subsystem 204 is responsible for: performing any search heuristics; performing correlation heuristics; identifying matches between the event 118 and the analysis data; determining where markers 134 should be placed; and performing processing utilizing the markers 134 and marker locations.

[0169] The analysis subsystem 204 searches the data subsystem to search for matches, e.g. analysis data points 112 indicative of an event 118 . Markers 134 can be placed on such locations of analysis data points 112 , so that additional calculations and processing can be automatically invoked without human intervention.

[0170] Marker locations can be identified by performing a correlation-based matching heuristic, or other types of correlation heuristics. In a preferred embodiment, a confidence value 146 is associated with each marker 134 . In some embodiments, an aggregate confidence value can be associated with a file 110 or even a folder.

[0171] In order to compare data points 110 and evaluate or even find matches (e.g. matches are “events” 118 in the analysis data), the analysis subsystem 204 can load the pattern or event data points 120 into the pattern array 142 and load the analysis data points 112 into the analysis data array 144 . The analysis subsystem 204 can correlate data points 112 relating to multiple channels 106 in finding and evaluating matches.

[0172] Comparing and correlating the data can be easier when the data is stored in arrays. Only subsets of the analysis data and template data should be loaded into arrays at any one time. Typically, no more than a single analysis file 110 and a single template file 116 should be loaded into arrays. The correlation functionality of the analysis subsystem 204 can be performed by a correlation module discussed in greater detail below.

[0173] D. Pattern Subsystem

[0174] FIG. 6 illustrates an example of a subsystem view exemplifying a different example of a subsystem-level configuration. A pattern subsystem 203 can be incorporated into the system 100 . The pattern subsystem 203 is responsible for storing all template data points 120 , events 118 , template data files 116 , and the template data storage 114 component. As discussed above, events 118 can be defined from input characteristics. In some embodiments, the pattern subsystem 203 can be responsible for the configuring and invoking of the various pattern matching heuristics that can be incorporated into the system 100 . In other embodiments, a search subsystem 205 is responsible for performing pattern matching, and the pattern subsystem 203 is limited to the creation, modification, and analysis of the events 118 .

[0175] E. Search Subsystem

[0176] A search subsystem 205 can be used to perform any of the functions of the search tool 130 discussed above. The search subsystem 205 can be used to identify files 110 in the data storage component 109 that are desired for pattern matching. A search result can be based on a wide variety of search criteria, including a correlation with the template data points 120 in the event 118 . Pattern matching can also be performed by the search subsystem 205 . The search subsystem 205 can identify multiple locations in one or more data files 110 indicative of events 118 across multiple data channels 106 or data types.

V. Module-Level View

[0177] FIG. 7 is a block diagram illustrating an example of a module-level view of the system 100 . As is indicated in the diagram, any module can directly interact with any other module making up the system 100 . The system 100 need not possess each of these modules to function. In many embodiments, system functionality can be aggregated together in a lesser number of modules. For example, the functionality of the analysis module, patterns module, and correlations module could be combined into a single analysis module.

[0178] A. Data Collection Module

[0179] A data collection module 206 is part of the data subsystem 202 , and includes the data collection component 107 , if such a component exists for the particular system 100 . The data collection module 206 can include many different types of sensors 104 , with each sensor 104 capable of capturing a wide variety of different sensor data through a wide variety of different channels 106 . The data collection module 206 can be responsible for generating data points 112 and data files 110 from sensor data. Each data point 112 can be associated with at least one channel 106 .

[0180] In various vehicle transmission embodiments of the system 100 , the sensor data can be transmission shift data from multiple shifts captured in a simultaneous or substantially simultaneous manner. Sensor data originating from different transmission can also be included.

[0181] Data collection is described in greater detail above and below.

[0182] B. Pattern Module

[0183] A pattern module 208 can be responsible for creating, modifying, and storing events 118 (e.g. patterns of template data points 120 ). Input characteristics received through the interface subsystem 200 can be sent to the pattern module 208 in order to create, define, modify, or delete events 118 . The pattern module 208 can be used to populate the pattern array 142 with template data points 120 .

[0184] Pattern 118 processing is described in greater detail below.

[0185] C. Search Module

[0186] A search module 210 can include various tools for navigating the data storage component 108 , identifying particular files 110 , and for identifying patterns 118 in particular files 110 . The search module 210 can include both search tools 130 and correlation tools 132 .

[0187] 1. Search Tools

[0188] Search tools 130 in the search module 210 can be used to navigate through the data subsystem 202 using various search criteria and other methodologies. A single search result can include more than one file 110 , and even more than one folder. The scope of the search can be configured by the search module 210 . For example, the search could be limited to a particular folder or directory, or the search could include the entire data storage component 108 . The search tools 130 of the search m odule 210 can perform various file management functions within the data storage component 108 . Search tool 130 functionality is described both above and below.

[0189] 2. Correlation Tools

[0190] In some embodiments, various correlation tools 132 within the search module 210 can perform various correlation heuristics to identify events 118 in analysis data. In some embodiments, the search module 210 loads the data array 144 , and compare the data array 144 with the template data array 142 . In such embodiments, the search module 210 also generates the applicable confidence values 146 .

[0191] In a vehicle transmission embodiment, the search criteria or data types of the correlation tools 132 can include a shift time, an engagement bump, and a shift impulse, as well as other characteristics relating to transmission performance. Correlation tool 132 functionality is described both above and below.

[0192] D. Marker Module

[0193] A marker module 212 can be used to place, monitor, modify, sort, and move markers 134 . The marker module 212 can place a marker 134 at a marker location within the analysis data storage 109 component. In some embodiments, the calculation of the confidence value 146 is performed by the marker module 212 . The results of a marker sort heuristic can be used by the analysis subsystem 204 to perform the analysis calculations generated by that subsystem.

[0194] In a preferred embodiment, the marker module 212 includes a marker interface (part of the interface subsystem 200 ) which can be used by users 122 to accept or reject marker locations. Markers 134 can also be manually moved by users 122 . Different weight adjustments can be applied to different characteristics in evaluating the desirability of a particular marker 134 placement.

[0195] Marker 134 functionality is described in greater detail below.

[0196] E. Analysis Module

[0197] An analysis module 214 can be used to perform correlation calculations, generate confidence values 146 , and generate subsequent calculations and processing at marker locations. The primary responsibility of the analysis module 214 is to perform analysis calculations at marker locations. Those calculations are performed using the characteristics of the data points 112 at those marker locations.

[0198] A reporting tool within the analysis module 214 can be used to automatically generate certain types of reports based on the characteristics of the marker 134 , such as the channels 106 . In a preferred embodiment, the reports of the report tool are highly configurable and highly automated, allowing for manual intervention as desired by the user 122 .

[0199] The analysis module 214 can include a “track changes” tool for determining that all the data files 110 meet one or more selected criteria. The analysis module 214 can be configured to automatically load information from first data file 110 included in the search results. The software would then compare these items with the settings in each new data file. If a difference occurred, a warning message would be displayed. In Auto or Batch modes, the file would be rejected from the analysis. Other modules within the system 100 could be used to house the track changes utility tool.

[0200] Analysis 136 functionality is described in greater detail below.

[0201] F. Correlation Module

[0202] A correlation module 216 can be used to exclusively isolate the performance of correlation heuristics to a single module. The various correlation heuristics can be performed between pattern arrays 142 and data arrays