Title:
Audiovisual information management system with identification prescriptions
Document Type and Number:
United States Patent 7178107

Abstract:
A method of using a system with at least one of audio, image, and a video comprising a plurality of frames comprising the steps of providing a usage preferences description where the usage preferences description includes at least one of a browsing preferences description, a filtering preferences description, a search preferences description, and a device preferences description.

Representative Image:
Inventors:
Sezan, Muhammed Ibrahim (Camas, WA, US)
Van Beek, Petrus (Vancouver, WA, US)
      Plaque It!

Sponsored by:
Flash of Genius
Application Number:
10/977718
Publication Date:
02/13/2007
Filing Date:
10/28/2004
View Patent Images:
Images are available in PDF form when logged in. To view PDFs, Login  or  Create Account (Free!)
Assignee:
Sharp Laboratories of America, Inc. (Camas, WA, US)
Primary Class:
Other Classes:
715/716, 725/46, 725/47, 715/738, 715/733, 715/744
International Classes:
G06F3/00; G06F3/14; H04N7/173
Field of Search:
725/53, 725/46, 715/744-747, 715/741, 725/47, 725/44, 725/37, 715/738, 715/719, 715/716, 715/733, 715/727
US Patent References:
4321635Apparatus for selective retrieval of information streams or itemsMarch, 1982Tsuyuguchi
4324402Electronic baseball gameApril, 1982Klose
5012334Video image bank for storing and retrieving video image sequencesApril, 1991Etra
5241671Multimedia search system using a plurality of entry path means which indicate interrelatedness of informationAugust, 1993Reed et al.
5288069Talking footballFebruary, 1994Matsumoto
D348251Menu control panel for a universal remote control unitJune, 1994Hendricks
D354059Remote control unitJanuary, 1995Hendricks
5404316Desktop digital video processing systemApril, 1995Klingler et al.
5410344Apparatus and method of selecting video programs based on viewers' preferencesApril, 1995Graves et al.
5434678Seamless transmission of non-sequential video segmentsJuly, 1995Abecassis
5444499Audio video apparatus with intelligence for learning a history of user controlAugust, 1995Saitoh
5483278System and method for finding a movie of interest in a large movie databaseJanuary, 1996Strubbe et al.
D368263Remote control unitMarch, 1996Hendricks
5550965Method and system for operating a data processor to index primary data in real time with iconic table of contentsAugust, 1996Gabbe et al.
5559549Television program delivery systemSeptember, 1996Hendricks et al.
5589945Computer-themed playing systemDecember, 1996Abecassis
5600364Network controller for cable television delivery systemsFebruary, 1997Hendricks et al.
5600573Operations center with video storage for a television program packaging and delivery systemFebruary, 1997Hendricks et al.
5600781Method and apparatus for creating a portable personalized operating environmentFebruary, 1997Root et al.
5610653Method and system for automatically tracking a zoomed video imageMarch, 1997Abecassis
5634849Content-on-demand interactive video method and apparatusJune, 1997Abecassis
D381991Remote control unitAugust, 1997Hendricks
5659350Operations center for a television program packaging and delivery systemAugust, 1997Hendricks et al.
5664046Autoconfigurable video systemSeptember, 1997Abecassis
5682195Digital cable headend for cable television delivery systemOctober, 1997Hendricks et al.
5684918System for integrating video and communicationsNovember, 1997Abecassis
5696869Variable-content-video provider systemDecember, 1997Abecassis
5710884System for automatically updating personal profile server with updates to additional user information gathered from monitoring user's electronic consuming habits generated on computer during useJanuary, 1998Dedrick
5717814Variable-content video retrieverFebruary, 1998Abecassis
5724472Content map for seamlessly skipping a retrieval of a segment of a videoMarch, 1998Abecassis
5727129Network system for profiling and actively facilitating user activitiesMarch, 1998Barrett et al.
5734853Set top terminal for cable television delivery systemsMarch, 1998Hendricks et al.
5758257System and method for scheduling broadcast of and access to video programs and other data using customer profilesMay, 1998Herz et al.
5758259Automated selective programming guideMay, 1998Lawler
5761881Process and apparatus for wrapping paper rollsJune, 1998Wall
5774357Human factored interface incorporating adaptive pattern recognition based controller apparatusJune, 1998Hoffberg et al.
5794210Attention brokerageAugust, 1998Goldhaber et al.
5797001Broadcast interactive multimedia systemAugust, 1998Augenbraun et al.
5798785Terminal for suggesting programs offered on a television program delivery systemAugust, 1998Hendricks et al.
5809426Arrangement in mobile telecommunications systems for providing synchronization of transmitters of base stationsSeptember, 1998Radojevic et al.
5822537Multimedia networked system detecting congestion by monitoring buffers' threshold and compensating by reducing video transmittal rate then reducing audio playback rateOctober, 1998Katseff et al.
5835087System for generation of object profiles for a system for customized electronic identification of desirable objectsNovember, 1998Herz et al.
D402310Electronic bookDecember, 1998Hendricks
5848396Method and apparatus for determining behavioral profile of a computer userDecember, 1998Gerace
5861881Interactive computer system for providing an interactive presentation with personalized video, audio and graphics responses for multiple viewersJanuary, 1999Freeman et al.
5867386Morphological pattern recognition based controller systemFebruary, 1999Hoffberg et al.
5875108Ergonomic man-machine interface incorporating adaptive pattern recognition based control systemFebruary, 1999Hoffberg et al.
5878222Method and apparatus for controlling video/audio and channel selection for a communication signal based on channel data indicative of channel contents of a signalMarch, 1999Harrison
5892536Systems and methods for computer enhanced broadcast monitoringApril, 1999Logan et al.
5900867Self identifying remote control device having a television receiver for use in a computerMay, 1999Schindler et al.
5901246Ergonomic man-machine interface incorporating adaptive pattern recognition based control systemMay, 1999Hoffberg et al.
5903454Human-factored interface corporating adaptive pattern recognition based controller apparatusMay, 1999Hoffberg et al.
5907324Method for saving and accessing desktop conference characteristics with a persistent conference objectMay, 1999Larson et al.
5913013Seamless transmission of non-sequential video segmentsJune, 1999Abecassis
5913030Method and system for client/server communications with user information revealed as a function of willingness to reveal and whether the information is requiredJune, 1999Lotspiech et al.
5920477Human factored interface incorporating adaptive pattern recognition based controller apparatusJuly, 1999Hoffberg et al.
5926624Digital information library and delivery system with logic for generating files targeted to the playback deviceJuly, 1999Katz et al.
5930783Semantic and cognition based image retrievalJuly, 1999Li et al.
5945988Method and apparatus for automatically determining and dynamically updating user preferences in an entertainment systemAugust, 1999Williams et al.
5956026Method for hierarchical summarization and browsing of digital videoSeptember, 1999Ratakonda
5958006Method and apparatus for communicating summarized dataSeptember, 1999Eggleston et al.
5973683Dynamic regulation of television viewing content based on viewer profile and viewing historyOctober, 1999Cragun et al.
5977964Method and apparatus for automatically configuring a system based on a user's monitored system interaction and preferred system access timesNovember, 1999Williams et al.715/721
5986690Electronic book selection and delivery systemNovember, 1999Hendricks
5986692Systems and methods for computer enhanced broadcast monitoringNovember, 1999Logan et al.
5987211Seamless transmission of non-sequential video segmentsNovember, 1999Abecassis
5990927Advanced set top terminal for cable television delivery systemsNovember, 1999Hendricks et al.
5995094User-oriented multimedia presentation system for multiple presentation items that each behave as an agentNovember, 1999Eggen et al.
6002833Disc storing a variable-content-video and a user interfaceDecember, 1999Abecassis
6006265Hyperlinks resolution at and by a special network server in order to enable diverse sophisticated hyperlinking upon a digital networkDecember, 1999Rangan et al.
6011895Keyword responsive variable content video programJanuary, 2000Abecassis
6020883System and method for scheduling broadcast of and access to video programs and other data using customer profilesFebruary, 2000Herz et al.
6029195System for customized electronic identification of desirable objectsFebruary, 2000Herz
6038367Playing a Video Responsive to a comparison of two sets of Content PreferencesMarch, 2000Abecassis
6049821Proxy host computer and method for accessing and retrieving information between a browser and a proxyApril, 2000Theriault et al.
6052554Television program delivery systemApril, 2000Hendricks et al.
6064385Systems with user preference setting schemesMay, 2000Sturgeon et al.
6067401Playing a version of and from within a video by means of downloaded segment informationMay, 2000Abecassis
6070167Hierarchical method and system for object-based audiovisual descriptive tagging of images for information retrieval, editing, and manipulationMay, 2000Qian et al.
6072934Video previewing method and apparatusJune, 2000Abecassis
6076166Personalizing hospital intranet web sitesJune, 2000Moshfeghi et al.
6078928Site-specific interest profiling systemJune, 2000Schnase et al.
6081750Ergonomic man-machine interface incorporating adaptive pattern recognition based control systemJune, 2000Hendricks et al.
6088455Methods and apparatus for selectively reproducing segments of broadcast programmingJuly, 2000Logan et al.
6088722System and method for scheduling broadcast of and access to video programs and other data using customer profilesJuly, 2000Herz et al.
6091886Video viewing responsive to content and time restrictionsJuly, 2000Abecassis
RE36801Time delayed digital video system using concurrent recording and playbackAugust, 2000Logan et al.
6128624Collection and integration of internet and electronic commerce data in a database during web browsingOctober, 2000Papierniak et al.707/104.1
6133909Method and apparatus for searching a guide using program characteristicsOctober, 2000Schein et al.
6137486Image display control device for restricting display of video data viewed on a television in accordance with a restrict level of the video dataOctober, 2000Yoshida et al.
6151444Motion picture including within a duplication of framesNovember, 2000Abecassis
D435561Replay bar icon for a displayDecember, 2000Pettigrew et al.
6160989Network controller for cable television delivery systemsDecember, 2000Hendricks et al.
6177931Systems and methods for displaying and recording control interface with television programs, video, advertising information and program scheduling informationJanuary, 2001Alexander et al.
6181335Card for a set top terminalJanuary, 2001Hendricks et al.
6185625Scaling proxy server sending to the client a graphical user interface for establishing object encoding preferences after receiving the client's request for the objectFebruary, 2001Tso et al.
6198767Apparatus for color component compressionMarch, 2001Greenfield et al.
6201536Network manager for cable television system headendsMarch, 2001Hendricks et al.
6208805Inhibiting a control function from interfering with a playing of a videoMarch, 2001Abecassis
6215526Analog video tagging and encoding systemApril, 2001Barton et al.
6226678Method and apparatus for dynamically defining data communication utilitiesMay, 2001Mattaway
6230501Ergonomic systems and methods providing intelligent adaptive surfaces and temperature controlMay, 2001Bailey et al.
6233389Multimedia time warping systemMay, 2001Barton et al.
6236395Audiovisual information management systemMay, 2001Sezan et al.
6252544Mobile communication deviceJune, 2001Hoffberg
6269216Intermittently arranged frames for instantaneously shifting among video segmentsJuly, 2001Abecassis
6286141Personal editing systemSeptember, 2001Browne et al.
6289165System for and a method of playing interleaved presentation segmentsSeptember, 2001Abecassis
6298482System for two-way digital multimedia broadcast and interactive servicesOctober, 2001Seidman et al.
6304715Disc having a code for preventing an interference with a playing of a video segmentOctober, 2001Abecassis
6317718System, method and article of manufacture for location-based filtering for shopping agent in the physical worldNovember, 2001Fano
6317881Method and apparatus for collecting and providing viewer feedback to a broadcastNovember, 2001Shah-Nazaroff et al.
6370504Speech recognition on MPEG/Audio encoded filesApril, 2002Zick et al.
6405371Navigating through television programsJune, 2002Oosterhout et al.
6412008System and method for cooperative client/server customization of web pagesJune, 2002Fields et al.
6426761Information presentation system for a graphical user interfaceJuly, 2002Kanevsky et al.
6426974Image conversion apparatus for transforming compressed image data of different resolutions wherein side information is scaledJuly, 2002Takahashi et al.
6446261Set top device for targeted electronic insertion of indicia into videoSeptember, 2002Rosser725/34
6487390System and method for interactive on-demand informationNovember, 2002Virine et al.
6522342Graphical tuning bar for a multi-program data streamFebruary, 2003Gagnon et al.
6530082Configurable monitoring of program viewership and usage of interactive applicationsMarch, 2003Del Sesto et al.
6535639Automatic video summarization using a measure of shot importance and a frame-packing methodMarch, 2003Uchihachi et al.
6543053Interactive video-on-demand systemApril, 2003Li et al.
6546555System for hypervideo filtering based on end-user payment interest and capabilityApril, 2003Hjelsvold
6553178Advertisement subsidized video-on-demand systemApril, 2003Abecassis
6571279Location enhanced information delivery systemMay, 2003Herz et al.
6578075Methods and arrangements for distributing services and/or programs in a network environmentJune, 2003Nieminen et al.
6581207Information filtering system and methodJune, 2003Sumita et al.
6587127Content player method and server with user profileJuly, 2003Leeke et al.715/765
6593936Synthetic audiovisual description scheme, method and system for MPEG-7July, 2003Huang et al.
6594699System for capability based multimedia streaming over a networkJuly, 2003Sahai et al.
6611876Method for establishing optimal intermediate caching points by grouping program elements in a software systemAugust, 2003Barrett et al.
6678659System and method of voice information dissemination over a network using semantic representationJanuary, 2004Van Kommer
6766362Providing a network-based personalized newspaper with personalized content and layoutJuly, 2004Miyasaka et al.
6868440Multi-level skimming of multimedia content using playlistsMarch, 2005Gupta et al.
20010030664Method and apparatus for configuring icon interactivityOctober, 2001Shulman et al.
20020026345Targeted delivery of informational content with privacy protectionFebruary, 2002Juels
20020097165METHOD FOR REMOTELY CONTROLLING A PLURALITY OF APPARATUS USING A SINGLE REMOTE CONTROL DEVICEJuly, 2002Hulme
20020133412SYSTEM FOR MANAGEMENT OF TRANSACTIONS ON NETWORKSSeptember, 2002Oliver et al.
20030105682User interface and methods for recommending items to usersJune, 2003Dicker et al.
20050102202Content personalization based on actions performed during browsing sessionsMay, 2005Linden et al.
Foreign References:
JP08125957May, 1996DIGITAL STILL CAMERA
JP09322154December, 1997MONITOR VIDEO DEVICE
JP11032267February, 1999
JP11261908September, 1999SUPPORT SYSTEM FOR SELECTING PROGRAM AND/OR INFORMATION
JP2000013755January, 2000BROADCASTING SYSTEM
JP2001036861February, 2001
JP2002503896February, 2002
WO/1994/014284June, 1994REPROGRAMMABLE TERMINAL FOR SUGGESTING PROGRAMS OFFERED ON A TELEVISION PROGRAM DELIVERY SYSTEM
WO/1999/004143January, 1999HOUSING FOR AN ENGAGEABLE AND DISENGAGEABLE BUCKET TAPPET,
WO/1999/012194March, 1999TEMPERATURE ADJUSTING METHOD AND ALIGNER TO WHICH THIS METHOD IS APPLIED
WO/1999/065237December, 1999TELEVISION PROGRAM RECORDING WITH USER PREFERENCE DETERMINATION
Other References:
Henry Lieberman et al, “Let's Browse: A Collaborative Web Browsing Agent”, Jan. 1999, CPP Conference Paper, Proceeding of IUI 99. Intelligent User Interfaces, p. 65-8.
“MPEG-7 Requirements Document,” ISO/IEC JTC1/SC29/WG11 (N2859), Jul. 1999.
“MPEG-7 Requirements for Description of Users,” ISO/IEC JTC1/SC29/WG11 (M4601), Mar. 1999.
“MPEG-7 Description Schemes for Consumer Video,” ISO/IEC JTC1/SC29/WG11 (P429), Feb. 1999.
“MPEG-7 Media/Meta Dss (V 2),” ISO/IEC JTC1/SC29/WG11, Aug. 1999.
“MPEG-7 Description Schemes (V 5),” ISO/IEC JTC1/SC29/WG11 (N2844), Jul. 1999.
“XML Schema Part I: Structures, W3C Working Draft May 6, 1999,” www.w3.org/1999/05/06-xmlschema-1/.
“Composite Capability/Preference Profiles (CC/PP): A User Side Framework for Content Negotiation,” W3C Note Jul. 27, 1999,www.w3.org/TR/1999-NOTE-CCPP-19990727).
“Proposal for User Preference Descriptions in MPEG-7,” ISO/IEC JTC1/SC29/WG11 (M5222), Oct. 1999.
“User Preference Descriptions for MPEG-7,” ISO/IEC JTC1/SC29/WG11 (Mxxxx), Dec. 1999.
Lewis, “UbiNet: The Ubiquitous Internet Will Be Wireless,” Computer Magazine, pp. 128-130, Oct. 1999.
“MPEG-7 Generic AV Description Schemes (V0.7),” ISO/IEC JTC1/SC29/WG11 (N2966), Oct. 1999.
“MPEG-7 Description Definition Language Document,” ISO/IEC JTC1/SC29/WG11 (N2997), Oct. 1999.
Ehrmantraut, Harder, Wittig & Steinmatz, “The Personal Electronic Program Guide—Towards the Pre-selection of Individual TV Programs,” Proceedings of the International Conference on information and Knowledge Management CIKM, ACM, New York, NY, US, Nov. 12, 1996, pp. 243-250.
LG Corporate Institute of Technology, “Specification of the Usage History DS,” Noordwijkerhout, Mar. 2000, ISO/IEC JTC1/SC29/WG11/M5748.
LG Electronics Institute of Technology, “Proposal of Usage History DS,” Beijing, Jul. 2000, ISO/IEC JTC1/SC29/WG11 M6259.
“Customer Profile Exchange (CPExchange) Specification”, Version 1.0, Oct. 20, 2000.
“XML Schema Part 2: Datatypes,” W3C Recommendation May 2, 2001, http://www.w3.org/TR/xmlschema-2/, pp. 1-136.
Primary Examiner:
Bautista X, Lucila
Attorney, Agent or Firm:
Chernoff, Vilhauer, McClung & Stenzel
Parent Case Data:

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 09/541,447 filed Mar. 31, 2000 now abandoned which claims the benefit of U.S. Provisional Patent Application No. 60/154,388, Filed Sep. 16, 1999.

Claims:
The invention claimed is:

1. A method of using a system with at least one element chosen from audio, image, and a video comprising a plurality of frames comprising the steps of: (a) providing a plurality of usage preferences descriptions where each of said usage preference descriptions includes at least a browsing preferences description, a filtering preferences description, a search preferences description, and a device preferences description where, (i) said browsing preferences description relates to a user's viewing preferences for the presentation structure by which the content of said at least one element chosen from audio, image, and a video is to be presented to said user; (ii) said filtering preferences descriptions and said search preferences descriptions relate to at least one of (1) content preferences of said at least one of audio, image, and video, (2) classification preferences of said at least one of audio, image, and video, (3) keyword preferences of said at least one of audio, image, and video, and (4) creation preferences of said at least one of audio, image, and video; and (iii) said device preferences description contains descriptors describing at least one of preferred audio and video rendering settings of the presentation device that relates to user's preferences regarding presentation characteristics of the presentation device, said at least one of audio and video rendering settings of said presentation device relating to how the user consumes said at least one of audio and video content; and (b) providing a plurality of user identification descriptions, each of which identifies at least one of said usage preference descriptions.

2. The method of claim 1 wherein said user identification description identifies a corresponding set of at least two of said usage preference descriptions.

3. The method of claim 1 wherein each of said multiple user identification descriptions identifies an overlapping set of at least one of said usage preferences descriptions.

4. The method of claim 1 wherein said usage preferences description includes at least said filtering preferences description, said search preferences description, and said device preferences description.

5. The method of claim 1 where each said browsing preference description is specific to a user-selective one or more content types of said at least one element chosen from audio, image, and a video.

6. The method of claim 1 where said browsing preference specifies a presentation structure selected from a list comprising at least two of: (a) thumbnail view; (b) slide view; (c) key frame view; (d) highlight view; (e) close-up view; (f) frame view; (g) shot view; (h) event view; and (i) alternate view.

Description:

BACKGROUND OF THE INVENTION

The present invention relates to a system for managing audiovisual information, and in particular to a system for audiovisual information browsing, filtering, searching, archiving, and personalization.

Video cassette recorders (VCRs) may record video programs in response to pressing a record button or may be programmed to record video programs based on the time of day. However, the viewer must program the VCR based on information from a television guide to identify relevant programs to record. After recording, the viewer scans through the entire video tape to select relevant portions of the program for viewing using the functionality provided by the VCR, such as fast forward and fast reverse. Unfortunately, the searching and viewing is based on a linear search, which may require significant time to locate the desired portions of the program(s) and fast forward to the desired portion of the tape. In addition, it is time consuming to program the VCR in light of the television guide to record desired programs. Also, unless the viewer recognizes the programs from the television guide as desirable it is unlikely that the viewer will select such programs to be recorded.

RePlayTV and TiVo have developed hard disk based systems that receive, record, and play television broadcasts in a manner similar to a VCR. The systems may be programmed with the viewer's viewing preferences. The systems use a telephone line interface to receive scheduling information similar to that available from a television guide. Based upon the system programming and the scheduling information, the system automatically records programs that may be of potential interest to the viewer. Unfortunately, viewing the recorded programs occurs in a linear manner and may require substantial time. In addition, each system must be programmed for an individual's preference, likely in a different manner.

Freeman et al., U.S. Pat. No. 5,861,881, disclose an interactive computer system where subscribers can receive individualized content.

With all the aforementioned systems, each individual viewer is required to program the device according to his particular viewing preferences. Unfortunately, each different type of device has different capabilities and limitations which limit the selections of the viewer. In addition, each device includes a different interface which the viewer may be unfamiliar with. Further, if the operator's manual is inadvertently misplaced it may be difficult for the viewer to efficiently program the device.

BRIEF SUMMARY OF THE INVENTION

The present invention overcomes the aforementioned drawbacks of the prior art by providing a method of using a system with at least one of audio, image, and a video comprising a plurality of frames comprising the steps of providing a usage preferences description scheme where the usage preference description scheme includes at least one of a browsing preferences description scheme, a filtering preferences description scheme, a search preferences description scheme, and a device preferences description scheme. The browsing preferences description scheme relates to a user's viewing preferences. The filtering and search preferences description schemes relate to at least one of (1) content preferences of the at least one of audio, image, and video, (2) classification preferences of the at least one of audio, image, and video, (3) keyword preferences of the at least one of audio, image, and video, and (4) creation preferences of the at least one of audio, image, and video. The device preferences description scheme relates to user's preferences regarding presentation characteristics. A usage history description scheme is provided where the usage preference description scheme includes at least one of a browsing history description scheme, a filtering history description scheme, a search history description scheme, and a device usage history description scheme. The browsing history description scheme relates to a user's viewing preferences. The filtering and search history description schemes relate to at least one of (1) content usage history of the at least one of audio, image, and video, (2) classification usage history of the at least one of audio, image, and video, (3) keyword usage history of the at least one of audio, image, and video, and (4) creation usage history of the at least one of audio, image, and video. The device usage history description scheme relates to user's preferences regarding presentation characteristics. The usage preferences description scheme and the usage history description scheme are used to enhance system functionality.

The foregoing and other objectives, features and advantages of the invention will be more readily understood upon consideration of the following detailed description of the invention, taken in conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

FIG. 1 is an exemplary embodiment of a program, a system, and a user, with associated description schemes, of an audiovisual system of the present invention.

FIG. 2 is an exemplary embodiment of the audiovisual system, including an analysis module, of FIG. 1.

FIG. 3 is an exemplary embodiment of the analysis module of FIG. 2.

FIG. 4 is an illustration of a thumbnail view (category) for the audiovisual system.

FIG. 5 is an illustration of a thumbnail view (channel) for the audiovisual system.

FIG. 6 is an illustration of a text view (channel) for the audiovisual system.

FIG. 7 is an illustration of a frame view for the audiovisual system.

FIG. 8 is an illustration of a shot view for the audiovisual system.

FIG. 9 is an illustration of a key frame view the audiovisual system.

FIG. 10 is an illustration of a highlight view for the audiovisual system.

FIG. 11 is an illustration of an event view for the audiovisual system.

FIG. 12 is an illustration of a character/object view for the audiovisual system.

FIG. 13 is an alternative embodiment of a program description scheme including a syntactic structure description scheme, a semantic structure description scheme, a visualization description scheme, and a meta information description scheme.

FIG. 14 is an exemplary embodiment of the visualization description scheme of FIG. 13.

FIG. 15 is an exemplary embodiment of the meta information description scheme of FIG. 13.

FIG. 16 is an exemplary embodiment of a segment description scheme for the syntactic structure description scheme of FIG. 13.

FIG. 17 is an exemplary embodiment of a region description scheme for the syntactic structure description scheme of FIG. 13.

FIG. 18 is an exemplary embodiment of a segment/region relation description scheme for the syntactic structure description scheme of FIG. 13.

FIG. 19 is an exemplary embodiment of an event description scheme for the semantic structure description scheme of FIG. 13.

FIG. 20 is an exemplary embodiment of an object description scheme for the semantic structure description scheme of FIG. 13.

FIG. 21 is an exemplary embodiment of an event/object relation graph description scheme for the syntactic structure description scheme of FIG. 13.

FIG. 22 is an exemplary embodiment of a user preference description scheme.

FIG. 23 is an exemplary embodiment of the interrelationship between a usage history description scheme, an agent, and the usage preference description scheme of FIG. 22.

FIG. 24 is an exemplary embodiment of the interrelationship between audio and/or video programs together with their descriptors, user identification, and the usage preference description scheme of FIG. 22.

FIG. 25 is an exemplary embodiment of a usage preference description scheme of FIG. 22.

FIG. 26 is an exemplary embodiment of the interrelationship between the usage description schemes and an MPEG-7 description schemes.

FIG. 27 is an exemplary embodiment of a usage history description scheme of FIG. 22.

FIG. 28 is an exemplary system incorporating the user history description scheme.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

Many households today have many sources of audio and video information, such as multiple television sets, multiple VCR's, a home stereo, a home entertainment center, cable television, satellite television, internet broadcasts, world wide web, data services, specialized Internet services, portable radio devices, and a stereo in each of their vehicles. For each of these devices, a different interface is normally used to obtain, select, record, and play the video and/or audio content. For example, a VCR permits the selection of the recording times but the user has to correlate the television guide with the desired recording times. Another example is the user selecting a preferred set of preselected radio stations for his home stereo and also presumably selecting the same set of preselected stations for each of the user's vehicles. If another household member desires a different set of preselected stereo selections, the programming of each audio device would need to be reprogrammed at substantial inconvenience.

The present inventors came to the realization that users of visual information and listeners to audio information, such as for example radio, audio tapes, video tapes, movies, and news, desire to be entertained and informed in more than merely one uniform manner. In other words, the audiovisual information presented to a particular user should be in a format and include content suited to their particular viewing preferences. In addition, the format should be dependent on the content of the particular audiovisual information. The amount of information presented to a user or a listener should be limited to only the amount of detail desired by the particular user at the particular time. For example with the ever increasing demands on the user's time, the user may desire to watch only 10 minutes of or merely the highlights of a basketball game. In addition, the present inventors came to the realization that the necessity of programming multiple audio and visual devices with their particular viewing preferences is a burdensome task, especially when presented with unfamiliar recording devices when traveling. When traveling, users desire to easily configure unfamiliar devices, such as audiovisual devices in a hotel room, with their viewing and listening preferences in a efficient manner.

The present inventors came to the further realization that a convenient technique of merely recording the desired audio and video information is not sufficient because the presentation of the information should be in a manner that is time efficient, especially in light of the limited time frequently available for the presentation of such information. In addition, the user should be able to access only that portion of all of the available information that the user is interested in, while skipping the remainder of the information.

A user is not capable of watching or otherwise listening to the vast potential amount of information available through all, or even a small portion of, the sources of audio and video information. In addition, with the increasing information potentially available, the user is not likely even aware of the potential content of information that he may be interested in. In light of the vast amount of audio, image, and video information, the present inventors came to the realization that a system that records and presents to the user audio and video information based upon the user's prior viewing and listening habits, preferences, and personal characteristics, generally referred to as user information, is desirable. In addition, the system may present such information based on the capabilities of the system devices. This permits the system to record desirable information and to customize itself automatically to the user and/or listener. It is to be understood that user, viewer, and/or listener terms may be used interchangeability for any type of content. Also, the user information should be portable between and usable by different devices so that other devices may likewise be configured automatically to the particular user's preferences upon receiving the viewing information.

In light of the foregoing realizations and motivations, the present inventors analyzed a typical audio and video presentation environment to determine the significant portions of the typical audiovisual environment. First, referring to FIG. 1 the video, image, and/or audio information 10 is provided or otherwise made available to a user and/or a (device) system. Second, the video, image, and/or audio information is presented to the user from the system 12 (device), such as a television set or a radio. Third, the user interacts both with the system (device) 12 to view the information 10 in a desirable manner and has preferences to define which audio, image, and/or video information is obtained in accordance with the user information 14 . After the proper identification of the different major aspects of an audiovisual system the present inventors then realized that information is needed to describe the informational content of each portion of the audiovisual system 16 .

With three portions of the audiovisual presentation system 16 identified, the functionality of each portion is identified together with its interrelationship to the other portions. To define the necessary interrelationships, a set of description schemes containing data describing each portion is defined. The description schemes include data that is auxiliary to the programs 10 , the system 12 , and the user 14 , to store a set of information, ranging from human readable text to encoded data, that can be used in enabling browsing, filtering, searching, archiving, and personalization. By providing a separate description scheme describing the program(s) 10 , the user 14 , and the system 12 , the three portions (program, user, and system) may be combined together to provide an interactivity not previously achievable. In addition, different programs 10 , different users 14 , and different systems 12 may be combined together in any combination, while still maintaining full compatibility and functionality. It is to be understood that the description scheme may contain the data itself or include links to the data, as desired.

A program description scheme 18 related to the video, still image, and/or audio information 10 preferably includes two sets of information, namely, program views and program profiles. The program views define logical structures of the frames of a video that define how the video frames are potentially to be viewed suitable for efficient browsing. For example the program views may contain a set of fields that contain data for the identification of key frames, segment definitions between shots, highlight definitions, video summary definitions, different lengths of highlights, thumbnail set of frames, individual shots or scenes, representative frame of the video, grouping of different events, and a close-up view. The program view descriptions may contain thumbnail, slide, key frame, highlights, and close-up views so that users can filter and search not only at the program level but also within a particular program. The description scheme also enables users to access information in varying detail amounts by supporting, for example, a key frame view as a part of a program view providing multiple levels of summary ranging from coarse to fine. The program profiles define distinctive characteristics of the content of the program, such as actors, stars, rating, director, release date, time stamps, keyword identification, trigger profile, still profile, event profile, character profile, object profile, color profile, texture profile, shape profile, motion profile, and categories. The program profiles are especially suitable to facilitate filtering and searching of the audio and video information. The description scheme enables users to have the provision of discovering interesting programs that they may be unaware of by providing a user description scheme. The user description scheme provides information to a software agent that in turn performs a search and filtering on behalf of the user by possibly using the system description scheme and the program description scheme information. It is to be understood that in one of the embodiments of the invention merely the program description scheme is included.

Program views contained in the program description scheme are a feature that supports a functionality such as close-up view. In the close-up view, a certain image object, e.g., a famous basketball player such as Michael Jordan, can be viewed up close by playing back a close-up sequence that is separate from the original program. An alternative view can be incorporated in a straightforward manner. Character profile on the other hand may contain spatio-temporal position and size of a rectangular region around the character of interest. This region can be enlarged by the presentation engine, or the presentation engine may darken outside the region to focus the user's attention to the characters spanning a certain number of frames. Information within the program description scheme may contain data about the initial size or location of the region, movement of the region from one frame to another, and duration and terms of the number of frames featuring the region. The character profile also provides provision for including text annotation and audio annotation about the character as well as web page information, and any other suitable information. Such character profiles may include the audio annotation which is separate from and in addition to the associated audio track of the video.

The program description scheme may likewise contain similar information regarding audio (such as radio broadcasts) and images (such as analog or digital photographs or a frame of a video).

The user description scheme 20 preferably includes the user's personal preferences, and information regarding the user's viewing history such as for example browsing history, filtering history, searching history, and device setting history. The user's personal preferences includes information regarding particular programs and categorizations of programs that the user prefers to view. The user description scheme may also include personal information about the particular user, such as demographic and geographic information, e.g. zip code and age. The explicit definition of the particular programs or attributes related thereto permits the system 16 to select those programs from the information contained within the available-program description schemes 18 that may be of interest to the user. Frequently, the user does not desire to learn to program the device nor desire to explicitly program the device. In addition, the user description scheme 20 may not be sufficiently robust to include explicit definitions describing all desirable programs for a particular user. In such a case, the capability of the user description scheme 20 to adapt to the viewing habits of the user to accommodate different viewing characteristics not explicitly provided for or otherwise difficult to describe is useful. In such a case, the user description scheme 20 may be augmented or any technique can be used to compare the information contained in the user description scheme 20 to the available information contained in the program description scheme 18 to make selections. The user description scheme provides a technique for holding user preferences ranging from program categories to program views, as well as usage history. User description scheme information is persistent but can be updated by the user or by an intelligent software agent on behalf of the user at any arbitrary time. It may also be disabled by the user, at any time, if the user decides to do so. In addition, the user description scheme is modular and portable so that users can carry or port it from one device to another, such as with a handheld electronic device or smart card or transported over a network connecting multiple devices. When user description scheme is standardized among different manufacturers or products, user preferences become portable. For example, a user can personalize the television receiver in a hotel room permitting users to access information they prefer at any time and anywhere. In a sense, the user description scheme is persistent and timeless based. In addition, selected information within the program description scheme may be encrypted since at least part of the information may be deemed to be private (e.g., demographics). A user description scheme may be associated with an audiovisual program broadcast and compared with a particular user's description scheme of the receiver to readily determine whether or not the program's intended audience profile matches that of the user. It is to be understood that in one of the embodiments of the invention merely the user description scheme is included.

The system description scheme 22 preferably manages the individual programs and other data. The management may include maintaining lists of programs, categories, channels, users, videos, audio, and images. The management may include the capabilities of a device for providing the audio, video, and/or images. Such capabilities may include, for example, screen size, stereo, AC3, DTS, color, black/white, etc. The management may also include relationships between any one or more of the user, the audio, and the images in relation to one or more of a program description scheme(s) and a user description scheme(s). In a similar manner the management may include relationships between one or more of the program description scheme(s) and user description scheme(s). It is to be understood that in one of the embodiments of the invention merely the system description scheme is included.

The descriptors of the program description scheme and the user description scheme should overlap, at least partially, so that potential desirability of the program can be determined by comparing descriptors representative of the same information. For example, the program and user description scheme may include the same set of categories and actors. The program description scheme has no knowledge of the user description scheme, and vice versa, so that each description scheme is not dependant on the other for its existence. It is not necessary for the description schemes to be fully populated. It is also beneficial not to include the program description scheme with the user description scheme because there will likely be thousands of programs with associated description schemes which if combined with the user description scheme would result in a unnecessarily large user description scheme. It is desirable to maintain the user description scheme small so that it is more readily portable. Accordingly, a system including only the program description scheme and the user description scheme would be beneficial.

The user description scheme and the system description scheme should include at least partially overlapping fields. With overlapping fields the system can capture the desired information, which would otherwise not be recognized as desirable. The system description scheme preferably includes a list of users and available programs. Based on the master list of available programs, and associated program description scheme, the system can match the desired programs. It is also beneficial not to include the system description scheme with the user description scheme because there will likely be thousands of programs stored in the system description schemes which if combined with the user description scheme would result in a unnecessarily large user description scheme. It is desirable to maintain the user description scheme small so that it is more readily portable. For example, the user description scheme may include radio station preselected frequencies and/or types of stations, while the system description scheme includes the available stations for radio stations in particular cities. When traveling to a different city the user description scheme together with the system description scheme will permit reprogramming the radio stations. Accordingly, a system including only the system description scheme and the user description scheme would be beneficial.

The program description scheme and the system description scheme should include at least partially overlapping fields. With the overlapping fields, the system description scheme will be capable of storing the information contained within the program description scheme, so that the information is properly indexed. With proper indexing, the system is capable of matching such information with the user information, if available, for obtaining and recording suitable programs. If the program description scheme and the system description scheme were not overlapping then no information would be extracted from the programs and stored. System capabilities specified within the system description scheme of a particular viewing system can be correlated with a program description scheme to determine the views that can be supported by the viewing system. For instance, if the viewing device is not capable of playing back video, its system description scheme may describe its viewing capabilities as limited to keyframe view and slide view only. Program description scheme of a particular program and system description scheme of the viewing system are utilized to present the appropriate views to the viewing system. Thus, a server of programs serves the appropriate views according to a particular viewing system's capabilities, which may be communicated over a network or communication channel connecting the server with user's viewing device. It is preferred to maintain the program description scheme separate from the system description scheme because the content providers repackage the content and description schemes in different styles, times, and formats. Preferably, the program description scheme is associated with the program, even if displayed at a different time. Accordingly, a system including only the system description scheme and the program description scheme would be beneficial.

By preferably maintaining the independence of each of the three description schemes while having fields that correlate the same information, the programs 10 , the users 14 , and the-system 12 may be interchanged with one another while maintaining the functionality of the entire system 16 . Referring to FIG. 2, the audio, visual, or audiovisual program 38 , is received by the system 16 . The program 38 may originate at any suitable source, such as for example broadcast television, cable television, satellite television, digital television, Internet broadcasts, world wide web, digital video discs, still images, video cameras, laser discs, magnetic media, computer hard drive, video tape, audio tape, data services, radio broadcasts, and microwave communications. The program description stream may originate from any suitable source, such as for example PSIP/DVB-SI information in digital television broadcasts, specialized digital television data services, specialized Internet services, world wide web, data files, data over the telephone, and memory, such as computer memory. The program, user, and/or system description scheme may be transported over a network (communication channel). For example, the system description scheme may be transported to the source to provide the source with views or other capabilities that the device is capable of using. In response, the source provides the device with image, audio, and/or video content customized or otherwise suitable for the particular device. The system 16 may include any device(s) suitable to receive any one or more of such programs 38 . An audiovisual program analysis module 42 performs an analysis of the received programs 38 to extract and provide program related information (descriptors) to the description scheme (DS) generation module 44 . The program related information may be extracted from the data stream including the program 38 or obtained from any other source, such as for example data transferred over a telephone line, data already transferred to the system 16 in the past, or data from an associated file. The program related information preferably includes data defining both the program views and the program profiles available for the particular program 38 . The analysis module 42 performs an analysis of the programs 38 using information obtained from (i) automatic audio-video analysis methods on the basis of low-level features that are extracted from the program(s), (ii) event detection techniques, (iii) data that is available (or extractable) from data sources or electronic program guides (EPGs, DVB-SI, and PSIP), and (iv) user information obtained from the user description scheme 20 to provide data defining the program description scheme.

The selection of a particular program analysis technique depends on the amount of readily available data and the user preferences. For example, if a user prefers to watch a 5 minute video highlight of a particular program, such as a basketball game, the analysis module 42 may invoke a knowledge based system 90 (FIG. 3) to determine the highlights that form the best 5 minute summary. The knowledge based system 90 may invoke a commercial filter 92 to remove commercials and a slow motion detector 54 to assist in creating the video summary. The analysis module 42 may also invoke other modules to bring information together (e.g., textual information) to author particular program views. For example, if the program 38 is a home video where there is no further information available then the analysis module 42 may create a key-frame summary by identifying key-frames of a multi-level summary and passing the information to be used to generate the program views, and in particular a key frame view, to the description scheme. Referring also to FIG. 3, the analysis module 42 may also include other sub-modules, such as for example, a de-mux/decoder 60 , a data and service content analyzer 62 , a text processing and text summary generator 64 , a close caption analyzer 66 , a title frame generator 68 , an analysis manager 70 , an audiovisual analysis and feature extractor 72 , an event detector 74 , a key-frame summarizer 76 , and a highlight summarizer 78 .

The generation module 44 receives the system information 46 for the system description scheme. The system information 46 preferably includes data for the system description scheme 22 generated by the generation module 44 . The generation module 44 also receives user information 48 including data for the user description scheme. The user information 48 preferably includes data for the user description scheme generated within the generation module 44 . The user input 48 may include, for example, meta information to be included in the program and system description scheme. The user description scheme (or corresponding information) is provided to the analysis module 42 for selective analysis of the program(s) 38 . For example, the user description scheme may be suitable for triggering the highlight generation functionality for a particular program and thus generating the preferred views and storing associated data in the program description scheme. The generation module 44 and the analysis module 42 provide data to a data storage unit 50 . The storage unit 50 may be any storage device, such as memory or magnetic media.

A search, filtering, and browsing (SFB) module 52 implements the description scheme technique by parsing and extracting information contained within the description scheme. The SFB module 52 may perform filtering, searching, and browsing of the programs 38 , on the basis of the information contained in the description schemes. An intelligent software agent is preferably included within the SFB module 52 that gathers and provides user specific information to the generation module 44 to be used in authoring and updating the user description scheme (through the generation module 44 ). In this manner, desirable content may be provided to the user though a display 80 . The selections of the desired program(s) to be retrieved, stored, and/or viewed may be programmed, at least in part, through a graphical user interface 82 . The graphical user interface may also include or be connected to a presentation engine for presenting the information to the user through the graphical user interface.

The intelligent management and consumption of audiovisual information using the multi-part description stream device provides a next-generation device suitable for the modern era of information overload. The device responds to changing lifestyles of individuals and families, and allows everyone to obtain the information they desire anytime and anywhere they want.

An example of the use of the device may be as follows. A user comes home from work late Friday evening being happy the work week is finally over. The user desires to catch up with the events of the world and then watch ABC's 20/20 show later that evening. It is now 9 PM and the 20/20 show will start in an hour at 10 PM. The user is interested in the sporting events of the week, and all the news about the Microsoft case with the Department of Justice. The user description scheme may include a profile indicating a desire that the particular user wants to obtain all available information regarding the Microsoft trial and selected sporting events for particular teams. In addition, the system description scheme and program description scheme provide information regarding the content of the available information that may selectively be obtained and recorded. The system, in an autonomous manner, periodically obtains and records the audiovisual information that may be of interest to the user during the past week based on the three description schemes. The device most likely has recorded more than one hour of audiovisual information so the information needs to be condensed in some manner. The user starts interacting with the system with a pointer or voice commands to indicate a desire to view recorded sporting programs. On the display, the user is presented with a list of recorded sporting events including Basketball and Soccer. Apparently the user's favorite Football team did not play that week because it was not recorded. The user is interested in basketball games and indicates a desire to view games. A set of title frames is presented on the display that captures an important moment of each game. The user selects the Chicago Bulls game and indicates a desire to view a 5 minute highlight of the game. The system automatically generates highlights. The highlights may be generated by audio or video analysis, or the program description scheme includes data indicating the frames that are presented for a 5 minute highlight. The system may have also recorded web-based textual information regarding the particular Chicago-Bulls game which may be selected by the user for viewing. If desired, the summarized information may be recorded onto a storage device, such as a DVD with a label. The stored information may also include an index code so that it can be located at a later time. After viewing the sporting events the user may decide to read the news about the Microsoft trial. It is now 9:50 PM and the user is done viewing the news. In fact, the user has selected to delete all the recorded news items after viewing them. The user then remembers to do one last thing before 10 PM in the evening. The next day, the user desires to watch the VHS tape that he received from his brother that day, containing footage about his brother's new baby girl and his vacation to Peru last summer. The user wants to watch the whole 2-hour tape but he is anxious to see what the baby looks like and also the new stadium built in Lima, which was not there last time he visited Peru. The user plans to take a quick look at a visual summary of the tape, browse, and perhaps watch a few segments for a couple of minutes, before the user takes his daughter to her piano lesson at 10 AM the next morning. The user plugs in the tape into his VCR, that is connected to the system, and invokes the summarization functionality of the system to scan the tape and prepare a summary. The user can then view the summary the next morning to quickly discover the baby's looks, and playback segments between the key-frames of the summary to catch a glimpse of the crying baby. The system may also record the tape content onto the system hard drive (or storage device) so the video summary can be viewed quickly. It is now 10:10 PM, and it seems that the user is 10 minutes late for viewing 20/20. Fortunately, the system, based on the three description schemes, has already been recording 20/20 since 10 PM. Now the user can start watching the recorded portion of 20/20 as the recording of 20/20 proceeds. The user will be done viewing 20/20 at 11:10 PM.

The average consumer has an ever increasing number of multimedia devices, such as a home audio system, a car stereo, several home television sets, web browsers, etc. The user currently has to customize each of the devices for optimal viewing and/or listening preferences. By storing the user preferences on a removable storage device, such as a smart card, the user may insert the card including the user preferences into such media devices for automatic customization. This results in the desired programs being automatically recorded on the VCR, and setting of the radio stations for the car stereo and home audio system. In this manner the user only has to specify his preferences at most once, on a single device and subsequently, the descriptors are automatically uploaded into devices by the removable storage device. The user description scheme may also be loaded into other devices using a wired or wireless network connection, e.g. that of a home network. Alternatively, the system can store the user history and create entries in the user description scheme based on the's audio and video viewing habits. In this manner, the user would never need to program the viewing information to obtain desired information. In a sense, the user descriptor scheme enables modeling of the user by providing a central storage for the user's listening, viewing, browsing preferences, and user's behavior. This enables devices to be quickly personalized, and enables other components, such as intelligent agents, to communicate on the basis of a standardized description format, and to make smart inferences regarding the user's preferences.

Many different realizations and applications can be readily derived from FIGS. 2 and 3 by appropriately organizing and utilizing their different parts, or by adding peripherals and extensions as needed. In its most general form, FIG. 2 depicts an audiovisual searching, filtering, browsing, and/or recording appliance that is personalizable. The list of more specific applications/implementations given below is not exhaustive but covers a range.

The user description scheme is a major enabler for personalizable audiovisual appliances. If the structure (syntax and semantics) of the description schemes is known amongst multiple appliances, the user (user) can carry (or otherwise transfer) the information contained within his user description scheme from one appliance to another, perhaps via a smart card—where these appliances support smart card interface—in order to personalize them. Personalization can range from device settings, such as display contrast and volume control, to settings of television channels, radio stations, web stations, web sites, geographic information, and demographic information such as age, zip code etc. Appliances that can be personalized may access content from different sources. They may be connected to the web, terrestrial or cable broadcast, etc., and they may also access multiple or different types of single media such as video, music, etc.

For example, one can personalize the car stereo using a smart card plugged out of the home system and plugged into the car stereo system to be able to tune to favorite stations at certain times. As another example, one can also personalize television viewing, for example, by plugging the smart card into a remote control that in turn will autonomously command the television receiving system to present the user information about current and future programs that fits the user's preferences. Different members of the household can instantly personalize the viewing experience by inserting their own smart card into the family remote. In the absence of such a remote, this same type of personalization can be achieved by plugging in the smart card directly to the television system. The remote may likewise control audio systems. In another implementation, the television receiving system holds user description schemes for multiple users (users) in local storage and identify different users (or group of users) by using an appropriate input interface. For example an interface using user-voice identification technology. It is noted that in a networked system the user description scheme may be transported over the network.

The user description scheme is generated by direct user input, and by using a software that watches the user to determine his/her usage pattern and usage history. User description scheme can be updated in a dynamic fashion by the user or automatically. A well defined and structured description scheme design allows different devices to interoperate with each other. A modular design also provides portability.

The description scheme adds new functionality to those of the current VCR. An advanced VCR system can learn from the user via direct input of preferences, or by watching the usage pattern and history of the user. The user description scheme holds user's preferences users and usage history. An intelligent agent can then consult with the user description scheme and obtain information that it needs for acting on behalf of the user. Through the intelligent agent, the system acts on behalf of the user to discover programs that fit the taste of the user, alert the user about such programs, and/or record them autonomously. An agent can also manage the storage in the system according to the user description scheme, i.e., prioritizing the deletion of programs (or alerting the user for transfer to a removable media), or determining their compression factor (which directly impacts their visual quality) according to user's preferences and history.

The program description scheme and the system description scheme work in collaboration with the user description scheme in achieving some tasks. In addition, the program description scheme and system description scheme in an advanced VCR or other system will enable the user to browse, search, and filter audiovisual programs. Browsing in the system offers capabilities that are well beyond fast forwarding and rewinding. For instance, the user can view a thumbnail view of different categories of programs stored in the system. The user then may choose frame view, shot view, key frame view, or highlight view, depending on their availability and user's preference. These views can be readily invoked using the relevant information in the program description scheme, especially in program views. The user at any time can start viewing the program either in parts, or in its entirety.

In this application, the program description scheme may be readily available from many services such as: (i) from broadcast (carried by EPG defined as a part of ATSC-PSIP (ATSC-Program Service Integration Protocol) in USA or DVB-SI (Digital Video Broadcast-Service Information) in Europe); (ii) from specialized data services (in addition to PSIP/DVB-SI); (iii) from specialized web sites; (iv) from the media storage unit containing the audiovisual content (e.g., DVD); (v) from advanced cameras (discussed later), and/or may be generated (i.e., for programs that are being stored) by the analysis module 42 or by user input 48 .

Contents of digital still and video cameras can be stored and managed by a system that implements the description schemes, e.g., a system as shown in FIG. 2. Advanced cameras can store a program description scheme, for instance, in addition to the audiovisual content itself. The program description scheme can be generated either in part or in its entirety on the camera itself via an appropriate user input interface (e.g., speech, visual menu drive, etc.). Users can input to the camera the program description scheme information, especially those high-level (or semantic) information that may otherwise be difficult to automatically extract by the system. Some camera settings and parameters (e.g., date and time), as well as quantities computed in the camera (e.g., color histogram to be included in the color profile), can also be used in generating the program description scheme. Once the camera is connected, the system can browse the camera content, or transfer the camera content and its description scheme to the local storage for future use. It is also possible to update or add information to the description scheme generated in the camera.

The IEEE 1394 and Havi standard specifications enable this type of “audiovisual content” centric communication among devices. The description scheme API's can be used in the context of Havi to browse and/or search the contents of a camera or a DVD which also contain a description scheme associated with their content, i.e., doing more than merely invoking the PLAY API to play back and linearly view the media.

The description schemes may be used in archiving audiovisual programs in a database. The search engine uses the information contained in the program description scheme to retrieve programs on the basis of their content. The program description scheme can also

be used in navigating through the contents of the database or the query results. The user description scheme can be used in prioritizing the results of the user query during presentation. It is possible of course to make the program description scheme more comprehensive depending on the nature of the particular application.

The description scheme fulfills the user's desire to have applications that pay attention and are responsive to their viewing and usage habits, preferences, and personal demographics. The proposed user description scheme directly addresses this desire in its selection of fields and interrelationship to other description schemes. Because the description schemes are modular in nature, the user can port his user description scheme from one device to another in order to “personalize” the device.

The proposed description schemes can be incorporated into current products similar to those from TiVo and Replay TV in order to extend their entertainment informational value. In particular, the description scheme will enable audiovisual browsing and searching of programs and enable filtering within a particular program by supporting multiple program views such as the highlight view. In addition, the description scheme will handle programs coming from sources other than television broadcasts for which TiVo and Replay TV are not designed to handle. In addition, by standardization of TiVo and Replay TV type of devices, other products may be interconnected to such devices to extend their capabilities, such as devices supporting an MPEG 7 description. MPEG-7 is the Moving Pictures Experts Group-7, acting to standardize descriptions and description schemes for audiovisual information. The device may also be extended to be personalized by multiple users, as desired.

Because the description scheme is defined, the intelligent software agents can communicate among themselves to make intelligent inferences regarding the user's preferences. In addition, the development and upgrade of intelligent software agents for browsing and filtering applications can be simplified based on the standardized user description scheme.

The description scheme is multi-modal in the following sense that it holds both high level (semantic) and low level features and/or descriptors. For example, the high and low level descriptors are actor name and motion model parameters, respectively. High level descriptors are easily readable by humans while low level descriptors are more easily read by machines and less understandable by humans. The program description scheme can be readily harmonized with existing EPG, PSIP, and DVB-SI information facilitating search and filtering of broadcast programs. Existing services can be extended in the future by incorporating additional information using the compliant description scheme.

For example, one case may include audiovisual programs that are prerecorded on a media such as a digital video disc where the digital video disc also contains a description scheme that has the same syntax and semantics of the description scheme that the FSB module uses. If the FSB module uses a different description scheme, a transcoder (converter) of the description scheme may be employed. The user may want to browse and view the content of the digital video disc. In this case, the user may not need to invoke the analysis module to author a program description. However, the user may want to invoke his or her user description scheme in filtering, searching and browsing the digital video disc content. Other sources of program information may likewise be used in the same manner.

It is to be understood that any of the techniques described herein with relation to video are equally applicable to images (such as still image or a frame of a video) and audio (such as radio).

An example of an audiovisual interface is shown in FIGS. 4–12 which is suitable for the preferred audiovisual description scheme. Referring to FIG. 4, by selecting the thumbnail function as a function of category provides a display with a set of categories on the left hand side. Selecting a particular category, such as news, provides a set of thumbnail views of different programs that are currently available for viewing. In addition, the different programs may also include programs that will be available at a different time for viewing. The thumbnail views are short video segments that provide an indication of the content of the respective actual program that it corresponds with. Referring to FIG. 5, a thumbnail view of available programs in terms of channels may be displayed, if desired. Referring to FIG. 6, a text view of available programs in terms of channels may be displayed, if desired. Referring to FIG. 7, a frame view of particular programs may be displayed, if desired. A representative frame is displayed in the center of the display with a set of representative frames of different programs in the left hand column. The frequency of the number of frames may be selected, as desired. Also a set of frames are displayed on the lower portion of the display representative of different frames during the particular selected program. Referring to FIG. 8, a shot view of particular programs may be displayed, as desired. A representative frame of a shot is displayed in the center of the display with a set of representative frames of different programs in the left hand column. Also a set of shots are displayed on the lower portion of the display representative of different shots (segments of a program, typically sequential in nature) during the particular selected program. Referring to FIG. 9, a key frame view of particular programs may be displayed, as desired. A representative frame is displayed in the center of the display with a set of representative frames of different programs in the left hand column. Also a set of key frame views are displayed on the lower portion of the display representative of different key frame portions during the particular selected program. The number of key frames in each key frame view can be adjusted by selecting the level. Referring to FIG. 10, a highlight view may likewise be displayed, as desired. Referring to FIG. 11, an event view may likewise be displayed, as desired. Referring to FIG. 12, a character/object view may likewise be displayed, as desired.

An example of the description schemes is shown below in XML. The description scheme may be implemented in any language and include any of the included descriptions (or more), as desired.

The proposed program description scheme includes three major sections for describing a video program. The first section identifies the described program. The second section defines a number of views which may be useful in browsing applications. The third section defines a number of profiles which may be useful in filtering and search applications. Therefore, the overall structure of the proposed description scheme is as follows:

<?XML version=“1.0”>
<!DOCTYPE MPEG-7 SYSTEM “mpeg-7.dtd”>
<ProgramIdentity>
<ProgramID> ... </ProgramID>
<ProgramName> ... </ProgramName>
<SourceLocation> ... </SourceLocation>
</ProgramIdentity>
<ProgramViews>
<ThumbnailView> ... </ThumbnailView>
<SlideView> ... </SlideView>
<FrameView> ... </FrameView>
<ShotView> ... </ShotView>
<KeyFrameView> ... </KeyFrameView>
<HighlightView> ... </HighlightView>
<EventView> ... </EventView>
<CloseUpView> ... </CloseUpView>
<AlternateView> ... </AlternateView>
</ProgramViews>
<ProgramProfiles>
<GeneralProfile> ... </GeneralProfile>
<CategoryProfile> ... </CategoryProfile>
<DateTimeProfile> ... </DateTimeProfile>
<KeywordProfile> ... </KeywordProfile>
<TriggerProfile> ... </TriggerProfile>
<StillProfile> ... </StillProfile>
<EventProfile> ... </EventProfile>
<CharacterProfile> ... </CharacterProfile>
<ObjectProfile> ... </ObjectProfile>
<ColorProfile> ... </ColorProfile>
<TextureProfile> ... </TextureProfile>
<ShapeProfile> ... </ShapeProfile>
<MotionProfile> ... </MotionProfile>
</ProgramProfiles>

Program Identity

Program ID

  • <ProgramID> program-id </ProgramID>

The descriptor <ProgramID> contains a number or a string to identify a program.

Program Name

  • <ProgramName> program-name </ProgramName>

The descriptor <ProgramName> specifies the name of a program.

Source Location

  • <SourceLocation> source-url </SourceLocation>

The descriptor <SourceLocation> specifies the location of a program in URL format.

Program Views

Thumbnail View

<ThumbnailView>
<Image> thumbnail-image </Image>
</ThumbnailView>

The descriptor <ThumbnailView> specifies an image as the thumbnail representation of a program.

Slide View

  • <SlideView> frame-id . . . </SlideView>

The descriptor <SlideView> specifies a number of frames in a program which may be viewed as snapshots or in a slide show manner.

Frame View

  • <FrameView> start-frame-id end-frame-id </FrameView>

The descriptor <FrameView> specifies the start and end frames of a program. This is the most basic view of a program and any program has a frame view.

Shot View

<ShotView>
<Shot id=“”> start-frame-id end-frame-id display-frame-id </Shot>
<Shot id=“”> start-frame-id end-frame-id display-frame-id </Shot>
...
</ShotView>

The descriptor <ShotView> specifies a number of shots in a program. The <Shot> descriptor defines the start and end frames of a shot. It may also specify a frame to represent the shot.

Key-Frame View

<KeyFrameView>
<KeyFrames level=“”>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
...
</KeyFrames>
<KeyFrames level=“”>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
...
</KeyFrames>
...
</KeyFrameView>

The descriptor <KeyFrameView> specifies key frames in a program. The key frames may be organized in a hierarchical manner and the hierarchy is captured by the descriptor <KeyFrames> with a level attribute. The clips which are associated with each key frame are defined by the descriptor <Clip>. Here the display frame in each clip is the corresponding key frame.

Highlight View

<HighlightView>
<Highlight length=“”>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
...
</Highlight>
<Highlight length=“”>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
...
</Highlight>
...
</HighlightView>

The descriptor <HighlightView> specifies clips to form highlights of a program. A program may have different versions of highlights which are tailored into various time length. The clips are grouped into each version of highlight which is specified by the descriptor <Highlight> with a length attribute.

Event View

<EventView>
<Events name=“”>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
...
</Events>
<Events name=“”>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
...
</Events>
...
</EventView>

The descriptor <EventView> specifies clips which are related to certain events in a program. The clips are grouped into the corresponding events which are specified by the descriptor <Event> with a name attribute.

Close-Up View

<CloseUpView>
<Target name=“”>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
...
</Target>
<Target name=“”>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
<Clip id=“”> start-frame-id end-frame-id display-frame-id
</Clip>
...
</Target>
...
</CloseUpView>

The descriptor <CloseUpView> specifies clips which may be zoomed in to certain targets in a program. The clips are grouped into the corresponding targets which are specified by the descriptor <Target> with a name attribute.

Alternate View

<AlternateView>
<AlternateSource id=“”> source-url </AlternateSource>
<AlternateSource id=“”> source-url </AlternateSource>
...
</AlternateView>

The descriptor <AlternateView> specifies sources which may be shown as alternate views of a program. Each alternate view is specified by the descriptor <AlternateSource> with an id attribute. The locate of the source may be specified in URL format.

Program Profiles

General Profile

<GeneralProfile>
<Title> title-text </Title>
<Abstract> abstract-text </Abstract>
<Audio> voice-annotation </Audio>
<Www> web-page-url </Www>
<ClosedCaption> yes/no </ClosedCaption>
<Language> language-name </Language>
<Rating> rating </Rating>
<Length> time </Length>
<Authors> author-name ... </Authors>
<Producers> producer-name ... </Producers>
<Directors> director-name ... </Directors>
<Actors> actor-name ... </Actors>
...
</GeneralProfile>

The descriptor <GeneralProfile> describes the general aspects of a program.

Category Profile

  • <CategoryProfile> category-name . . . </CategoryProfile>

The descriptor <CategoryProfile> specifies the categories under which a program may be classified.

Date-Time Profile

<DateTimeProfile>
<ProductionDate> date </ProductionDate>
<ReleaseDate> date </ReleaseDate>
<RecordingDate> date </RecordingDate>
<RecordingTime> time </RecordingTime>
...
</DateTimeProfile>

The descriptor <DateTimeProfile> specifies various date and time information of a program.

Keyword Profile

  • <KeywordProfile> keyword . . . </KeywordProfile>

The descriptor <KeywordProfile> specifies a number of keywords which may be used to filter or search a program.

Trigger Profile

  • <TriggerProfile> trigger-frame-id . . . </TriggerProfile>

The descriptor <TriggerProfile> specifies a number of frames in a program which may be used to trigger certain actions while the playback of the program.

Still Profile

<StillProfile>
<Still id=“”>
<HotRegion id =“”>
<Location> x1 y1 x2 y2 </Location>
<Text> text-annotation </Text>
<Audio> voice-annotation </Audio>
<Www> web-page-url </Www>
</HotRegion>
<HotRegion id =“”>
<Location> x1 y1 x2 y2 </Location>
<Text> text-annotation </Text>
<Audio> voice-annotation </Audio>
<Www> web-page-url </Www>