[0001] Not Applicable.
[0002] All of the material in this patent application is subject to copyright protection under the copyright laws of the United States and of other countries. As of the first effective filing date of the present application, this material is protected as unpublished material. However, permission to copy this material is hereby granted to the extent that the copyright owner has no objection to the facsimile reproduction by anyone of the patent documentation or patent disclosure, as it appears in the United States Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
[0003] 1. Field of the Invention
[0004] The present invention relates to the field of computerized publication of documents, and more particularly to a method for publishing documents using XML on networks such as the Word Wide Web and the ability to publish documents for different device types such as computers, PDAs, cell phones and print.
[0005] 2. Description of the Related Art
[0006] Web sites often present content which is constantly changing. Presenting current information to the outside world without requiring an inordinate amount of human effort and computing power is a major technical challenge to Web site designers.
[0007] Multimedia content including text, graphics, video and sound on the Internet needs to be highly adaptive. Recently the World Wide Web Consortium (W3C) adopted the Extensible Markup Language (XML) as a universal format for structured documents and data on the Web. The base specifications are XML 1.0, W3C Recommendation February '98. See online URL (www.w3.org) for more information. A content management system based on XML along with (Extensible Stylesheet Language) XSL enforces separation of content and presentation, thus allowing flexible rendering of the content to multiple device types. Similarly, such a content management system allows maximal reuse of information and data through the composition of XML fragments as well as ensures data integrity through the consistent use of information.
[0008] In addition to the availability of XML, new interfaces and devices are emerging, the diversity of users is increasing, machines are acting more and more on users' behalf, and net activities are possible for a wide range of business, leisure, education, and research activities.
[0009] Systems and methods are being developed for generating more flexible content and a capability to manage frequent changes to content. One system for achieving maximum flexibility and reuse is disclosed in the patent application entitled “Method and System for Efficiently Constructing And Consistently Publishing Web Documents” filed on Apr. 4, 1999 with application Ser. No. 09/283,542 with inventors JR Challenger et al. now [Pending] and commonly assigned herewith to International Business Machines. Disclosed is a system and method where the multimedia content is broken down into fragments that can be combined into published documents.
[0010] The use of XML in content management systems introduces the following new challenges:
[0011] 1. A need exists to maintain information about the functional and semantic role of each richly tagged fragment. This information describes what the content is about, who the target audience is, and its relationship to a taxonomy or other fragments. The same mechanism should support efficient searches of particular fragments.
[0012] 2. A need exists for an efficient method to track the effects of changes in a particular richly tagged fragment or style and propagate those changes throughout the information space.
[0013] 3. A need exists for a user interface that shields the content contributor from knowing the underlying syntax and complexities of the XML documents;
[0014] 4. A need exists for finding relevant document fragments on demand, keeping track of the dependencies between document fragments, transforming combinations of those document fragments into viewable pages available to multiple device types, and designing a content creation tool that does not overwhelm the contributor with the details and the complexities of the underlying system.
[0015] Accordingly, a need exists for a system and method that manages and publishes the information content of a Web site, or an Internet information portal, in a way that separates the information from the form and reuses the stored information and enables the presentation in the user interface to be customized for different audiences and target devices and media.
[0016] Other prior art systems/tools that relate to the XML editing include markup languages that use XML to declaratively specify user interfaces, fully functioning editors, and systems that publish XML documents. Bluestone Software's XwingML [for more information refer to URL www.bluestone.com] enables the creation of Java Swing user interfaces without coding. The GUI (Graphical User Interface) is declaratively specified in XML and is translated into working Java code. This approach separates the GUI code from the application logic. Their DTD specifies the entire set of classes and properties for all of Swing components. However, the Bluestone Software's XwingML creates arbitrary interfaces in a declarative fashion rather than creating specific interfaces that reflect the document types for a given publishing environment. Accordingly a need exists for a method and tool to accomplish creating specific interfaces that reflect the document types for a given publishing environment.
[0017] Another prior art editor for XML is XmetaL, from Softquad, [refer to online URL www.xmetal.com] which is a flexible XML editor that supports three views into XML files. These views include raw XML mode, Tags-On mode that provides a WYSIWYG presentation with direct access to elements and attributes, and a full WYSIWYG mode in a word-processor like environment. The XmetaL tool although useful has the problem that separate style sheets need to be used to support the editing vs. the publishing process. In addition, one stylesheet may not include all of the elements that would be used on other platforms or for different uses. Accordingly, a content editor is needed that separates the content from presentation and the reusability of that content on different delivery environments such as PCs, PDAs and phones.
[0018] Still another prior art content editor system is Interwoven [refer to online URL www.interwoven.com] which is a complete publishing system that supports HTML as well as XML. It provides an end to end solution from content creation to promotion and publishing. It also has a templating tool that provides the means to produce form-based pages. However, its support of reusable fragments within the environment is rather limited and the publishing to viewable pages is performed using non-standard methods.
[0019] Accordingly a need exists for a method and tool to accomplish creating and reusing content fragments using standard methods for a given publishing environment.
[0020] The system for end-to-end content publishing using XML with an object dependency graph is based on the following two design principles: First, separation of content and style: Information stored in the content management system is independent on how it is going to be presented. The presentation style is encapsulated elsewhere and can be used to customize the look and feel based on the end-user preferences as well as the delivery methods and devices. Second, reusability of information content: By encapsulating common information in fragments and subfragments and making these fragments insertable in other fragments, thereby avoid scattering and duplication of information. This enables a user to restrict the edit operations to a limited number of relevant fragments, to affect global changes. In addition, the present invention provides data consistency and data integrity in the content management.
[0021] The implementation of the system is based on the following:
[0022] 1. Standards based design: The different components of the system interact through well-defined API's based on industry standards, such as: XML, XSL, WebDAV, HTTP, DASL.
[0023] 2. Pervasive use of XML: XML is used not only as the content model but also as the language in which information is transferred between the different parts of the system.
[0024] The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention will be apparent from the following detailed description taken in conjunction with the accompanying drawings.
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039] It is important to note that these embodiments are only examples of the many advantageous uses of the innovative teachings herein. In general, statements made in the specification of the present application do not necessarily limit any of the various claimed inventions. Moreover, some statements may apply to some inventive features but not to others. In general, unless otherwise indicated, singular elements may be in the plural and visa versa with no loss of generality.
[0040] In the drawing like numerals refer to like parts through several views.
[0041] Exemplary Network—
[0042] Referring to
[0043] A removable computer readable memory medium in the form of a diskette
[0044] Discussion of Hardware and Software Implementation Options
[0045] The present invention, as would be known to one of ordinary skill in the art could be produced in hardware or software, or in a combination of hardware and software. The system, or method, according to the inventive principles as disclosed in connection with the preferred embodiment, may be produced in a single computer system having separate elements or means for performing the individual functions or steps described or claimed or one or more elements or means combining the performance of any of the functions or steps disclosed or claimed, or may be arranged in a distributed computer system, interconnected by any suitable means as would be known by one of ordinary skill in art.
[0046] According to the inventive principles as disclosed in connection with the preferred embodiment, the invention and the inventive principles are not limited to any particular kind of computer system but may be used with any general purpose computer, as would be known to one of ordinary skill in the art, arranged to perform the functions described and the method steps described. The operations of such a computer, as described above, may be according to a computer program contained on a medium for use in the operation or control of the computer, as would be known to one of ordinary skill in the art. The computer medium which may be used to hold or contain the computer program product, may be a fixture of the computer such as an embedded memory or may be on a transportable medium such as a disk, as would be known to one of ordinary skill in the art.
[0047] The invention is not limited to any particular computer program or logic or language, or instruction but may be practiced with any such suitable program, logic or language, or instructions as would be known to one of ordinary skill in the art. Without limiting the principles of the disclosed invention any such computing system can include, inter alia, at least a computer readable medium allowing a computer to read data, instructions, messages or message packets, and other computer readable information from the computer readable medium. The computer readable medium may include non-volatile memory, such as ROM, Flash memory, floppy disk, Disk drive memory, CD-ROM, and other permanent storage. Additionally, a computer readable medium may include, for example, volatile storage such as RAM, buffers, cache memory, and network circuits.
[0048] Furthermore, the computer readable medium may include computer readable information in a transitory state medium such as a network link and/or a network interface, including a wired network or a wireless network, that allow a computer to read such computer readable information.
[0049] Overview of Trigger Monitor
[0050] This invention presents a system and method for publishing documents, for example Web documents, efficiently and consistently. This method may be used at a wide variety of Web sites of the World Wide Web. The present invention may be applied to systems outside the Web as well, for example, where compound objects are constructed from fragments. A fragment is an object which is used to construct a compound object. The term “document fragment” or just “fragment” is used throughout this patent to refer to these reusable information objects. Which in their simplest form are an XML fragments. An object is an entity which can either be published or is used to create something which is publishable. Objects include both fragments and compound objects. A compound object is an object constructed from one or more fragments.
[0051] In generating Web content, publishable Web pages known as servables may be constructed from simpler fragments. A servable is a complete entity which may be published at a Web site. Publishing an object means making it visible to the public or a community of users. Publishing is decoupled from creating or updating an object and generally takes place after the object has been created or updated. It is possible for a servable to embed a fragment which in turn embeds another fragment, etc.
[0052] While fragments significantly increase the capabilities of a Web site, a number of problems may arise which need to be solved, including the following:
[0053] (1) When changes to underlying data occur, how does the system determine all objects affected by the change?
[0054] (2) How does the system determine a correct and efficient order for updating fragments and servables?
[0055] (3) How can a system consistently publish Web pages in the presence of fragments? For an illustrative example, refer to
[0056] A method for solving problem (1) is described in a commonly assigned patent application, U.S. Ser. No. 08/905,114, entitled “Determining How Changes to Underlying Data Affect Cached Objects” by J. Challenger, P. Dantzig, A. lyengar, and G. Spivak. The current invention solves problems (2) and (3).
[0057] It should be understood that the elements shown in
[0058] Dependence edges may preferably be used to identify the following:
[0059] a. The objects affected by a change to underlying data.
[0060] b. The order in which objects are desired or needed to be updated.
[0061] In one illustrative example,
[0062] Whenever objects change, the system is notified in block
[0063] In block
[0064] In block
[0065] (1) In a number of environments, publishing documents after the documents are updated may be time-consuming. Incremental publication may make certain documents available sooner than would be the case using the all-at-once approach.
[0066] (2) It is conceivable that some environments may have constraints on the number of documents which can be published atomically. The incremental approach reduces the number of documents which need to be published in single atomic actions.
[0067] Incremental publishing may be more difficult to implement than the all-at-once approach because of the need to satisfy consistency constraints such as the ones described earlier.
[0068] Referring to
[0069] Consistency edges are also used to indicate that two servables both embed a common fragment whose value has changed and thus are to be published concurrently. If c and d both embed a common fragment whose value has changed, then a consistency edge from c to d and d to c should exist.
[0070] It is now explained how to determine whether two servables both embed a common changed fragment. As a node a in S is constructed in the order defined by the topological sort in block
[0071] A directed graph T is now created including servables in S (S is the set of all objects which have changed)and consistency edges. For two servables a and b in S, an edge from a to b exists in T if:
[0072] (1) A hypertext link from b to a exists, or
[0073] (2) a and b both embed a common changed fragment. This is true if comprising-nodes(a) and comprising-nodes(b) have a node in common. In this case, a consistency edge from both a to b and b to a exist.
[0074] In step
[0075] In step
[0076] An extension of this algorithm may be to use either more or fewer consistency constraints in the method depicted in
[0077] A quick publishing and censoring system and method which may be used is described in “METHOD AND SYSTEM FOR RAPID PUBLISHING AND CENSORING INFORMATION”, Attorney docket number YO999-040(8778-753), filed concurrently herewith, commonly assigned and incorporated herein by reference. A system and method which may be used for publishing Web documents is described in “METHOD AND SYSTEM FOR PUBLISHING DYNAMIC WEB DOCUMENTS”, Attorney docket number YO999-039(8778-754), filed concurrently herewith, commonly assigned and incorporated herein by reference.
[0078] Functional Block Diagram of Various Software Components—
[0079]
[0080] The system consists of the following main components:
[0081] 1. Client editor application GUI
[0082] 2. Dispatcher
[0083] 3. MetaStore Manager
[0084] 4. File system manager
[0085] 5. Content Store Manager
[0086] The communication protocols between the different components are based on industry standards: WebDAV (World Wide Web Distributed Authoring and Versioning), DASL (Distributed Authoring Search Language), and HTTP (Hypertext Transfer Protocol). XML is used not only for creating the multimedia content, but also for system configuration documents at startup and as the language for information exchange between the different parts of the system. Now each of these software components
[0087] Client Editor GUI—
[0088] Client editor application GUI
[0089] Data Model
[0090] As previously described above, the present invention operating on server
[0091] A simple fragment is an XML file that contains only text data and metadata, for example a product specification.
[0092] A compound fragment is a simple fragment that contains a pointer to an accompanying file, such as a video or image file, an XSL style sheet, or a hand- crafted HTML page.
[0093] An index fragment is an automatically updated XML file that indexes any number of servables, for example the five latest press releases.
[0094] A composite fragment is a simple fragment that contains references and imports content from one or more fragments.
[0095] A servable is a composite fragment that contains references to one or more style sheet fragments, which allow it to be transformed into one or more final published pages.
[0096] Each fragment type and servable type has an associated DTD (A document type definition (DTD) is a specific definition that follows the rules of the Standard Generalized Markup Language) that describes the structure of the XML document. The DTD specifies both metadata elements and content elements. In another embodiment, schemas specify the definition of the document structure. The DTD must abide to some constraints imposed by the present invention. The root element has a child node that is common to all documents called SYSTEM with the children:
[0097] FRAGMENTID, CREATOR, MODIFIER, CREATIONTIME, LASTMODIFIEDTIME, PAGETYPE and CONTENTSIZE.
[0098] These elements are shared across all documents and comprise the common metadata used in searches. These elements are not displayed in the interface, since their value can be inferred from the context. Additional metadata, such as KEYWORD and CATEGORY, are provided by common DTD elements to allow functional and semantic categorization of the fragments.
[0099] The metadata elements are used both at author-time and run-time. At author-time the metadata elements are used for categorization of fragments and for efficient searches of subfragments. At run-time, the same metadata elements can be used to perform personalization in a dynamic Web site.
[0100] A fragment can include other fragments as subfragments. This enables the reuse of content. To accomplish inclusion of a subfragment, the entity reference that defines all subfragment types must be included in the DTD. Currently, the declaration of a subfragment contains the SUBFRAGMENTTYPE attribute set to the appropriate document type, as illustrated in the following example:
[0101] <!ENTITY SUBFRAGMENTTYPES SYSTEM
[0102] “http://fserver/dtd/subfragmenttypes.txt”>
[0103] <!ELEMENT SUBFRAGMENT (#PCDATA)>
[0104] <!ATTLIST SUBFRAGMENT SUBFRAGMENTTYPE
[0105] (%SUBFRAGMENTTYPES;) “IMAGEFRAGMENT” #FIXED>
[0106] where server is the name of the server
[0107] This piece of a DTD specifies that a particular type of subfragment, IMAGEFRAGMENT, is needed as content for the element SUBFRAGMENT. The subfragment syntax will be replaced by the XLink syntax as it becomes a W3 recommendation and XML parser and XSL transformation engines support the syntax.
[0108] In the present invention, servables always result in one or more final published pages. The DTD of a servable indicates the names of the XSL stylesheets that can be used for layout for that particular type of document.
[0109] Because the servable includes content from subfragments, the stylesheet is written to work on the so-called expanded servable. Before page assembly, a servable is temporarily rewritten to include the content of all its subfragments. Thus the system implements a temporary solution that mimics the XLink functionality by expanding the servable.
[0110] In one embodiment, an IBM DB/2™/UDB database is used to store metadata that can be used either at author-time or run-time. In one embodiment, the mapping of the metadata elements of the XML document to the columns of the relational database is performed using the DB/2 XML Extender package. For each DTD, a Document Access Definition (DAD) is defined that specifies this mapping. The DAD is itself an XML document that abides to a particular DTD. Each DAD defines the relationship between the hierarchical structure of the XML document and the columns and tables of the relational database. The DB/2 XML Extender package uses the DAD to decompose the input XML document into the columns, or to compose an XML document from selected columns. A second embodiment that does not rely on DAD consists of the programmatic mapping of the XML elements into the database columns.
[0111] In summary, the addition of a new document type to the system requires the definition of a DTD and the corresponding metastore mapping. If the document is a servable, stylesheets defined in XSL are also required.
[0112] Automated User Interface Creation
[0113] One of the biggest challenges of any publishing system is to remove as much complexity from the users' tasks as possible. When dealing with a relatively new technology like XML/XSL this aspect of the system becomes even more important. By hiding the syntax of XML from the editors and authors, domain experts can take on the role of creating and modifying the content without worrying about the syntax of a particular markup language.
[0114] When using the Content Editor
[0115] Users are assigned roles in the system and each role, in turn, is assigned specific document types. A user assigned to an edit role can only create or modify a document assigned to that role. When the user selects a document type to create or edit, the Content Editor
[0116] DTD to Interface
[0117] In this present invention, the term “interface controls” or “GUI widget” or just “widget” is used to describe an element of a GUI
[0118] The Content Editor
[0119] The present invention uses a number of assumptions in handling DTDs and the automatic creation of the user interface. Most notably, special attributes are used to assist in the transformation of an XML element into an appropriate interface widget. In one embodiment, the interface widgets are created for DTD elements, not for DTD attributes and a special type attribute for these elements enables the transformation into an appropriate interface widget.
[0120] Until XML schemas (see online URL www.w3.org) become widely adopted, there is no standard way to provide data typing for elements in the DTD. The present invention solves this problem by including the attribute, DATATYPE, whenever an element is to be displayed in the interface If an element does not contain a DATATYPE attribute no widget is created in the interface for that element. Children elements, however, may still contain DATATYPE attributes to specify their user interface. In addition, whenever an element has the DATATYPE attribute, it contains a child of type PCDATA. Thus, through typing the DTD can specify, for example, whether a one line input, a medium text area or a large text area is required.
[0121] In the partial DTD shown here, TITLE, SHORTDESCRIPTION, and BODY each specify different text input widgets to use.
[0122] <!ELEMENT TITLE (#PCDATA)>
[0123] <!ELEMENT SHORTDESCRIPTION (#PCDATA)>
[0124] <!ELEMENT BODY (#PCDATA)>
[0125] <!ATTLIST TITLE DATATYPE
[0126] (%UITYPES;) “STRING” #FIXED>
[0127] <!ATTLIST SHORTDESCRIPTION DATAYTPE
[0128] (%UITYPES;) “SHORTTEXT” #FIXED>
[0129] <!ATTLIST BODY DATAYTPE
[0130] (%UITYPES;) “LONGTEXT” #FIXED>
[0131] The external entity UITYPES contains the list of all GUI widgets known to the editor. These data types include:
[0132] DATE—widget accepting only a date entry.
[0133] INTEGER—widget accepting only a numerical entry.
[0134] STRING—a one line text box widget.
[0135] SHORTTEXT—a short multi-line text area widget.
[0136] LONGTEXT—a long multi-line text area widget.
[0137] CHOICE—a drop-down menu that stores user's selection.
[0138] ASSOCLIST—a drop-down menu that stores code corresponding to user's selection.
[0139] BROWSESERVER—a widget enabling directory browsing on the server.
[0140] BROWSELOCAL—a widget enabling directory browsing on the local machine.
[0141] LABEL—a non-editable widget displaying the name of the element.
[0142] In another embodiment, additional types may be used.
[0143] A widely used interface widget is the drop-down menu. To accomplish this, the DATATYPE attribute is set to the UITYPE CHOICE, and the CHOICES attribute to a default value from a list of options. The options can be defined as an external entity for reuse across many DTDs. For example,
[0144] <!ENTITY % CATEGORYDEFS SYSTEM
[0145] “http:/Iserverldtd/categorydefs.txt”>
[0146] defines an external entity for a set of category choices.
[0147] These choices could be defined as the types of IBM Netfinity™ Servers:
[0148] NONE | Netfinity
[0149] Netfinity
[0150] Netfinity
[0151] The definition for CATEGORY in the DTD might then be:
[0152] <!ATTLIST CATEGORY
[0153] DATATYPE (%UITYPES;) “CHOICE” #FIXED
[0154] CHOICES (%CATEGORYDEFS;) “NONE” #REQUIRED>
[0155] The content editor creation algorithm assumes that if the first word in the set of CHOICES is the string NONE, and the user selects it and the element is optional, the XML element will not appear in the document.
[0156] In a DTD, elements can either be required, optional, or occur
[0157] In the present invention, auxiliary lookup tables further expand the definition of the DTD, beyond what the DTD syntax permits. These lookup tables are encoded as XML files which are read by the client GUI into a hash table for fast access to the information. An auxiliary lookup table can store various additional information. In one embodiment, the lookup table stores the DATATYPE values for each DTD element. In another, a lookup table stores all translations of element names and help strings, as well as the labels in the GUI, to a given language. More specifically, when a user logs in and the GUI is initialized, the default language in the user's profile determines which translation lookup table to load. The GUI uses the lookup table to display all labels, DTD element names and help strings in the appropriate language. In yet another embodiment, a lookup table stores a more user friendly display name for DTD elements, to help make the GUI more approachable by a non-technical editor. The auxiliary file could be used for further information not limited to the types of information listed above.
[0158] Using the client editor GUI
[0159] Turning to
[0160]
[0161] Because of the strict way that the interface is constructed, each widget knows whether or not it is required and whether or not more elements can be added to an XML instance. If an element in the DTD is required, the widget will be highlighted (e.g. colored brightly) to allow the user to distinguish which fields must be filled in before submission. Therefore, only well-formed and valid documents are submitted to the server.
[0162] Although the present invention uses existing XML technologies and standards with, newer standards, such as XLink and XML Schema, and technologies based on those can be leveraged to improve the design and the implementation of the present invention. As it should be understood that the user of those technologies are within the true scope and spirit of the present invention.
[0163] In yet another embodiment a number of features including automated extraction of keywords, automated translation and a Web-centric client that requires no installation and can easily be accessed from any browser.
[0164] Object Oriented GUI
[0165] Each Java widget is encapsulated in a set of classes that include additional functionality. This object-oriented approach allows for modular design and future extensions to the set of interface widgets. Inheritance and generic methods are used throughout the class hierarchy for the definition of the interface widgets. Each UITYPE may also provide very specialized functionality. For example, BROWSELOCAL and BROWSESERVER provide a button which, when clicked on, opens a dialog to choose a file on the local system or a directory on the remote server, respectively. This functionality is encapsulated within these particular classes. These widgets are illustrated in
[0166] UITYPE LONGTEXT element tags are also handled specially within the system. The system assumes that UITYPE LONGTEXT tags may be composed of one or more PARAGRAPH tags. Blank lines in the input are interpreted as paragraph separators. When constructing the XML document, these PARAGRAPH tags are automatically composed within the outer UITYPE LONGTEXTtag. This functionality is inherited through the text widget class hierarchy. In general, this functionality can be enabled or disabled as the application requires.
[0167] Process Flow for Client Editor GUI
[0168]
[0169] When launching the GUI interface, the user enters a user name and password. Based on the roles assigned, the user is authorized to create certain types of documents. Only authorized document types appear in the user's GUI. For example, someone outside of accounting would not be authorized to create a bill.
[0170] Get DTD & Parse DTD—
[0171] The process begins with step
Type and context information - 1308 Function - For every element in the DTD, the following information is determined: 1) its location in the hierarchy (its XPath); and 2) type information for DTD elements. Output- Type (e.g., a single line of input, multi-line input, choice element, etc) and context (XPath) information for each element in the DTD. Mapping Information for Type and Context - 1310 Function - Given a DTD element, its type and its XPath, the system maps this input information to the GUI values for generating the interface for that element. The system uses the editor's user profile and lookup tables to determine the values. These GUI values include but are not limited to: 1) the type of input widget to display in the interface, (e.g. simple 1- line string, multi-line text area, drop-down menu, directory browser for server, directory browser for local machine, etc). 2) the name to display in the interface, translating the element name to user friendly text in the user's preferred language using a lookup table. 3) the value of a help string to be made available in the interface if the user needs it (e.g., as a tooltip) in the user's preferred language. Input - DTD element name, its type and Xpath, and attributes from editor's user profile from 1308. Output:GUI values to display DTD element. Generate GUI - 1312 Function - Taking the input information, this step processes the DTD elements in order and recursively, while maintaining hierarchical inclusion, generates the GUI 702 as a set of interface widgets to be edited by the user. The hierarchy can be represented by indentation within the interface to indicate when one item is included by another. During this recursion, the process maintains a link between the interface widget and the corresponding element in the XML document under creation. If the interface is constructed for an existing XML document, the previously stored content is supplied to be displayed in the widgets. An existing XML document may also contain more than one occurrence of an element. If so, the process adjusts the interface accordingly and adds the elements. Also, the process maintains and displays information about whether an element is required or not in the final document. This information is used in the test in Check in step 1324. If an element can occur more than once in the interface, affordances are placed in the interface (i.e., “+/−” buttons) so that the user can easily repeat or delete repeated elements from the XML document being created/edited. Input: the GUI values to display DTD elements from 1310. Content from 1314 if editing an existing document. Output: - the interface to display in either a web-based client or standalone java client, with content if generating from an existing XML document. Content from Existing XML Document - 1314 Function - This step incorporates the content of an existing document into the GUI being constructed. Input - XML file from file system 714. Output - The content to be displayed in the interface.
[0172] Display GUI—
[0173] The results of the user input are then used to generate the GUI
[0174]
[0175] Create XML Document from GUI Widgets—
[0176] Function—The process extracts the content from the GUI widgets and places it into the XML document being constructed. This is accomplished by looping over the hashtable to get each widget and its corresponding XML element, extracting the content from the GUI widget and placing it into the XML element. To do this we encapsulate this information in the interface object with generic GET and SET methods. This allows us to call a standard method, independent of type, on the interface object to get user input and place it into the XML element.
[0177] Input—XML document being created or edited and the hashtable that stores the GUI widgets and their corresponding XML element.
[0178] Output—An XML document that represents the complete document filled in with the content from the GUI widgets
[0179] Check-In Process
[0180] In step
[0181] Dispatcher—
[0182] The Web application consists of four servlets and three subcomponents. The main servlet is the dispatcher that coordinates the activities of all subsystems and interfaces with the client application. The source and sink servlets allow Trigger Monitor to retrieve fragments from the file system and write assembled pages to it. The admin servlet provides for administration and monitoring functionality. The three subsystems interface with the metastore
[0183] A dispatcher
[0184] MetaStore Manager—
[0185] A MetaStore Manager
[0186] File System Manager—
[0187] The file system
[0188] Content Store Manager—
[0189] A Content Store Manager
[0190] MetaStore—
[0191] The metastore
[0192] The system-generated tags correspond to the children element of the SYSTEM element defined in every DTD, as described in an earlier section. The non-system generated tags correspond to additional elements in the DTDs that contain the content or are necessary for maintaining the functional and semantic role of the fragments. These tags can be further grouped into two parts: 1) the tags which are used for describing the XML object, such as keywords, categories and publishing information; and 2) the tags which hold the content of the XML object, such as TITLE and SUMMARY.
[0193] In one embodiment, the metastore
[0194] IBM DB/2™ is a relational database, and thus cannot be used directly to store an XML object, because the XML object has a hierarchical data model. A mapping from XML data model to a set of database tables is needed. In one embodiment, DB/2 XML Extender 7.1 is used to map the XML document elements that correspond to the metatags into a set of pre-defined DB/2 tables. The DB/2 XML Extender is an IBM product developed to support the XML-based e-business applications using the IBM universal database—UDB.
[0195] The XML Extender provides two access and storage methods in using DB/2 as an XML repository: XML column and XML collection. The XML collection access method decomposes XML documents into a collection of relational tables or composes XML documents from a collection of relational tables. These are exactly the operations required for the metastore
[0196] A second embodiment consists of a programmatic mapping of the XML elements into the database columns.
[0197] Search
[0198] For a content management system that will potentially have a very large number of interrelated documents and fragments, finding and locating a particular fragment or servable efficiently becomes one of the major challenges. Accordingly, such an operation based on a directory structure browsing operation is both inefficient and unreliable. The browsing operation is replaced with a search operation that leverages the meta-information that is stored in the metastore
[0199] The search feature requires implementation at both client and server sides. At the client side
[0200] At the server side
[0201] In order to ensure the scalability of the application, a number of techniques have been used to streamline database access operations. First, a database connection pool is used to maintain a set of active connections, instead of creating a new connection for each access. Second, the search fields are indexed in the database to speed up search operations. Third, the search results are cached to minimize repeated access to the database for the same query from the same client
[0202] Fragment Dependency Store—
[0203] The fragment dependency store
[0204] The fragment dependency store
[0205] Several Trigger Monitor stages are chained together to allow for multistage publishing. Trigger Monitor is written in pure Java running in Java Virtual Machine
[0206] In the present invention, several classes have been created for Trigger Monitor to implement three handlers:
[0207] 1. the Extension Parser;
[0208] 2. the Dependency Parser; and
[0209] 3. the Page Assembler.
[0210] Each of these classes are now described.
[0211] Extension Parser
[0212] Within the present invention, Trigger Monitor manages different types of files differently based on their extensions. Servables, simple, compound, and index fragments, stylesheets and multimedia assets are all treated slightly differently in the publishing flow.
[0213] The Extension Parser takes in a name of a fragment, and returns an extension used in the Trigger Monitor configuration files to specify actions to take during the publish process. The appropriate behavior for each type of fragment is defined in the Trigger Monitor configuration files. These behaviors include moving assets to different stages within the system as well as assembling the servables into the expanded mode described in an earlier section and invoking the XSL transformation to create viewable pages.
[0214] Dependency Parser
[0215] The Dependency Parser analyzes an XML object and updates the ODG maintained by Trigger Monitor accordingly. The ODG maintains the dependencies between fragments. Currently defined are two types of dependencies: composition and style. The composition dependency maintains structural information between fragments and between a complex fragment and its associated asset. The style dependency maintains information about the relationship between servables and stylesheets.
[0216] Dependencies are considered to point from the subfragments to the fragments that include them. In the case of complex fragments, the dependency is from the fragment to the associated assets.
[0217] Page Assembler
[0218] Trigger Monitor is configured to invoke in the present invention Page Assembler for servables. The Page Assembler assembles the servable into the expanded mode by including the contents of all included subfragments, and then invokes the XSL transformation engine to produce viewable output pages. As discussed in an earlier section, the first step of creating an expanded XML is a method used in the absence of a final XLink standard, and the lack of tools that handle XLink constructs.
[0219] The type of the viewable page, as well as its target device, is determined from the stylesheet. The assembled XML and all the resulting viewable pages are written to one file, which is later split up, and the these pages are written to the appropriate directories on the server
[0220] Chaining of Trigger Monitor Stages
[0221] Currently, two Trigger Monitor stages are used in the publish process. They share an ODG, and the sink of the first one is the source of the second, creating a publishing chain.
[0222] When a fragment is checked in to the Content store, it is added to the shared ODG, and a publish command is issued to the first handler. Trigger Monitor reads the fragment XML from the source servlet, uses the extension parser to find its extension, and then uses the dependency parser to find dependencies to add to the ODG. The page assembler then pulls in the contents of the fragment's subfragments, and if the fragment is a servable, combines it with its stylesheets to produce the output pages (e.g., HTML files). The servable XMLs, output pages, binary files, and stylesheets—all fragments affected by the check-in—are sent to the servlet specified as the sink of the first handler. When a servable has been approved, a publish command on the servable fragment is issued to the second handler. It is reassembled and recombined with its XSLs, and the resulting XML and output pages are published to the production Web server through a second sink servlet. Binary files (such as images) are also published to the second sink. This is where the Web server pulls the final HTML and image files from.
[0223] Detailed Process Flow—
[0224]
[0225] 1. Information Analysis and Modeling
[0226] 2. Target Audience Analysis
[0227] 3. Target Device Analysis
[0228] 4. Workflow and Role Analysis
[0229] The four inputs above assist in defining how the information on the site should be organized and decomposed into reusable fragments of information. The analysis will directly impact the document templates, stylesheets, and auxiliary lookup tables that get constructed. In addition, this analysis will inform the process of defining the meta data that will be stored in the metadata database
[0230] The end result from this process inputs
[0231] Identify Meta Information, Servables and Fragments—
[0232] Next in process step
Function - Information architects and system designers identify the metatags and document types that will be used throughout an implementation of this process. They determine the fragmentation granularity and the composition of each servable and fragment from subfragments. Input - The input is the results of the modeling and analysis from the external modules for information analysis, target audience analysis, target device analysis and workflow and role analysis. Output - The output from this step is information to guide the construction of the metastore 712, the document templates and the stylesheets constructed in steps 1012, 1014 and 1016. Initialize MetaStore - 1012 Function - A database administrator creates the metadata database(s) 712 and database tables. Input - Input is a database management tool and the results of step 1010. This includes the type of meta tags to be included in the tables within the metadata database 712. Output - The metadata database 712 is initialized and made operational. The tables and columns are setup in the database 712 that will allow for the storing and searching of documents within the system. Create Document Templates - 1014 Function - A domain expert creates document templates that define the structure of the servables and fragments identified in step 1010. In addition, auxiliary lookup tables for DTDs as well as the DTD-to-database mapping files. Input - The input is the results of the information modeling and analysis modules (1002-1008) from step 1010. Output - Multiple document templates (e.g., DTDs or schemas) that define the structure of each document type. These templates describe the structure of each document fragment and servable and how the elements in the document are related, including how many times (1 required, optional, 0 or more, or 1 or more, etc) the element will appear in the final document. The lookup tables contain more information on each DTD element, such as the type information for each element, help strings, and any translations to more user friendly names or other languages. The lookup table allows for the GUI to be automatically generated from the DTD. Further files specify the mapping of DTD elements to database tables. Create Stylesheets - 1016 Function - A designer creates the stylesheets that determine the presentation and layout of the information in each servable for each target audience and target device. Input - Results of the analysis modules, and results of step 1014. Output - The output is multiple stylesheets for each servable document for each specified device. Create/Edit and Compose Content - 1018 Function - Authors and editors create content for the Web site. A more detailed description of this step with sub-steps is given in FIG. 11. Input - Content creation interface 702, document templates, knowledge about the requirement for new content or about the necessity to edit existing content. Output - Content files in file system 714, meta information in metastore 712, information about the content dependencies in the object dependency graph. Preview and Approve Content - 1020 Function - Authors, editors and approvers view the output produced from the content using the selected stylesheets. Input - XML content and stylesheets along with the viewing interface on client editor 702. Output - The output is the fully rendered pages on the Web or simulated on various devices (e.g., PalmPilot ™) to be reviewed by appropriate person in the workflow. Publish - 1022 Function - Approvers and publishers publish the content to the presentation system. Input - Input consists of the content created in step 1018, stylesheets created in step 1016, and the knowledge that the servables are ready for publishing from step 1020. Output - Approved output pages are sent to the presentation engine.
[0233] Presentation Engine—
[0234] Presentation engine such as IBM's WebSphere™ platform is used to present the resulting Web page.
[0235] Details of Create/Edit Process Detail Flow—
[0236] The following is a further detail of the process flow
Editor Selects Type of New Document - 1102 Function - The editor selects the type of document to be created from a menu of possible types available for this person in the roles that they are associated with. Input - A list of the document types that the particular editor can create. Output - The output is the selection of a particular document type to edit. This may be a fragment or servable document type. System Dynamically Creates a Blank Form - 1104 Function - The system creates a blank form based on the document template for the particular document type chosen. Input - The user selection from 1102 and the document type definitions from step 1118. Output - A form displayed in the client GUI 702 that allows the user to interactively add the content to the form. The form is based on the document template and only allows valid documents to be constructed based on the specification in the document type definition. Editor Searches and Selects a Document - 1106 Function - The editor searches and selects an existing document using the metastore 712. Input - The search interface allows the user to specify the constraints of the specific documents they want to retrieve. Output - The output is the selection of a particular document to retrieve from the file system 714. System Retrieves the Document - 1108 Function - The system retrieves the document. Input - The input is the user's selection from step 1106 and the documents already created in the system. Output - The output is the XML document and its attachments (if any). System Dynamically Creates a Form and Fills it in - 1110 Function - The system dynamically creates a form similar to the form created in step 1104. But in this case, the system automatically fills it in with the values of the elements from the selected document. Input- Input is the retrieved document from 1108 and the document definition from 1118. Output - A form displayed in the client GUI 702 , with the fields of the form initialized to the values of the elements of the retrieved document. Editor Fills in the Form - 1112 Function - The editor fills the form with content for the newly created document. Input - Input to this step is the form created in step 1104. Output - The output is the form with all required fields filled in. Search/Select Sub-Fragments - 1114 Function - The editor searches for subfragments and, if necessary, references them in the document being created/edited. Input - The search interface is used to find relevant subfragments inserted into the document being created/edited. Output - The output is a reference to a subfragment placed into the form of the current document. Editor Modifies the Form - 1116 Function - The editor modifies the form of an existing document. Input - Input to this step is the content and form created in step 1110. Output - The output is in the form with all required fields filled-in. Editor Checks in the Document - 1118 Further details are given in the functional block diagram of Function - The editor checks in the created document. Input - Input is the filled in document in the editor window from either creating a new document 1112 or editing an existing one 1116. Output - Output is the acknowledgement of the checkin process 1200.
[0237]
Details of Editor Checks in Document 1202 Function - The editor checks in the document to save it in the system. Input - The form input from either a newly created document 1112 or a modified existing document 1116. Output - The output is an XML document that conforms to the document template for the specified document type. Save Document as XML File - 1204 Function - The document is saved in the file system 714. Input - XML document from step 1202 is provided as input. Output - The output is the XML file in the file system 714. Save Attachments - 1206 Function - Any uploaded attachments (e.g., stylesheets, images, etc) to the XML document are saved in the file system 714. Input - The input is the content transferred to the server along with the XML document from 1204. Output - The output is the attachments saved in the file system 714. Save Meta Information in Metastore - 1208 Function - Meta information from the XML is saved to the metastore database 712. This includes automatically constructed data, such as user and modified time, as well as application specific meta tags such as, category definitions. Input - The XML file being saved is the input to this step. Output - The output is the meta data in the appropriate tables within the metastore database 712. Update ODG - 1210 Function - The function of this step is to update the object dependency graph (ODG) with the various links between fragments. These links are inclusion links (e.g., subfragments included within another fragment) and other links such as stylesheet links (e.g., links between stylesheets and their servables) Input - Input is the XML file from step 1208 with references to other fragments (e.g., subfragments or stylesheets) Output - The output is an updated ODG with proper interdependencies between fragments in fragment dependency store. Generate Preview Pages - 1212 Function - The purpose of this step is to cache the preview pages so they are immediately available when editors/approvers want to preview the servable pages. Input - The update to the ODG 1210 triggers a publish of the servable pages from the XML file. Output - The output is the temporary preview files in the file.
[0238] While the invention has been illustrated and described in the preferred embodiments, many modifications and changes therein may be affected by those skilled in the art. It is to be understood that the invention is not limited to the precise construction herein disclosed. Accordingly, the right is reserved to all changes and modification coming within the true spirit and scope of the invention.