[0001] The present invention relates to a document processing apparatus and method which analyzes a tagged document, e.g., a Hyper Text Markup Language (HTML) document and forms another tagged document containing original sentences and translated sentences. The present invention also relates to a recording medium for recording such a tagged document.
[0002] With the recent proliferation of personal computers and communication apparatus, people have become able to use communication networks represented by the Internet, i.e., Internet protocol (IP) communication networks, and to easily obtain various sorts of information through the networks. Ordinarily, the World Wide Web in the Internet generally uses HTML as a language for describing information. Dynamic HTML (DHTML) and Extensible Markup Language (XML) are other languages presently used to form tagged documents.
[0003] Conventionally, to form home pages (also called Web pages) containing the same information described in different languages, e.g., English and Japanese, the process of separately forming each home page is required. That is, the steps of forming sentences separately in each language, pasting common images including graphs and figures, separately setting links from each of the English home page and the Japanese home page, etc., are required.
[0004] Home pages are open to the public on the Internet and can be read by people everywhere in the world. Therefore, people who set up their home pages for various readers to read are making home pages having the same format and contents but having sentences written in different languages.
[0005] The English and Japanese home pages formed as described above need scrupulous attention for maintenance because their English and Japanese sentences after correction must be equivalent in meaning and format.
[0006] To form such English and Japanese home pages, it is necessary to make and manage two kinds of HTML document files for the English home page and the Japanese home page. That is, the number of files to be managed is increased in proportion to the number of languages, and management and maintenance becomes difficult.
[0007] A user who wishes to read document information in an English home page in Japanese may translate the English document information by using Internet translation software. However, if the user wishes to edit the results of the translation displayed as an HTML document, he or she must give up the idea of doing so or is further required to translate the translation results by different translation software, because direct editing of the translated HTML document is impossible.
[0008] If the user dares to edit the translation-result HTML document, he or she must perform the steps of storing the translation-result HTML document on a local disk, opening the HTML document file stored on the local disk by using HTML document editing software, displaying the HTML document source, directly editing the HTML document source, and storing the results of the editing on the local disk. This process enables editing of the translation results to some effect. However, it is difficult to edit a document in which HTML tags, original sentences, and translated sentences are mixed.
[0009] Further, in a case where an HTML document intended as an object of translation is prepared in advance and, from this document, another HTML document described in a different language is formed by translation processing using Internet translation software, a need may arise to edit the HTML document in the second language formed by the translation processing and, if necessary, the translation-object HTML document, if the author of the HTML document is not satisfied with the results of the translation.
[0010] In this editing, it is difficult to determine document portions to be edited and to confirm the correspondence between original and translated sentences, since the translation-object HTML document and the translation-result HTML document exist in separate files. It is also possible that, through editing, the page configuration (format) of one document will become different from that of the other.
[0011] As described above, the conventional HTML document processing apparatus can be designed to enable translation of an original home page on the Internet using Internet translation software and visual display of original and translated sentences in a juxtaposed form. However, in editing translation results, an HTML document itself cannot be edited. There is a way to directly edit the HTML document source, but editing in such a way is extremely troublesome and not satisfactorily effective.
[0012] In view of the above-described circumstances, an object of the present invention is to provide a document processing apparatus and a recording medium which make it possible to easily form and maintain a home page (Web page data) expressed in two or more languages.
[0013] Another object of the present invention is to provide a document processing apparatus and a recording medium which make it possible to easily edit translated sentences obtained as a result of translation of a tagged document.
[0014] Still another object of the present invention is to provide a document processing apparatus and a recording medium which make it possible to selectively display original sentences in a tagged document and translated sentences obtained as a result of translation of the tagged document.
[0015] A further object of the present invention is to provide a document processing apparatus and a recording medium which make it possible to return the edited translation sentences of the result of translation into a tagged document and to use it.
[0016] A further object of the present invention is to provide a document processing apparatus and a recording medium which make it possible to easily edit original sentences in a tagged document to from a more favorable translation.
[0017] To achieve the above-described objects, according to one aspect of the present invention, there is provided a document processing apparatus comprising a language tag setting unit for setting a language tag designating a kind of language at each of constituent unit positions in an original and those in a translated version corresponding to the original, and a document forming unit for forming a tagged document including the original and the translated version each having the language tag set therein.
[0018] The above-described document forming unit may form a tagged document in which the original and the translated version each having the language tag set therein are described in an original-versus-version form.
[0019] The above-described document processing apparatus further comprises a translation unit for translating the original to make the corresponding translated version.
[0020] The original may be contained in the processing-object tagged document.
[0021] The above-described document processing apparatus further comprises a visibility control tag setting unit for setting a visibility control tag for controlling any document portion so that the document portion is invisible, the visibility control tag setting unit setting the visibility control tag at such a position that one of the original and the translated version is in an invisible state.
[0022] The visibility control tag setting unit may set the visibility control tag at such a position that one of the original and the translated version is set in an invisible state and the language tag is also set in an invisible state.
[0023] The above-described document processing apparatus further comprises a display processing unit for interpreting the tag contained in the tagged document and for enabling a browser for displaying the tagged document to display the formed tagged document on the basis of a result of the interpretation in the above document processing apparatus.
[0024] According to another aspect of the present invention, there is provided a document processing apparatus comprising a language tag setting unit for setting a language tag at each of constituent unit positions in a first text described in a first language and those in a second text described in a second language of the first text, and a document forming unit for forming a tagged document including the first text and the second text each having the language tag set therein.
[0025] The above-described document forming unit may form a tagged document in which the first text and the corresponding second text each having the language tag set therein are described by being related to each other.
[0026] The above-described document processing apparatus further comprises a visibility control tag setting unit for setting a visibility control tag for controlling a text so that the text is invisible, the visibility control tag setting unit setting the visibility control tag at such a position that one of the first text and the second text is in an invisible state.
[0027] The visibility tag setting unit may set the visibility control tag at such a position that one of the first text and the second text is set in an invisible state and the language tag is also set in an invisible state.
[0028] The above-described document processing apparatus further comprises a display processing unit for interpreting the tag contained in the tagged document and for enabling a browser for displaying the tagged document to display the formed tagged document on the basis of a result of the interpretation.
[0029] According to still another aspect of the present invention, there is provided a document processing apparatus comprising an analysis unit for determining and extracting an original text from a processing-object tagged document, and an editing unit for enabling edit processing of the original text extracted from the tagged document by displaying the original text.
[0030] The analysis unit may determine the original text according to a language tag contained in the processing-object tagged document.
[0031] The above-described document processing apparatus further comprises a translation processing unit for making a translated version by translating the original text edited by the editing unit.
[0032] The above-described document processing apparatus further comprises a restoration unit for restoring, from the original text after the editing and the translated version made by the translation processing unit, a tagged document in the same format as the processing-object tagged document according to tags contained in the processing-object tagged document, the tags including the language tag.
[0033] According to yet another aspect of the present invention, there is provided a document processing apparatus comprising an analysis unit for determining an original text and a translated version corresponding to the original text in a processing-object tagged document, and an editing unit for enabling edit processing of the original text and the translated version by extracting the original text and the translated version from the tagged document and by displaying the original text and the translated version in an original-versus-version form.
[0034] The analysis unit may determine the original text and the translated version according to language tags contained in the processing-object tagged document.
[0035] The above-described document processing apparatus further comprises a translation processing unit for making a translated version by translating the original text edited by the editing unit.
[0036] The above-described document processing apparatus further comprises a restoration unit for restoring, from the original text after the editing and the translated version made by the translation processing unit, a tagged document in the same format as the processing-object tagged document according to tags contained in the processing-object tagged document, the tags including the language tag.
[0037] According to a further aspect of the present invention, there is provided a recording medium readable by a computer, the recording medium having a program recorded thereon, the program for enabling the computer to execute a step of setting a language tag designating a kind of language at each of constituent unit positions in an original and those in a translated version corresponding to the original, and a step of forming a tagged document including the original and the translated version each having the language tag set therein.
[0038] The program recorded on this recording medium may enable the computer to further execute a step of setting a visibility control tag for controlling any document portion so that the document portion is invisible, the visibility control tag being set at such a position that one of the original and the translated version is in an invisible state.
[0039] According to still a further aspect of the present invention, there is provided a recording medium readable by a computer, the recording medium having a program recorded thereon, the program for enabling the computer to execute a step of setting a language tag designating a kind of language at each of constituent unit positions in a first text described in a first language and those in a second text described in a second language of the first text, and a step of forming a tagged document including the first text and the second text each having the language tag set therein.
[0040] The program recorded on this recording medium may enable the computer to further execute a step of setting a visibility control tag for controlling a text so that the document is invisible, the visibility control tag being set at such a position that one of the first text and the second text is in an invisible state.
[0041] According to still a further aspect of the present invention, there is provided a recording medium readable by a computer, the recording medium having a program recorded thereon, the program for enabling the computer to execute a step of determining and extracting an original text from a processing-object tagged document, and a step of enabling editing of the text extracted from the tagged document by displaying the original text.
[0042] The program recorded on this recording medium may enable the computer to further execute a step of making a translated version by translating the original text after editing of the original text, and a step of restoring, from the original text after the editing and the translated version made by the translation, a tagged document in the same format as the processing-object tagged document according to tags contained in the processing-object tagged document, the tags including a language tag designating a kind of language.
[0043] According to still a further aspect of the present invention, there is provided a recording medium readable by a computer, the recording medium having a program recorded thereon, the program for enabling the computer to execute a step of determining an original text and a translated version corresponding to the original text in a processing-object tagged document, and a step of enabling editing of the original text and the translated version by extracting the original text and the translated version from the tagged document and by displaying the original text and the translated version in an original-versus-version form.
[0044] The program recorded on this recording medium may enable the computer to further execute a step of forming a translated version by translating the original text after editing of the original text, and a step of restoring, from the original text after the editing and the translated version made by the translation, a tagged document in the same format as the processing-object tagged document according to tags contained in the processing-object tagged document, the tags including a language tag designating a kind of language.
[0045] According to still a further aspect of the present invention, there is provided a recording medium readable by a computer, the recording medium having a tagged document recorded thereon, the recorded tagged document comprising a first language tag designating a first kind of language; a first text following the first language tag, the first text being described in a first language; a second language tag following the first text, the second language tag designating a second kind of language; and a second text following the second language tag, the second text being described in a second language and corresponding to the first text.
[0046] According to still a further aspect of the present invention, there is provided a recording medium readable by a computer, the recording medium having a tagged document recorded thereon, the recorded tagged document comprising an invisibility start tag for setting a following text in an invisible state; a first language tag following the invisibility start tag, the first language tag designating a first kind of language; a first text following the first language tag, the first text being described in a first language; a second language tag following the first text, the second language tag designating a second kind of language; an invisibility end tag following the second language tag, the invisibility end tag canceling the invisible state; and a second text following the invisibility end tag, the second text being described in a second language and corresponding to the first text.
[0047] According to a further aspect of the present invention, there is provided a document processing method comprising a step of setting a language tag designating a kind of language at each of constituent unit positions in an original and those in a translated version corresponding to the original; and a step of forming a tagged document including the original and the translated version each having the language tag set therein.
[0048] The above-described document processing method further comprises a step of setting a visibility control tag for controlling any document portion so that the document portion is invisible, said visibility control tag being set at such a position that one of the original and the translated version is in an invisible state.
[0049] According to still a further aspect of the present invention, there is provided a document processing method comprising a step of setting a language tag designating a kind of language at each of constituent unit positions in a first text described in a first language and those in a second text described in a second language of the first text; and a step of forming a tagged document including the first text and the second text each having the language tag set therein.
[0050] The above-described document processing method further comprises a step of setting a visibility control tag for controlling a text so that the document is invisible, said visibility control tag being set at such a position that one of the first text and the second text is in an invisible state.
[0051] According to still a further aspect of the present invention, there is provided a document processing method comprising a step of determining and extracting an original text from a processing-object tagged document; and a step of enabling editing of the original text extracted from the tagged document by displaying the original text.
[0052] The above-described document processing method further comprises a step of making a translated version by translating the original text after editing of the original text; and a step of restoring, from the original text after the edit processing and the translated version formed by said translation, a tagged document in the same format as the processing-object tagged document according to tags contained in the processing-object tagged document, said tags including a language tag designating a kind of language.
[0053] According to still a further aspect of the present invention, there is provided a document processing method comprising a step of determining an original text and a translated version corresponding to the original text in a processing-object tagged document; and a step of enabling edit processing of the original text and the translated version by extracting the original text and the translated version from the tagged document and by displaying the original text and the translated version in an original-versus-version form.
[0054] The above-described document processing method further comprises a step of forming a translated version by translating the original text after editing of the original text; and a step of restoring, from the original text after the edit processing and the translated version made by said translation, a tagged document in the same format as the processing-object tagged document, according to tags contained in the processing-object tagged document, said tags including a language tag designating a kind of language.
[0055] According to the present invention, it is not necessary to form and manage a file with respect to each of a plurality of languages, so that the maintenance can be easily carried out.
[0056] According to the present invention, an original text and a translated version of the original text obtained by translation are displayed in an original-versus-version form such that edit processing and version can be easily carried out.
[0057] According to the present invention, an original text in a tagged document and a translated version of the original text obtained by translation can be selectively displayed.
[0058] Further, according to the present invention, an edited translated version of the result of translation can be reformatted to the tagged document and used.
[0059] Further, according to the present invention, an original text in a tagged document can be easily edited to obtain a more favorable translated version.
[0060] These objects and advantages of the present invention will become more apparent and more readily appreciated from the following detailed description of the presently preferred exemplary embodiments, taken in conjunction with the accompanying drawings of which:
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
[0068]
[0069]
[0070]
[0071]
[0072]
[0073] Embodiments of the present invention will be described with reference to the accompanying drawings.
[0074] [First Embodiment]
[0075] [Configuration of HTML Document Processing Apparatus]
[0076]
[0077] The personal computer has, as is well known, a central processing unit (CPU), a random access memory (RAM), a hard disk, a drive unit for accessing a portable recording medium, such as a floppy disk or a compact disk-read only memory (CD-ROM), to read out a program or data recorded thereon, a communication control unit, such as a modem or a local area network (LAN) board, a display, a keyboard, and a mouse.
[0078] The input section
[0079] The display processing section
[0080] The program of the present invention and the Internet translation software are provided by being recorded on the portable recording medium, and are read out by the drive unit and stored in the hard disk in advance.
[0081] The above-described program and software stored in an external unit, may be downloaded to the HTML document processing apparatus
[0082] The program of the present invention and the Internet translation software stored in the hard disk are read to the RAM to be executed by the CPU. The programs of the present invention and the Internet translation software may be directly read to the RAM without being stored in the hard disk, instead of being temporality stored in the hard disk and thereafter read to the RAM as described above.
[0083] HTML documents formed by the HTML document processing apparatus
[0084] In the HTML document processing apparatus
[0085] In this translation and display process, the original-versus-version HTML document forming processing section
[0086] [Operation of HTML Document Processing Apparatus]
[0087] The operation of the HTML document processing apparatus
[0088] (Original-Versus-Version HTML Document Forming Processing)
[0089] The operation will be described by way of example with respect to a case where, in the HTML document processing apparatus
[0090] Referring also to
[0091] Next, the user initiates a program for translation in the desired direction by operating the input section
[0092] The user clicks a “translation” button displayed in an initial window (dialogue window) (not shown) of the translation program to make the HTML document processing apparatus
[0093] The original-versus-version HTML document forming processing section
[0094] The original-versus-version HTML document forming processing section
[0095] Next, the original-versus-version HTML document forming processing section
[0096] The original-versus-version HTML document forming processing section
[0097] The original-versus-version HTML document forming processing section
[0098] After the original-versus-version HTML document forming processing performed by the original-versus-version HTML document forming processing section
[0099] (Processing for Selectively Displaying English HTML Document (Original) and Japanese HTML Document (Version))
[0100] In the HTML document processing apparatus, processing described below is performed after the original-versus-version HTML document forming processing based on the above-described sequence of document analysis steps. The processing described below enables selective display of only one of the English HTML document (original) and the Japanese HTML document (translated version) displayed in the upper-lower juxtaposition form on the screen of the display section
[0101] The user determines display of the document in the particular language for selective display of the document, and clicks a button corresponding to the language to be displayed (a “Japanese” button in this example) (S
[0102] The individual-language HTML document conversion processing section
[0103] If the individual-language HTML document conversion processing section
[0104] Thereafter, to enable visual display of only the Japanese sentences in the English original-vs.-Japanese version HTML document, the individual-language HTML document conversion processing section
[0105] The well-known comment tag may be used as the above-mentioned visibility control tag. A sentence or paragraph bracketed by a pair of comment tags is not displayed by the WWW browser
[0106] If the individual-language HTML document conversion processing section
[0107] An original-versus-version HTML document
[0108] If no language tag is detected in above step S
[0109] The individual-language HTML document conversion processing section
[0110] The display processing section
[0111] Further,
[0112] [Second Embodiment]
[0113] [Configuration of HTML Document Processing Apparatus]
[0114]
[0115] The HTML document processing apparatus
[0116] In the HTML document processing apparatus arranged as described above, the automatic translation processing section
[0117] In this translation and display process, the original-versus-version HTML document forming processing section
[0118] The HTML document analysis and conversion processing section
[0119] If an HTML document is translated by Internet translation software, translation results outputted in an original-versus-version form cannot be edited. However, document data prepared by removing HTML tags from the translation results is supplied to the editor capable of editing data in an original-versus-version form to enable editing of the translation results. Also, HTML tags are restored in edit results to enable the edit results to be used as an HTML document.
[0120] [Operation of HTML Document Processing Apparatus]
[0121] The operation of the HTML document processing apparatus
[0122] (Original-Versus-Version HTML Document Forming Processing)
[0123] Referring to
[0124] (Original-Versus-Version HTML Document Data Conversion Processing and Translated Version Edit Processing)
[0125] In the HTML document processing apparatus
[0126] When a user clicks an “original-versus-version edit processing start” button, the automatic translation processing section
[0127] The HTML document analysis and conversion processing section
[0128] First, the HTML document analysis and conversion processing section
[0129] If the HTML document analysis and conversion processing section
[0130] Thereafter, to enable display and edit of English sentences in the English original-vs.-Japanese version HTML document, the HTML document analysis and conversion processing section
[0131] If the result of above determination in step S
[0132] If no language tag is detected in above step S
[0133] The HTML document analysis and conversion processing section
[0134] When the HTML tag indicating the end of the original-versus-version HTML document is detected, the HTML document analysis and conversion processing section
[0135] The user edits the Japanese sentences according to his or her need on the basis of the English and Japanese paragraphs or sentences visually displayed in the left-right juxtaposition form in the window of the English-Japanese translation editor. The symbol in the translated sentences shown in
[0136] (Processing for Restoration to HTML Document Format)
[0137] In the HTML document processing apparatus
[0138] When the user clicks an “HTML display” button in the English-Japanese translation editor window, the English-Japanese translation editor (original-versus-version editor program)
[0139] First, the original-versus-version HTML document conversion processing section
[0140] If the result of this determination is that the paragraph or sentence read out is English, the original-versus-version HTML document conversion processing section
[0141] The original-versus-version HTML document conversion processing section
[0142] The original-versus-version HTML document conversion processing section
[0143] When the original-versus-version HTML document is displayed on the editor, the English paragraphs or sentences and the positions of the same, and the Japanese paragraphs or sentences and the positions of the same in the original-versus-version HTML document are stored in the respective storage areas in the data storage section
[0144] The formed original-versus-version HTML document can be displayed in the form of an original-versus-version home page by the Internet display tool WWW browser
[0145] [Embodiment 3]
[0146] [Configuration of HTML Document Processing Apparatus]
[0147] An HTML document processing apparatus
[0148] In the HTML document processing apparatus
[0149] In this HTML document processing apparatus
[0150] Also, the original-versus-version HTML document conversion processing section
[0151] [Operation of HTML Document Processing Apparatus]
[0152] (Data Conversion Processing and Original Edit Processing of English HTML Document (Original))
[0153]
[0154] Next, the user initiates a program for translation in the desired direction by operating the input section
[0155] The user clicks an “original-versus-version edit processing start” button in an initial window (dialogue window) (not shown) of the translation program to start the English-Japanese translation editor (original-versus-version editor program)
[0156] The HTML document analysis and conversion processing section
[0157] Next, the HTML document analysis and conversion processing section
[0158] If HTML tag is detected in above step S
[0159] The HTML document analysis and conversion processing section
[0160] When the HTML tag indicating the end of the English HTML document is detected, the HTML document analysis and conversion processing section
[0161] The user performs manual translation on the basis of the English paragraphs or sentences (original) visually displayed in the window of the English-Japanese translation editor, and forms translated sentences (Japanese sentences) by operating the input section
[0162] (Processing for Restoration to HTML Document Format)
[0163] In the HTML document processing apparatus
[0164] When the user clicks an “HTML display” button in the English-Japanese translation editor window, the English-Japanese translation editor (original-versus-version editor program)
[0165] First, the original-versus-version HTML document conversion processing section
[0166] If the result of this determination is that the paragraph or sentence read out is English, the original-versus-version HTML document conversion processing section
[0167] The original-versus-version HTML document conversion processing section
[0168] The original-versus-version HTML document conversion processing section
[0169] The original-versus-version HTML document conversion processing section
[0170] Also in this embodiment, HTML document restoration processing can be performed by using the tags in the original-versus-version HTML document in the same manner as that in Embodiment 2.
[0171] Also, selective display processing (see
[0172] [Examples of Modification]
[0173] In the HTML document processing apparatus
[0174] In each of the HTML document processing apparatuses in the above-described embodiments, an original written in English, is translated into Japanese. However, the HTML document processing apparatus of the present invention can also operate in the same manner with respect to the respective languages, and can similarly process an HTML document having sentences written in three or more languages.
[0175] In the HTML document processing apparatus
[0176] Each of the above-described processes according to the present invention can be applied in association with a computer-readable medium.
[0177] Although only a few embodiments of the present invention have been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the preferred embodiments without departing from the novel teachings and advantages of the present invention. Accordingly, all such modifications are intended to be included within the scope of the present invention as defined by the following claims.