Title:
Computer-Implemented System And Method For Call Response Processing
Kind Code:
A1


Abstract:
A computer-implemented system and method for call response processing is provided. A call is monitored between an agent and a customer. A stream of speech with inquiries from the customer is received during the call. At least a portion of a script is received from the agent in response to one of the inquiries from the customer, and a spoken response is provided to the customer based on the script from the agent. A text message is further received from the agent in response to another of the inquiries from the customer. The text message is converted to a spoken response and provided to the customer.



Inventors:
Waalkes, Adam (Sammamish, WA, US)
Sutherland, Alastair (Seattle, WA, US)
Odinak, Gilad (Bellevue, WA, US)
Jiang, Haodong Howard (Issaquah, WA, US)
Application Number:
15/641183
Publication Date:
10/19/2017
Filing Date:
07/03/2017
Assignee:
Intellisist, Inc. (Seattle, WA, US)
International Classes:
H04M3/523; H04M3/22; H04M3/51; H04M3/527
View Patent Images:
US Patent References:
20170208175N/A2017-07-20
7058578N/A2006-06-06



Primary Examiner:
DEANE JR, WILLIAM J
Attorney, Agent or Firm:
STEVENS & SHOWALTER LLP (7019 CORPORATE WAY DAYTON OH 45459-4238)
Claims:
What is claimed is:

1. A computer-implemented system for call response processing, comprising: a monitor to monitor a call between an agent and a customer; a speech module to receive a stream of speech with inquiries from the customer; a script module to receive from the agent at least a portion of a script in response to one of the inquiries from the customer and to provide a spoken response to the customer based on the script from the agent; a text module to receive from the agent a text message in response to another of the inquiries from the customer; and a text-to-speech module to convert the text message to a spoken response and to provide the spoken response to the customer.

2. A system according to claim 1, further comprising: a compiler to compile the script and the text message from the agent and the speech from the customer as record of the call.

3. A system according to claim 2, further comprising: a correction module to receive from the agent one of input and corrections to the call record.

4. A system according to claim 1, wherein the call is one of up to four call sessions simultaneously presented to the agent via an agent console.

5. A system according to claim 4, further comprising: an active call module to designate one of the simultaneous calls as active and to receive from the agent instructions for the active call comprising one of playing a recording of a transcribed message to the agent, playing a synthesized message to the customer, and activating at least a portion of one of the scripts.

6. A system according to claim 1, further comprising: a conversion module to convert the speech from the customer to text and to provide the text to the agent.

7. A system according to claim 6, further comprising: a speech identification module to identify one or more utterances in the speech from the agent; and an instruction module to receive instructions from the agent for one of ranking and reordering the identified utterances.

8. A system according to claim 1, wherein the text message from the agent is manually typed.

9. A system according to claim 1, further comprising: a transfer module to transfer the customer to a live agent.

10. A system according to claim 1, further comprising: a database to store a portion of the speech from the customer as parsed data.

11. A computer-implemented method for call response processing, comprising: monitoring a call between an agent and a customer; receiving a stream of speech with inquiries from the customer; receiving from the agent at least a portion of a script in response to one of the inquiries from the customer; providing a spoken response to the customer based on the script from the agent; further receiving from the agent a text message in response to another of the inquiries from the customer; converting the text message to a spoken response; and providing the spoken response to the customer.

12. A method according to claim 11, further comprising: compiling the script and the text message from the agent and the speech from the customer as record of the call.

13. A method according to claim 12, further comprising: receiving from the agent one of input and corrections to the call record.

14. A method according to claim 11, wherein the call is one of up to four call sessions simultaneously presented to the agent via an agent console.

15. A method according to claim 14, further comprising: designating one of the simultaneous calls as active; and receiving from the agent instructions for the active call comprising one of playing a recording of a transcribed message to the agent, playing a synthesized message to the customer, and activating at least a portion of one of the scripts.

16. A method according to claim 11, further comprising: converting the speech from the customer to text; and providing the text to the agent.

17. A method according to claim 16, further comprising: identifying one or more utterances in the speech from the agent receiving instructions from the agent for one of ranking and reordering the identified utterances.

18. A method according to claim 11, wherein the text message from the agent is manually typed.

19. A method according to claim 11, further comprising: transferring the customer to a live agent.

20. A method according to claim 11, further comprising: storing a portion of the speech from the customer as parsed data.

Description:

CROSS-REFERENCE TO RELATED APPLICATION

This non-provisional patent application is a continuation of U.S. patent application Ser. No. 13/893,231, filed on May 13, 2013, pending, which is a continuation of U.S. Pat. No. 8,442,209, issued May 14, 2013, which claims priority to U.S. Provisional Patent Application Ser. No. 60/838,102, filed Aug. 15, 2006, the disclosures of which are incorporated by reference.

FIELD

The present invention relates in general to automated call center operation and, in particular, to a system and method for call response processing.

BACKGROUND

Customer call centers, or simply, “call centers,” are often the first point of contact for customers seeking direct assistance from manufacturers and service vendors. Call centers are reachable by telephone, including data network-based telephone services, such as Voice-Over-Internet (VoIP), and provide customer support and problem resolution. Although Worldwide Web- and email-based customer support are becoming increasingly available, call centers still offer a convenient and universally-accessible forum for remote customer assistance.

The timeliness and quality of service provided by call centers is critical to ensuring customer satisfaction, particularly where caller responses are generated through automation. Generally, the expectation level of callers is lower when they are aware that an automated system, rather than a live human agent, is providing assistance. However, customers become less tolerant of delays, particularly when the delays occur before every automated system-generated response. Minimizing delays is crucial, even when caller volume is high.

To properly service callers, agents frequently need to perform troubleshooting or diagnosis to identify the cause of the problem or service required. Fully automated interactive voice recognition (IVR) based call centers rely on scripted exchanges with callers that are intended to elicit specific information in a predefined sequence, which may not be user-friendly or responsive to urgent needs. In addition, while some of the information may be critical to resolving the situation, other information may not be essential and could either be deferred or omitted entirely.

Fully automated system interfaces generally are not as user friendly as a live agent, as interacting with the automated system through a scripted caller interface may be awkward or uncomfortable to callers. Moreover, callers that are under stress or in a hurry may be unable or unwilling to answer questions in a particular order, preferring answering questions through regular conversational speech in a free form format. As a result, an inflexible script could force a loss of information or require repeated questions, which can in turn lead to decreased caller experience due to frustration or loss of confidence.

Therefore, there is a need to accommodate non-sequential and non-linear information gathering by call center agents, which allows caller interactions to progress free form in a more natural and conversational order than might ordinarily be experienced with automated caller response systems.

SUMMARY

Automated call center agents are each able to process several callers at a time through an agent console that includes a graphical user interface. The agent interacts indirectly with each caller session through a separate window or screen view displayed on the agent console in which a running non-linear and non-sequential dialogue between the agent and the caller is displayed. The caller hears questions from the agent as machine-generated dialogue, but is able to respond using natural speech, dual-tone multi-frequency (DTMF) tones, text messages, or other form, which can be free form. If possible, the information essential to determining the proper disposition of each call is first gathered, after which appropriate topics can be discussed or return information provided. The agent can review the conversation at any point during a caller session to collect information that may have been missed or which was applicable to another topic to minimize repetition and facilitate efficient troubleshooting and problem resolution.

A further embodiment provides a computer-implemented system and method for call response processing. A call is monitored between an agent and a customer. A stream of speech with inquiries from the customer is received during the call. At least a portion of a script is received from the agent in response to one of the inquiries from the customer, and a spoken response is provided to the customer based on the script from the agent. A text message is further received from the agent in response to another of the inquiries from the customer. The text message is converted to a spoken response and provided to the customer.

Still other embodiments of the present invention will become readily apparent to those skilled in the art from the following detailed description, wherein is described embodiments of the invention by way of illustrating the best mode contemplated for carrying out the invention. As will be realized, the invention is capable of other and different embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and the scope of the present invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not as restrictive.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a functional block diagram showing an automated call center operational environment.

FIG. 2 is a functional block diagram showing the automated call center of FIG. 1.

FIG. 3 is a process flow diagram showing, by way of example, a typical call sequence, as transacted in the automated call center of FIG. 2.

FIG. 4 is a screen diagram showing, by way of example, a view of an agent console for a live call session.

FIG. 5 is a process flow diagram showing a method for processing out-of-order caller responses during automated call processing, in accordance with one embodiment.

FIG. 6 is a screen diagram showing, by way of further example, a view of an agent console with a plurality of active caller sessions.

FIG. 7 is a screen diagram showing, by way of further example, a portion of a call dialogue that includes out-of-order caller responses.

FIG. 8 is a block diagram showing a system for processing out-of-order caller responses during automated call processing, in accordance with one embodiment.

DETAILED DESCRIPTION

Automated Call Center Operational Environment

Automated call centers are a form of voice-response system, which rely upon programmed control to guide callers through a machine-generated dialogue for receiving support and problem resolution assistance. FIG. 1 is a functional block diagram showing an automated call center operational environment 10. By way of example, a multiplicity of callers call into an automated call center 11, generally through telephonic means, which can include conventional telephone handsets 13a-c through Plain Old Telephone Service (POTS) 12, portable handsets 15a-c through cellular and satellite telephone service 14, VoIP clients 17a-b, and Internet telephony clients 18a-b, as well as other forms and combinations of telephony, message, and voice-based communications. For example, a telephone caller may provide responses to the automated call center 11 through a combination of voice, DTMF tones, or text messaging. Callers can also “call” or interface into the automated call center 11 using conventional network clients 19 through an internetwork 17, including the Internet. Other types of automated call center access are possible.

Except as otherwise stated, as used herein, the terms “caller,” “user,” and “customer” are used interchangeably to refer to a caller to the automated call center 11. Similarly, the terms “agent,” “guide,” and “operator” are used interchangeably to refer to an agent that provides service provisioning to the caller to the automated call center 11. Additionally, although the automated call center 11 is shown as a single point within the automated call center operation environment 10, the automated call center 11 could include two or more logically interconnected but physically separate, including geographically removed, call center operations, which collectively provide a unified automated call center. Other automated call center arrangements and configurations are possible.

Automated Call Center

Generally, an automated call center provides a single dial-in source for support and problem resolution assistance for customers seeking direct help from manufacturers, service vendors, and others. However, automated call centers 11 can also be applied to many other diverse areas of commerce, such as mail order sales and roadside assistance dispatch. FIG. 2 is a functional block diagram showing an automated call center 30 for use in the automated call center operational environment 10 of FIG. 1. The automated call center 30 includes one or more servers 31, 34 and agent consoles 32a-c operatively interconnected over an internal network 33. The agent consoles 32a-c could also be interconnected to the message server 31 over an external network infrastructure 33, such as the Internet or a non-public enterprise data network. The network infrastructure 33 can be either wired or wireless and, in one embodiment, is implemented based on the Transmission Control Protocol/Internet Protocol (TCP/IP) network communications specification, although other types or combinations of networking implementations are possible. Similarly, other network topologies and arrangements are possible.

A message server 31 provides the primary message-based communications infrastructure for the call center operation, such as described in commonly-assigned U.S. Pat. No. 7,292,689, issued Nov. 6, 2007 to Odinak et al., and U.S. Pat. No. 8,170,797, issued May 1, 2012 to Odinak, the disclosures of which are incorporated by reference. During regular operation, the message server 31 executes multiple threads to process multiple simultaneous calls, which are handled by agents executing agent applications on agent consoles 32a-c.

Customer calls are received through a telephony interface 35, which is operatively coupled to the message server 31 to provide access to a telephone voice and data network 36. In one embodiment, the telephony interface connects to the telephone network 36 over a T-1 carrier line, which can provide individual channels of voice or data traffic. Other types of telephone network connections are possible.

The automated call center 30 also includes a recognition server 34 that interfaces directly to the message server 31 as a top-level or root tier of a speech recognition hierarchy. The message server 31 sends streamed audio data for each user call to the recognition server 34, which then performs distributed speech recognition. The message server 31 assigns grammar generation and speech recognition to the recognition server 34. Upon startup, the telephony gateway 35 opens a T-1 carrier device channel for each available T-1 time slot. The telephony gateway 35 initiates a new connection to the message server 31, one connection per T-1 device channel, and the message server 31, in turn, initiates a corresponding new connection to the recognition server 34.

The separate telephony gateway-to-message server and message server-to-main recognizer connections form one concurrent session apiece. When a customer call is answered or connected, the telephony gateway 35 sends a call message to the message server 31. The message server 31 then sends a new call message to the recognition server 34.

The message server 31 also assigns each new call to one of the agent consoles 32a-c through dynamic load balancing, such as further described in commonly-assigned U.S. Provisional Patent application, entitled “System and Method for Balancing Agent Console Load During Automated Call Processing,” Ser. No. 60/838,074, filed Aug. 15, 2006, the disclosure of which is incorporated by reference. Each agent console 32a-c provides the primary means for direct customer interaction by providing service provisioning and related assistance to callers, such as further described below with reference to FIG. 3. The main purpose of each agent console 32a-c is to execute one or more agent applications, which display both user and agent messages and provide menus of actions that can be executed in response to agent commands, including script execution. One or more agent applications execute on each agent console and one or more agent consoles can execute in parallel. Alternatively, multiple instances of agent applications can run on a server machine and can be accessed by agents at agent consoles operating as remote terminals.

Each component, including the message server 31, recognition server 34, and agent consoles 32a-c, is implemented as a computer program, procedure or module, or state transition written as source code or flow specification in a conventional programming language, such as the C++ programming language or modeling tool, and presented for execution by a computer system as object or byte code. Alternatively, the components could be directly implemented in hardware, either as integrated circuitry or burned into read-only memory components. The various implementations of the source code and object and byte codes can be held on a computer-readable storage medium.

Exemplary Typical Call Sequence

Callers to the automated call center interface indirectly with an agent, who multitasks through one or more individual caller sessions via an agent console 32a-c. The callers, however, only perceive an automated calling interface heard, for instance, through their telephone or portable handset. Calls are processed through a sequence of phases. FIG. 3 is a process flow diagram showing, by way of example, a typical call sequence 40, as transacted by the automated call center 30 of FIG. 2. Although the form of call processing required may vary by subject matter area and other factors, the same overall sequence of caller-to-agent interchanges will loosely apply.

Initially, upon calling into the automated call center 30, each user receives an initial greeting and optionally informational message 41 providing a synopsis of caller options. When provided, the caller options enable the user to navigate through to a specific topic area for assistance or support. However, the user could simply proceed directly into a customer support scenario 42 with either an agent, which is a live person; with a guide, which is an automated prompt that is under the supervision of a live person; or through automated voice response to enable information collection, problem trouble-shooting, and other actions. When present, the guide can listen to and revise caller replies in the background, generally unbeknownst to the caller, such as described in commonly-assigned U.S. Pat. No. 7,292,689, entitled “System and Method for Providing a Message-Based Communications Infrastructure for Automated Call Center Operation,” issued Nov. 6, 2007, pending, the disclosure of which is incorporated by reference. Other guide functions and caller dispositions are possible.

The scenario 42 can be delayed by agent unavailability, caller volume capacity limits, and other factors that can delay providing a response to the caller. Throughout the session, the agent can adjust and customize the flow of the call processing by evaluating caller attributes, such as further described in commonly-assigned U.S. Provisional Patent application, entitled “System and Method for Managing a Dynamic Call Flow During Automated Call Processing,” Ser. No. 60/838,101, filed Aug. 15, 2006, the disclosure of which is incorporated by reference. In addition, the manner in which information is gathered from the user to determine the proper disposition of the call and to assist the caller can be dynamically evaluated and controlled, as further described below beginning with reference to FIG. 5.

As required, service-provisioning 43 is provided to the user, either directly during the course of the call or indirectly through a service request dispatch. The type of service provided could include sending information or products, entering a subscription or enrollment, or other forms of commerce or assistance that can be remotely provided over the telephone. Finally, the call ends in a wrap-up 44, which provides closure to the call and a departing salutation. Other forms and variations of customer call sequences are feasible. Importantly, however, from the perspective of the caller, the experience may appear to be an interaction with an intelligent machine and the caller would be aware that the agent is automated, not human. Accordingly, the caller would have a more relaxed expectation of agent responsiveness since a machine, and not an actual person, is on the line.

Following call termination, the automated call center 30 can perform post-call processing 45, which identifies individual speech utterances in each call that can be analyzed or stored into a database 34. The identified speech utterances can be presented to an agent for manipulation, such as ranking or reordering. In addition, the post-call processing 45 can include performing speech recognition on the speech utterances, identifying speaker characteristics, and marking certain speech utterances for later use. In a further embodiment, the automated call center 30 can also perform in-progress call processing 46, which operates on a stream of speech utterances copied from on-going calls. The same types of processing operations can be performed as on completed calls, but allows an agent to perform real time call analysis.

Exemplary Agent Console

Call sessions are displayed on each agent console through a graphical user interface (GUI). FIG. 4 is a screen diagram showing, by way of example, a view of an agent console 50 for a live call session. Multiple call sessions can be simultaneously displayed and managed through the GUI controls.

The particulars concerning the live call session are obtained by the message server 31 and provided to the agent. The session particulars can include, for instance, caller profile 51, address 52, contact information 53, reason for call 54, and credit card data 55. Other types of session particulars are possible.

Operationally, the agent can review a recording of the caller's spoken responses through intuitive playback controls 57 and update the display by operating an “Update” control 56. Additionally, the agent can transfer the caller to a live agent by operating a “Transfer to Live Agent” control 58 and can log out by operating a “Logout” control 59. Other GUI controls are possible.

Out-of-Order Caller Response Processing

The manner in which information is solicited from callers can be provided in part as an open ended dialogue with callers through an agent console 32a-c. FIG. 5 is a process flow diagram showing a method 70 for processing out-of-order caller responses during automated call processing, in accordance with one embodiment. As each agent is indirectly interacting with callers through dialogue that is generated as text-to-speech by the message server 31, allowing calls to be handled in an unstructured sequence can provide agents with extra time in which to multitask, or listen to or review caller messages in near real time.

Before an agent is able to provide assistance to a caller, the agent must first gather any information that is essential to determining the appropriate disposition of the call (operation 71). At a minimum, essential information includes the caller's identify and their reason for calling. Other essential information is possible. Once obtained, the agent can determine the call disposition (operation 72) and begin engaging in automated interaction with the caller (operation 73), such as described above with reference to FIG. 3.

Throughout the call session, the agent can explore various topics by blocks of dialogue during which both conditionally essential and non-essential information on current or previous topics is gathered via information-gathering queries (operations 74 and 75, respectively). Conditionally essential information is essential within the context of the current topic, whereas non-essential information is all other information that is not required to move onto the next topic, including information that is off-topic or extraneous. For instance, credit card information might be essential within the context of shipping products to the caller to ensure payment. By contrast, the types of music that the caller prefers might be non-essential information, yet still be applicable to a separate topic regarding product preferencing. The agent is able to review the session dialogue and can input or correct the information provided at any time throughout the call, not just at the moment at which a caller provides a particular answer. In addition, the agent can re-prompt the caller for any information on the current or previous topics that he or she was unable to gather from the caller's initial utterances (operation 76). The agent can move to a next topic in the dialogue (operation 77) and continue with facilitating problem resolution or service provisioning. Other types of call handling sequencing are possible.

Agent Console Sessions

Each agent console 32a-c implements a graphical user interface (GUI) for the agent. FIG. 6 is a screen diagram showing, by way of further example, a view of an agent console 80 with a plurality of active caller sessions 81-83. Each session 81-83 appears within the GUI and enables the agent to indirectly interact with a different customer calling through, for instance, the telephony interface 35. The agent can accept new session requests from the messaging server 31 and create a visual session container for each additional session 81-83.

In one embodiment, up to four sessions can be presented to an agent simultaneously. Preferably, the agent can view the contents of all sessions on a single screen or display. One session is designated as the active session and accepts agent comments, such as an instruction to listen to a transcribed user message, play a synthesized agent message to the caller, or activate a script through a menu 84. The agent can switch between active sessions with a single keystroke or pointer click.

Each active caller session 81-83 has a scrolling dialogue in a respective visual session container. FIG. 7 is a screen diagram showing, by way of further example, a portion of a call 90 that includes out-of-order caller responses 91. In the scenario, an elementary school teacher has discovered that her car has a flat tire and is requesting assistance through the automated call center 30. She first provides essential information, including her identity and the nature of her problem, which is a flat tire. Her identity need not be by her name. She could provide her license plate number, telephone number, or any other data, which the agent could use to look up and confirm her identity. Similarly, the nature of her problem need not be selected from a list of finite possibilities. Rather, she can simply state, in plain language, her problem, which is converted from speech to text and displayed on the agent console 32a-c for the agent to read and interpret. As a result, the caller receives a higher level of responsiveness and user friendliness, as the behind-the-scenes agent enables her to speak in a more natural and comfortable manner.

With the essential information having been provided, the agent can then “converse” with the caller and determine that she needs a tow truck (excerpt 92). In response, the agent, through the agent console 32a-c, inquires of the caller's personal safety (excerpt 93), to which the caller explains her location and repeats her problem (excerpt 94). Rather than repeat the question about her personal safety, in free form, the agent, again through the agent console 32a-c, adjusts the line of inquiry to determine her whereabouts (excerpt 95). At this point, the agent can dispatch a tow truck to the teacher on the basis of only knowing her problem and location, deferring obtaining any remaining non-essential information until after the tow truck has been dispatched. Thus, even though the caller is interacting with an automated system, she is able to describe her problem using her own words, rather than having to struggle through countless menus and pre-recorded questions and options. Other forms of unstructured non-sequential and interactive dialogue are possible.

Agent Console

Call handling is primarily controlled by the agent console 32a-c for the agent assigned to handling the call session. FIG. 8 is a block diagram showing a system 100 for processing out-of-order caller responses during automated call processing, in accordance with one embodiment. The system 100 operates in accordance with a sequence of process steps, as described above with reference to FIG. 5.

Throughout the call session, the agent receives a stream of messages 109 and generates a stream of responses 110 through his or her agent console 101. The console 101 includes a script engine 102 that the agent can use to execute stored scripts 106 maintained in a storage device 105, which presents “canned” snippets of interactive dialogue to the caller. However, the scripts 106 are short and are generally single or limited questions intended to ask for particular information, yet which can be presented in any order or sequence as desired by the agent. Additionally, the agent need not use the scripts 106 and can instead choose to manually type dialogue to the caller, which is generated as a spoken response by the message server 31.

The console 101 also includes an evaluator 103 and selector 104, which respectively identify information from the individual caller responses that is stored as parsed data 107 and scenarios 108 of troubleshooting and diagnosis sequences for the agent's use. The parsed data 107 and scenarios 108 are both stored in the storage device 105. Other console functionality is possible.

While the invention has been particularly shown and described as referenced to the embodiments thereof, those skilled in the art will understand that the foregoing and other changes in form and detail may be made therein without departing from the spirit and scope of the invention.