Going beyond intent classification to dynamic natural language-to-database query language translation.
~8 minute read
In our previous post, we talked about the current state of conversational AI and discussed the limitations of both intent classification and today’s machine learning technologies when it comes to database access. We know that new conversational technologies will only be successful in meeting the need for greater data accessibility if they are equipped to handle a level of complexity and flexibility that today’s systems can’t yet support.
The next step forward is going beyond intent classification, and creating a system that can learn to understand meaning, not just intent, in entire human phrases.
Machine learning models that are trained not only to understand individual words, but that are also capable of reasoning through the relationships between those words to decipher the meaning of an entire natural language query, can make using a conversational system an even more human experience.
This means the computer begins to recognize that “Who owes me?” is synonymous with other questions like “Who’s in debt to me?” and that these statements mean the querent is asking for information about outstanding invoices (as opposed to who hasn’t e-transferred their share of the dinner bill, which might be an equally relevant question in a different context).
Typically, database query language statements are used to search for specific information in a database when users are looking for answers to their questions.
Teaching a machine to fully understand meaning and decipher context in natural language isn’t easy. Computers much prefer structured data, like programming language (code), to natural language, which is completely unstructured data: it doesn’t fit nicely into boxes and its rules are somewhat arbitrary.
In this article, we’ll discuss why it’s important to build conversational AI for database access and what ultimately makes a natural language-to-database query language translation system successful.
Translating Natural Language to Database Query Language
To a computer, the meaning behind natural language is irrelevant: it understands structured, precise commands and there’s no room for nuance or ambiguity. Computers run on rules, but the rules of human language are hard for even humans to fully understand or consistently abide by. What computers do understand is how to execute algorithms and code. They can execute actions based on the instructions given in this particular, structured form. To get a computer system to do what you want it to do, you need to speak its language.
Because computers can’t synthesize natural language the way humans can, when it comes to querying conversationally (using natural language), an intermediary step needs to occur between a computer’s receival of the human phrase and the subsequent action that stems from that request or command. Put simply, human language needs to be dynamically translated into the computer’s language.
Any user can simply enter a natural language query – or ask a question in their own words – to receive information from their database.
When it comes to finding information in databases, specifically relational databases, computer systems are already equipped with “languages” that enable them to carry out the process of searching through data and returning what the user wants to know.
A relational database can be searched and managed using a database query language like Structured Query Language (SQL). SQL statements are written and run to identify and retrieve data from tables, columns, and rows in a database so that a user doesn’t have to hypothetically scroll through massive volumes of information in a single spreadsheet to find out, to return to our previous example, who owes them money.
Even though database query languages exist, humans are required to learn how these languages work in order to access data. Writing database query language like SQL is a specialized skill that the average employee doesn’t necessarily possess. This means access to data is often restricted within organizations, and bottlenecks develop as data demands grow well beyond the capacity of those employed to cater to them.
To bridge this skill gap and democratize the information contained in databases, we can make data accessible to more people by employing AI that translates natural language to database query language.
AutoQL dynamically translates natural language questions into database query language statements in real time.
This is where the need for improved Natural Language Understanding (NLU) and advanced machine learning come in: once the machine can understand the meaning (not just the intent) of a user’s phrase, it should be able to dynamically create a SQL statement that reflects exactly what the user is requesting from the database.
With this kind of AI in place, any user can simply enter a natural language query–or ask a question in their own words–to receive information from their database.
This makes database exploration intuitive and accessible rather than exclusive or restricted to developers or other team members who have been trained in writing database query language. It’s a major step towards the true democratization of data, making it possible for anyone, regardless of skillset, to search and analyze an aggregation of data via natural language queries and commands.
Understanding NL, Understanding SQL
To be effective in meeting the needs of users, a conversational AI system for database access needs to deliver the outcomes that the user is expecting, every time.
While intent classification uses keywords in a user’s query to classify specific intents and then matches these intents to pre-defined SQL statements, the user has to use very specific words that the system knows and has been trained to respond to. In response, the machine may or may not run the SQL statement that most accurately reflected the user’s intrinsic intent, since the system would only be designed to have a single SQL option associated with that particular intent. This can result in a negative user experience, one that is likely to yield mistrust of the system.
Translating NL to database query language allows for the same kind of flexibility and adaptability to nuance that humans expect from one another when asking for information. There are many ways to ask: “Who owes me?” but in the context the question is being asked, there’s probably only one very specific desired output.
AutoQL takes into account the context of each query, understanding that the user is looking for information about their AR in this particular example.
To return an accurate answer, the AI needs to understand how to reconcile all the variables of the natural language query with the limited, but still extensive, variables of database query language. When a user asks: “Who owes me?” the system understands what the entire queried phrase means both in whole and in part, and dynamically generates a corresponding SQL statement based on that understanding. This means multiple unique SQL statements could be generated from natural language queries that vary only slightly in how they are asked by the user.
Much like a Paris city guide fluent in English and French, the machine can interpret natural language and translate that statement to an equivalent statement in database query language to return a meaningful result.
Importantly, the way the user asked for that information might be unique and be framed in a way that the translator hasn’t experienced before. But this is irrelevant because the system is designed for this purpose exactly: to decipher meaning by reasoning through the natural language input, dynamically generating a relevant SQL statement, and surfacing or returning the desired result.
A system that understands both human language and database query language acts like an intelligent interpreter so that no matter how a human chooses to ask for the information they want to access in their data, the computer can dynamically generate a SQL statement that will retrieve the relevant information that user meant to ask for.
Read more: How to Talk Data with Conversational AI
AutoQL Provides the Best Conversational AI for Database Access
As a research-forward company, Chata’s methods for providing the best AI solutions for database access are constantly evolving. We created AutoQL (our play on “automatic query language”) to facilitate the next-generation data access functionality that today’s users demand.
AutoQL is built to facilitate unprecedented user experiences for people who increasingly require seamless access to data to make impactful decisions that benefit the businesses they work for.
The output of every natural language query is returned data that users can rely on for further reporting and analysis. A great way to employ the power of conversational AI is to make it accessible through a robust conversational user interface. We’ve built embeddable frontend components like the chat window-inspired Data Messenger for instant data on demand, and BI-grade Dashboards designed for total flexibility and rapid deployment, enabling users to set up comprehensive overviews of the metrics that matter most, simply by using natural language.
Thanks to our cutting-edge NLP and NLU technologies, automated training data generation techniques, and proprietary machine-learning models that enable dynamic NL to SQL translation, we’re designed to enable conversational data accessibility for even the most complex of enterprise-grade databases.
The technology we’re building is our part to play in transforming the digital landscape of today and working towards a future where humans can interact with computers–and more specifically, their data–as intuitively and seamlessly as they interact with each other.