June 5, 2023 | Blog | 3 minutes

Breaking Boundaries: Language Models Revolutionize Structured Data Analysis

Deepak Manapure

Solution Architect

Language models have been leading the way in advancing natural language processing, allowing for the comprehension and generation of text that closely resembles human language. However, recent progress has broadened their ability to also handle structured data. In this blog post, we will delve into the ways in which language models can be utilized to process and analyse structured data, presenting intriguing opportunities for various practical applications.

Structured data encompasses organized information presented in a predetermined format, such as spreadsheets, databases, or tables. It consists of distinct fields, records, and connections between various entities. In contrast to unstructured data, which comprises free-form text, structured data possesses a predefined schema, enabling straightforward interpretation and analysis using conventional approaches. Applying a language model to structured data necessitates comprehending both the data itself and its underlying schema.

An enduring challenge in the Data & AI field has been for business users to acquire understandable information in a readily comprehensible format from structured data. The initial hurdle lies in structuring the data according to a business domain schema, which is the primary step in transforming data into valuable insights. Subsequently, defining relationships and granularity becomes crucial to ensure that all potential queries are accommodated within the domain models. Unfortunately, this process has historically constrained the freedom of business users to query the data according to their needs, regardless of how the underlying business model was constructed.

The greatest advantage for business users in employing Large Language Models (LLMs) is the unrestricted ability to compose queries. To showcase this potential, we have developed a demonstration utilizing a basic table. Our confidence in the applicability of LLMs extends beyond this example, as we envision their utilization in other domains such as data validation and quality assessment. Through LLM-driven insights, business users can also gain access to validation adherence, further enhancing their decision-making capabilities.

The utilization of a relational database for LLM model consumption can be outlined through the following steps:

  1. Identify the pertinent data tables: Determine the tables within the existing relational database that hold the data required for LLM analysis.
  2. Extract the data: Retrieve the data from the identified tables by executing SQL queries.
  3. Perform LLM analysis: Utilize a supported Open AI library for LLM analysis to examine the data obtained from the selected table, identifying patterns and relationships.

Driven by our unwavering commitment to continuous innovation, we have developed a demonstration showcasing the effectiveness of LLM models in handling structured data. This breakthrough offers exciting possibilities for business users to directly interact with structured data, freeing them from the limitations imposed by pre-determined business models based on query patterns. It empowers users to explore the data directly and leverage its potential without being bound by predefined constraints.

Given below is a structured table containing comprehensive employee information.

Here are a few examples of responses generated using LLM models:

In conclusion, LLMs provide a means for natural interaction with structured data. Rather than relying on conventional query languages, we can engage with data directly through conversational means. This simplifies the process of interaction and promotes data democratization. LLMs also contribute to the identification of data patterns and anomalies, facilitating exploratory data analysis. Ultimately, LLMs enhance the accessibility and interactivity with data, bridging the gap between users and the wealth of information contained within the data.