With data growing exponentially, business professionals use search technologies to find the information they need. However, search is inadequate for finding the structured data in enterprises transactional systems and data warehouses. Traditional business intelligence (BI) and reporting are the primary methods for locating this data, but their complexity normally requires IT assistance. The next generation of search will enable business users to easily and immediately search and discover the mission-critical structured data they need, without burdening IT.
Structured data discovery is the next wave in the search revolution. Data discovery revolutionizes how business professionals can leverage enterprises ever-expanding data assets to quickly and effectively access the data necessary to answer questions, make decisions and solve problems - thus changing the competitive dynamic with its speed and simplicity.
How Structured Data Discovery Differs
It is helpful to briefly review how unstructured data search works. A search engine uses one or more keywords to match items and displays the sorted results in a search results page. From there, users navigate via a link to another page displaying details about the returned data. To perform faster matching, the search engine collects metadata beforehand using a process called indexing. Search engines store the indexed information, but not the full content of each item. The full content remains at the data source, typically a document repository or Web site.
Structured and unstructured data search each present distinct challenges in terms of what is being searched, why search is being used and how search needs to work to solve the business problem at hand. Figure 1 summarizes the major differences between the two approaches.
Structured data search, also known as data discovery, is optimized for discovering data that is typically stored in databases and used as a complement to BI applications such as reporting and ad hoc query. After the user enters keywords, the search results page displays both physical and virtual database entities where those keywords reside.
Metadata about the keyword plus the relationships between keywords are as important as the keyword itself. For example, what does the number 2317 mean? Metadata in the form of a column name helps distinguish among 2317 B Street as a ship-to address, 2,317 as the number of units shipped, 23:17 as the shipment transaction time and 2317 as a shipped items ID number. Like BI, which shows the relationships between disparate data elements (for example, customers, invoices, shipments and returns), data discovery also shows data relationships. However, data discovery avoids manual modeling, instead automating the discovery of both the explicit relationships, as defined by existing schema, as well as the implicit relationships from the data values themselves. The latter capability is enabled using advanced, probabilistic correlation techniques.
Accessing data is typically done using ODBC and JDBC standards, while leveraging high-performance federated query algorithms to optimize the SELECT statements required. When viewing the data, users want to see data rows along with appropriate column headers. As such, structured search requires a spreadsheet-style workspace where users can easily add, move and delete rows and columns as they work with their data.
Security for structured data search is implemented by leveraging existing source-, column- and row-level security rules as implemented via corporate directories such as LDAP and Active Directory or via role-based rules specified in packaged applications.
Complementing Traditional BI and Reporting
In its original form, search was applied to BI as a way to complement hierarchical folder navigation or parameter-driven reporting methods. This assumed the data was available in an existing report.










Be the first to comment on this post using the section below.