Managing data as an asset isn’t natural or obvious for most product and service companies, especially when it comes to historical and archived data stores. But thanks to big data analytics, organizations that treat their data as an asset can find significant new value in the information they retain, improve competitiveness and decision making, and change the economics — the cost vs. benefit — of information production, collection and retention.
Media and entertainment companies are accustomed to thinking about digital media (their “data”) as assets, and they always consider the full lifecycle of them, from the point of creation through to archiving and preservation. Disney, for example, takes advantage of the “long tail” value of its movies and television shows to produce revenue well into the future through re-release, on-demand streaming, repurposing clips, and producing new assets, not to mention T-shirts, lunch boxes, stuffed animals and other toys and tchotchkes.
Today, thanks to the ability of big data analytics solutions to rapidly ingest and analyze huge volumes of data, we can see parallels in other industries. In oil and gas, for example, new drilling and extraction techniques make it possible to take advantage of sites that were once considered unprofitable. As a result, companies are using big data analytics to mine decades-old studies in order to find these potential locations without the need for new and expensive research. Similarly, pharmaceutical companies are aggregating and analyzing past drug research to find previously undetected connections or interactions that could directly impact current product research and development.
Other industries should also evaluate the entire lifecycle of their information to determine if a high-value long tail exists. For example, manufacturers often keep all technical documents and product specifications long after a product has stopped being used by customers. By aggregating and analyzing this information, manufacturers may discover a wealth of insight related to customer service, safety issues, lawsuits and product recalls.
Retailers, too, may benefit from the long tail if they go back and analyze years of seasonal and holiday selling patterns, the relationship between regional sales patterns and changing demographics, the connection between product success and supply chain issues and much more. And healthcare organizations can aggregate and analyze patient-related doctor’s notes, insurance correspondence and medical imagery to create historical profiles that augment traditional diagnostic work.
The possibilities are truly limitless.
The questions you need to ask
No matter the industry, actually treating information as an asset, especially the long tail, requires the ability to examine all data stores to identify what is there and how valuable it is. After all, not all information is valuable and not all information should automatically be included in analytics projects. In fact, including dated and irrelevant information in analytics can distort the results and undercut any potential value in business decision making.
To determine the value of the information you have, you need to ask the right questions. For example, when it comes to the information you are currently creating, you may ask questions such as:
- What information that is being created is critical to ongoing business functions?
- What groups and individuals are creating this information, what applications are used to generate it and where is this information stored and in what format?
- What groups and individuals are consuming the information and for what purposes, and how long is the information of value to the various stakeholders?
- What information contains personally identifiable or otherwise sensitive or regulated information?
By contrast, for historical and archived data — the dark data so many companies have no insight into — you need a different strategy. Many organizations I’ve talked to have taken a similar approach, starting at the aggregate level:
- What data stores exist anywhere in the organization?
- What types of files are in these data stores — how many documents, spreadsheets, PDFs, etc.?
- Which of these files contain customer, product or project IDs, and which contain personally identifiable information about customers or employees?
As you begin to work through this process, undoubtedly you’ll discover more questions to ask, and you may find yourself in an iterative process. One company I know decided to pattern match on the 3-2-4 number scheme and discovered a mountain of files that contained Social Security numbers but that were in no way protected as sensitive data!
Taking an iterative approach — discovering what you need to do as you do it — is OK. You just need a very methodical process, and that’s where information governance (IG) comes in.
Information governance and the value of information
An IG program puts in place the people, processes and technology necessary to efficiently locate, identify and assess the value of all data stores. To learn more about IG, you may want to consult the Information Governance Reference Model (IGRM), which provides a framework for linking information duties and value to the data assets that IT manages. To help implement the IGRM, organizations like the CGOC (Compliance, Governance and Oversight Council) offer resources, such as the Information Lifecycle Governance Leader Reference Guide, which lays out the strategies and foundational elements of such a program.
Finally, IG solution vendors offer retention policy and schedule management tools that centralize and maintain an information retention schedule that applies to all information — paper and electronic, records and non-records, structured and unstructured. These solutions can help organizations conduct a sweeping information inventory of what specific information exists, how it is described, where it is kept, who manages it and how long the information has value to the business.
Managing data as an asset isn’t yet common in most industries, but those companies that begin the journey will likely find a treasure trove of valuable information that can be used to increase competitiveness, improve decision making and help ensure regulatory and legal compliance. They may even be able to shrink their storage footprint and significantly reduce IT and e-discovery costs by defensibly disposing of information that has no legal, regulatory or business value.
Derek Gascon is executive director of CGOC. A 20-year veteran of the IT industry, he has led market and product strategies for content management/archiving software and solutions. He is currently the executive director for CGOC and program lead for ILG Thought Leadership at IBM. Derek also held executive positions in strategy, marketing and solutions across related technologies including index/search, ECM, MAM, archive and storage with companies such as Dell, Hitachi Data Systems, Convera and StorageTek.