I know that’s a mouthful – “Governed Enterprise Big Data & Analytics Environment”, so let’s just refer to it as a Governed Data Environment.
In recent years, banks have been confronted with a significant number of data and analytically intensive regulatory, compliance and competitive demands. These analytical and data intensive demands have driven the need for large amounts of detailed, near real time, historical (10+ years), structured and unstructured internal bank and external data to be collected and stored to meet these increasing requirements.
There is also a need to have higher data quality, integrity and auditability in analytics by understanding how the data created in various line of business systems makes its way to final reports in an accurate, complete and timely manner. The best way to meet these requirements is through a Governed Data Environment.
For instance, Basel III (now IV), CECL, CCAR, DFAST, Liquidity Coverage Ratio (LCR), BSA/KYC/AML are all significant bank enterprise wide data and analytics intensive initiatives. All require a broader set of historical enterprise data than traditional bank reporting. Stress testing initiatives, such as CCAR & DFAST, require a broader set of attributes about loans and counterparties than is required in traditional loan accounting and reporting. The new CECL (Current Expected Credit Loss) accounting standard will also require a significant amount of data and analytical modeling.
Capital and Liquidity calculations for a bank require significant analytics and data capabilities. Daily reporting of liquidity and capital adequacy require data be collected on a near real time or at least daily basis from source systems. KYC/AML has real time or near real time data collection and analytical requirements of payment, deposit and counterparty data. Internal bank analytics and reporting, such as Portfolio Management and Economic Capital initiatives, also require many more loan attributes, counterparty and external data than traditional reporting requirements.
Traditional data warehouses tend to be best at supporting current (last 1-3 years) reporting and business intelligence requirements, which is important to many bank users, particularly in finance and some risk reporting. Their data needs tend to be more current year focused, summarized, finite in nature, repetitive and dimensional. Control of changes in the environment (i.e. SOX) is important to these users. Traditional data warehousing environments serve them well. Traditional data warehouses tend to have smaller amounts of data, are more expensive to maintain, have fewer and more summarized data elements (i.e. month end balance).
Analytical users tend to want access to raw, flattened source system data vs. the dimensional summarized data in a traditional data warehouse environment. It is not uncommon to require 10+ years of loan portfolio history plus external data in a stress testing project. Analytical users also want large numbers of data attributes about a lending instrument, the ability to link to external data sources (i.e. Credit Reporting Agencies) and the flexibility to change on the fly. All users want high quality, governed data with well-populated data dictionaries. Given these different requirements, it is not unusual for the reporting and analytical users to get into heated conflicts about data environments and what will best meet their needs. Big Data environments are better suited for the needs of the analytical user. The ability to flexibly and cost effectively manage large amounts of structured and unstructured data, support analytical modeling, link to external data sources, and be able to support rapid prototyping required by data scientists.
Big data environments are also quickly closing the gap for the needs of the reporting users as well, with dimensional reporting capabilities with a higher level of data governance controls. However, based on a bank’s requirements, having both the Big Data and Traditional Data warehouse environments as part of the banks overall data architecture fabric can be a logical decision.
Data Governance and SOX level controls are becoming more important to analytical initiatives. Basel III, BCBS 239, and Part 504 (New York Banking Division Transactional Monitoring & Filtering Requirements over BSA/AML) are regulatory requirements which have key implications for the governance and integrity of data used in bank reporting and analytics.
These regulations require a higher degree of data governance, internal control, and verification of data integrity. However, many banking big data environments were implemented quickly and somewhat ad-hoc to meet the urgent needs of stress testing and other regulatory analytical requirements. They were not architected as enterprise level stores of data, as part of an overall enterprise data strategy or with the same level of controls (SOX) that would be present in traditional banking transactional and finance systems.
But given the strategic importance of these big data environments in the analytics and reporting which are key to understanding a bank’s solvency, capital adequacy, liquidity and compliance, having strong Data Governance, process & policy controls, ownership and an enterprise data strategy for these environments has become essential in the eyes of the regulators, investors and sophisticated customers.
Anyone can put numbers on a piece of paper and present it as fact to the bank’s Board and regulators. It is quite another thing to be able to prove how those numbers were arrived at and the lineage of how the data was aggregated from a variety of internal and external sources to create the analysis. When a bank can achieve this level of control and data transparency, it creates a new level of soundness for the institution in the eyes of not its regulators but also its increasingly data savvy investors and customers.
In fact, a lot of attention has been given to Model Risk Management and rightly so, given the importance of quantitative modeling to banks. But I would argue that one of the top risks to banking analytical models is data integrity risk and this is not so well understood or measured, even today.
CECL will drive a new level of internal controls into banking Big Data environments. When the accounting standard (ASU2016-13) was issued, those of us who have worked closely with bank data and analytics environments realized how much data, internal control and analytics work was ahead for the banks.
Banks will need to aggregate larges amount of data in a SOX controlled environment to be able to meet the CECL lifetime loan valuation analytics requirements. This will require significant time and investment on the part of many banks. Some banks will be able to leverage the data which exists in existing stress testing data environments, such as those used for DFAST or CCAR, however, in most cases the level of controls in these environments is not SOX compliant, which is required for financial reporting.
There will be a significant amount of work required not only to bring the data together, but also to put in place the controls and Data Governance to make these environments ready to support financial reporting. The other issue will be that the frequency of data updates in these environments will need to be more to support periodic financial reporting vs. the annual stress test.
This article has focused on why banks need a Governed Enterprise Big Data & Analytics environment, more from the perspective of regulatory analytics, financial reporting and compliance. However, there are strong arguments to be made as well in the support of a Governed Data Environment to support revenue growth, customer experience, operational efficiency, cognitive and fintech initiatives. These will be the topics for another set of articles, as well an explanation and deep dive into some of the topics mentioned above (i.e. What is a Big Data environment, Data Governance, unstructured data, CECL, Data Risk, etc..).
In conclusion, the significant analytical needs of banks requires large amounts of detailed, historical and external data, along with regulatory and investor requirements to ensure data governance, integrity and quality has made a Governed Data Environment essential infrastructure for today’s sound bank.
For More Information
For more information, please contact Michael Andrud, President, FinResults, Inc. (firstname.lastname@example.org).
About the Author: Michael Andrud has significant experience as a senior level banking executive, having been an Executive Vice President at leading financial institutions, including roles as a transformative Enterprise Chief Information Officer and Chief Data Officer. More recently, he was the lead Banking Data & Analytics Partner for IBM’s Global Business Services organization. Michael founded FinResults, Inc., which provides financial institutions with advice and solutions to some of their most complex business problems by leveraging the capabilities of the Cloud, big data, Artificial Intelligence, advanced analytics, and process robotics.