Inside JP Morgan's Data Environment (2024)

JP Morgan Chase is one of the largest banks in the world, with nearly $130 billion in revenue in 2020. With 50,000 IT employees and an annual IT budget of $12 billion a year, the company invests heavily to ensure its technology gives it a competitive advantage. An example is JPMorgan Chase's data infrastructure, which includes a whopping 450+ Petabytes of data serving more than 6,500 applications, according to a presentation at AWS re:Inveny 2021, including one that processes 3 billion messages a day.

The bank recognizes the importance of data and widely shares it internally. Yet in a highly-regulated industry such as banking, making data too accessible can also lead to disaster.

“To unlock the value of our data, we must solve this paradox,” wrote JPMorgan officials in a 2021 blog on Amazon’s AWS site. “We must make data easy to share across the organization, while maintaining appropriate control over it.”

Like any large enterprise, JPMorgan had a lot of stored data in relational databases. As an early big data proponent, JPMorgan had also adopted Hadoop widely, which it used to build a monolithic on-premises data lake managed by a central data engineering team. While Hadoop continues to play a key role for analytics at JPMorgan, the bank also recognized how embracing the public cloud could decentralize data ownership and encourage data democratization and business innovation.

JPMorgan Chase first created a comprehensive data structure that is based around the concept of “data products”. These are collections of related data that may or may not map to existing business lines or even IT systems. For instance, one JPMorgan Chase data product includes all the data around wholesale credit risk, such as credit exposure, credit rating, and credit facility harvested from many different data stores and applications. Another data product is focused on trading and position data, including cash, derivatives, securities and collateral. Using the term “data product” instead of dataset or repository or even data asset is meant to create a shift in mindset by highlighting the goal: enabling data to produce business results, rather than accumulate dust in some forgotten database, according to James Reid, JPMorgan CIO for Employee Experience and Corporate Technology, in a July 2021 presentation

Inside JP Morgan's Data Environment (1)

Each data product is curated and owned by a team that includes a business owner, a technical owner, and multiple data engineers. They own and deeply understand their specific data product, its uses, its limitations and its management requirements. At the same time, giving each data engineering team end-to-end ownership of a domain encouraged and empowered them to consolidate any “data puddles” and “data ponds” under their management that feed a JP Morgan Chase data lake, said Reid.

Each data product is stored in its own physically-isolated data lake. While most are stored on Amazon S3, there are some still stored in on-premises repositories due to regulatory realities, said Reid.

All of these data lakes are cataloged by AWS Glue, Amazon’s serverless data integration tool. In addition, there are consuming applications used by employees that are physically separated from each other as well as from the data lakes. These separate, but interconnected, domains create JPMorgan’s data mesh.

Amazon AWS cloud services interconnect the distributed domains. AWS Glue Data Catalog enables applications and users to find and query the data they need. This enterprise-wide data catalog is automatically updated as new data is ingested into the data lakes, checked for data quality, and curated by data engineers with domain expertise.

Inside JP Morgan's Data Environment (2)

The catalog also tracks all data requests and audits that flow from data to applications. This gives JPMorgan Chase data engineers a single point of visibility into how their data is being used, which is key for JPMorgan Chase to remain compliant with the many regulations it faces. This metadata also helps users looking for data they are entitled to use that is both relevant and trustworthy.

Meanwhile, AWS Lake Formation enables data to be securely shared to approved applications and users. Neither applications nor users are ever allowed to copy or store data. This reduces storage costs and prevents the creation of “dark” data silos that lose freshness and accuracy over time, creating data quality and security problems. And without extra copies of data floating around, it's easier to manage data and enforce policies and access controls.

Inside JP Morgan's Data Environment (3)

Finally, JPMorgan Chase uses a trio of cloud-based database engines to query the data, which includes Amazon Athena, Amazon Redshift Spectrum, and Amazon EMR for non-SQL data processing. Machine learning is done via Amazon Sagemaker.

For JPMorgan Chase, its Amazon cloud-based data mesh satisfies three key technical priorities: high security, high availability, and easy discoverability. And that is supporting the outcomes JPMorgan hopes to achieve with its data: cost savings, business value, and data reuse.

With a framework for instantiating data lakes that uses a data mesh architecture, JP Morgan Chase was able to share data across the enterprise while giving data owners the control and visibility they need to manage their data effectively.

Get a demo of the Acceldata Data Observability Platform to learn how your organization can optimize data spend, data operations, and data reliability.

Photo by Jaanam Haleem on Unsplash

Inside JP Morgan's Data Environment (2024)
Top Articles
Microsoft's Competitive Advantage: An Inside Look
PackSafe - Portable Electronic Devices Containing Batteries
Navicent Human Resources Phone Number
Craigslist Myrtle Beach Motorcycles For Sale By Owner
Bubble Guppies Who's Gonna Play The Big Bad Wolf Dailymotion
It's Official: Sabrina Carpenter's Bangs Are Taking Over TikTok
Wordscapes Level 5130 Answers
Research Tome Neltharus
craigslist: kenosha-racine jobs, apartments, for sale, services, community, and events
Mackenzie Rosman Leaked
Txtvrfy Sheridan Wy
Polyhaven Hdri
Kostenlose Games: Die besten Free to play Spiele 2024 - Update mit einem legendären Shooter
Anki Fsrs
All Obituaries | Ashley's J H Williams & Sons, Inc. | Selma AL funeral home and cremation
Mid90S Common Sense Media
Grab this ice cream maker while it's discounted in Walmart's sale | Digital Trends
Roll Out Gutter Extensions Lowe's
Marvon McCray Update: Did He Pass Away Or Is He Still Alive?
Vandymania Com Forums
Welcome to GradeBook
Nhl Tankathon Mock Draft
Loft Stores Near Me
Geometry Review Quiz 5 Answer Key
How to Grow and Care for Four O'Clock Plants
Terry Bradshaw | Biography, Stats, & Facts
Greensboro sit-in (1960) | History, Summary, Impact, & Facts
Dei Ebill
Tokyo Spa Memphis Reviews
100 Gorgeous Princess Names: With Inspiring Meanings
Housing Intranet Unt
Pay Stub Portal
The Monitor Recent Obituaries: All Of The Monitor's Recent Obituaries
Alima Becker
Serenity Of Lathrop - Manteca Photos
Lucky Larry's Latina's
Workday Latech Edu
Spinning Gold Showtimes Near Emagine Birch Run
Build-A-Team: Putting together the best Cathedral basketball team
Frank 26 Forum
159R Bus Schedule Pdf
Felix Mallard Lpsg
Cdcs Rochester
Invalleerkracht [Gratis] voorbeelden van sollicitatiebrieven & expert tips
Gopher Hockey Forum
Immobiliare di Felice| Appartamento | Appartamento in vendita Porto San
13 Fun & Best Things to Do in Hurricane, Utah
Truck Works Dothan Alabama
Gas Buddy Il
Iman Fashion Clearance
Terrell Buckley Net Worth
Yoshidakins
Latest Posts
Article information

Author: Fredrick Kertzmann

Last Updated:

Views: 6417

Rating: 4.6 / 5 (46 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Fredrick Kertzmann

Birthday: 2000-04-29

Address: Apt. 203 613 Huels Gateway, Ralphtown, LA 40204

Phone: +2135150832870

Job: Regional Design Producer

Hobby: Nordic skating, Lacemaking, Mountain biking, Rowing, Gardening, Water sports, role-playing games

Introduction: My name is Fredrick Kertzmann, I am a gleaming, encouraging, inexpensive, thankful, tender, quaint, precious person who loves writing and wants to share my knowledge and understanding with you.