: Why don't banks give access to all your transaction activity? In the era of big data, I find it surprising that banks and credit card companies only offer access to a ridiculously small number of transactions - often only

Posted in: #money

0 Reactions

Why don't banks give access to all your transaction activity?
In the era of big data, I find it surprising that banks and credit card companies only offer access to a ridiculously small number of transactions - often only your last 180 days, if that. The longest I've seen was 720 days going back. I suspect they do store everything, but intentionally limit access.

These transactions are text only, take up extremely little space, and storing an individual's lifetime worth of transaction would take less than 10MB of data; about the storage required by two MP3 files, and about 250k records per person per lifetime, if we generously assume everyone makes 10 transactions a day. But I'd be happy with only the last 10 years of transactions, so about 1MB per customer. One of the largest banks, JP Morgan Chase, has ~70MM credit card customers. That means 70TB of data for 10 years of records - hardly impressive for a corporation of that size, with B of net income in 2013, when 1TB of cloud storage costs retail per month.

By comparison, would you put up with Gmail or any online email provider keeping only the most recent 120 days worth of email? (And emails do take a lot more space than transactions, are far more numerous, and have to be instantly retrievable.)

Storage requirements for transaction activity are trivial in an era where we're throwing around petabytes and zetabytes.

Is there a sound reason for banks not offering access to all your transactions, other than legacy software on their side?

UPDATE: Somehow I hadn't found this 2011 Quora question asking the same thing.

Load Full (11)

Login to post a comment!

11 Comments

Sorted by latest first Latest Oldest Best

@JosephChristo

0 Reactions

One reason why they limit it is to protect you. If I hack your account, I get your entire financial history.

I can see a copy of every check you ever wrote. I can see the account number with every doctor, utility, and credit card. I can also see the account information on the back of those checks for all your relatives who you sent for their birthday.

I can use the information in those accounts to see where you used to live, this allows me to spoof you when applying for new credit. If they ask if I ever lived on Main street in Anytown USA. I can confidently say yes.

If I only let you download a window of time, the responsibility is on you to protect that data that is before the window. They protect it in file isolated from the internet, and finally only in archive locations.

Some of the information doesn't exists in electronic form. Data from the 1990's and earlier may not exist in the form you want. They have been expanding the windows over time. I can see/download a pdf of my monthly statement going back 7 years. Of course that data can't go directly into quicken.

Some places do let you get a file that goes back farther, but they charge you for it, and it can only be done by them sending you the file. That prevents you from downloading your entire
history everyday. That times 70 Million customers would overwhelm their server and other infrastructure.

Regarding the amount of data:

My quicken file with the last 10 years of data is about 60 MBytes and growing.
You also assume that the only data is in text files. They also store electronic copies of checks.
You also assume that the data you see on the transaction is the only data that exists regarding the transaction. They have fields that track the people who handled the transaction, the location of the transaction (ATM, teller, mail, scanned...), the information regarding when it was transferred to the other bank, when the funds were released...

@JackWillington

0 Reactions

If you need access to your data beyond the online availability, you download the transactions and manage the archive yourself. Six months to eighteen months is generally enough time for most people to manage their own archived data.

Big banks have the power to store and retrieve all the data online. Unfortunately, the older records are not frequently accessed. Why have these records online when they will be rarely accessed? Backing up data will take longer. Queries to retrieve data will take longer. Everything will take longer just so you can have records that 99% of customers will never access.

@HenryColt

0 Reactions

"Things are the way they are because they got that way."
- Gerald Weinberg

Banks have been in business for a very long time. Yet, much of what we take for granted in terms of technology (capabilities, capacity, and cost) are relatively recent developments.

Banks are often stuck on older platforms (mainframe, for instance) where the cost of redundant online storage far exceeds the commodity price consumers take for granted. Similarly, software enhancements that require back-end changes can be more complicated.

Moreover, unless there's a buck (or billion) to be made, banks just tend to move slowly compared to the rest of the business world. Overcoming "but we've always done it that way" is an incredible hurdle in a large, established organization like a bank — and so things don't generally improve without great effort. I've had friends who've worked inside technology divisions at big banks tell me as much.

A smaller bank with less historical technical debt and organizational overhead might be more likely to fix a problem like this, but I doubt the biggest banks lose any sleep over it.

@AlvinPetty

0 Reactions

All the other answers here are correct, but I'll add one more perspective. I am a business architect at one of the world's largest retail banks. Every day I experience the frustration of trying to get large-scale corporate IT to do anything, so I feel that your question is just one facet of the wider question: "why are banks so old and busted?"

While it's true that the cost of online, redundant, performant, secure data storage is significantly higher than you anticipate in the question, it should still be well within the capacity of a large enterprise. The true cost is the cost of change.

Nothing at a bank is a green field development. Everything is a bolt-on to existing systems. Any change brings the risk that existing functionality will be affected, therefore vast schemes of regression testing (largely manually executed) spring up around even the most trivial developments. Costs scale exponentially with the number of platforms affected (often utterly distinct, decades-old, incompatible platforms that have arisen out of historical mergers and acquisitions). Only statutory, revenue-generating and critical maintenance change is approved.

Any form of cost-cutting that increases risk is quickly extinguished. This is because when things go wrong, IT get blamed by their business colleagues. This is because the business colleagues in turn get blamed by the regulators, the media, the customers, and the public at large. Who doesn't cuss their bank when the ATM is unavailable? The bank's IT organization develops a kind of management sclerosis, risk averse in the extreme. Banks can't ship a beta version and patch it later.

This ultra-low-innovation approach is a direct result of market and regulatory forces. If you were happy with a bank account that played fast and loose with your money the way Facebook plays with your data, then banking would be much cheaper, much more innovative, and much riskier.

Conclusion

To get back to your specific question, some banks actually do offer a much longer back catalog of transactions for download (usually only a few key fields of each transaction though), and the ones that don't most likely don't see it as a revenue generating selling point, and it therefore falls above their innovation appetite.

@AllyBetsy

0 Reactions

Many good points have been brought up, and I'll just link to them here, for ease.

Source: I work at a credit/debit card transaction processing company on the Database and Processing Software teams.

1. Security

See mhoran_psprep's answer.

2. Tradition

See Chris' answer.

3. System Integrity

Believe it or not, banks don't expose their primary (or secondary) database to end users. They don't expose their fastest / most robust database to end users. By only storing x days of data in that customer-facing database and limiting the range of any one query, any query run against it is much less likely to cause system-wide slowness.

They most definitely have database archives which are kept offline, and most definitely have an employee-facing database which allows employees to query larger ranges of data.

4. No Added Value

What would a bank have to gain by allowing you to query a full year of transactions?

@JohnSmith

0 Reactions

Although if you count only your data, it would be quite less 10 MB, multiply this by 1 million customers and you can see how quickly the data grows. Banks do retain data for longer period, as governed by country laws, typically in the range of 7 to 10 years.

The online data storage cost is quite high 5 to 10 times more than offline storage. There are other aspects, Disaster recover time, the more the data the more the time. Hence after a period of time Banks move the data into Archive that are cheaper to store but are not available to online query, plus the storage is not optimized for search. Hence retrieval of this data often takes few days if the regulator demands or court or any other genuine request for data retrieval.

@TeddAnderson

0 Reactions

Well, I know why the Rabobank in the Netherlands does it. I can go back around one year and a half with my internet banking. But I can only go further back (upto 7 years) after contacting the bank and paying �5,- per transcript (one transcript holds around a month of activities).

I needed a year worth of transcripts for my taxes and had to cough up more than �50.

EDIT
It seems they recently changed their policy in a way that you can request as many transcripts as you like for a maximum cost of �25,- so the trend to easier access is visible.

@JohnClane

0 Reactions

To add technical detail to other answers, your (and some commenters') estimates of storing that data is woefully (many orders of magnitude) off.

Let's take your 10MB of transaction data per user.

You're only estimating text records like in Quicken.

Now add on the volume of storing everye check's image. That's 100K (if not 500Kb depending on resolution of the scan) per check. If you have 100 checks per year (not unrealistic, if you pay all utility/morgage bills by check, as well as purchases), you now have 10Mb/year to 50MB/year. Now you're asking for 10 years of this, so you have 100-500MB per customer. NOT 10MB-70MB as you initially assumed. Let's take a mid-range figure, 300 MB.
You were estimating using consumer grate cheap-o storage (which Facebook can afford for their data, as they don't store transaction data). Now let's up that to enterprise server hard drives. Your storage costs just rose 2x-5x.
Now, typically you'd have RAID. So 2x more.
Most large financial institutions have multiple data centers. You typically store all data's copies in those data centers for DR purposes. Your multiplier added another 2x-4x
Most production data servers have multiple copies (Write DB server + one or more read-only copies). Multiply by 2x-4x
With some rare exceptions, most banks don't just have one central database server. Each major app / business line would have its own DB, so you multiply that by 2x-20x depending on the bank, especially if it's arrived at its size by merging with other banks and has dozens of inherited legacy systems.
multiple backups. Regulatory backup requirements means you don't just back up your data once a year. You do it daily, till the data is purged from DB.

Meaning, you don't store ONE copy of your transaction in backup. You stored, say, 10*365 copies, assuming 10 year retention)

So, at the low end, your cost estimates are 30*2*2*2*2*2 = 900 times off (3 orders of magnitude) just for live database storage, and 3500 times off for backup costs.

At the high end, they could be 50*5*2*4*4*20=16,000 times off (4-5 orders of magnitude)

At this range, no, it isn't worth it for the bank to keep your transactions available in DB and online any longer than bare-bones absolutely critically necessary.

@AllenHopper

0 Reactions

I would say a lot of the answers here aren't quite right.

The main issue here is that banking is a highly oligopolous industry - there are few key players (the UK, for example, has only 5 major banks operating under a variety of brands: it's all the same companies underneath) and the market is very, very hard to enter owing to the immense regulatory burden.

Because the landscape is so narrow and it's possible to keep close tabs on all your competitors, there's no incentive to spend money on shiny new things to keep up with the competition - the industry is purely reactive. If nobody else has an awesome, feature-filled online portal, there's no need for any one bank to make one. If everybody is reactive, and nobody proactive, then it's a short logical deduction that improvements happen at a glacial pace.

Also take into account that when you've got this toxic "bare-minimum" form of competition, the question for these people soon turns to "what can we get away with?" which results in things like subpar online portals with as much information as you like delivered on paper for a hefty charge, and extortionate, price fixed administrative fees. Furthermore your transaction history is super valuable information. There are one or two highly profitable companies who collate international transaction data and whose sole job in life is to restrict access to that information to the highest bidders. Your transaction history is an asset in a multibillion dollar per year industry, and as such it is not surprising that banks don't want to give it out for free.

@CatherineCorbett

0 Reactions

Well, some banks are seeing the light. In their most recent redesign, Alliant Credit Union has an option to download all transactions:

@SamuelRichards

0 Reactions

Heterogenous archives

A big issue for historical data in banking is that they don't/can't reside within a single system.

Archives of typical bank will include dozen(s) of different archives made by different companies on different, incompatible systems. For example, see www.motherjones.com/files/images/big-bank-theory-chart-large.jpg as an illustration of bank mergers and acquisitions, and AFAIK that doesn't include many smaller deals. For any given account, it's 10-year history might be on some different system.

Often, when integrating such systems, a compromise is made - if bank A acquires bank B that has earlier acquired bank C, then if the acquisition of C was a few years ago, then you can skip integrating the archives of C in your online systems, keep them separate, and use them only when/if needed (and minimize that need by hefty fees).

Since the price list and services are supposed to be equal for everyone, then no matter how your accounts originated, if 10% of archives are an expensive enough problem to integrate, then it makes financial sense to restrict access to 100% of archives older than some arbitrary threshold.

Feed

: Why don't banks give access to all your transaction activity? In the era of big data, I find it surprising that banks and credit card companies only offer access to a ridiculously small number of transactions - often only

Login to post a comment!

11 Comments

Back to top Use Dark theme