• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Spreadsheet Slowing Down!

I'm attempting to run the profiler. I copied the tab ExecutionTimes into my workbook. I tried copying and dragging the module Optimize, but it won't copy or drag into my workbook. Any suggestions?
 
I'm unclear about the exact instructions even after watching the video. I think the copy/paste from-and-to messed me up. But I tried anyway, and got this when pressing the Profile macro button:
upload_2016-10-19_21-44-39.png

However, I did run an experiment to see where the problem might be. I copied and pasted values over two of the sheets in the workbook. These sheets have formulas like this:

=IF(INDEX(Data_Load_Complete_DRG_DB,MATCH(BI$1,Dates_DRG_DB,0))=TRUE,IFERROR(SUMPRODUCT((Maps_To_Class="Medicaid")*(MDC=$G33)*(Dates_DRG_DB=BI$1)*(DRG_Weight)*(DRG_DB))/SUMPRODUCT((Maps_To_Class="Medicaid")*(MDC=$G33)*(Dates_DRG_DB=BI$1)*(DRG_DB)),1),BH33)

What these formulas do is extract data from a database and arrange it for use in a summary format elsewhere in the model. It first evaluates whether we are dealing with an actual results (historical) month or whether it hasn't been reported yet (it's a future reporting period for which we will prepare a forecast). And if there's an error, for example, if there is no volume which would produce a #DIV! result, then default to the prior month. It actually works really well.

These two sheets have 120 months along the top columns and 180 categories down in the rows. There are two similar sheets like this with a combined 52,000 cells.

There are some cells that reference these sheets in the Assumptions tab, and they now take about 40 seconds to recalculate. Other cells that do not reference these sheets are much quicker, but still not instant, maybe 2-3 seconds. So I may have narrowed down the problem without using the profiler program.

I have previously removed all Volatile functions. There are still about 20 VLOOKUP formulas I could replace with INDEX/MATCH if VLOOKUP bogs things down.

The question is, what to do about it? Maybe Pivot Tables could extract the data, but that would require a lot of rewrite on my part. And, given what I want the formula above to do, this will be problematic to create. If the answer is that there's nothing I can do, that a workbook just requires a lot of horsepower when it reaches this size and complexity, will buying a faster computer, say, with 16GB RAM and a 64 bit OS solve this?
 

Attachments

  • upload_2016-10-19_21-44-14.png
    upload_2016-10-19_21-44-14.png
    21.2 KB · Views: 3
Last edited:
Hi ,

The error message you have posted merely indicates that there may be more than name timeallsheets ; do a search for this text , and delete all occurrences of this except for one , which should be the procedure that you want.

Narayan
 
Narayan,

I ran the profiler. It took 3-1/2 days. What it showed was that 3/4 of the total calculation time is taken up by the formulas that extract data from a database and perform calculations against it, as outlined in my last post above. These calculations figure out the weighting for different classes of patients and determine expense levels and revenues. I knew about this time drain because I had previously pasted over the formulas with values to see what would happen.

The other time sink I hadn't known about is a sheet that extracts from the source database patient volumes, like so:
=IF(Settings!$F$11="No","NR ",SUMPRODUCT((Maps_To_Class_APC="Charity")*(APC=$C480)*(Dates_APC_DB=O$5)*(INDEX(Data_Load_Complete_APC_DB,MATCH(O$5,Dates_APC_DB,0)))*(APC_DB))/SUMPRODUCT((Maps_To_Class_APC="Charity")*(Dates_APC_DB=O$5)*(INDEX(Data_Load_Complete_APC_DB,MATCH(O$5,Dates_APC_DB,0)))*(APC_DB))*O$42)

This array formula finds charity patients by payment code. It first determines whether the data has been finalized, and then it finds the percentage of patients for each category (which sum to 100%) and applies this percentage to reported, official patient totals. This way, if the detail is off from the official total, the percentages will ensure that everything ties out.

If it all sounds like one of these steps could be eliminated, it'd have the unfortunate effect of making the resulting formulas enormously long and complicated. Imagine tying this formula to the one in my previous posting. Yikes! Besides, I think it's better as a general practice to split formulas up for easier auditing and error checking.

While I could generate a Pivot Table to do some of the heavy lifting, I'd still have to apply most of the same formula logic to it. I'm not sure that it would save much time.

Do you have anything to suggest? I'm running an 8GB RAM Mac using Excel for Windows. That may be part of the problem. Perhaps a 64 bit, 16GB RAM Windows-only machine might do the trick. Anything you can suggest would be greatly appreciated.

Paul
 
Hi ,

The only thing I can think of is to develop the entire application in VBA , retaining only the simpler formulas.

This would need a complete functional description of what you want the application to do , what will be the input data , and how you want the outputs to be formatted , displayed.

Narayan
 
Let me try avoiding those array formulas first. That might speed up the processing time. I'm almost always using them with SUMPRODUCT formulas. Aside from trial and error to see what works, what are the rules when arrays are needed vs. not?
 
Hi ,

An array formula is required when you have to process data in multi-cell ranges or arrays ; a normal IF statement will look at one cell , and based on the contents of the cell will output its result.

An IF statement can also look at a range or array of cells , and based on an ORing or ANDing or SUMming or any other operation , can output its result ; in this case , the formula will have to be entered using CTRL SHIFT ENTER , in order for it to return the correct result.

Some Excel functions such as SUMPRODUCT naturally work on multi-cell ranges or arrays , and do not need to be entered using CTRL SHIFT ENTER , but this by itself may not result in any improvement in speed.

Thus , if you have a requirement of multiplying two ranges , and adding them up , any of the following formulae will do :

=SUMPRODUCT(A1:A10 , B1:B10)

=SUMPRODUCT(A1:A10 * B1:B10)

entered normally , or :

=SUM(A1:A10 * B1:B10)

entered using CTRL SHIFT ENTER.

But which of them will be the fastest , and how much of a difference it will make is something that will need to be tested.

As a general rule , we can take it that two functions that need to do exactly the same thing may not differ much in their execution time , unless the data is substantial enough that even a difference of a few milliseconds of difference per cell add up to minutes when the number of cells runs into hundreds of thousands.

Narayan
 
Thank you for the reply. I do not think this involves hundreds of thousands of cells, but I can find some formulas where I used control-shift-enter and didn't need to, to see if there's any measurable difference.

By the way, a friend ran this file on his machine. It's about the same as my computer, except that it has 16GB RAM whereas mine has 8. We both have about 2.3-2.6 gigahertz processors. He could recalculate a cell in about 15 seconds, whereas my recalculation time for the identical cell was 35 seconds. For a good user experience, that's still too long. I'm wondering if 32GB or even 64GB of RAM, and possibly a 4 gigahertz processor would fix everything. But I can't test this assumption until I actually buy a new computer, something I don't want to do.
 
8GB of ram is small, particularly on today's machines, you are using that for everything running on your machine, the OS, background services, open software such as Excel, if you have a browser open as well then you may as well go cut the grass while Excel try's to add up 2+2 with the meagre resources left to it.
If you have CF assigned to cells in Excel then memory will be reserved for it even if CF is not use, Excel can soon turn into a memory hungry hog.
You say you do not wish to upgrade your machine but if it only came with 8Gb you may find you are suffering even more as your files grow with CF, volatile formulas, VBA, dashboard, pivot tables and all the rest of Excels greed.
 
Most of the time, I try to design dashboard/workbook for lowest performance machine that can be reasonably expected to be in use by client (internal or external).

4Gb Ram, i5 ~ 2.5GHz is pretty common in business environment. As most companies are slow to update/upgrade computers when it is working.

Liberal use of MS Query and PowerQuery has been helpful in this regard.
 
8GB of ram is small, particularly on today's machines, you are using that for everything running on your machine, the OS, background services, open software such as Excel, if you have a browser open as well then you may as well go cut the grass while Excel try's to add up 2+2 with the meagre resources left to it.
If you have CF assigned to cells in Excel then memory will be reserved for it even if CF is not use, Excel can soon turn into a memory hungry hog.
You say you do not wish to upgrade your machine but if it only came with 8Gb you may find you are suffering even more as your files grow with CF, volatile formulas, VBA, dashboard, pivot tables and all the rest of Excels greed.
To experiment: recalculate time is now 35 seconds for a particular cell sitting at the head of a large dependency tree. I closed the browser, and the recalculate time went down to 27 seconds. Then I eliminated conditional formatting (I have a lot) and the time was further reduced by 2 seconds.

I do have some simple VBA code that I no longer need present in a lot of the model. Not sure how much that will speed things up, if at all. But I'm going to add VBA code soon to automate some operations. There are no Pivot Tables, as of yet, but I know that takes some time also.

I'm running a Mac using Parallels for Windows. I wonder if a Windows-only machine would be faster. I can't find any documentation on how things might slow down when using two operating systems at once. If I knew this was the solution, I'd go get a 16GB Windows machine and be done with it.
 
Most of the time, I try to design dashboard/workbook for lowest performance machine that can be reasonably expected to be in use by client (internal or external).

4Gb Ram, i5 ~ 2.5GHz is pretty common in business environment. As most companies are slow to update/upgrade computers when it is working.

Liberal use of MS Query and PowerQuery has been helpful in this regard.
Thanks, that's a good observation. Other users do not have the fastest machines. Of course, I could just tell my clients to run this on a machine with certain specs if I knew what those specs would be. Or be prepared to wait!
 
Back
Top