• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Large data analysis approach

ornshrn

New Member
Greetings to the Chandoo community
Hi guys

I'd like to consult with this fine community on how to analyze a very large data base file. The file contains 44 columns and ~160,000 rows, resulting the file to be 37MB data only.

Based on my experience, any formulas calculation or pivot tables will enlarge the file size and will consume CPU resource when calculating any new formula.

What should I do?

I added a small example of the columns and the top 10 rows to help you understand my challenge.

Appreciate any kind of advice.

Oren
 

Attachments

Oren

There are techniques in Excel to handle large data sets but they are well beyond what is generally used at Chandoo.org

Can you tell us what you want to do with the data ?
Also how often does it change and are the changes incremental or the whole of the data set?
 
Hi Hui,

I'll explain:

1. my objective is to prioritize column AE (LOCAL_ITEM_DESCRIPTION)& K (LOCAL_SUPPLIER_NAME) first by AQ (RCV_OR_ORD_USD_REPORTED_VALUE) and than by AM (RCV_OR_ORD_QTY).
2. The constraints are: Column Q (INCOTERMS), only what start with C or D. What do I do with all the blanks cells?
3. The partition of the above is according to T (PO_REGION) & Y (GLOBAL_TYPE_OF_GOODS)

I really hope I made myself a bot more clear :DD

Thanks
Oren
 
When you say Prioritise, do you mean sort or highlight or something else ?

Have you tried applying filters and then sorting in a Table
 
It's not a simple sorting or highlighting, it involves some conditions in order to get only the data I need. A pivot table and some vlookups will solve it but the file size will be dramatically large..

Can I use this as a raw data base and some how perform calculation from another file?
 
I would think about adding a number of helper columns to tag records according to your rules
then simply use Filtering and Sorting based on those fields

Try using Pivot Tables, they may not impact the file as much as you think
As always keep a backup just in case
 
Back
Top