• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Excel "not responding" when working with large data

trident50

New Member
Hello,
I've posted twice on here and got excellent results - when working with my "test workbook" - which is a small portion of the original 2 sheets (I'll upload the test file).

Pasted below is my original question - which was answered with a formula that works perfectly!

The problem I'm now facing is my original "sheet1" is almost 25,000 rows and 19 columns and is 9 Mb in size. "Sheet2" is 1.35Gig with 1.33 Million rows and two columns.

Even breaking sheet2 up into 10, 130 mb, files, I can't manipulate the data. Even if I apply the formula below, I still can't sort it without Excel "not responding".

I'm running a virtual server on MS Azure with 8 core and 54gb ram and nothing running on it (task manager shows cpu usage at 2% and ram at 5%).

I have access to a wide variety or tools - Access, SQL server, Foxpro 9, Filemaker Server Pro 11, Office 2010 & 2013 but, as a network engineer, my knowledge programming and databases is somewhat limited and I'm wondering if anyone can help out - maybe Excel isn't the right tool for this particular job? I've even installed Powerpivot in both versions - although I know little about it - and when trying to import the 1.3gb txt file I get memory errors.

Here is my original post:

I'm running Excel 2013 and I have one workbook with two sheets - sheet1 and sheet2. Sheet1 has 24 columns and 26K rows. Column "A" is titled "id" and contains
id numbers that are also, some, found in column "A" of Sheet2 - which is also titled "id".

Sheet1 has no "description" column. Sheet 2 only has 2 columns - "A" is the "id" column and "b" is the "description" column.

A B

ID Description

58749651 a bunch of text

How can I create a description column on sheet1 (it would be column "U") and populate it with the description from sheet2 only where the "id" values from column "A" match?

So, basically, I'm hoping to populate a clumn in sheet 1 (the next open column is "U") where the id number matches the id number on sheet 2, column 1.

So both sheets have id numbers in column 1.

Sheet 2 only has 2 columns - column "A" is "id" and column "B" is "description"

I want those descriptions, where the ID numbers match, to populate column "U" on sheet1.


And here is the formula provided -

Hello Rob,

You could use VLOOKUP in Sheet1.

Use in U2, then copy down.

=IFERROR(VLOOKUP(A2+0,Sheet2!A:B,2,0),"")
Regards,
Haseeb Avarakkan


And I've uploaded the sample file.

Again, not sure if Excel is the way to do this so I'm open to any and all suggestions.

Thank you all!
Rob
 

Attachments

Not Responding doesn't actually mean that Excel is frozen,
In fact Excel is still calculating.

I'd suggest leaving it for as long as you can, even overnight.
 
In regards to speeding it up

Backup the workbook

On sheet1 I would copy/paste all the cells as values, maybe don't copy row 2 so you have a copy of the formulas
This will stop intermediate recalculations that aren't required.

Change U2: =IFERROR(INDEX(Sheet2!B:B,MATCH(A2,Sheet2!A:A,0)),"")
Copy down

This doesn't need the source to be sorted
 
Hi Hui,
I've left it overnight, for 18 hours, and the entire system hung - not just Excel.

And I've copied and pasted all this data into a new workbook pasting only the values, and not the formula, and it makes no difference.

It's 1.3gb of data but it's running on an 8 core with 54gb of ram…and it's not even doing anything all that complex…I'll be needing to do this at least once a week…so if it takes 24 hours to do I'm thinking maybe there's a completely different way to go about this?

Thanks for your help!
 
If you have the data as a Text or Access file I'd sort it in Access if required
Then use access to import the second field as a merge of some type (I'm not an Access guy but I know it can do what you want)
1.3Million records isn't that many

I'd also be doing it on a local PC, not on a Virtual Server ?

I also suspect there is more to your workbook than you have told us.
If I copy Sheet1 Row 500 down to row 25000 and copy Sheet2 Row 2 down to row 1048576, my Workbook is only 11MB! Which is nowhere near the 1.3GB you mention

I'm also not sure how you are working with 1.3M Rows as Excel can only handle 1048575 rows of data + a header row?
 
Hmmm….okay, now I'm totally confused. what are you copying/looking at that is only 11mb? The test file I uploaded is only 600Kb or so.

The .txt file I have is 1.3 gb and 1,332,758 rows - that is the number that gets imported from the "table import" wizard in Powerpivot.

Is there a problem with a virtual system? Because I don't have anyhting nearly as powerful at home…;)
 
What I said was "If I copy Sheet1 Row 500 down to row 25000 and copy Sheet2 Row 2 down to row 1048576, my Workbook is only 11MB! Which is nowhere near the 1.3GB you mention"
Doing this should have effectively re-created your Excel file and it is nowhere near the 1.3GB you mention in your text

I also said that you can't fit 1.3 million rows into 1048576 rows

I suggested doing it all in Access using a merge function, which I am unfamiliar with
 
Hi ,

You might have got your answer by now , but if you don't mind , I'd like to ask a few questions which I don't see in this thread.

1. Why are there 2 sheets of information ?

As far as I can see both sheets have unique entries per ID ; Sheet2 does not have any other information except the description of each ID ; so why was this not in Sheet1 to start with ?

2. Is there any logic to the ID numbering ?

From what I can see , Sheet2 has IDs which all start with 651 ; the data on Sheet1 does not have a single ID starting with 651 ; what is the purpose of this formula in the first place ?

3. If at all your first file is just 9 MB in size , why not leave the second file as a text file , and use code to do a search within the text file ?

See this link for some ideas.

If at all it is possible to get the text file with the IDs in sorted order , it might be better to split it into at least 2 text files , so that we can straightaway decide which of the files will contain the ID we are looking for.

Narayan
 
Hello Narayan,

The two sheets come from Apple. The first sheet I pulled from a flat file - very simple, easy to work, etc…but, unfortunately, that flat file does not include the "description" column that is found on sheet 2. That information was part of an extremely large mysql file which I imported into mysql and then pulled out those two columns.

For whatever reason Apple did not include the "genre" category in the mysql file so the only way was to pull every product, but only what I needed - the ID field and description - so I could do the vlookup and add the description to the appropriate, corresponding, ID/item on sheet1.

It is as if Apple went out of their way to make this difficult. I asked for a flat file that contained the description field and they said they only provide the data the way it is - no customization.

As for the number schema - the sample file I provided is very small and it could just be a coincindence that there isn't a match. I have split sheet2 into 10 smaller files - all about 125mb each. Each one of those smaller files does contain matching data with sheet1.

The end result is sheet1 - all 24,000 rows, amended to include the desciption from sheet2.

I may need to go back to the mysql database and, using phpmyadmin, see if I can mesh the data together somehow that way?

You wrote "see this link" but I don't see a link in your post - could you post that link again please?

Thank you for your help!
Rob
 
Hi Narayan!
Thank you for your time….the "exceluser.com" looks promising - I will try it out shortly and let you know.

Hui's point about not using a virtual system was interesting as well - I have an imac as my primary machine but I also have a Dell XPS with 6mb ram (it's not setup at the moment) that has some tools on it - 3rd party stuff - that I've used several years ago for large files.

I may have to go back to the beginning - when I exported this data from Mysql - and rethink things…:) But I will post an update here shortly - I appreciate your help!
Rob
 
Back
Top