Ξ bigXi

March 28, 2006

IE memory leak, revisited

Filed under: Ajax,IE Memory Leak,JavaScript — bigxi @ 9:43 pm

Prelude

By now it is well known that IE (from version 4 to version 6) leaks memory with DHTML (or Ajax, if you prefer). As pointed out in numerous articles (see list at the end), the major source of memory leak is circular references formed with both JavaScript objects and DOM nodes (host objects, or ActiveX objects). A natural question one would ask is: since the problem is so well known, will Microsoft fix it? With Ajax becoming more popular each day, can we expect IE 7 be leak free?

An answer to that question would depend on the answer to why IE leaks in the first place. Unfortunately, after searching hi and lo, near and far, I can only find this little snippet from this MSDN blog entry:

"This page used to say that IE tears down the div when the page is navigated away, but it turns out that that's not right. Though IE did briefly do that, the application compatibility lab discovered that there were actually web pages that broke when those semantics were implemented. (No, I don't know the details.) The IE team considers breaking existing web pages that used to work to be way, way worse than leaking a little memory here and there, so they've decided to take the hit and leak the memory in this case."

So, IE leaks because of compatibility issues. Web pages actually break when garbage is rightfully collected. I can't help but wonder what those web pages look like and who owns them. Since the leak mainly concerns circular references involving ActiveX objects, can we boldly assume that the offending pages have ActiveX controls embedded? Maybe some of them belong to Microsoft? If that's the case, it is indeed much better to take the "hit" and leak the memory here and there.

Looks like we have to live with IE memory leaks until the compatibility issues are gone.

Coping with IE memory leak

Many articles on the web refer to Joel Webber's DHTML Leaks Like a Sieve, which is since lost from the web. Joel Webber even wrote a nice little tool called Drip to detect memory leaks in IE, which was once slashdotted but also lost – for a while, until found again on OutOfHanwell.

The "official" dose of medicine from Microsoft is probably Justin Roger's Understanding and Solving Internet Explorer Leak Patternson MSDN, which is used by many as the blueprint of IE memory leak remedy. In his article, Justin described various patterns that would lead to IE leak and ways to fix the leaks. Interesting enough, he also described two other patterns besides the circular reference leak we are all familiar about. These are: the DOM insertion order leak and dynamic scripting leak, which are described in the "Cross-page Leaks" and "Pseudo-Leaks" sections of the article, respectively. I will also talk about these types of leaks later in this article.

To avoid leak caused by circular references, the suggestions offered are:

  1. Do not form circular references
  2. If you have to form circular references (especially with event handlers through closures, which are so easy and convenient), break them up with an onunload handler after the page unloads.

Along the lines of the second suggestion are some more complex and systematic schemes to register event handlers so that the event handlers are automatically unregistered upon page unload, therefore breaking any circular references that may be formed during event registration. Among them are:

  1. EventCache by Mark Wubben
  2. EventManager by Keith Gaughan

However, as I will discuss below, onunload handlers may be effective for cross-page leaks, but they generally can't handle same-page leaks.

What's a memory leak anyway?

In the C and C++ world, the answer is simple. Any memory that is no longer referenceable but yet not released is leaked. When you allocate memory, your application footprint grows. When you release memory, the footprint shrinks.

Not so clear-cut for JavaScript. With JavaScript, you no longer have direct control of memory usage. There's a garbage collector running in the background and it decides how and when to reclaim a piece of memory that's no longer used. The garbage collector does so by determining which objects are no longer accessible and therefore can be deallocated from memory. It can detect circular loops among "garbage" by an algorithm called "mark and sweep", i.e., if a group of objects hold references to each other but are nonetheless no longer referenced by anything from the active execution path of the program, they can be marked as garbage collectively and cleaned up.

With JavaScript, you can have variables and object references going out of scope and yet see the memory consumption going up. And that does not necessarily mean that there is a memory leak. It may simply be that the garbage collector hasn't got a chance to collect the garbage. Or, on the reverse, if the garbage collector is at its work when you create new objects, you may see the memory usage shrink.

Therefore, for JavaScript, there is a leak when the garbage collector is unable (vs. not cleaning up at this moment) to clean up some garbage. To determine that the garbage collector is unable to do cleanup, you have to repeat your test many many times and see the memory consumption growing without bounds where you'd expect to see a flat-line otherwise.

When you navigate the browser from one web page to another, none of the objects in the document of the previous page should be left over to the new page. Those objects used to construct the previous page should be cleaned up, sooner or later. If there are leftovers from the previous pages, and they accumulate and never go away, you have a cross-page memory leak. Likewise, if you are dynamically adding and removing objects but nonetheless stay on the same page, the garbage collector should be able to reclaim the resources used up by the removed elements (sooner or later). If the removed objects accumulate and never go away, you have a same-page leak.

Cross-page leaks are bad enough that you'll eventually have to close your browser and restart it to make it functional again. Same-page leaks are less harmful but may be important for some Ajax applications where the whole application is fitted in one page and you never navigate away from it unless exiting the application.

Since the page is never unloaded, onunload handlers do nothing to remedy same-page leaks. In fact, schemes like EventManager or EventCache guarantee that there is same-page leak.

Closures are your friends

There once was a saying: "closures are you friends". But since the IE bully appeared people are shying away from closures whenever they can (or can't). However, there's a little secret about our quiet friends that would make it easier to befriend them again. And that's what I'm going to tell you here.

Let's start with a simple test (test1):

<html>
    <body>
        <button onclick="startTest();">Start!</button>
        <script language="JavaScript">
            function startTest() {
                for (var i = 0; i < 5000; i++) {
                    var element = document.createElement("DIV");
                    element.innerHTML = "Div #" + i;
                    hookupEvent(element);
                    document.body.appendChild(element);
                }
            }

            function hookupEvent(element) {
                element.onclick = function() { alert('Clicked: ' + this.innerHTML); }
            }
        </script>
    </body>
</html>

Does this page leak? Definitely. The DOM node element has a reference to an anonymous JavaScript function (the onclick handler), which in turn has a reference back to element through the closure formed inside hookupEvent. There is a circular reference loop encompassing both JavaScript and DOM objects. Test it for yourself. Here's how: load the page, click "Start!", reload page, click "Start!", …, and watch the memory usage grow with the Windows Task Manager. Repeat until you are satisfied.

Now let's modify test1 a bit and call it test2:

<html>
    <body>
        <button onclick="startTest();">Start!</button>
        <script language="JavaScript">
            function startTest() {
                for (var i = 0; i < 5000; i++) {
                    var element = document.createElement("DIV");
                    element.innerHTML = "Div #" + i;
                    element.onclick = function() { alert('Clicked: ' + element.innerHTML); };
                    document.body.appendChild(element);
                }
            }
        </script>
    </body>
</html>

Is there a leak here? It definitely seem so. There's still a closure and the circular references still exist. Test it, please!

Did you see the memory consumption grow? No? Are you surprised?

Now click the "Start!" button and bring the DIVs back. Click on DIV #0. What does it say? Click on DIV #4999?! In fact, no matter which DIV you click, it'll always say "Clicked DIV#4999".

As it turns out, JavaScript closures are not TRUE closures – in the sense that they don't enclose the values at the moment they are formed. Instead, they only enclose a scope for their own existence. The values inside the scope can be updated even after the closure is formed. Comparing our test1 to test2, 5000 copies of the anonymous onclick handler were created in test1, while there's only one copy of it in test2. To write it out more explicitly, test2 is equivalent to:

<html>
    <body>
        <button onclick="startTest();">Start!</button>
        <script language="JavaScript">
            function startTest() {
                for (var i = 0; i < 5000; i++) {
                    var element = document.createElement("DIV");
                    element.innerHTML = "Div #" + i;
                    element.onclick = onclickHandler;
                    document.body.appendChild(element);
                }

                function onclickHandler() {
                    alert('Clicked: ' + element.innerHTML);
                }
            }
        </script>
    </body>
</html>

Strictly speaking, there's still a leak in test2. But instead of leaking 5000 elements, only the last element is leaked.

Take insight from the previous example, we can now easily make our test1 leak free:

            function hookupEvent(element) {
                element.onclick = function() { alert('Clicked: ' + this.innerHTML); }
                element = null;
            }

The only thing I did was to set element to null after the event handler is attached. And suddenly we are leak free! The closure is still there, but there's no circular reference anymore.

In general, we should understand that closures don't necessarily mean circular references. And in some instances breaking up circular references is actually easier to do at the point of closure formation than at page unload time.

What about insertion order leaks?

One leak pattern that is presented in Justin Rogers' MSDN article is the "insertion order leak", which as far as I know isn't discussed anywhere else.

Here's the test that leaks (test3):

                var hostElement = document.getElementById("hostElement");
                var parentDiv = document.createElement("<div onClick='foo()'>");
                var childDiv = document.createElement("<div onClick='foo()'>");
                parentDiv.appendChild(childDiv);
                hostElement.appendChild(parentDiv);

And the test that does not leak (test4):

                var hostElement = document.getElementById("hostElement");
                var parentDiv = document.createElement("<div onClick='foo()'>");
                var childDiv = document.createElement("<div onClick='foo()'>");
                hostElement.appendChild(parentDiv);
                parentDiv.appendChild(childDiv);

The only difference between test3 and test4 is the order of attachment to the DOM tree. If the dynamic elements are pre-assembled first, there is a leak; if they are attached directly to the DOM without pre-assembly, there's no leak.

However, I found the explanations in Justin's article somewhat hard to believe. So I modified the tests as follows:

test5:

                var parentDiv = document.createElement("<div>");
                parentDiv.onclick = function() { foo(); }
                var childDiv = document.createElement("<div>");
                childDiv.onclick = function() { foo(); }
                parentDiv.appendChild(childDiv);
                hostElement.appendChild(parentDiv);

test6:

                var parentDiv = document.createElement("<div>");
                parentDiv.onclick = function() { foo(); }
                var childDiv = document.createElement("<div>");
                childDiv.onclick = function() { foo(); }
                hostElement.appendChild(parentDiv);
                parentDiv.appendChild(childDiv);

And there's no leak either ways! Then I ran these two tests:

test7:

                var parentDiv = document.createElement("<div onClick='foo()'>");
                parentDiv.innerHTML = "parent";
                var childDiv = document.createElement("<div onClick='foo()'>");
                childDiv.innerHTML = "child";
                parentDiv.appendChild(childDiv);
                hostElement.appendChild(parentDiv);

test8:

                var parentDiv = document.createElement("<div onClick='foo()'>");
                parentDiv.innerHTML = "parent";
                var childDiv = document.createElement("<div onClick='foo()'>");
                childDiv.innerHTML = "child";
                hostElement.appendChild(parentDiv);
                parentDiv.appendChild(childDiv);

WARNING: don't ever try test7 or test8 5000 times, it may crash your machine!

So what's my take? I don't believe there's such a thing as insertion order leak – at least not as demonstrated by test3 and test4. The leak has something to do with the way IE parses and processes the string passed to document.createElement.

BTW, none of the leaks appeared in the above tests are cross-page. Memory usage returns to normal when you refresh the page.

Dynamic scripting leaks are real leaks

In his MSDN article, Justin Rogers demonstrated another type of leak with this test:

<html>
    <head>
        <script language="JScript">

        function LeakMemory()
        {
            for(i = 0; i < 5000; i++)
            {
                hostElement.text = "function foo() { }";
            }
        }
        </script>
    </head>

    <body>
        <button onclick="LeakMemory()">Memory Leaking Insert</button>
        <script id="hostElement">function foo() { }</script>
    </body>
</html>

Basically, when you click the "Memory Leaking Insert" button, the script element is re-written 5000 times. The memory footprint grows each time you click the button and never falls back, unless you leave the page. However, I don't quite agree with Justin's classification of this leak as a "pseudo-leak". I believe it's a real leak in the sense that scripts that are no longer needed and no longer reachable cannot be garbage collected. Somehow, I think this leak is related to the "insertion order leak" discussed above.

Resources

  1. Understanding and Solving Internet Explorer Leak Patterns
  2. Out of Hanwell: IE Memory Leaks
  3. Quirks Mode: Memory Leak Mystery
  4. Novemberborn: EventCache
  5. Talideon: EventManager
  6. IE: Where's My Memory?
  7. JavaScript Closures

About these ads

7 Comments »

  1. Good article, thanks. It’s “garbage” – not “gabbage” or “garbbage.” That sort of thing is just annoying enough to make the article hard to read.

    Comment by Brett — April 5, 2006 @ 3:39 pm | Reply

  2. Thanks Brett. And sorry about the misspelled garbage. I guess my spelling was strongly influenced by the name Babbage – the “Father of Computing”. But anyway, error corrected.

    Comment by bigxi — April 5, 2006 @ 4:50 pm | Reply

  3. Maybe the only correct way to use innerHTML – LEAK FREE http://www.posos.com/page/Index.cfm?SelNavID=2714. Use outerHTML…

    Comment by Michael — August 5, 2007 @ 6:45 am | Reply

  4. This is my first post
    just saying HI

    Comment by jameswillisisthebest — September 8, 2007 @ 9:51 pm | Reply

  5. I read some of the posts and I think it is a great site. I want make the best use of my teamwork peer A JOKE! ) What geometric figure represents a lost parrot? A polygon.

    Comment by Astothtipsy — October 26, 2008 @ 8:01 pm | Reply

  6. any changes coming ?

    Comment by Sexo Cl — August 4, 2009 @ 4:34 am | Reply

  7. Regarding test2 – the reason that it always displays Div #4999 is that you used element.innerHTML instead of this.innerHTML. I suspect that there ARE 5000 function objects created (for the 5000 DIV objects) but there is only one closure (around startTest()) which is shared by them all. Within this closure, element retains its last value (the 5000th DIV object, hence the output). This means that the leak is not just the last element but the whole of startTest(). Since this is invoked only once, the leak is (about) 5000 times smaller than when the leak is the closure around hookupEvent() which is invoked 5000 times. (The “about” is included because the closures are not necessarily the same size.)

    For those of you used to more tightly scoped languages (like C++) who are wondering about how onclickHandler(), in the revision of test2, can “see” element, the answer is that Javascript (aka ECMA-262) has (had?) only function (and object) local scope. Any variable declared anywhere within a function, even implicitly by first appearance (assuming it’s not already in an enclosing scope), is visible throughout the function. Since onclickHandler() is defined WITHIN startTest(), it can see element.

    Comment by Code Master Bob — April 26, 2011 @ 2:26 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: