Example #1: Function expression identifier leaks into an enclosing scope
實例1:函數表達式標示符滲進了外圍作用域
var f = function g(){};
typeof g; // "function"
Remember how I mentioned that an identifier of named function expression is not available in an enclosing scope? Well, JScript doesn't agree with specs on this one - g in the above example resolves to a function object. This is a most widely observed discrepancy. It's dangerous in that it inadvertedly pollutes an enclosing scope - a scope that might as well be a global one - with an extra identifier. Such pollution can, of course, be a source of hard-to-track bugs.
我剛才提到過一個有名函數表達式的標示符不能在外部作用域中被訪問。但是,JScript在這點上和標准並不相符,在上面的餓例子中g卻是一個函數 對象。這個是一個可以廣泛觀察到的差異。這樣它就用一個多余的標示符污染了外圍作用域,這個作用域很有可能是全局作用域,這樣是很危險的。當然這個污染可 能是一個很難去處理和跟蹤的bug的根源
Example #2: Named function expression is treated as BOTH - function declaration AND function expression
實例2:有名函數表達式被進行了雙重處理,函數表達式和函數聲明
typeof g; // "function"
var f = function g(){};
As I explained before, function declarations are parsed foremost any other expressions in a particular execution context. The above example demonstrates how JScript actually treats named function expressions as function declarations. You can see that it parses g before an “actual declaration” takes place.
正如我前面解釋的,在一個特定的執行環境中,函數聲明是在所有的表達式之前被解釋。上面的例子說明JScript實際上把有名函數表達式作為一個函 數聲明來對待。我們可以看到他在一個實際的聲明之前就被解釋了。
This brings us to a next example:
在此基礎上我們引入了下面的一個例子。
Example #3: Named function expression creates TWO DISCTINCT function objects!
實例3:有名函數表達式創建兩個不同的函數對象。
var f = function g(){};
f === g; // false
f.expando = 'foo';
g.expando; // undefined
This is where things are getting interesting. Or rather - completely nuts. Here we are seeing the dangers of having to deal with two distinct objects - augmenting one of them obviously does not modify the other one; This could be quite troublesome if you decided to employ, say, caching mechanism and store something in a property of f, then tried accessing it as a property of g, thinking that it is the same object you're working with.
在這裡事情變得更加有趣了,或者是完全瘋掉。這裡我們看到必須處理兩個不同的對象的危險,當擴充他們當中的一個的時候,另外一個不會相應的改變。如 果你打算使用cache機制並且在f的屬性中存放一些東西,只有有試圖在g的屬性中訪問,你本以為他們指向同一個對象,這樣就會變得非常麻煩
Let's look at something a bit more complex.
讓我們來看一些更復雜的例子。
Example #4: Function declarations are parsed sequentially and are not affected by conditional blocks
實例4:函數聲明被順序的解釋,不受條件塊的影響
var f = function g() {
return 1;
};
if (false) {
f = function g(){
return 2;
}
};
g(); // 2
An example like this could cause even harder to track bugs. What happens here is actually quite simple. First, g is being parsed as a function declaration, and since declarations in JScript are independent of conditional blocks, g is being declared as a function from the “dead” if branch - function g(){ return 2 }. Then all of the “regular” expressions are being evaluated and f is being assigned another, newly created function object to. “dead” if branch is never entered when evaluating expressions, so f keeps referencing first function - function g(){ return 1 }. It should be clear by now, that if you're not careful enough, and call g from within f, you'll end up calling a completely unrelated g function object.
像這樣的一個例子可能會使跟蹤bug非常困難。這裡發生的問題卻非常簡單。首先g被解釋為一個函數聲明,並且既然JScript中的聲明是和條件塊 無關的,g就作為來自於已經無效的if分支中的函數被聲明function g(){ return 2 }。之後普通的表達式被求值並且f被賦值為另外一個新創建的函數對象。當執行表達式的時候,由於if條件分支是不會被進入的,因此f保持為第一函數的引用 function g(){ return 1 }。現在清楚了如果不是很小心,而且在f內部調用g,你最終將調用一個完全無關的g函數對象。
You might be wondering how all this mess with different function objects compares to arguments.callee. Does callee reference f or g? Let's take a look:
你可能在想不從的函數對象和arguments.callee相比較的結果會是怎樣呢?callee是引用f還是g?讓我們來看一下
var f = function g(){
return [
arguments.callee == f,
arguments.callee == g
];
};
f(); // [true, false]
As you can see, arguments.callee references same object as f identifier. This is actually good news, as you will see later on.
我們可以看到arguments.callee引用的是和f標示符一樣的對象,就像稍後你會看到的,這是個好消息
Looking at JScript deficiencies, it becomes pretty clear what exactly we need to avoid. First, we need to be aware of a leaking identifier (so that it doesn't pollute enclosing scope). Second, we should never reference identifier used as a function name; A troublesome identifier is g from the previous examples. Notice how many ambiguities could have been avoided if we were to forget about g's existance. Always referencing function via f or arguments.callee is the key here. If you use named expression, think of that name as something that's only being used for debugging purposes. And finally, a bonus point is to always clean up an extraneous function created erroneously during NFE declaration.
既然看到了JScript的缺點,我們應該避免些什麼就非常清楚了。首先,我們要意識到標示符的滲出(以使得他不會污染外圍作用域)。第二點,我們 不應該引用作為函數名的標示符;從前面的例子可以看出g是一個問題多多的標示符。請注意,如果我們忘記g的存在,很多歧義就可以被避免。通常最關鍵的就是 通過f或者argument.callee來引用函數。如果你使用有名的表達式,記住名字只是為了調試的目的而存在。最後,額外的一點就是要經常清理有名 函數表達式聲明錯誤創建的附加函數
I think last point needs a bit of an explanation:
我想最有一點需要一些更多解釋
JScript 內存管理
Being familiar with JScript discrepancies, we can now see a potential problem with memory consumption when using these buggy constructs. Let's look at a simple example:
熟悉了JScript和規范的差別,我們可以看到當使用這些有問題的結構的時候,和內存消耗相關的潛在問題
var f = (function(){
if (true) {
return function g(){};
}
return function g(){};
})();
We know that a function returned from within this anonymous invocation - the one that has g identifier - is being assigned to outer f. We also know that named function expressions produce superfluous function object, and that this object is not the same as returned function. The memory issue here is caused by this extraneous g function being literally “trapped” in a closure of returning function. This happens because inner function is declared in the same scope as that pesky g one. Unless we explicitly break reference to g function it will keep consuming memory.
我們發現從匿名調用中返回的一個函數,也就是以g作為標示符的函數,被復制給外部的f。我們還知道有名函數表達式創建了一個多余的函數對象,並且這 個對象和返回的對象並不是同一個函數。這裡的內存問題就是由這個沒用的g函數在一個返回函數的閉包中被按照字面上的意思捕獲了。這是因為內部函數是和可惡 的g函數在同一個作用域內聲明的。除非我們顯式的破壞到g函數的引用,否則他將一直占用內存。
var f = (function(){
var f, g;
if (true) {
f = function g(){};
}
else {
f = function g(){};
}
//給g賦值null以使他不再被無關的函數引用。
//null `g`, so that it doesn't reference extraneous function any longer
g = null;
return f;
})();
Note that we explicitly declare g as well, so that g = null assignment wouldn't create a global g variable in conforming clients (i.e. non-JScript ones). By nulling reference to g, we allow garbage collector to wipe off this implicitly created function object that g refers to.
注意,我們又顯式的聲明了g,所以g=null賦值將不會給符合規范的客戶端(例如非JScirpt引擎)創建一個全局變量。通過給g以null的 引用,我們允許垃圾回收來清洗這個被g所引用的,隱式創建的函數對象。
When taking care of JScript NFE memory leak, I decided to run a simple series of tests to confirm that nulling g actually does free memory.
當考慮 JScript的有名函數表達式的內存洩露問題時,我決定運行一系列簡單的測試來證實給g函數null的引用實際上可以釋放內存
測試
The test was simple. It would simply create 10000 functions via named function expressions and store them in an array. I would then wait for about a minute and check how high the memory consumption is. After that I would null-out the reference and repeat the procedure again. Here's a test case I used:
這個測試非常簡單。他將通過有名函數表達式創建1000個函數,並將它們儲存在一個數組中。我等待了大約一分鐘,並查看內存使用有多高。只有我們加 上null引用,重復上述過程。下面就是我使用的一個簡單的測試用例
function createFn(){
return (function(){
var f;
if (true) {
f = function F(){
return 'standard';
}
}
else if (false) {
f = function F(){
return 'alternative';
}
}
else {
f = function F(){
return 'fallback';
}
}
// var F = null;
return f;
})();
}
var arr = [ ];
for (var i=0; i<10000; i++) {
arr[i] = createFn();
}
Results as seen in Process Explorer on Windows XP SP2 were:
結果是在Windows XP SP2進行的,通過進程管理器得到的
IE6:
without `null`: 7.6K -> 20.3K
with `null`: 7.6K -> 18K
IE7:
without `null`: 14K -> 29.7K
with `null`: 14K -> 27K
The results somewhat confirmed my assumptions - explicitly nulling superfluous reference did free memory, but the difference in consumption was relatively insignificant. For 10000 function objects, there would be a ~3MB difference. This is definitely something that should be kept in mind when designing large-scale applications, applications that will run for either long time or on devices with limited memory (such as mobile devices). For any small script, the difference probably doesn't matter.
結果在一定程度上證實了我的假設,顯示的給無用的參考以null值確實會釋放內存,但是在內寸的消耗的區別上貌似不是很大。對於1000個函數對 象,大約應該有3M左右的差別。但是有一些是明確的,在設計大規模的應用的時候,應用要不就是要運行很長時間的或者要在一個內存有限的設備上(例如移動設 備)。對於任何小的腳本,差別可能不是很重要。
You might think that it's all finally over, but we are not just quite there yet :) There's a tiny little detail that I'd like to mention and that detail is Safari 2.x
你可以認為這樣就可以結束了,但是還沒到結束的時候。我還要討論一些小的細節,而且這些細節是在Safari 2.x下的
Safari bug
Even less widely known bug with NFE is present in older versions of Safari; namely, Safari 2.x series. I've seen some claims on the web that Safari 2.x does not support NFE at all. This is not true. Safari does support it, but has bugs in its implementation which you will see shortly.
雖然沒有被人們發現在早期的Safari版本,也就是Safari 2.x版本中有名函數表達式的bug。但是我在web上看到一些聲稱Safari 2.x根本不支持有名函數表達式。這不是真的。Safari的確支持有名函數表達式,但是稍後你將看到在它的實現中是存在bug的
When encountering function expression in a certain context, Safari 2.x fails to parse the program entirely. It doesn't throw any errors (such as SyntaxError ones). It simply bails out:
在某些執行環境中遇到函數表達式的時候,Safari 2.x 將解釋程序整體失敗。它不拋出任何的錯誤(例如SyntaxError)。展示如下
(function f(){})(); // <== 有名函數表達式 NFE
alert(1); //因為前面的表達式是的整個程序失敗,本行將無法達到, this line is never reached, since previous expression fails the entire program
After fiddling with various test cases, I came to conclusion that Safari 2.x fails to parse named function expressions, if those are not part of assignment expressions. Some examples of assignment expressions are:
在用一些測試用例測試之後,我總結出,如果有名函數表達式不是賦值表達式的一部分,Safari解釋有名函數表達式將失敗。一些賦值表達式的例子如 下
// 變量聲明part of variable declaration
var f = 1;
//簡單的賦值 part of simple assignment
f = 2, g = 3;
// 返回語句part of return statement
(function(){
return (f = 2);
})();
This means that putting named function expression into an assignment makes Safari “happy”:
這就意味著把有名函數表達式放到賦值表達式中會讓 Safari非常“開心”
(function f(){}); // fails 失敗
var f = function f(){}; // works 成功
(function(){
return function f(){}; // fails 失敗
})();
(function(){
return (f = function f(){}); // works 成功
})();
setTimeout(function f(){ }, 100); // fails
It also means that we can't use such common pattern as returning named function expression without an assignment:
這也意味著我們不能使用這種普通的模式而沒有賦值表達式作為返回有名函數表達式
//要取代這種Safari2.x不兼容的情況 Instead of this non-Safari-2x-compatible syntax:
(function(){
if (featureTest) {
return function f(){};
}
return function f(){};
})();
// 我們應該使用這種稍微冗長的替代方法we should use this slightly more verbose alternative:
(function(){
var f;
if (featureTest) {
f = function f(){};
}
else {
f = function f(){};
}
return f;
})();
// 或者另外一種變形or another variation of it:
(function(){
var f;
if (featureTest) {
return (f = function f(){});
}
return (f = function f(){});
})();
/*
Unfortunately, by doing so, we introduce an extra reference to a function
which gets trapped in a closure of returning function. To prevent extra memory usage,
we can assign all named function expressions to one single variable.
不幸的是 這樣做我們引入了對函數的另外一個引用
他將被包含在返回函數的閉包中
為了防止多於的內存使用,我們可以吧所有的有名函數表達式賦值給一個單獨的變量
*/
var __temp;
(function(){
if (featureTest) {
return (__temp = function f(){});
}
return (__temp = function f(){});
})();
...
(function(){
if (featureTest2) {
return (__temp = function g(){});
}
return (__temp = function g(){});
})();
/*
Note that subsequent assignments destroy previous references,
preventing any excessive memory usage.
注釋:後面的賦值銷毀了前面的引用,防止任何過多的內存使用
*/
If Safari 2.x compatibility is important, we need to make sure “incompatible” constructs do not even appear in the source. This is of course quite irritating, but is definitely possible to achieve, especially when knowing the root of the problem.
如果Safari2.x的兼容性非常重要。我們需要保證不兼容的結構不再代碼中出現。這當然是非常氣人的,但是他確實明確的可以做到的,尤其是當我 們知道問題的根源。
It's also worth mentioning that declaring a function as NFE in Safari 2.x exhibits another minor glitch, where function representation does not contain function identifier:
還值得一提的是在Safari中聲明一個函數是有名函數表達式的時候存在另外一個小的問題,這是函數表示法不含有函數標示符(估計是 toString的問題)
var f = function g(){};
// Notice how function representation is lacking `g` identifier
String(g); // function () { }
This is not really a big deal. As I have already mentioned before, function decompilation is something that should not be relied upon anyway.
這不是個很大的問題。因為之前我已經說過,函數反編譯在任何情況下都是不可信賴的。
解決方案
var fn = (function(){
//聲明一個變量,來給他賦值函數對象 declare a variable to assign function object to
var f;
// 條件的創建一個有名函數 conditionally create a named function
// 並把它的引用賦值給f and assign its reference to `f`
if (true) {
f = function F(){ }
}
else if (false) {
f = function F(){ }
}
else {
f = function F(){ }
}
//給一個和函數名相關的變量以null值 Assign `null` to a variable corresponding to a function name
//這可以使得函數對象(通過標示符的引用)可以被垃圾收集所得到This marks the function object (referred to by that identifier)
// available for garbage collection
var F = null;
//返回一個條件定義的函數 return a conditionally defined function
return f;
})();
Finally, here's how we would apply this “techinque” in real life, when writing something like a cross-browser addEvent function:
最後,當我麼一個類似於跨浏覽器addEvent函數的類似函數時,下面就是我們如何在真實的應用中使用這個技術
// 1) 用一個分離的作用域封裝聲明 enclose declaration with a separate scope
var addEvent = (function(){
var docEl = document.documentElement;
// 2)聲明一個變量,用來賦值為函數 declare a variable to assign function to
var fn;
if (docEl.addEventListener) {
// 3) 確保給函數一個描述的標示符 make sure to give function a descriptive identifier
fn = function addEvent(element, eventName, callback) {
element.addEventListener(eventName, callback, false);
}
}
else if (docEl.attachEvent) {
fn = function addEvent(element, eventName, callback) {
element.attachEvent('on' + eventName, callback);
}
}
else {
fn = function addEvent(element, eventName, callback) {
element['on' + eventName] = callback;
}
}
// 4)清除通過JScript創建的addEvent函數 clean up `addEvent` function created by JScript
// 保證在賦值之前加上varmake sure to either prepend assignment with `var`,
// 或者在函數頂端聲明 addEvent or declare `addEvent` at the top of the function
var addEvent = null;
// 5)最後通過fn返回函數的引用 finally return function referenced by `fn`
return fn;
})();
可替代的解決方案
It's worth mentioning that there actually exist alternative ways of
having descriptive names in call stacks. Ways that don't require one to
use named function expressions. First of all, it is often possible to
define function via declaration, rather than via expression. This option
is only viable when you don't need to create more than one function:
需要說明,實際上純在一個種使得在調用棧上顯示描述名稱(函數名)的替代方法。一個不需要使用有名函數表達式的方法。首先,通常可以使用聲明而不是
使用表達式來定義函數。這種選擇通常只是適應於你不需要創建多個函數的情況。
var hasClassName = (function(){
// 定義一些私有變量define some private variables
var cache = { };
//使用函數定義 use function declaration
function hasClassName(element, className) {
var _className = '(?:^|\\s+)' + className + '(?:\\s+|$)';
var re = cache[_className] || (cache[_className] = new RegExp(_className));
return re.test(element.className);
}
// 返回函數return function
return hasClassName;
})();
This obviously wouldn't work when forking function definitions.
Nevertheless, there's an interesting pattern that I first seen used by
Tobie Langel. The way it works is by defining all functions
upfront using function declarations, but giving them slightly different
identifiers:
這種方法顯然對於多路的函數定義不適用。但是,有一個有趣的方法,這個方法我第一次在看到Tobie
Langel.在使用。這個用函數聲明定義所有的函數,但是給這個函數聲明以稍微不同的標示符。
var addEvent = (function(){
var docEl = document.documentElement;
function addEventListener(){
/* ... */
}
function attachEvent(){
/* ... */
}
function addEventAsProperty(){
/* ... */
}
if (typeof docEl.addEventListener != 'undefined') {
return addEventListener;
}
elseif (typeof docEl.attachEvent != 'undefined') {
return attachEvent;
}
return addEventAsProperty;
})();
While it's an elegant approach, it has its own drawbacks. First, by
using different identifiers, you loose naming consistency. Whether it's
good or bad thing is not very clear. Some might prefer to have identical
names, while others wouldn't mind varying ones; after all, different
names can often “speak” about implementation used. For example, seeing
“attachEvent” in debugger, would let you know that it is an attachEvent-based implementation of addEvent. On the other hand,
implementation-related name might not be meaningful at all. If you're
providing an API and name “inner” functions in such way, the user of API
could easily get lost in all of these implementation details.
雖然這是一個比較優雅的方法,但是他也有自己的缺陷。首先,通過使用不同的標示符,你失去的命名的一致性。這是件好的事情還是件壞的事情還不好說。
有些人希望使用一支的命名,有些人則不會介意改變名字;畢竟,不同的名字通常代表不同的實現。例如,在調試器中看到“attachEvent”,你就可以
知道是addEvent基於attentEvent的一個實現。另外一方面,和實現相關的名字可能根本沒有什意義。如果你提供一個api並用如此方法命名
內部的函數,api的使用者可能會被這些實現細節搞糊塗。
A solution to this problem might be to employ different naming
convention. Just be careful not to introduce extra verbosity. Some
alternatives that come to mind are:
解決這個問題的一個方法是使用不同的命名規則。但是注意不要飲用過多的冗余。下面列出了一些替代的命名方法
`addEvent`, `altAddEvent` and `fallbackAddEvent`
// or
`addEvent`, `addEvent2`, `addEvent3`
// or
`addEvent_addEventListener`, `addEvent_attachEvent`, `addEvent_asProperty`
Another minor issue with this pattern is increased memory
consumption. By defining all of the function variations upfront, you
implicitly create N-1 unused functions. As you can see, if attachEvent is found in document.documentElement,
then neither addEventListener nor addEventAsProperty are ever really used. Yet, they
already consume memory; memory which is never deallocated for the same
reason as with JScript's buggy named expressions - both functions are
“trapped” in a closure of returning one.
這種模式的另外一個問題就是增加了內存的開銷。通過定義所有上面的函數變種,你隱含的創建了N-1個函數。你可以發現,如果attachEvent
在document.documentElement中發現,那麼addEventListener和addEventAsProperty都沒有被實際
用到。但是他們已經消耗的內存;和Jscript有名表達式bug的原因一樣的內存沒有被釋放,在返回一個函數的同時,兩個函數被‘trapped‘在閉
包中。
This increased consumption is of course hardly an issue. If a library
such as Prototype.js was to use this pattern, there would be not more
than 100-200 extra function objects created. As long as functions are
not created in such way repeatedly (at runtime) but only once (at load
time), you probably shouldn't worry about it.
這個遞增的內存使用顯然是個嚴重的問題。如果和Prototype.js類似的庫需要使用這種模式,將有另外的100-200個多於的函數對象被創
建。如果函數沒有被重復地(運行時)用這種方式創建,只是在加載時被創建一次,你可能就不用擔心這個問題。