前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >深入理解PHP的引用(References in PHP)

深入理解PHP的引用(References in PHP)

作者头像
黄规速
发布2022-04-14 20:23:04
4.3K0
发布2022-04-14 20:23:04
举报

深入理解PHP的引用(References in PHP) huangguisu

为了深入理解PHP的引用,找到一篇老外的东西: http://derickrethans.nl/talks/phparch-php-variables-article

很多内容还是直接看英文版比较好,翻译过来有时候词不达意。

基础知识

php在zend里面存储的变量,PHP中每个变量都有对应的 zval, Zval结构体定义在Zend/zend.h里面,其结构:

代码语言:javascript
复制
typedef struct _zval_struct zval;  
struct _zval_struct {  
    /* Variable information */  
    zvalue_value value;     /* The value  存储变量的值*/  
    zend_uint refcount__gc; /* 引用计数 */  
    zend_uchar type;        /* 变量具体类型*/  
    zend_uchar is_ref__gc;  /* 是否引用 1为引用,0不是*/  
};  

后面也经常提到refcount 即refcount_gc (PHP5.3以后引入的垃圾收集机制)

PHP’s handling of variables can be non-obvious, at times.Have you ever wondered what happens at the engine level when a variable is copied to another? How about when a function returns a variable “by reference?” If so, read on.

PHP是弱语言,其变量处理的过程是不可见的。你是否曾经很想知道在变量复制的时候,PHP引擎做了什么?你是否曾经很想知道一个函数是如何以引用的方式返回一个变量?如果是这样,请您接着向下看。

Every computer language needs some form of container to hold data-variables. In some languages, those variables have a specific type attached to hem. They can be a string, a number, an array, an object or something else. Examples of such statically-typed languages are C and pascal. Variables in PHP do not have this specific restraint. They can be a string in one line, but a number in the next line. Converting between types is also easy to do, and often, even auto-matic. These loosely-typed variables are one of the properties that make PHP such an easy and powerful language, although they can sometimes also cause interesting problems. Internally, in PHP, those variables are all stored in a similar container, called a zval container (also called“variable container”). This container keeps track of several things that are related to a specific value. The most important things that a variable container contains are the value of the “variable”, but also the type of the variable. Python is similar to PHP in this regard as it also labels each variable with a type. The variable container contains a few more fields that the PHP engine uses to keep track of whether a value is a reference or not. It also keeps reference count of its value. Variables are stored in a symbol table, which is quite analogous to an associative array. This array has keys that represent the name of the variable, and those keys point to variable containers that contain the value (andtype) of the variables. See Figure 1 for an example of this.

总结就是变量存储在一个于类似关联数组的符号表中。

.

1 . 引用计数 Reference Counting

PHP tries to be smart when it deals with copying variables like in a = b. Using the = operator is also called an “assign-by-value” operation. While assigning by value, the PHP engine will not actually create a copy of the variable container, but it will merely increase the refcount__gc field in the variable container. As you can imagine this saves a lot of memory in case you have a large string of text, or a large array.Figure 2 shows how this “looks”.

In Step 1 there is one variable, a, which contains the text"this is" s and it has (by default) a reference count of 1.

In step 2, we assign variable a to variableb and

in step 3,Now, you might wonder what would happen if the variable cgets changed. Two things might happen, depending on the value of therefcount. If the value is 1, then the container simply gets updated with its new value (and possibly its type, too). In case therefcountvalue is larger than 1, a new variable container gets created containing the new value (and type). You can see this in step 3 of Figure 2。Therefcount value for the variable container that is linked to the variable ais decreased by one so that the variable container that belongs to variablea and b now has a refcount of 2, and the newly created container has a refcount of 1.

in step 4 ,When unsett( ) is called on a variable the refcount value of the variable container that is linked to the variable that is unset will be decreased by one. This happens when we call unset( $b ) in step 4. If the refcount value drops below 1, the PHP Engine will free the variable container.

in step 5,The variable container is then destroyed, as you can see in step 5.

2. 函数传值 Passing Variables to Functions

Besides the global symbol table that every script has, every call to a user defined function creates a symbol table where a function locally stores its variables. Every time a function is called, such a symbol table is created, and every time a function returns, this symbol table is destroyed. A function returns by either using the return statement, or by implicitly returning because the end of the function has been reached.

In Figure 3, I illustrate exactly how variables are assed to functions.

In step 1, we assign a value to the ariable a, again—“this is”. We pass this variable to the do_something g( ) function, where it is received in the ariable s.

In step 2, you can see that it is practically he same operation as assigning a variable to another ne (like we did in the previous section with b = a),except that the variable is stored in a different symbol table—the one that belongs to the called function—and that the reference count is increased twice, instead he normal once. The reason for this is that the function’s stack also contains a reference to the variable container.(原因是函数栈也包含了这个变量容器的引用)

in step 3 ,When we assign a new value to the variable $s in step 3, the refcount of the original variable container is decreased by one and a new variable container is created, containing the new variable.

In step 4, we return the variable with thereturn statement. The returned variable gets an entry in the global symbol table and the refcount value is increased by 1. When the function ends, the function’s symbol table will be destroyed. During the destruction, the engine will go over all variables in the symbol table and decrease therefcount of each variable container. When a refcount of a variable container reaches 0, the variable container is destroyed. As you see, the variable container is again not copied when returning it from the function due to PHP’s reference counting mechanism. If the variable s would not have been modified in step 3 then variablea and b would still point to the same variable container which would have a refcount value of 2. In this situation, a copy of the variable container that was created with the statementa = = “ this is ” would not have been made

3. 介绍引用Introducing References

References are a method of having two names for the same variable. A more technical description would be: references are a method of having two keys in a symbol table pointing to the same zval container. References can be created with the reference assignment operator &=.

Figure 4 gives a schematic overview of how references work in combination with reference counting.

Instep 1, we create a variable$a that contains the string “this is”.

Instep 2,Then in step two we create two references (b and c)to the same variable container. The refcount increases normally for each assignment making the final refcount 3, after both assignments by reference (b =& c =&

This value is important for two reasons.The second one I will divulge a little bit later in this article(后面将会说明第二原因), and the first reason that makes this value important is when we are reassigning a new value to one of the three variables that all point to the same variable container. If the is_ref value is set to 0 when a new value is set for a specific variable, the PHP engine will create a new variable container as you could see in step 3 of Figure 2. But if the is_ref value is set to 1, then the PHP engine will not create a new variable container and simply only update the value to which one of the variable names point as you can see in step 2 of Figure 4.

In step 3, The exact same result would be reached when the statement a = 42 was used instead ofb = 42. After the variable container is modified, all three variablesa, band

In step 4, we use theunset() language construct to remove a variable—in this case variable $c. Using unset() on a variable means that therefcount value of the variable container that the variable points to gets decreased by 1. This works exactly the same for referenced variables. There is one difference, though, that shows in step 5.

In step 5 When the reference count of a variable container reaches 1 and the is_ref value is set to 1, the is_ref value is reset to 0. The reason for this is that a variable container can only be marked as a referenced variable container when there is more than one variable pointing to the variable container.

4 .混合变量直接赋值和引用赋值 Mixing Assign-by-Value and Assign-by-Reference

混合方式系,并没有节约内存空间,反而增加了。这个由于引用赋值后需要重新分配一份内存给引用的变量。 Something interesting—and perhaps unexpected—happens if you mix an assign-by-value call and an assign-by-reference call. This shows in Figure 5.

In step 1,In the first step we create two variablesa andb, where the latter is assigned-by-value to the former. This creates a situation where there is one variable container withis_ref set to 0 and r re ef fc co ou un nt t set to 2. This should be familiar by now. In step 2 we proceed by assigning variablec by reference to variablec. Here, the PHP engine will create a copy of the variable container. The variableakeeps pointing to the original variable container but the refcount is, of course, decreased to 1 as there is only one variable pointing the this variable container now. The variables b and

You can see that in this case, using a reference does not save you any memory, it actually uses more memory, as it had to duplicate the original variable container. The container had to be copied, otherwise the PHP engine would have no way of knowing how to deal with the reassignment of one of the three variables as two of them were references to the same containerb andc, while the other was not supposed to be a reference. If there is only one container with r re ef fc co ou un nt t set to 3, andis_ref set to 1, then it is impossible to figure that out. That is the reason why the PHP engine needs to create a copy of the container when you do an assignment-by-reference.

If we switch the order of assignments—first we assign a by reference to b and then we assign a by value to c—then something similar happens. Figure 6 shows how this is handled.

In step 1, In the first step we assign the variable a to the string “this is” and then we proceed to assign a by reference to variableb. We now have one variable container whereis_ref is 1 and refcount is 2. In step 2, , we assign variable a by value to variable c, now a copy of the variable container is made in order for the PHP engine to be able to handle modifications to the variables, correctly, with the same reasons as stated in the previous paragraph.But if you go back to step 2 of Figure 2, where we assign the variableato bothb andc, you see that no copy is made here.

5. 函数引用传递Passing References to Functions Variables can also be passed-by-reference to functions. This is useful when a function needs to modify the value of a specific variable when it is called. The script in Figure 7 is a slightly modified version of the script that you have already seen in Figure 3.

The only difference is the ampersand (&) in front of the

The refcount s ; statement is basically the same as the c = a statement in step 2 of Figure 6. The global variblea and the local variable s are both references to he same variable container and the logic dictates that is_ref is set to 1 for a specific container and this conainer is assigned to another variable by-value, the conainer does not need to be duplicated. This is exactly hat happens here, except that the newly created varible is created in the global symbol table by the assignment of the return value of the function with the statement b = do_something(

6 . 函数引用返回 Returning by Reference Another feature in PHP is the ability to “return by reference”. This is useful, for example, if you want to select a variable for modification with a function, such as selecting an array element or a node in a tree structure. In Figure 8 we show how returning by references work by means of an example.

In step 1,In this example (step 1), we define a $tree variable (which is actually not a tree, but a simple array) that contains three elements. The three elements have key values of 1, 2 and 3, and all of them point to a string describing the English word that matches with the key’s value (ie.one, two and three).

In step 2,This array gets passed to the fiind_node()function by reference, along with the key of the element that thefiind_node() function should look for and return. We need to pass by reference here, otherwise we can not return a reference to one of the elements, as we will be returning a reference to a copy of the tree . When tree is passed to the function it has arefcount of 3 andis_refis set to 1. Nothing new here.

In step 3,The first statement in the function, item = & key], causes a new variable to be created in the symbol table of the function, which points to the array element where the key is “3” (because the variablekey is set to 3). In this step 3 you see that the creation of the

In step 4,The interesting things happen in step 4 where we return item (by reference) back to the calling scope and assign it (by reference) to node. This causes therefcountof the variable container to which the 3rd array key points to be set to 3. At this point tree, item (from the function’s scope) and

In step 5, When the symbol table of the function is destroyed (in step 5), therefcount value decreases from 1 to 2.node is now a reference to the third element in the array. If the variableitem would not have been assigned by reference to the return value of the do_something() function, but instead would have been assigned by value, thennode would not have been a reference to tree[3]. In this case, therefcount value of the variable container to which tree[3] points is then 1 after the function ends, but for some strange reason theis_refvalue is not reset to 0 as you might expect. My tests did not find any problems with this, though, in this simple example. If the function do_something() would not have been a “return-by-reference function”, then again the node variable would not be a reference to tree[3]]. In this case, theis_ref value of the variable ( tree )container would have been reset to 0.

In step 6,Finally, in step 6, we modify the value in the variable container to which bothnode and tree[3] point.

Please do note that it is harmful not to accept a reference from a function that returns a reference. In some cases, PHP will get confused and cause memory corruptions which are very hard to find and debug. It is also not a good idea to return a static value as reference, as the PHP engine has problems with that too. In PHP 4.3, both cases can lead to very hard to reproduce bugs and crashes of PHP and the web server. In PHP 5, this works all a little bit better. Here you can expect a warning and it will behave “properly”. Hopefully, a backported fix for this problem makes it into a new minor version of PHP 4—PHP 4.4.

7.The Global Keyword PHP has a feature that allows the use of a global variable inside a function: you can make this connection with the g gl lo ob ba al l keyword. This keyword will create a ref- erence between the local variable and the global one. Figure 9 shows this in an example.

In step 1 and 2, we create the variablevarand call the functionupdate_var() with the string literal“one” as the sole parameter. At this point, we have two variable containers. The first one is pointed to from the global variable var, and the second one is the val the functionupdate_var() with the string literal“one” as the sole parameter. At this point, we have two variable containers. The first one is pointed to from the global variable var, and the second one is the val variable in the called function. The latter variable container has arefcount value of 2, as both the variable on the stack and the local variableval point to it.

In step 3,The global $var statement, in the function, creates a new variable in the local scope, which is created as a reference to the variable with the same name in the global scope. As you can see in step 3, this increases the refcount of the variable container from 1 to 2 and this also sets the is_ref value to 1.

In step 4, we unset the variable var . Against some people’s expectation, the global variable vardoes not get unset—as theunset() was done on a reference to the global variable

In step 5, To reestablish the reference, we employ the global keyword, again in step 5. As you can see, we have re-created the same situation as in step 3. Instead of using global var we could just as well have used var ==&

In step 6, we continue to reassign the var variable to the function’s val argument. This changes the value to which both the global variablevar and the local variablevar point; this is what you would expect from a referenced variable. When the function ends,

In step 7, the reference from the variable in the scope of the function disappears, and we end up with one variable container with arefcount of 1 and anis_ref value of 0.

8. (勿滥用引用)Abusing References

In this section, I will give a few examples that show you how references should not be used—in some cases these examples might even create memory corruptions in PHP 4.3 and lower.

Example 1: “Returning static values by-reference”. In Figure 10, we have a very small script with a return-by-reference function called definition().

This function simply returns an array that contains some elements. Returning by reference makes no sense here, as the exact same things would happen internally if the variable container holding the array was returned by value, except that in the intermediate step (step 3) the is_ref value of the container would not be set to 1, of course. In case thedefvariable in the function’s scope would have been referenced by another variable, something that might happen in a class method where you do def =

Example 2: “Accepting references from a function hat doesn’t return references”. This is potentially dan-gerous; PHP 4.3 (and lower) does not handle this properly. In Listing 1, you see an example of something that is not going to work properly.

代码语言:javascript
复制
<?php
	function &split_list($emails)
	{
	$emails =& preg_split(“/[,;]/”, $emails);
	return $emails;
	}

	$emails =
	split_list(‘derick@php.net;derick@derickrethans.nl;dr@ez.no’);

This function was implemented with performance in mind, trying not to copy variable containers by using references. As you should know after reading this article, this is not going to buy you anything. There are a few reasons why it doesn’t work. The first reason is that the PHP internal function preg_split() does not return by reference—actually, no internal function in PHP can return anything by reference. So, assigning the return value by reference from a function that doesn’t return a reference is pointless. The second reason why there is no performance benefit, here, is the same one as in Example 1, in the previous paragraph: you’re returning a static valuenot a reference to a variable—it does not make sense to make thes split_list() function to return-by-reference.

9. 总结 Conclusion

After reading this article, I hope that you now fully understand how references, refcounting, and variables work in PHP. It should also have explained that assigning by reference does not always save you memory—it’s better to let the PHP engine handle this optimization. Do not try to outsmart PHP yourself here and only use references when they are really needed. In PHP 4.3, there are still some problems with references, for which patches are in the works. These patches are backports from PHP 5-specific code, and although they work fine, they will break binary compatibility—meaning that compiled extensions no longer work after those patches are put into PHP. In my opinion, those hard to produce memory corruption errors should be fixed in PHP 4 too, though, so perhaps this creates the need for a PHP 4.4 release. If you’re having problems, you can try to use the patch located at http://files.derickrethans.nl/patches/ze1-return-refrence-20050429.diff.txt The PHP Manual also has some information on references, although it does not explain the internals very well. The URL for the section in PHP’s Manual is http://php.net/language.references

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2012-04-13 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 基础知识
相关产品与服务
容器服务
腾讯云容器服务(Tencent Kubernetes Engine, TKE)基于原生 kubernetes 提供以容器为核心的、高度可扩展的高性能容器管理服务,覆盖 Serverless、边缘计算、分布式云等多种业务部署场景,业内首创单个集群兼容多种计算节点的容器资源管理模式。同时产品作为云原生 Finops 领先布道者,主导开源项目Crane,全面助力客户实现资源优化、成本控制。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档