Showing posts with label favorite. Show all posts
Showing posts with label favorite. Show all posts

Jun 1, 2006

What the heck is TSRMLS_CC, anyway?

If you've ever worked on the PHP internals or built an extension, you've seen this construct floating around here and there, but noone ever talks about it. Those who know what this is typically answer questions from those who don't with "Don't worry about what it is, just use it here here here and here. And if the compiler says you're missing a tsrm_ls, put it there too..." This isn't laziness on the part of the person answering the question (okay, maybe it is a little bit), it's just that the engine goes so far out of its way to simplify what this magic values does, that there's no profit in a new extension developer knowing the mechanics of it. The information is like a cow's opinion, it doesn't matter, it's Moo.


Since I love to listen to myself rattle on about pointless topics (and I havn't blogged much this month), I thought I'd cover this topic and see if anyone manages to stay awake through it. You can blame Lukas, he got me rolled onto planet-php.net...

Glossary

TSRM
Thread Safe Resource Manager - This is an oft overlooked, and seldom if ever discussed layer hiding in the /TSRM directory of your friendly neighborhood PHP source code bundle. By default, the TSRM layer is only enabled when compiling a SAPI which requires it (e.g. apache2-worker). All Win32 builds have this layer enabled enabled regardless of SAPI choice.


ZTS
Zend Thread Ssafety - Often used synonymously with the term TSRM. Specifically, ZTS is the term used by ./configure ( --enable-experimental-zts for PHP4, --enable-maintainer-zts for PHP5), and the name of the #define'd preprocessor token used inside the engine to determine if the TSRM layer is being used.


tsrm_ls
TSRM local storage - This is the actual variable name being passed around inside the TSRMLS_* macros when ZTS is enabled. It acts as a pointer to the start of that thread's independent data storage block which I'll cover in just a minute


TSRMLS_??
A quartet of macros designed to make the differences between ZTS and non-ZTS mode as painless as possible. When ZTS is not enabled, all four of these macros evaluate to nothing. When ZTS is enabled however, they expand out to the following definitions:
  • TSRMLS_C tsrm_ls
  • TSRMLS_D void ***tsrm_ls
  • TSRMLS_CC , tsrm_ls
  • TSRMLS_DC , void ***tsrm_ls

Globals

In any normal C program (just like in PHP) you have two methods of getting data access to the same block of data in two different functions. One method is to pass the value on the parameter stuck like so:

#include 

void output_func(char *message)
{
printf("%s\n", message);
}

int main(int argc, char *argv[])
{
output_func(argv[0]);

return 0;
}

Alternately, you could store the value in a variable up in the global scope and let the function access it there:

#include 

char *message;

void output_func(void)
{
printf("%s\n", message);
}

int main(int argv, char *argv[])
{
message = argv[0];
output_func();

return 0;
}

Both approaches have their merits and drawbacks and typically you'll see some combination of the two used in a real application. Indeed, PHP is covered in global variables from resource type identifiers, to function callback pointers, to request specific information such as the symbol tables used to store userspace variables. Attempting to pass these values around in the parameter stack would be more than unruly, it'd be impossible for an application like PHP where it's often necessary to register callbacks with external libraries which don't support context data.


So common information, like the execution stack, the function and class tables, and extension registries all sit up in the global scope where they can be picked up and used at any point in the application. For single-threaded SAPIs like CLI, Apache1, or even Apache2-prefork, this is perfectly fine. Request specific structures are initialized during the RINIT/Activation phase, and reset back to their original values during the RSHUTDOWN/Deactivation phase in preparation for the next request. A given webserver like Apache1 can serve up multiple pages at once because it spawns multiple processes each in their own process space with their own independant copies of global data.


Now let's introduce threaded webservers like Apache2-worker, or IIS. Under these conditions, only one process space is active at a given time with multiple threads spun off. Each of these threads then act in the same manner as a single-threaded process might; Servicing requests one-at-a-time as dispatched by inbound requests. The trouble starts to brew as two or more threads try to service the a request at the same time. Each thread wants to use the global scope to store its request-specific information, and tries to do so by writing to the same storage space. At the least, this would result in userspace variables declared in one script showing up in another. In practice, it leads to quick and disasterous segfaults and completely unpredictable behavior as memory is double freed or written with conflicting information by separate threads.


Non-Global Globals

The solution is to require the engine, the core, and any extension using global storage to determine how much memory will be used by request-specific data. Then, at the spin-up of each new thread, allocate a chunk of memory for each of these players to store their data into thus giving each thread its own local storage. In order to group all the individual chuncks used by a given thread together, one last vector of pointers is allocated to store the individual sub-structure pointers into. It's the pointer to this vector which is passed around as the tsrm_ls variable by the TSRMLS_* family of macros. To see how this works, let's look at a example extension:


typedef struct _zend_myextension_globals {
int foo;
char *bar;
} zend_myextension_globals;

#ifdef ZTS
int myextension_globals_id;
#else
zend_myextension_globals myextension_globals;
#endif

/* Triggered at the beginning of a thread */
static void php_myextension_globals_ctor(zend_myextension_globals *myext_globals TSRMLS_DC)
{
myext_globals->foo = 0;
myext_globals->bar = NULL;
}

/* Triggered at the end of a thread */
static void php_myextension_globals_dtor(zend_myextension_globals *myext_globals TSRMLS_DC)
{
if (myext_globals->bar) {
efree(myext_globals->bar);
}
}

PHP_MINIT_FUNCTION(myextension)
{
#ifdef ZTS
ts_allocate_id(&myextension_globals_id, sizeof(zend_myextension_globals),
php_myextension_globals_ctor, php_myextension_globals_dtor);
#else
php_myextension_globals_ctor(&myextension_globals TSRMLS_CC);
#endif

return SUCCESS;
}

PHP_MSHUTDOWN_FUNCTION(myextension)
{
#ifndef ZTS
php_myextension_globals_dtor(&myextension_globals TSRMLS_CC);
#endif

return SUCCESS;
}

Here you can see the extension declaring its global requirements to the TSRM layer by stating that it needs sizeof(zend_myextension_globals) bytes of storage, and providing callbacks to use when initializing (or destroying) a given thread's local storage. The value populated into myextension_globals_id represents the offset (common to all threads) into the tsrm_ls vector where the pointer to that thread's local storage can be found. In the event that ZTS is not enabled, the data storage is simply placed into the true global scope and the thread initialization and shutdown routines are called manually during the Module's Startup and Shutdown phases. If you're wondering why TSRMLS_CC was included in the non-ZTS blocks, then I clearly havn't made you fall asleep yet. Those aren't needed there since we know they evaluate to nothing, but it helps encourage good habits to include them anywhere the function's prototype calls for them.


Putting it all together

The final piece of this thread-safe puzzle comes from the question: "How do I access data in these structures?" And the answer to that question comes in the form of another familiar looking macro. Each extension or core component defines, in one of its header files, a macro which looks something like the following:


#ifdef ZTS
# define MYEXTENSION_G(v) \
(((zend_myextension_globals*)(*((void ***)tsrm_ls))[(myextension_globals_id)-1])->v)
#else
# define MYEXTENSION_G(v) (myextension_globals.v)
#endif

Thus, when ZTS is not enabled, this macro simply plucks the right value out of the imediate value in the global scope, otherwise it uses the ID to locate the thread's local storage copy of the structure and derefence the value from there.


Wanna know more, like how to deal with foreign callbacks where tsrm_ls isn't available? Buy my book!

May 24, 2006

Compiled Variables

Last month at php|tek I gave a presentation on "How PHP Ticks" where I covered, among other things the process of compiling source code into opcodes (an intermediate psuedo-language similar to what java calls "bytecode" or what .NET calls "MSIL"). As part of this section of the presentation, I showed one of the more interresting changes between ZE 2.0 (PHP 5.0) and ZE 2.1 (PHP 5.1), namely: How variables are retreived and used in an operation. More specifically, how they provide a small, yet cumulative, speedup to applications in a way that's transparent to the end-user -- One more reason to like PHP 5.1 right?


After listening to Marcus Whitney's interview with Brion Vibber of WikiMedia in which he mentions my presentation and makes reference to this engine change, I realized that I should clarify what this feature is (and more importantly, what it isn't) before any FUD spreads.

What Compiled Variables (CVs) are

First, let's look at the anantomy of an OpArray. Say you have the following simple script:


<?php
$a = 123;
$b = 456;
$c = $a + $b;
echo $c;

Now let's see how ZE 2.0 (PHP5.0) compiles this (ZE1.x/PHP4.x comes out to nearly identical opcodes). The $0 and ~0 references you see are (for lack of a better one sentence explanation) types of temporary variables (the latter moreso than the former, but don't worry about the distinction right now). What's important to know about this block and its statements are in the accompaning comments:

FETCH_W                  $0, 'a'          /* Retreive the $a variable for writing */
ASSIGN $1, $0, 123 /* Assign the numeric value 123 to retreived variable 0 */
FETCH_W $2, 'b' /* Retreive the $b variable for writing */
ASSIGN $3, $2, 456 /* Assign the numeric value 456 to retreived variable 2 */
FETCH_R $5, 'a' /* Retreive the $a variable for reading */
FETCH_R $6, 'b' /* Retreive the $b variable for reading */
ADD ~7, $5, $6 /* Add the retreived variables (5 & 6) to gether and store the result in 7 */
FETCH_W $4, 'c' /* Retreive the $c variable for writing */
ASSIGN $8, $4, ~7 /* Assign the value in temporary variable 7 into retreived variable 4 */
FETCH_R $9, 'c' /* Retreive the $c variable for reading */
ECHO $9 /* Echo the retreived variable 9 */

Seem like a lot of work for one plus one? It is, here's the same code snippet compiled by ZE 2.1/PHP 5.1 (or later).

ASSIGN                   $0, !0, 123      /* Assign the numeric value 123 to compiled variable 0 */
ASSIGN $1, !1, 456 /* Assign the numeric value 456 to compiled variable 1 */
ADD ~2, !0, !1 /* Add compiled variable 0 to compiled variable 1 */
ASSIGN $3, !2, ~2 /* Assign the value of temporary variable 2 to compiled variable 2 */
ECHO !2 /* Echo the value of compiled variable 2 */

These !0 variables refer to a new structure in the execution stack which stores and references to the "real" variables out in userspace. The hash value for each variable is computed at compile time (meaning that it's only done once per variable no matter how often it's used and that opcode caches save this work from being done during subsequent page views at all). The first time one of these CVs is used, the engine looks it up in the active symbol table and updates the CV cache to know where it is. All subsequent uses of that compiled variable use that pre-fetched address and don't have to look it up again. On an individual lookup, this isn't a major leap forward in speed, however consider a for loop where the test value is checked on every iteration; To put it in PHP terms, which would you rather do?

for($i = 0; $i < foo =" lookup_variable('foo');">increment();
$foo = lookup_variable('foo');
$foo->check_value();
}

or

$foo = lookup_variable('foo');
for($i = 0; $i <>increment();
$foo->check_value();
}


What compiled variables are not

Don't assume you're going to get a speedup on all of your code, especially if you use arrays or objects (which most code taking advantage of PHP5's new features does). The CV speedup has one minor achilles heel: It only works on simple variables. Putting it in terms of opcodes, let's consider this PHP source:

<?php
$f->a = 123;
$f->b = 456;
$f->c = $f->a + $f->b;
echo $f->c;
?>

Basicly the same code right? Just a little oopified... Let's look at the ZE2.1/PHP5.1 compilation of that:

ASSIGN_OBJ                $0, !0, 'a'     /* Assign the numeric value 123 to property 'a' of compiled variable 0 object */
OP_DATA 123 /* Additional data for ASSIGN_OBJ opcode */
ASSIGN_OBJ $1, !0, 'b' /* Assign the numeric value 456 to property 'b' of compiled variable 0 object */
OP_DATA 456 /* Additional data for ASSIGN_OBJ opcode */
FETCH_OBJ_R $3, !0, 'a' /* Retreive property 'a' from compiled variable 0 object */
FETCH_OBJ_R $4, !0, 'b' /* Retreive property 'b' from compiled variable 0 object */
ADD ~5, $3, $4 /* Add those values and store the result in temp var 5 */
ASSIGN_OBJ $2, !0, 'c' /* Assign the ADD result to property 'c' of compiled variable 0 object */
OP_DATA ~5 /* Additional data for ASSIGN_OBJ opcode */
FETCH_OBJ_R $6, !0, 'c' /* Retreive property 'c' from compiled variable 0 object */
ECHO $6 /* Echo the value */

What's important to see here is that the properties are refetched each time a read or write is performed on them, which at first glance looks as bad as the pre PHP5.1 way of dealing with variables. Don't let your enthusiam for compiled variables blind you though. Remember the magic __get(), __set(), __offsetget(), and __offsetset() methods which objects allow for. These overloading tricks are great, but they mean that the variable returned by one fetch may not be the variable returned by a subsequent fetch. It's unfortunate that this can't be guaranteed, but it's the reality of a dynamic language like PHP. Know your particular class isn't overloaded? You can get that speedup (at least some of it) back by using good 'ol references to turn your object variables into simple variables:

<?php
$a = &$f->a;
$b = &$f->b;
$c = &$f->c;
$a = 123;
$b = 456;
$c = $a + $b;
echo $c;
?>

Becomes:

FETCH_OBJ_W               $0, !1, 'a'     /* Retreive property 'a' from compiled variable 1 object */
ASSIGN_REF $1, !0, $0 /* Make compiled variable 0 a reference to the retreived variable */
FETCH_OBJ_W $2, !1, 'b' /* Retreive property 'b' from compiled variable 1 object */
ASSIGN_REF $3, !2, $2 /* Make compiled variable 2 a reference to the retreived variable */
FETCH_OBJ_W $4, !1, 'c' /* Retreive property 'c' from compiled variable 1 object */
ASSIGN_REF $5, !3, $4 /* Make compiled variable 3 a reference to the retreived variable */
ASSIGN $6, !0, 123 /* Assign the numeric value 123 to compiled variable 0 */
ASSIGN $7, !2, 456 /* Assign the numeric value 456 to compiled variable 2 */
ADD ~8, !0, !2 /* Add compiled variable 0 to compiled variable 2 */
ASSIGN $9, !3, ~8 /* Assign the value of temporary variable 8 to compiled variable 3 */
ECHO !3 /* Echo the value of compiled variable 3 */

Now, this particular example is actually a few more opcodes, and a little bit more work for the engine too, but the more your code following this point uses the local/simple-variable/reference copies rather than the object copies, the balance will quickly tip towards your favor because the variables are already fetched, and they don't need to go through the expensive re-fetch process (which is worse for objects than it ever was for regular variables).


Another caveat to CVs, is that they are entirely scope-local. This should make since as $a in the globals scope is not the same as $a in a given function. What this means for your script, is that when execution enters a new function (execution scope) the CVs for that function are a blank slate and everything has to be fetched anew, even if that function was called before.

Globals and Statics

Statics and Globals are treated to CV status, but only by way of the reference trick I just mentioned:


<?php
static $bar;
echo $bar;

Turns into:

FETCH_W      static      $0, 'bar'
ASSIGN_REF !0, $0
ECHO !0

What does that mean for your use of the $GLOBALS array? That's right, the global keyword is technically faster. Now, I want to be really clear about one thing here. The minor speed affordance given by using your globals as localized CVs needs to be seriously weighed against the maintainability of looking at your code in five years and knowing that $foo came from the global scope. something_using($GLOBALS['foo']); will ALWAYS be clearer to you down the line than global $foo; /* buncha code */ something_using($foo); Don't be penny-wise and pound foolish..


No, seriously. This post shouldn't be taken as a guide-book to speeding up your apps, they're slow for other reasons.