Jan 19, 2008

Understanding Opcodes

A blog reader (I have readers???) recently shared his wishlist, "I'm trying to figure out how to show the opcodes like you have in your post...". I promised that I'd throw something together, so here it is:


Slow down, wtf is an "Opcode"?

Short answer: It's the compiled form of a PHP script, similar in principle to Java bytecode or .NET's MSIL. For example, say you've got the following bit of PHP script:

<?php
echo "Hello World";
$a = 1 + 1;
echo $a;

PHP (and it's actual compiler/executor component, the Zend Engine) are going to go through a multi-stage process:

  1. Scanning (a.k.a. Lexing) - The human readable source code is turned into tokens.
  2. Parsing - Groups of tokens are collected into simple, meaningful expressions.
  3. Compilation - Expressions are translated into instruction (opcodes)
  4. Execution - Opcode stacks are processed (one opcode at a time) to perform the scripted tasks.
Side note: Opcode caches (like APC), let the engine perform the first three of these steps, then store that compiled form so that the next time a given script is used, it can use the stored version without having to redo those steps only to come to the same result.

Er... okay... can you elaborate a little? What's lexing? I thought superman put him in jail...

That's Lex Luthor you nit-wit! The most expedient way to explain lexing is by example. Take a look at the manual page for token_get_all(), this gem is actually a wrapper around the Zend Engine's own language scanner. Play around with it a bit, and you'll notice that plugging the short script above into it will produce:

Array
(
[0] => Array
(
[0] => 367
[1] => <?php
)
[1] => Array
(
[0] => 316
[1] => echo
)
[2] => Array
(
[0] => 370
[1] =>
)
[3] => Array
(
[0] => 315
[1] => "Hello World"
)
[4] => ;
[5] => Array
(
[0] => 370
[1] =>
)
[6] => =
[7] => Array
(
[0] => 370
[1] =>
)
[8] => Array
(
[0] => 305
[1] => 1
)
[9] => Array
(
[0] => 370
[1] =>
)
[10] => +
[11] => Array
(
[0] => 370
[1] =>
)
[12] => Array
(
[0] => 305
[1] => 1
)
[13] => ;
[14] => Array
(
[0] => 370
[1] =>
)
[15] => Array
(
[0] => 316
[1] => echo
)
[16] => Array
(
[0] => 370
[1] =>
)
[17] => ;
)

In the array returned by token_get_all(), you have two types of tokens: Single character non-label characters are returned as just that. The character that was found in the source file at that point. Everything else, from labels, to language constructs, to multi-character operators (like >>, +=, etc...) are returned as an array containing two elements: The token ID (which corresponds to T_* constants -- e.g. T_ECHO, T_STRING, T_VARIABLE, etc...), and the actual text which that token came from. What the engine actually gets is slightly more detailed than what you see in the output from token_get_all(), but not by much...

Okay, tokenization just breaks the script into bite-size pieces, how does parsing work then?

The first thing the parser does is throw away all whitespace (Unlike some other P* language...). From the reduced set of tokens, the engine looks for irreducible expressions. How many expressions do you see in the example above? Did you say three? WRONG There are three statements, but one of those statements is made of two distinct expressions. In the case of $a = 1 + 1; the first expression is the addition, followed by the assignment to the variable as a second, distinct expression. All together our expression list is:

  1. echo a constant string
  2. add two numbers together
  3. store the result of the prior expression to a variable
  4. echo a variable

Hey! That's starting to sound familiar! Did I see that kind of description before?

Oh, you must mean my post about strings (plug). That's correct, because these expressions are exactly the pieces which go into making up oplines! Given the expression list we've just reached, the resulting opcodes look something like:

  • ZEND_ECHO 'Hello World'
  • ZEND_ADD ~0 1 1
  • ZEND_ASSIGN !0 ~0
  • ZEND_ECHO !0

What happened to $a? What's the difference between ~0 and !0?

Short answer: !0 is $a


So here's the deal.... oplines have five principle parts:

  • Opcode - Numeric identifier which distinguishes what the opline will do. This is what coresponds to ZEND_ECHO, ZEND_ADD, etc...
  • Result Node - Most opcodes perform "non-terminal" actions. That is; after executing there's some result which can be consumed as an input to a later opline. The result node identifies what temporary location to place the result of the operation in.
  • Op1 Node - One of two inputs to the given opcode. An input may be a constant zval, a reference to a previous result node, a simple variable (CV), or in some cases a "special" data element, such as a class definition. Note that an opcode may use both, one, or neither input node. (Some even use more, see ZEND_OPDATA)
  • Op2 Node - Ditto
  • Extended Value - Simple integer value used to differentiate specific behaviors of an overloaded opcode.

So obviously the nodes are the most complicated parts of an opline, here's the important parts of what they look like:

  • op_type - One of IS_CONST, IS_TMP_VAR, IS_VAR, IS_UNUSED, or IS_CV
  • u - A union of the following elements (the one which is used depends on the value of op_type):
    • constant (IS_CONST) - zval value. This node results which you include a literal value in your script, such as the 'Hello World' or 1 values in the example above.
    • var (IS_VAR or IS_TMP_VAR or IS_CV) - Integer value corresponding to a temporary slot in a lookup table used by the engine.

Now let's look at the difference between those optypes, particularly with respect to u.var:

  • IS_TMP_VAR - These ephemeral values are strictly for use by non-assignment non-terminal expressions. They don't support any refcounting because they're guaranteed not to be shared by any other variable. These are denoted in the examples I use on this site (and in VLD output) as tilde characters (~)
  • IS_VAR - Usually the result of a ZEND_FETCH(_DIM|_OBJ)?_(R|W|RW), or one of the assignment opcodes (which are technically non-terminal expressions since they can be used as inputs to other expressions. Since these are tied to real variables, they have to respect reference counting and are passed about at an extra degree of indirection. They're stored in the same table though. These are denoted by the string symbol ($)
  • IS_CV - "CV" stands for "Compiled Variables". These are basicly cached hash lookups for fetching simple variables from the local symbol table. Once a variable is actually looked up at runtime, it's stored at an extra level of indirection in an even faster lookup table using an index into a vector. That's what the number in this node denotes. These types of nodes are distinguished by a bang (!)

Boggle... You...so lost me there...

Yeah, that explanation sort of got away from me didn't it? What can I clear up?


All I really want to know is how to translate some source code into an opcode..list...thingy...

Heh, okay... first off, that "opcode list thingy" is called an op_array, and you can generate those really easily using one of two PECL packages. You can use my parsekit package, which is useful for programmatic analysis of script compilation, but frankly... it's not what you're looking for and there's not much call for scripts analyzing other scripts anyway. I recommend Derick's VLD (Vulcan Logic Disasembler) which is what'll actually generate the kinds of opcode lists you'll see me use in blog posts.


Once you've got it installed (it installs like any other PECL extension), you can run it with a command like the following:

php -d vld.active=1 -d vld.execute=0 -f yourscript.php

Then sit back and watch the opcodes fly! Important note: Using -r with command line code may not work due to a quirk of the way the engire parses files in older versions of PHP (and with older versions of VLD). Be sure to put your script on disk and reference it using -f if -r doesn't work for you.


Holy schnikies! That's a lot of opcodes! How can I tell what they all do?

Take a look at Zend/zend_vm_def.h in your PHP source tree. In here you'll find a meta-definition of every single opcode used by the engine. Side note: It's used as a source for zend_vm_gen.php which generates the actual code file zend_vm_execute.h. How's that for chicken and egg? Every version of PHP since 5.1.0 has required PHP be already built in order to build it!

62 comments:

  1. Brilliant! Thanks a ton for this post.

    ReplyDelete
  2. There is actually a full list of what the opcodes do now at http://www.php.net/manual/en/internals2.opcodes.list.php

    ReplyDelete
  3. Thanks for sharing. Please add info about step where semantic analyzer works

    ReplyDelete
  4. Private Label can help your dreams of having your own product come alive. Intermountain Supplements offers a wide variety of custom formulation services including liquids, capsules, sprays, powders, and more.

    ReplyDelete
  5. Want to get rid of stretch marks fast. This article mainly focuses on many stretch marks removal remedies to get rid of stretch marks fast.[url="http://www.howtogetridofstretchmarksfast.org"]how to prevent stretch marks fast[/url]

    ReplyDelete
  6. How to Get Whiter Skin Naturally: Want to get whiter skin naturally. This blog post is mainly intended to give full information regarding the ways to get whiter skin naturally fast.
    how to get whiter skin naturally

    ReplyDelete
  7. Hiccups are sudden, automatic compressions of the diaphragm which happen in the meantime as a contraction of the voice box (larynx) and aggregate closure of the glottis, adequately blocking air intake inside your body.
    how to stop hiccups

    ReplyDelete
  8. It’s not necessary that every short hairstyle is good for a round face, however some of those underneath appear to be cute to the point that you essentially can’t deny yourself a delight to attempt a cheeky short hair style for a change.If your face is round,
    freebloggingtips

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Get Muskurane Ki Wajah Tum Ho Lyrics, song lyrics of Muskurane Ki Wajah Tum Ho From CityLights Movie sung by Arijit Singh and music composed by Jeet Gannguli. Muskurane Ki Wajah Tum Ho Song: Singer: Arijit Singh Music: Jeet Gannguli Lyrics: Rashmi Singh Featuring: Arijit Singh, Rajkumar Rao, Patralekha Music …
      muskurane ki wajah tum ho lyrics

      Delete
  10. Are you looking for Diamond Engagement Rings, Jewellery & Loose Diamonds New Zealand? We are providing Free Shipping to all over NewZealand. Get the best Solitaires, Princess Cut, Gold & Platinum now!

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. Very helpful and great information.
    Look forward at: http://azhappynewyear.com/.

    ReplyDelete
  13. Great Information. Iwould like to appreciate your efforts. See more ideas about new year at : www.anewyear2016.com.

    ReplyDelete
  14. This article has covered the topic quite well. Very informational and interesting. Thanks for sharing this knowledge with us.

    new year, new year images, new year wallpapers, new year quotes, new year wishes, new year sms, new year greetings, whatsapp status

    ReplyDelete
  15. I stumbled upon this I have discovered It positively useful and it has aided me out loads. I am hoping to contribute & aid different users like its helped me. Great job.

    Ts 10th 2016 results
    APSET Results 2016
    CBSE 10th class Time Table 2016


    Anna University 2016 Results
    Osmania University 2016 Result
    PTU 2016 Result



    GSEB SSC 2016 Result
    CBSE 12th 2016 Results
    <a href="http://www.allindiaresults2016.in/2015/11/gseb-12th-science-result-2016

    ReplyDelete
  16. Valentines Day Invitation Card As we all know propose day is arriving and all the

    couples and gf/bf have been started to plan up day Happy Propose Day images
    Anti-Valentines Day Week List This a particular day when someone

    actually need someone.special person in the world.Valentines Day Best Dinner Menu

    ReplyDelete
  17. Are you looking for valentine day wishes, valentine day wish, valentine day wishes 2016, happy valentine day wish, valentine day image wishes, valentine day wish images, valentine day wish quotes, valentine day quote wishes, valentines day wishes, valentines day wish, valentines day wishes 2016, happy valentines day wish, valentines day image wishes, valentines day wish images, valentines day wish quotes, valentines day quote wishes and much more? Cool, no need to worry at all, we are here to help you

    Valentines Day Quotes 2016
    Happy Valentines Day Quotes 2016
    Valentines Day Quotes 2016 | Funny Valentines Love Quotes
    Valentines Day Cards 2016
    Valentines Day Cards 2016 |Happy Valentines Day Cards 2016 Ideas, funny, printable
    Valentines Day Images 2016
    Valentines Day Images 2016 | Valentine Pictures, Photos, Wallpapers
    Valentines Day Wishes 2016
    Valentines Day 2016 | Happy Valentines Day 2016
    Valentines Day Gifts 2016
    Valentines Day Gifts Idea 2016 | Happy Valentines Gift Ideas For Her/Him
    Valentines Day Ideas 2016
    Valentines Day Ideas 2016 | For Him | For Her | Gift Ideas
    Valentines Day Poems 2016
    Valentines Day Poems 2016 | Romantic Love Poems For Valentine's Day
    Valentines Day Wishes 2016 | Happy Valentines day Messages 2016, Greetings, SMS
    Happy Valentines Day 2016 Whatsapp Status
    Happy Valentines Day 2016 Messages
    Happy Valentines Day 2016 SMS
    Happy Valentines Day 2016 Gifts
    Rose Day Images 2016 | Happy Rose Day SMS Messages, Wishes, Quotes
    Happy Valentines Day 2016
    Anti Valentines Day Quotes Wishes SMS
    Happy Valentines Day 2016 Images
    Valentines Day Wishes For Boyfriend
    Cheap Valentines Day Cards
    Cute Valentines Day Quotes
    Short Poems For Valentines Day

    ReplyDelete
  18. Are you looking for valentine day wishes, valentine day wish, valentine day wishes 2016, happy valentine day wish, valentine day image wishes, valentine day wish images, valentine day wish quotes, valentine day quote wishes, valentines day wishes, valentines day wish, valentines day wishes 2016, happy valentines day wish, valentines day image wishes, valentines day wish images, valentines day wish quotes, valentines day quote wishes and much more? Cool, no need to worry at all, we are here to help you.
    Valentines Day Quotes 2016
    Happy Valentines Day Quotes 2016
    Valentines Day Quotes 2016 | Funny Valentines Love Quotes
    Valentines Day Cards 2016
    Valentines Day Cards 2016 |Happy Valentines Day Cards 2016 Ideas, funny, printable
    Valentines Day Images 2016
    Valentines Day Images 2016 | Valentine Pictures, Photos, Wallpapers
    Valentines Day Wishes 2016
    Valentines Day 2016 | Happy Valentines Day 2016
    Valentines Day Gifts 2016
    Valentines Day Gifts Idea 2016 | Happy Valentines Gift Ideas For Her/Him
    Valentines Day Ideas 2016
    Valentines Day Ideas 2016 | For Him | For Her | Gift Ideas
    Valentines Day Poems 2016
    Valentines Day Poems 2016 | Romantic Love Poems For Valentine's Day
    Valentines Day Wishes 2016 | Happy Valentines day Messages 2016, Greetings, SMS
    Happy Valentines Day 2016 Whatsapp Status
    Happy Valentines Day 2016 Messages
    Happy Valentines Day 2016 SMS
    Happy Valentines Day 2016 Gifts
    Rose Day Images 2016 | Happy Rose Day SMS Messages, Wishes, Quotes
    Happy Valentines Day 2016
    Anti Valentines Day Quotes Wishes SMS
    Happy Valentines Day 2016 Images
    Valentines Day Wishes For Boyfriend
    Cheap Valentines Day Cards
    Cute Valentines Day Quotes
    Short Poems For Valentines Day

    ReplyDelete
  19. Great post however I was wanting to know if you could write a litte more on this subject?
    I’d be very thankful if you could elaborate a little bit more.
    Thank you!rose day quotes,rose day quotes,happy rose day sms,happy valentines day images,happy valentines day images and teddy day messages

    ReplyDelete
  20. The day is celebrated with the name of the god of this constitution of India.Promise Day Promises For GF Love is the most wonderful of all feelings in this world.Rose Day Cover For FB It’s Valentine‘s Day2015 and what could be Valentinesday comes on 14th February Valentines Day Cards for gf As we all know propose day is arriving and all the couples and gf/bf have been started to plan up day

    ReplyDelete
  21. http://2016sms.com/propose-day-2016-sexy-photos/

    ReplyDelete
  22. which we can observe the reality. This is very nice one and gives in depth information..Study in Canada for Free 2016

    ReplyDelete