May 10, 2016

Comments Disabled

Somewhere around the beginning of this year, Google suffered a traumatic brain injury and forgot how to stop spam on Blogger. Probably forgot the product existed entirely, knowing Google. So I've had to disable comments completely. Feel free to reach out to me on Twitter @SaraMG.

May 2, 2016

Dario Blood Glucose Monitoring System

Last week, my wife found a new glucose meter via reddit or something, and it was only $40 so it sounded like it was worth giving a shot. It just arrived today, so I figured I'd share my initial thoughts below:

Ordering

My first annoyance was with buying the product. While the intro page tells you all about the meter and how much it costs, the pricing on strips was suspiciously absent. After a few minutes of clicking around the site, I finally tried hitting the [BUY] button, and was treated to actual numbers. In fairness, that could have been my first guess, but I'm easily annoyed.

After placing my order, I was given a stark white page saying "Your order has been placed." No order number, no tracking info, no estimated delivery date. Just a brief affirmation. Okay, whatevs, it'll come when it comes.

Unboxing

And it did come after only a few days, so point for promptness. The meter is just what it says on the tin, a compact housing for test strips, lancet device, and the small square dongle used for actually reading a strip and pumping it into one's phone.

One small quirk did pop up fairly early. The instructions indicate that the test strip cartridge should "snap" into the casing. It doesn't. In fact, when removing the outer cap, it pulls the cartridge out of the casing, then I have to pry the cap off the cartridge, so I can open the cartidge, and fish out the strip. For a device which hinges on convenience and ease of use, this isn't convenient or easy to use. Hopefully this is just an early model engineering defect and not indicative of the poor quality of the device overall.

Using the app

Unsurprisingly, the app wants you to register/login to their site. That's not unexpected. What surprised me here is that the account I created last week (while buying the device) wasn't usable for the actual app. Maybe that's a good thing, if the store is run by a third-party, principle of least privileges and all. But it's certainly inconvenient from a user standpoint.

But finally, I'm ready to take a reading...


Fuuuuuuuudge...
Now, don't judge me on the hipster wood case, I'm very clumsy and it helps me not break phones. I used to have plastic/vinyl wallet cases, but they fall apart in a matter of weeks, and... look, my point is, just a SLIGHTLY longer neck would have made this a lot more usable. Square knows this problem exists, which is why their 2nd gen readers don't match their 1st gen readers.

But it works well enough, the data matches my other meter within tolerance... Wait, no... one more gripe: The app doesn't talk to iOS's health data stream. My current go-to logbook app, MySugr (highly recommend, btw) is able to share data with my phone's OS, which means my apps can work better together.

Conclusion

Not a terrible little device, I'll probably keep it in my hoodie-of-many-pockets for those random occasions where I used to use OneTouch minis, but it's not quite ready for me yet.

Jan 8, 2015

HHVM Extension Writing, Part IV

Hopefully you've had some time to digest the first three parts of my HHVM Extension Writing series, and you're ready to embark into the world of Resources. Not quite as exciting as objects, they're certainly less commonly used in new extensions since Objects are so much more versatile, but there were an integral part of PHP's history prior to PHP5, and are still in heavy use by such features as streams, curl, and most database connectors.

We'll be continuing with the same examples repo at https://github.com/sgolemon/hhvm-extension-writing where I've already landed a skeleton for example3 with commit 144e9698b5.

Resource

Defining a new resource type shares some similarity with Objects, except that you won't be defining anything in systemlib initially, because a resource doesn't have an API in and of itself. It's just an opaque pointer used by seemingly unrelated functions and methods. We'll define some global functions in the second half of this post when we start accessing the resource. Any wonder Objects are winning out?

As with the Object example, we'll be implementing a bit of filesystem access to demonstrate how one might use resources.
class Example3File : public SweepableResourceData {
 public:
  DECLARE_RESOURCE_ALLOCATION_NO_SWEEP(Example3File)
  CLASSNAME_IS("example3-file")
  const String& o_getClassNameHook() const override { return classnameof(); }

  Example3File(const String& filename, const String& mode) {
    m_file = fopen(filename.c_str(), mode.c_str());
    if (!m_file) {
      throw Object(SystemLib::AllocExceptionObject(
        "Unable to open file"
      ));
    }
  }

  ~Example3File() { sweep(); }
  void sweep() { close(); }

  void close() {
    if (m_file) {
      fclose(m_file);
      m_file = nullptr;
    }
  }

  bool isInvalid() const override {
    return !m_file;
  }

  FILE* m_file{nullptr};
};

The first three lines of our class are basically boilerplate. DECLARE_RESOURCE_ALLOCATION_NO_SWEEP handles some specifics about the MemoryManager, because unlike objects which are allocated from userspace, resources are allocated from C++, and a bare c++ new won't do a "smart" allocation unless told to do so by this macro. The CLASSNAME_IS macro, and the o_getClassNameHook() virtual below it, define the "resource name" as seen from PHP when you var_dump() or call get_resource_type.

As with the Object version of this example, the class destructor is automatically called when the resource variable falls out of scope, meanwhile sweep() is invoked if the variable is still "live" at the end of a request. Since in this case we want the same behavior to occur, we simple chain one through the other, and on to their actual purpose; Closing the file.

isInvalid() exists for Resource types because it's very common to invoke functions like fclose() who's purpose is to make the resource no longer usable as its normal type. HHVM's mechanism for dealing with this is very different from PHP's, but the end result is the same. So long as your class has some way of knowing that it's "dead", HHVM can detect it, and report accordingly in var_dump() and other calls.

So now that we've defined a Resource type, let's make a function to create instances and do things with them:
Resource HHVM_FUNCTION(example3_fopen, const String& filename, const String& mode) {
#ifdef NEWOBJ
  return Resource(NEWOBJ(Example3File)(filename, mode)); 
#else
  return Resource(newres<Example3File>(filename, mode));
#endif
}

void HHVM_FUNCTION(example3_fclose, const Resource& fp) {
  // By default, if fp is not of type "Example3File" and valid,
  // HHVM will throw an exception here
  auto f = fp.getTyped<Example3File>();
  f->close();
}

Variant HHVM_FUNCTION(example3_ftell, const Resource& fp) {
  // By passing "true" for "badTypeOkay", invalid resources
  // result in returning nullptr, rather than throwing an exception
  // So check the return type!
  auto f = fp.getTyped<Example3File>(true /* nullOkay */, true /* badTypeOkay */);
  if (!f) {
    raise_warning("Instance of example3-file resource expected");
    return init_null();
  }
  return (int64_t)ftell(f->m_file);
}

As you can see, allocating a resource is literally as simple as newing up a C++ class with the newres<T>() template (or if you're using an older version of HHVM, the NEWOBJ() macro) and passing it as an argument to Resource's constructor.

For getting at that C++ instance, I've presented two different, equally valid methods. The first is certainly more concise, but you may not want to throw exceptions (you've probably eschewed making an Object for a reason, after all). On the other hand, while the second form allows you to handle your error cases more explicitly, this particularly common case forced us to change our return type from int to the far less precise mixed. Take this into account when designing your APIs. Many extensions follow the latter pattern, using a simple macro to avoid excessive copypasta from one function to the next. I've added an example of that to the repo.

What's with the nullOkay arg? TL;DR version: It's a bit of legacy logic and doesn't really have a place in modern HHVM extensions. You can usually set it to the same value as you pass for badTypeOkay. Note that both of these args are false by default.

Update: In the initial version of this post, I used NEWOBJ() exclusively to allocate a new resource instance, but as @maide pointed out in IRC, that's been removed from the newest versions of HHVM and replaced with newres<T>(). We use an #ifdef to figure out which version we're dealing with because the HHVM team is lousy at updating the API version constant. ;p

The code so far is at commit 5f852d45cc.

What's next...

We've covered all the userspace data types, declaration of functions, classes, and resources. Next up, in Part V, we'll back up for a few moments and look at the build system so that we can start linking in external libraries. If you're already familiar with CMake, then you can probably skip that chapter. Part VI is TBD at the moment, but I'll probably pick up a lot of the parts I glossed over in Parts I-IV, we'll see what comes out...

Jan 7, 2015

HHVM Extension Writing, Part III

In Part I of this series, we looked at the basic building blocks of a simple HHVM extension skeleton. Part II continued with three of the five "smart" datatypes used throughout the API. But now we get to have some fun. It's time to start looking into declaring classes with methods and properties and constants (oh my!).

I'll be continuing with the same git repository at https://github.com/sgolemon/hhvm-extension-writing where we left off at the end of Part II with commit 8e82e3e416.

Declaring methods

As with functions, the bulk of your class definition is going to appear in your systemlib file, maybe something like the following in ext_example1.php.
class Example1_Greeter {
  public function greet() {
    echo "Hello {$this->name}\n";
  }

  public function __construct(protected string $name = 'Stranger') {}
}

As you should expect by now, compiling your extension with this bit of code should mean that the Example1_Greeter class is now available to all requests and may be invoked like any other class definition. Let's apply what we already know from making native functions, and see how it works with methods...

  <<__Native>>
  public function getName(): string;

  <<__Native>>
  static public function DefaultGreeting(): string;

While you're probably tempted to rush off and add an HHVM_FE() and HHVM_FUNCTION() implementation to the C++ file, you'd only be half right. These aren't functions, they're methods, and as such have a different set of macros.
const StaticString
  s_Example1_Greeter("Example1_Greeter");

String HHVM_METHOD(Example1_Greeter, getName) {
  return this_->o_get(s_name, false, s_Example1_Greet);
}

String HHVM_STATIC_METHOD(Example1_Greeter, DefaultGreeting) {
  return "Hello";
}

Meanwhile, in moduleInit(), we'll add:
  HHVM_ME(Example1_Greeter, getName);
  HHVM_STATIC_ME(Example1_Greeter, DefaultGreeting);

The code so far is at commit 335dca9573.

Similarly, properties and constants may be declared directly on the Hack definition of the class in your systemlib file. I won't bother showing it here, since you all know how to write PHP code, but I'll put an example or two in the git repo.

What's marginally more interesting, and something you can probably guess at from the coverage in Part I, is that you can declare class constants from C++, meaning that they can take on values defined in external headers or computed values. Let's add one from moduleInit().

  Native::registerClassConstant(s_Example1_Greeter.get(), s_DEFAULT_GREETING.get(), s_Hello.get());

In the examples repo, I've changed ->getName() to now use this constant, so that you can see it propagate up.

The code so far is at commit 007c314b2d.

Binding internal data

What we have so far is all well and good for simple classes, but most extension classes will need to store some opaque pointer from an external library somewhere on the class that can be easily referenced later on. For that, things start to get a little bit more complicated. To help clarify what we're doing, I'm going to wipe the slate clean by moving example1 to its own subdirectory, and starting fresh with example2.

Yeah, kinda messy changing my whole directory structure around midway, but would you rather I ran `git push --force`? Yeah, I thought not.

The code skeleton we're starting out with is at commit: 20dc4ef824 in example2/.

To illustrate something slightly less contrived than earlier examples, I'll be wrapping the POSIX FILE* object in a simple PHP class. Let's start with something basic in our systemlib, containing just a constructor:
<<__NativeData("Example2_File")>>
class Example2_File {
  <<__Native>>
  public function __construct(string $filename, string $mode): void;
}

A new user attribute has appeared! <<__NativeData("Example2_File")>> tells the runtime that this is no ordinary object. This object should be over-allocated with enough to space to handle some internal C++ object, identified by the quoted name. In practice this is usually the name of the class it goes with, but it doesn't have to be. How does this hook up to internals? That comes next, within ext_example2.cpp by adding an #include "hphp/runtime/vm/native-data.h" and the following code:
const StaticString
  s_Example2_File("Example2_File");

class Example2_File {
 public:
  Example2_File() { /* new Example2_File */ }
  Example2_File(const Example2_File&) = delete;
  Example2_File& operator=(const Example2_File& src) {
    /* clone $instanceOfExample2_File */
    throw Object(SystemLib::AllocExceptionObject(
      "Cloning Example2_File is not allowed"
    ));
  }

  ~Example2_File() { sweep(); }
  void sweep() {
    if (m_file) {
      fclose(m_file);
      m_file = nullptr;
    }
  }

  FILE* m_file{nullptr};
};

void HHVM_METHOD(Example2_File, __construct, const String& filename, const String& mode) {
  auto data = Native::data<Example2_File>(this_);
  if (data->m_file) {
    throw Object(SystemLib::AllocExceptionObject(
      "File is already open!"
    ));
  }
  data->m_file = fopen(filename.c_str(), mode.c_str());
  if (!data->m_file) {
    String message("Unable to open ");
    message += filename + ": errno=" + String(errno);
    throw Object(SystemLib::AllocExceptionObject(message));
  }
}

And some glue code in moduleInit() to tie both the constructor and the data class into the class definition:
  HHVM_ME(Example2_File, __construct);

  Native::registerNativeDataInfo<Example2_File>(s_Example2_File.get());

The code so far is at commit 97b3cfd49e.

We're only opening (and ultimately closing) our file at this point, but these are really important stages in an object's lifecycle, so this is worth going though slowly. When a PHP script calls $o = new Example2_File(__FILE__, "r");, the first thing the engine does is allocate space for the object. This is done by adding sizeof(ObjectData) (the standard, base object size) to the size given by any NativeDataInfo associated with it. We made that association by calling Native::registerNativeDataInfo(StringData* id);, where T is the C++ class type to allocate with the ObjectData, and id is the symbolic name we gave it in the syetemlib file using <<__NativeData("id")>>.

Next, the engine invokes the constructor, providing a pointer to the object via a hidden ObjectData* this_ property in the C++ method's signature. From here, we can get access to our private data structure by using the Native::data() accessor to jump to the correct offset from this_. At this point, we have access to a normal C++ object which just so happens to be bound to a PHP object, and we have constructor parameters as well!

From here, there are two reasons a PHP object might die. In the expected case, it runs out of references and is destructed during the course of request's runtime. In this case, our auxiliary object has its destructor called as well, so that external pointers can be cleaned up nicely. The other time a PHP object can die is when the request is shutting down. This is somewhat more exceptional since the memory manager is sweeping ALL request-local data, not necessarily in the most ideal order. It's up to your auxiliary class to deal with non-sweepable resources, but trust that the runtime will deal with resources which are. This is resolved by having a secondary psuedo-destructor called sweep();. For simple implementations like ours, we want the regular destructor and sweep to do the same time, since the only members of this C++ class are external pointers. If we have sweepable resources such as an HPHP::String however, we'd want to avoid the implicit member destruction which comes with calling ~Example2_File(). It's entirely possible that the internal state of that String is no longer valid because it was sweeped first. Hence the need for a separate sweep() function.

TL;DR? - Just have your destructor call sweep(), and deal with external pointers in sweep(). That's good 90% of the time.

You might also have noticed that I'm throwing an exception in the assignment operator. This is normally used for handling a clone, where you'd probably duplicate the FILE* handle, but I realized midway that POSIX file streams don't really have that notion, so I took the easy way out and threw a standard exception. In practice, the implementation would probably look something like:
  Example2_File& operator=(const Example2_File& src) {
    /* copy/clone class members, then return self */
    if (m_file) {
      fclose(m_file);
      m_file = nullptr;
    }
    if (src.m_file) {
      m_file = fclone(src.m_file);
    }
    return *this;
  }

But like I said, there doesn't seem to be an fclone() as such. Instead, let's add a few more methods to flesh out our class:
String HHVM_METHOD(Example2_File, read, int64_t len) {
  auto data = Native::data<Example2_File>(this_);
  String ret(len, ReserveString);
  auto slice = ret.bufferSlice();
  len = fread(slice.ptr, 1, len, data->m_file);
  return ret.setSize(len);
}

int64_t HHVM_METHOD(Example2_File, tell) {
  auto data = Native::data(this_);
  return ftell(data->m_file);
}

bool HHVM_METHOD(Example2_File, seek, int64_t pos, int64_t whence) {
  if ((whence != SEEK_SET) && (whence != SEEK_CUR) && (whence != SEEK_END)) {
    raise_warning("Invalid seek-whence");
    return false;
  }
  auto data = Native::data<Example2_File>(this_);
  return 0 == fseek(data->m_file, pos, whence);
}
The code so far is at commit df4acf359ca.

Jan 6, 2015

HHVM Extension Writing, Part II

In our last installment I walked through setting up a dev environment and creating a simple HHVM extension which exposed some constants and global scope functions. Today, we'll expand on that by delving deeper into three of the five "Smart" types: String, Array, and Variant. The other two "Smart" types will be covered in Parts III(Objects) and IV(Resources) since they require a bit more explaining.

All code in the following examples can be found at https://github.com/sgolemon/hhvm-extension-writing and we'll be starting from where Part I left off: commit ad9618ac8c.

The String class

HPHP::String resembles C++'s std::string class in many ways, but also builds in several assumptions about how PHP strings should behave, is able to be encapsulated in a Variant (mixed) object, and performs common string related tasks, such as numeric conversion.

This post is going to highlight the most common features of the String class, but you should look through the header file yourself for a more in-depth exploration.

/* Basic inspection */
class String {
 public:
  const char* c_str() const;
  int size() const;
  bool empty() const { return size() == 0; }
  int length() const ( return size(); }
  bool isNumeric() const;
  bool isInteger() const;
  bool isZero() const;
  bool toBoolean() const;
  char toByte() const;
  short toInt16() const;
  int toInt32() const;
  int64_t toInt64() const;
  double toDouble() const;
  std::string toCppString() const;

  char charAt(int pos) const;
  char operator[](int pos) const;
};

The meaning and use of these methods should all be straightforward. In practice, c_str(), size(), and empty() are going to cover 90% of your uses for reading values from the String class.

/* Creation */
class String {
 public:
  String(); // empty string
  String(const char* cstr);
  String(const std::string& cppstr);
  String(const String& hphpstr);
  String(int64_t num);
  String(double num);

  static StaticString FromCStr(const char* cstr);

  String(size_t cap, ReserveStringMode mode);
  MutableSlice bufferSlice();
  uint32_t capacity() const;
  const String& setSize(int len);
};

The constructors, as you can see, are generally built around making new runtime string values from an existing string or numeric value, and are again straight-forward to use. String::FromCStr() is a somewhat special case in that it creates a StaticString, rather than a String. While a String is cleaned up at the end of the request it was created in, StaticStrings live forever, and can even be shared between multiple requests. Because overuse of StaticString could easily lead to memory bloat, they're typically only used for defining persistent features (such as constant names/values) as seen in Part I.

The most interesting part of this API is the ReserveStringMode and MutableSlice. Ordinarily, you shouldn't save the pointer you get from String::c_str() as it can potentially change between calls, and you generally shouldn't go modifying a String unless you know you own it anyway. If you do have need to modify a string, call bufferSlice() on it. The MutableSlice structure you get back will contain a pointer to a (relatively) stable block of memory which can be populated. Here's an example:

String HHVM_FUNCTION(example1_count_preallocate) {
  /* 30 bytes: 3 per number: 'X, ' */
  String ret(30, ReserveString);
  auto slice = ret.bufferSlice();
  for (int i = 0; i < 10; ++i) {
    snprintf(slice.ptr + (i*3), 4, "%d, ", i);
  }
  /* Terminate just after the 9th digit, overwriting the ',' with a null byte */
  return ret.setSize((9*3) + 1);
}

This contrived example allocates enough space for 10 single-digit numbers, and a comma and space following them. It uses snprintf() to fill that buffer up, then it truncates it as 28 characters, since the final ', ' wasn't actually necessary. You'll find this pattern in use anywhere an API expects you to provide it with a buffer for it to fill, such as in the intl extension where it calls into ICU.

Another approach to building up a string from parts would be to use the operator+ overload which allows you to simply concatenate Strings such as in the following:

String HHVM_FUNCTION(example1_count_concatenate) {
  String ret, delimiter(", ");
  for (int i = 0; i < 10; ++i) {
    if (i > 0) {
      ret += delimiter;
    }
    ret += String(i);
  }
  return ret;
}

There are costs and benefits to both versions. The former is more efficient as it only does one allocation, as opposed to the latter which does at least 11, and far less copying around. On the other hand, the second version is far more readable and far less error prone. For the contrived example, I'd call the second version "better", but there are certainly cases where the first version is superior.
The code so far is at commit: fa82b3cd70

The Array Class

Arrays are the do-all bucket of "stuff" of the PHP language. They can behave like vectors, maps, sets, or weird hybrid hodgepodge containers without rhyme or reason. You already know how to interact with them from userspace, so let's take a look at how to interact with them from C++. As with Strings, we're only going to go into the most common API calls here, check out the header for the full story.

/* Core API */
class Array {
 public:
  static Array Create(); // array()
  static Array Create(const Variant& value); // array($value)
  static Array Create(const Variant& key, const Variant& value); // array($key => $value)

  /* Read */
  const Variant operator[](int64_t key) const;
  const Variant operator[](const String& key) const;
  const Variant operator[](const Variant& key) const;

  /* count($arr) */
  ssize_t count() const;

  /* array_key_exists($arr, $key); */
  bool exists(int64_t key) const;
  bool exists(const String& key, bool isKey = false) const;
  bool exists(const Variant& key, bool isKey = false) const;

  /* Write */
  void clear();

  /* $arr[$key] = $v; */
  void set(int64_t key, const Variant& v);
  void set(const String& key, const Variant& v, bool isKey = false);
  void set(const Variant& key, const Variant& v, bool isKey = false);

  void prepend(const Variant& v); // array_unshift($v);
  Variant dequeue();              // array_shift($v);
  void append(const Variant& v);  // array_push($v); aka => $arr[] = $v;
  Variant pop();                  // array_pop($v);

  /* $arr[$key] =& $v; */
  void setRef(int64_t key, const Variant& v);
  void setRef(const String& key, const Variant& v, bool isKey = false);
  void setRef(const Variant& key, const Variant& v, bool isKey = false);

  /* $arr[] =& $v; */
  void appendRef(Variant& v);

  /* unset($arr[$key]); */
  void remove(int64_t key);
  void remove(const String& key, bool isKey = false);
  void remove(const Variant& key);
};

As you can see, the Array APIs mirror PHP's userspace API very closely, down to the read API using square-bracket notation just like PHP code. Let's write a couple new methods dealing with arrays as arguments and return values.
const StaticString
  s_name("name"),
  s_hello("hello"),
  s_Stranger("Stranger");

void HHVM_FUNCTION(example1_greet_options, const Array& options) {
  String name(s_Stranger);
  if (options.exists(s_name)) {
    name = options[s_name].toString();
  }
  bool hello = true;
  if (options.exists(s_hello)) {
    hello = options[s_hello].toBoolean();
  }
  g_context->write(greet ? "Hello " : "Goodbyte ");
  g_context->write(name);
  g_context->write("\n");
}

Array HHVM_FUNCTION(example1_greet_make_options, const String& name, bool hello) {
  Array ret = Array::Create();
  if (!name.empty()) {
    ret.set(s_name, name);
  }
  ret.set(s_hello, hello);
  return ret;
}

Pretty similar syntax to writing PHP code, yeah?

The code so far is at commit: 3966bb1da1

The Variant Class

The last "smart" class doesn't represent a single PHP type, rather it represents all types in a sort of meta-container which knows what it's holding, and knows how to convert between the concrete types. Variant is useful when you need to accept and/or return multiple possible types. For a start, let's list out the core API. Remember that there are far more methods than I'll cover here, and you can find the reset in the header file.

/* Creation/Assignment */
class Variant {
 public:
  Variant();
  Variant(bool bval);
  Variant(int64_t lval);
  Variant(double dval);
  Variant(const String& strval);
  Variant(const char* cstrval);
  Variant(const Array& arrval);
  Variant(const Resource& resval);
  Variant(const Object& objval);
  Variant(const Variant& val);

  template Variant &operator=(const T &v);
}

These APIs together mean that a Variant may be initialized or assigned from any other variable type supported by userspace code. This becomes especially powerful when looking at Variant return types.
Variant HHVM_FUNCTION(example1_password, const String& guess) {
  if (guess.same(s_secret)) {
    return "Password accepted: A winner is you!";
  }
  return false;
}

These seemingly incompatible return types (const char* and bool) work because they are implicitly constructed into a Variant instance. Explicit types are generally preferred, because the IR can make better assumptions during optimization, but sometimes you just want your return values to be adaptable like that.

/* Introspection and Unboxing */
class Variant {
 public:
  bool isNull() const;
  bool isBoolean() const;
  bool isInteger() const;
  bool isDouble() const;
  bool isNumeric(bool checkString = false) const;
  bool isString() const;
  bool isArray() const;
  bool isResource() const;
  bool isObject() const;

  bool toBoolean() const;
  int64_t toInt64() const;
  double toDouble() const;
  DataType toNumeric(int64_t &ival, double &dval, bool checkString = false) const;
  String toString() const;
  Array toArray() const;
  Resource toResource() const;
  Object toObject() const;
}

These APIs allow pulling a concrete data type out of a Variant so they can be operated on directly. Note that the to*() APIs will convert the type if necessary, even if the is*() call returned false, but that not all conversions make sense. Let's make a contrived example by implementing a simplistic var_dump():

<<__Native>>
function example1_var_dump(mixed $value): void;

void HHVM_FUNCTION(example1_var_dump, const Variant &value) {
  if (value.isNull()) {
    g_context->write("null\n");
    return;
  }
  if (value.isBoolean()) {
    g_context->write("bool(");
    g_context->write(value.toBoolean() ? "true" : "false");
    g_context->write(")\n");
    return;
  }
  if (value.isInteger()) {
    g_context->write("int(");
    g_context->write(String(value.toInt64()));
    g_context->write(")\n");
    return;
  }
  // etc...
}

The code so far is at commit: 8e82e3e416

What's next...

We'll continue in the next installment by exploring Objects. These get a bit more complicated with the introduction of visibility, properties, constants, inheritance, and internal data structures.

HHVM Extension Writing, Part I

I've written a number of blogposts and even one book over the years on writing extensions for PHP, but very little documentation is available for writing HHVM extensions.  This is kinda sad since I built a good portion of the latter's API. Let's fix that, starting with this article.

All the code in this post (and its followups) will be found at https://github.com/sgolemon/hhvm-extension-writing.

Setting up a build environment

The first thing you need to do is get all the dependencies in place.  I'm going to start from a clean install of Ubuntu 14.04 LTS and use the prebuilt HHVM binaries for Ubuntu. Other distros should work (with varying degrees of success), but sorting them all out is beyond the scope of this blog entry.

First, let's trust HHVM's package repo and pull in its package list. Then we can install the hhvm-dev package to pull in the binary along with all needed headers.

$ wget -O - http://dl.hhvm.com/conf/hhvm.gpg.key | \
  sudo apt-key add -
$ echo deb http://dl.hhvm.com/ubuntu trusty main | \
  sudo tee /etc/apt/sources.list.d/hhvm.list
$ sudo apt-get update

$ sudo apt-get install hhvm-dev

Creating an extension skeleton

The most basic, no-nothing extension imaginable requires two files. A C++ source file to declare itself, and a config.cmake file to describe what's being built. Let's start with the build file, which is a simple, single line:

HHVM_EXTENSION(example1 ext_example1.cpp)

This macro declares a new extension named "example1" with a single source file named "ext_example1.cpp". If we had multiple source files, we'd delimit them with a space (HHVM_EXTENSION(example1 ext_example1.cpp ex1lib.cpp utilex1.cpp etc.cpp))

The source file has a little more boilerplate, but fortunately it's also just a handful of lines:

#include "hphp/runtime/base/base-includes.h"

namespace HPHP {

class Example1Extension : public Extension {
 public:
  Example1Extension(): Extension("example1", "1.0") {}
} s_example1_extension;

HHVM_GET_MODULE(example1);

} // namespace HPHP

All we're doing here is exposing a specialization of the "Extension" class which gives itself the name "example1". It doesn't do anything more than declare itself into the runtime environment. Those familiar with PHP extension development can think of this as the zend_module_entry struct, with all callbacks and the function table set to NULL.

The code so far is at commit: 214e2e7be6

Building an extension and testing it out

To build an extension, first run hphpize to generate a CMakeLists.txt file, then cmake . to generate a Makefile from that. Finally, issue make to actually build it. You should see output like the following:

$ hphpize
** hphpize complete, now run 'cmake . && make` to build
$ cmake .
-- Configuring for HHVM API version 20140829
-- Configuring done
-- Generating done
-- Build files have been written to: /home/username/hhvm-ext-writing
$ make
Scanning dependencies of target example1
[100%] Building CXX object CMakeFiles/example1.dir/ext_example1.cpp.o
[100%] Built target example1

Now we're ready to load it into our runtime. Start by creating a simple test file:
<?php
var_dump(extension_loaded('example1.php'));
Then fire up hhvm: hhvm -d extension_dir=. -d hhvm.extensions[]=example1.so tests/loaded.php and you should see bool(true).

Adding functionality

The simplest way to add functionality is to write some Hack code. You could write straight PHP code, but you'll see in a few moments why Hack is preferable for extension systemlibs. Let's introduce a new file: ext_example1.php and link it into our project:
<?hh

function example1_hello() {
  echo "Hello World\n";
}

Then load it in during the moduleInit() (aka MINIT) phase:
class Example1Extension : public Extension {
 public:
  Example1Extension(): Extension("example1", "1.0") {}
  void moduleInit() override {
    loadSystemlib();
  }
} s_example1_extension;
And finally, add the following to your config.cmake file to embed it into the .so, where HHVM can load it from at runtime.
HHVM_EXTENSION(example1 ext_example1.cpp)
HHVM_SYSTEMLIB(example1 ext_example1.php)

Rebuild your extension according to the instructions above, then try it out:
$ hhvm -d extension_dir=. -d hhvm.extensions[]=example1.so tests/hello.php
Hello World

The code so far is at commit: 54782f157d

Bridging the gap

If all you wanted to do was write PHP code implementations you could create a normal library for that. Extensions are for bridging PHP-script into native code, so let's do that. Make a new entry in your systemlib file using some hack specific syntax:
<<__Native>>
function example1_greet(string $name, bool $hello = true): void;

The <<__Native>> UserAttribute tells HHVM that this is the declaration for an internal function. The hack types tell the runtime what C++ type to pair them with, and the usual rules for default arguments apply.

To pair it with an internal implementation, we'll add the following to ext_example1.cpp:

void HHVM_FUNCTION(example1_greet, const String& name, bool hello) {
  g_context->write(hello ? "Hello " : "Goodbye ");
  g_context->write(name);
  g_context->write("\n");
}


And link it to the systemlib by adding HHVM_FE(example1_greet); to moduleInit().

As you can see, internal functions are declared with the HHVM_FUNCTION() macro where the first arg is the name of the function, as exposed to userspace, and the remaining map to the userspace functions argument signature. The argument types map according the following table:

Hack type C++ type (argument) C++ type (return type)
voidN/Avoid
boolboolbool
intint64_tint64_t
floatdoubledouble
stringconst String&String
arrayconst Array&Array
resourceconst Resource&Resource
object
const Object&Object
ClassNameconst Object&Object
mixedconst Variant&Variant
mixed&VRefParamN/A


Since this is Hack syntax, you may declare the types as soft (with an @) or nullable (with a question mark), but since these types are not limited to a primitive, they need to be represented internally as the more generic const Variant& for arguments or Variant or return types (essentially, mixed).

Reference arguments use the VRefParam type noted above. An example of which can be seen below:

<<__Native>>
function example1_life(mixed &$meaning): void;
void HHVM_FUNCTION(example1_life, VRefParam meaning) {
  meaning = 42;
}

The code so far is at commit: df16aca35e

Constants

Constants, like any other bit of PHP, may be declared in the systemlib file, or if they depend on some native value (such as a define from an external library), they may be declare in moduleInit() using the Native::registerConstant() template as with the following:
const StaticString s_EXAMPLE1_YEAR("EXAMPLE1_YEAR");

class Example1Extension: public Extension {
 public:
  Example1Extension(): Extension("example1", "1.0") {}
  void moduleInit() override {
    Native::registerConstant<KindOfInt64>(s_EXAMPLE1_YEAR.get(), 2015);
  }
} s_example1_extension;
The use of this function should be mostly obvious, in that it takes the name of a constant as a StringData* (which comes from a StaticString's .get() accessor), and a value appropriate to the constant's type. The type, in turn, is given as the function's template parameter and is one of the DataType enum values. The kinds correspond roughly to the basic PHP data types.
DataTypeC++ type
KindOfNullN/A
KindOfBooleanbool
KindOfInt64int64_t
KindOfDoubledouble
KindOfStaticStringStringData*
The code so far is at commit: ad9618ac8c

What's next...

In the next part of this series, we'll look at the String, Array, and Variant types. Part III will continue with Objects, then Resources in Part IV.

Jun 11, 2009

Slashdotted; Post-mortem

About a day and a half ago, Reddit user stderr posted a link to the PHP documentation showing that GOTO will be a part of the language as of version 5.3. Somewhere in the reddit comments for this post, a link was pasted to an entry on this blog where I announced the feature being added years earlier. This naturally brought out the usual flame wars that circle around something like GOTO and drove some new traffic to my blog. No big deal, my server is pretty low-traffic, it can handle a few extra hits.


Within a few hours, burghler had browsed through other entries on my blog, finding what was at the time, the most recent entry about my friend's experience burying her mother. Just like the first reddit post, which had made the front page, this one also had a somewhat incendiary title.


Okay, more than a little incendiary, but I'll get to that in a moment...

The lesson

You would think, given that I'm the Architect for Yahoo WebSearch Front-end Engineering, that I would know something about configuring a server to not fall over under load. And in fact, I spend a good portion of my time on making sure that unexpected traffic spikes aren't enough to make a server get overloaded and trigger a chain-reaction of front-ends falling over. I really have no excuse for not preparing my server for what happened.


When I woke up Wed morning, I found that my server just didn't seem to respond to SSH or HTTP attempts. A reboot request didn't help, and the colo folks insisted that the server was simply running slow, but it was running. So I left my ssh connection attempt running and eventually I did get a login prompt. Several pained minutes later I managed to get an iptables rule in place to block off the flood of traffic (quicker than trying to stop the webserver). Suddenly, the CPU load was gone! Turns out I had my MaxClients setting much too high, and after a certain degree of concurrency, enough web-server children had spawned off to use up the available memory, which triggered disk swap, and made the CPU load 10x worse. Again, I know this effect exists, I really should have set up this server better.


A few tweaks to the config later and I got my server running smoothly. I also took the time to re-run my access log statistics. Of the aproximately 3 years of stats I've got, a full 2% of my hits were logged yesterday. Impressive reddit.... you win this round...

Bright squares indicate heavy traffic, dark squares indicate low traffic


Prior to yesterday's traffic

Jan






























Feb






























Mar






























Apr






























May






























Jun






























Jul






























Aug






























Sep






























Oct






























Nov






























Dec































01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31


Current stats

Jan






























Feb






























Mar






























Apr






























May






























Jun






























Jul






























Aug






























Sep






























Oct






























Nov






























Dec































01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

My flickr views...



Clarifications

Despite the title of the reddit entry, I have nothing against Christians. Further, I count myself as one. So if you really want to take a piss at people who are calling Christians evil, don't aim at me. Thanks.


Moreover, I don't think those people were evil, or even horrible (though I did use that word in the heat of the moment). A loved one had just died, and everyone was in pain. Death SUCKS, and it was a bad situation no matter how you slice it. I don't hate those people for trying to erase my friend though, I pity them. I pity the fact that they turned down the chance to mourn the passing of their loved one by crying on another loved one's shoulder. I pity the fact that they didn't get nearly the closure that my friend did. I pity them for willingly becoming victims of their own grief.


The rest is between them and God.