Tags:

At the most recent jQuerySF conference, Mike Sherov and I did a joint talk on the topic of JavaScript Syntax Tree: Demystified. The highlight of the talk was the demo from Mike as he showed how to fix coding style violations automatically.

The trick is to use JSCS and its latest features. If you want to follow a long, here is a step-by-step recipe.

First, you need to have JSCS installed. This is as easy as:

npm install -g jscs

Let’s pick an example project, for this illustration I use my kinetic scrolling demo:

git clone https://github.com/ariya/kinetic.git
cd kinetic

Now you want to let JSCS analyze all the JavaScript files in the project and deduce the most suitable code style:

jscs --auto-configure .

jscs
Give it a few seconds and after a while, JSCS will present the list of code style presets along with its associated number of errors, computed from your JavaScript code. If you already have a preset in my mind, you can choose one. An alternative would be to pick one that has the least amount of violations, as it indicates that your code already gravitates towards that preset.

Once you choose a preset, JSCS will ask you a couple of self-explained questions. At the end of this step, the configuration file .jscsrc will be created for you. With the configuration, the real magic happens. You just to invoke JSCS this way:

jscs -x .

then it will automatically reformat your JavaScript. Double check by looking at the changes and you will see that your code style now follows the specified preset.

With JSCS, you can comfortably ensure code style consistency throughout your project!

Tags:

ninja
With a complex application, it is often convenient to have a function that returns not just one value. There are many different ways to achieve this in C++, from using a structure to taking advantage of the latest C++ 11 tuple class template.

The obvious choice, returning an object, seems a bit overkill in many cases. First, you need to declare the structure. It is not seldom that the structure needs to be available for the consumer, hence you have to expose it to the outside world. The construction of the instance is also another ceremonial activity nobody likes to carry out unnecessarily.

Fortunately, if the function is supposed to return only two values, std::pair is to the rescue. Most likely, make_pair will be used to construct the pair. Each element of the pair can be accessed using first and second, respectively. This is illustrated in the following example:

std::pair<std::string , int> findPerson() {
    return std::make_pair("Joe Sixpack", 42);
}
 
int main(int, char**) {
    std::pair< std::string, int> person = findPerson();
    std::cout < < "Name: " << person.first << std::endl;
    std::cout << "Age: " << person.second << std::endl;
    return 0;
}

What if you need more than just two values? Well, obviously std::pair is not fit for the job. In this case, we can leverage boost:tuple from Boost Tuple library. If you are already using std::pair, it is very easy to get familiar with boost::tuple. A tuple can be created using make_tuple, its element is accessed using get<n>, where n denotes the element index.

#include <boost /tuple/tuple.hpp>
 
boost::tuple<std::string , std::string, int> findPerson() {
    return boost::make_tuple("Joe", "Sixpack", 42);
}
 
int main(int, char**) {
    boost::tuple< std::string , std::string, int> person = findPerson();
    std::cout < < "Name: " << person.get< 0>() < < " "
        << person.get< 1>() < < std::endl;
    std::cout << "Age: " << person.get< 2>() < < std::endl;
    return 0;
}

With the latest C++ 11, there is no need to rely on a third party library anymore since std::tuple is already available. With minor tweaks, the previous Boost example will look this in C++. Note also the use auto that saves us from unnecessary verbosity. The compiler knows the return type of findPerson and there is no need for a lengthy type declaration anymore.

#include <tuple>
 
std::tuple<std::string , std::string, int> findPerson() {
    return std::make_tuple("Joe", "Sixpack", 42);
}
 
int main(int, char**) {
    auto person = findPerson();
    std::cout < < "Name: " << std::get< 0>(person) < < " " <<
        std::get< 1>(person) < < std::endl;
    std::cout << "Age: " << std::get< 2>(person) < < std::endl;
    return 0;
}

While we are at it, might as well mention std::tie, useful to easily unpack a tuple (similar to ES6 destructuring). It is convenient alternative to the element access using get. The code fragment below demonstrates its usage.

int main(int, char**) {
    std::string first_name, last_name;
    int age;
    std::tie(first_name, last_name, age) = findPerson();
    std::cout < < "Name: " << first_name << std::endl;
    return 0;
}

From your own experience, which of these techniques do you like and why do you favor it?

Tags:

There are various hosted continuous integration services out there that you can use for your Node.js projects, from Travis CI to drone.io and many others. If you feel adventurous or you are always fascinated by a DIY solution (for whatever reasons), it is apparently quite easy to setup your own CI system quickly using Docker and TeamCity.

logo_teamcityAs an easy-to-use continuous integration system, TeamCity offers two free solutions for you: Professional Server license for up to 20 build configurations or Open Source license for your open-source projects. This is usually sufficient to get you started. Also, per the usual server agent architecture, we will run TeamCity server and agent in two separate containers. This is very similar to my previous blog post on TeamCity installation using Docker, with a minor tweak.

First, you need a machine for the server. This could be a physical machine, a virtual machine, or even a VPS. For a hassle-free setup, sign up for either Vultr or Digital Ocean (note: my affiliate links). Make sure you evaluate the system requirements to run the server (e.g. 2 cores and 2 GB RAM will be ideal).

On this machine, Docker must be installed properly. A useful quick test:

sudo docker run -it ariya/centos7-oracle-jre7 cat /etc/redhat-release

should show something like:

CentOS Linux release 7.0.1406 (Core)

Once Docker is there, starting TeamCity server is as easy as:

sudo docker run -dt --name teamcity_server -p 8111:8111 \
  ariya/centos7-teamcity-server

This is using a prepared container I have created called ariya/centos7-teamcity-server. Note that the container supports volume mapping of /data/teamcity. You definitely need to do this if you want to persist your TeamCity projects and other settings. Here is a fancier way to invoke the server where the data is stored on the host system under /var/data/teamcity and with automatic restart in case the server dies.

sudo docker run -dt --name teamcity_server --restart=always -p 8111:8111
  -v /var/data/teamcity:/data/teamcity
  ariya/centos7-teamcity-server

Also, if you are using a firewall, make sure to accept connections on port 8111. With iptables:

sudo iptables -A INPUT -p tcp --dport 8111 -j ACCEPT
sudo service iptables save

Once the server is running, visit the site (on port 8111) using your web browser. This allows you to initialize and configure TeamCity server. In a minute or two, it should be ready to use.

teamcity_starting

You can start creating your CI project, refer to the excellent TeamCity documentation for details. For the build process itself, it is quite common to invoke npm twice, first to install the dependencies and then to run the tests. This is illustrated in the following screenshot.

teamcity_project

While it is sufficient to use the command-line runner to invoke e.g. npm test, if you want to be a bit more sophisticated, you can use a customized runner such TeamCity.Node.

Of course, the project can not be executed right now because the server does not have any connecting build agents yet. Starting an agent is also extremely straightforward as I already prepared another container for that, ariya/centos7-teamcity-agent-nodejs. This container is already equipped with Node.js 0.10 and npm 1.3.

sudo docker run -e TEAMCITY_SERVER=http://$TEAMCITY_HOST:8111 -dt -p 9090:9090 \
  ariya/centos7-teamcity-agent-nodejs

In the above example, you need to supply the IP address of your server with the environment variable TEAMCITY_HOST. Again, the firewall needs to accept connections on port 9090.

teamcity_agent

It is of course possible to run this agent on the same host as the server, particularly if you have a beefy machine. In this case, you need to use Docker IP address:

export TEAMCITY_HOST=$(sudo docker inspect --format \
  '{{ .NetworkSettings.IPAddress }}' teamcity_server)

It takes a while for the agent to register itself with the server. However, it does not mean that the agent is immediately available. First, you need to authorize it so that the server will trust the agent and start dispatching the build tasks to the said agent. After that, you can start running your project.

teamcity_buildlog

Thanks to Docker, everything could be done in 10 minutes or less. Have fun with all the tests!

Tags:

blocks

Little did I know that the start of my adventure with Esprima three years ago will result in something beyond my expectation. While the syntax tree format used by Esprima is not original (see SpiderMonkey Parser API), this de-facto format gains a lot of traction since it provokes a Cambrian explosion of composable JavaScript language tooling, everything from a code coverage tool, a style checker, a delta debugger, a syntax autocompleter, a complexity visualizer, and many more. Mind you, this AST format is far from perfect and hence why some of us at Shape Security are taking a journey to figure out a better format.

Throughout the development, Esprima is also being used as a playground for a rigorious workflow. For example, performance is always important and hence why a benchmark system was implemented early on. There were numerous optimized JavaScript tricks (fixed object shape, profile-guided code shuffling, object-in-a-set) which I discovered via a few interesting investigations. Esprima also enforces a hard threshold of certain metrics, such as cyclomatic complexity and test coverage. Speaking of tests, I consider Esprima’s test suite (~ 800 unit tests) as its crown jewel. It is not uncommon to hear that this collection of tests is being utilized to assist the development of another similar parser, whether it is written in JavaScript or other languages.

After being in the wild for a while, Esprima started to attract more contributions, not only in term of adding new features but also for troubleshooting defects, solving performance challenges, and other less glamorous tasks. The growth, 600 dependent packages and 3 millions/month download on npmjs, needs to be anticipated as well. This was why after talking to Dave Methvin some time ago, I felt confident that jQuery Foundation would be a good new umbrella for the project. And that was how the adoption was initiated and finally completed a few weeks ago.

At the same time, JavaScript continues to evolve. The next edition, ECMAScript 6 (will be called ECMAScript 2015 officially) has its specification frozen, with some JavaScript engines (SpiderMonkey, V8, Chakra, JavaScriptCore) already start to support a few selected features. This has been anticipated by creating the special harmony branch in early 2012. In fact, it has served as the basis of a transpiler called (now defunct) Harmonizr, back when writing a transpiler was not considered cool yet. Meanwhile, more folks (particularly Facebook engineers and some others) continue to enhance this branch. It is being used to drive Facebook JavaScript infrastructure (see JSTransform, Recast, Regenerator, JSX), among others for its ES6 adoption. Still, this harmony branch (despite some unofficial third-party releases) is considered experimental and it should not be used in production.

This brings us to the most recent 2.0 release. Among others, this release starts to include carefully selected ES6 features (e.g. arrow function, default parameter, method definition). This is to facilitate the migration of downstream language tools, per the original plan outlined several months ago in the mailing-list:

The new master, which bears the version 2.x, will start to introduce ECMAScript 6 features. We will do it peacemeal, taking features which are known to be more or less stabilized in the most recent draft spec. In a few cases, this is a matter of bringing in the existing implementation from the experimental harmony branch.

Thanks to the wonderful community, these three years have been fantastic. Let’s continue to build amazing tools!

Tags:

Some time ago, I came up with a bar joke involving SMTP. Since I need to explain it a couple of times, I thought I just write it down as a blog post for future reference.

The joke goes like this (as a tweet):

The key thing here is the EHLO part. To explain this, let me show you a typical chatting between an SMTP server (e.g. from your mail provider) and an SMTP client (e.g. your email application). If you want to follow along, there is a nice trick. Sign up at Mailtrap for a test account (you can authenticate using your Github credential) and you will have a test server to play with.

Start by connecting to the server using telnet:

telnet mailtrap.io 2525
Trying 54.85.222.127...
Connected to mailtrap.io.
Escape character is '^]'.
220 mailtrap.io ESMTP ready

ehloAt this moment, you are supposed to greet the server (see RFC 821, Section 3.5 on Opening and Closing) using the HELO command:

HELO mailtrap.io
250 mailtrap.io

If you carefully read the above RFC 821, it is obvious that SMTP commands are 4-letter words. Thus, MAIL is for initiating a transaction, NOOP is to do nothing, HELP for showing up some instructions, and so on.

As SMTP grows in functionality, an extension mechanism is established so that the client recognizes certain extra features of the server and perhaps would like to leverage them. Rather than inventing a completely different opening command, EHLO is introduced (see RFC 5321, Section 3.2 on Client Initialization). This new command let the client and server know about each other’s privileged status. For example, running EHLO on Mailtrap gives us:

EHLO mailtrap.io
250-mailtrap.io
250-SIZE 5242880
250-PIPELINING
250-ENHANCEDSTATUSCODES
250-8BITMIME
250-DSN
250-AUTH PLAIN LOGIN CRAM-MD5
250 STARTTLS

which basically lists some service extensions supported by Mailtrap’s SMTP server.

Practically all modern email clients prefer to use EHLO instead. It is quite widespread and hence, the bar joke and the EHLO style of greeting. That was fun, right?

QUIT

Tags:

clonetroopersIn some cases, an instance of a C++ class should not be copied at all. There are three ways to prevent such an object copy: keeping the copy constructor and assignment operator private, using a special non-copyable mixin, or deleting those special member functions.

A class that represents a wrapper stream of a file should not have its instance copied around. It will cause a confusion in the handling of the actual I/O system. In a similar spirit, if an instance holds a unique private object, copying the pointer does not make sense. A somehow related problem but not necessarily similar is the issue of object slicing.

The following illustration demonstrates a simple class Vehicle that is supposed to have a unique owner, an instance of Person.

class Car {
public:
  Car(): owner(0) {}
  void setOwner(Person *o) { owner = o; }
  Person *getOwner() const { return owner; }
  void info() const;
private:
  Person *owner;
};

For this purpose, the implementation of Person is as simple as:

struct Person {
  std::string name;
};

To show the issue, a helper function info() is implement as follows:

void Car::info() const
{
  if (owner) {
    std::cout < < "Owner is " << owner->name < < std::endl;
  } else {
    std::cout << "This car has no owner." << std::endl;
}

From this example, it is obvious that an instance of Car must not be copied. In particular, another clone of a similar car should not automatically belong to the same owner. In fact, running the subsequent code:

  Person joe;
  joe.name = "Joe Sixpack";
 
  Car sedan;
  sedan.setOwner(&joe);
  sedan.info();
  Car anotherSedan = sedan;
  anotherSedan.info();

will give the output:

Owner is Joe Sixpack
Owner is Joe Sixpack

How can we prevent this accidental object copy?

Method 1: Private copy constructor and copy assignment operator

A very common technique is to declare both the copy constructor and copy assignment operator to be private. We do not even need to implement them. The idea is so that any attempt to perform a copy or an assignment will provoke a compile error.

In the above example, Car will be modified to look like the following. Take a look closely at two additional private members of the class.

class Car {
public:
  Car(): owner(0) {}
  void setOwner(Person *o) { owner = o; }
  Person *getOwner() const { return owner; }
  void info() const;
private:
  Car(const Car&);
  Car& operator=(const Car&);
  Person *owner;
};

Now if we try again to assign an instance of Car to a new one, the compiler will complain loudly:

example.cpp:35:22: error: calling a private constructor of class 'Car'
  Car anotherSedan = sedan;
                     ^
example.cpp:22:3: note: declared private here
  Car(const Car&);
  ^
1 error generated.

If writing two additional lines containing repetitive names is too cumbersome, a macro could be utilized instead. This is the approach used by WebKit, see its WTF_MAKE_NONCOPYABLE macro from wtf/Noncopyable.h (do not be alarmed, in the context of WebKit source code, WTF here stands for Web Template Framework). Chromium code, as shown in the file base/macros.h, distinguishes between copy constructor and assignment, denoted as DISALLOW_COPY and DISALLOW_ASSIGN macros, respectively.

Method 2: Non-copyable mixin

The idea above can be extended to create a dedicated class which has the sole purpose to prevent object copying. It is often called as Noncopyable and typically used as a mixin. In our example, the Car class can then be derived from this Noncopyable.

Boost users may be already familiar with boost::noncopyable, the Boost flavor of the said mixin. A conceptual, self-contained implementation of that mixin will resemble something like the following:

class NonCopyable
{
  protected:
    NonCopyable() {}
    ~NonCopyable() {}
  private: 
    NonCopyable(const NonCopyable &);
    NonCopyable& operator=(const NonCopyable &);
};

Our lovely Car class can be written as:

class Car: private NonCopyable {
public:
  Car(): owner(0) {}
  void setOwner(Person *o) { owner = o; }
  Person *getOwner() const { return owner; }
  }
private:
  Person *owner;
};

Compared to the first method, using Noncopyable has the benefit of making the intention very clear. A quick glance at the class, right on its first line, and you know right away that its instance is not supposed to be copied.

Method 3: Deleted copy constructor and copy assignment operator

For modern applications, there is less and less reason to get stuck with the above workaround. Thanks to C++11, the solution becomes magically simple: just delete the copy constructor and assignment operator. Our class will look like this instead:

class Car {
public:
  Car(const Car&) = delete;
  void operator=(const Car&) = delete;
  Car(): owner(0) {}
  void setOwner(Person *o) { owner = o; }
  Person *getOwner() const { return owner; }
private:
  Person *owner;
};

Note that if you use boost::noncopyable mixin with a compiler supporting C++11, the implementation of boost::noncopyable also automatically deletes the said member functions.

With this approach, any accidental copy will result in a quite friendlier error message:

example.cpp:34:7: error: call to deleted constructor of 'Car'
  Car anotherSedan = sedan;
      ^              ~~~~~
example.cpp:10:3: note: 'Car' has been explicitly marked deleted here
  Car(const Car&) = delete;
  ^

So, which of the above three methods is your favorite?