.

Tags:

Joining the buzzword-laden crowd, here I’d like to say that PhantomJS goes to the cloud.

Back to the realistic world, this blog post shows how easy it is to build and deploy PhantomJS on a Linux instance of Amazon EC2. If you are not familiar with EC2, it’s the Elastic Compute Cloud platform from Amazon Web Service, essentially computer resource you can rent and scale up/down as neeeded. EC2 is quite popular, it powers various consumer-oriented services, from Amazon.com itself to Netflix.

Artwork credit: Internet cloud, Cartoon ghost.

There are two keys to the enablement of PhantomJS on EC2: the improved build workflow and the true headless feature. Assuming you have an instance running, it’s a matter of the following commands:

sudo yum install gcc-c++ git chrpath openssl-devel freetype-devel fontconfig-devel
git clone git://github.com/ariya/phantomjs.git && cd phantomjs
git checkout 1.5
./build.sh --jobs 1

That was tested in a 64-bit image with the following /etc/system-release:

Amazon Linux AMI release 2011.09

Note: With Amazon Linux AMI release 2012.03, make is also needed, i.e. sudo yum install make.

As expected, there is no need to have any sort of GUI to run PhantomJS. Pure headless.

For some tweaks and other notes, read the complete PhantomJS build instruction info. Please note that the build may take a long time, the Linux Micro Instance (free usage tier) took about 28 hours to complete the entire process. You may also switch to another Linux image or even build locally first on a beefy machine and then upload the resulting build. In fact, you could also use the included script deploy/package-linux-dynamic.sh to pack the build into a tarball and transport it somewhere else, e.g. further AMI instances. The package will be self-contained, the proof is in the result of running ldd on the binary:

linux-vdso.so.1 =>  (0x00007fff02dff000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fc9d266b000)
libQtWebKit.so.4 => /home/ec2-user/deploy/phantomjs/bin/../lib/libQtWebKit.so.4 (0x00007fc9d0cf3000)
libQtGui.so.4 => /home/ec2-user/deploy/phantomjs/bin/../lib/libQtGui.so.4 (0x00007fc9d01d6000)
libQtNetwork.so.4 => /home/ec2-user/deploy/phantomjs/bin/../lib/libQtNetwork.so.4 (0x00007fc9cfe92000)
libQtCore.so.4 => /home/ec2-user/deploy/phantomjs/bin/../lib/libQtCore.so.4 (0x00007fc9cf93e000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fc9cf722000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007fc9cf41c000)
libm.so.6 => /lib64/libm.so.6 (0x00007fc9cf197000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fc9cef81000)
libc.so.6 => /lib64/libc.so.6 (0x00007fc9cebe0000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc9d2877000)
libfreetype.so.6 => /usr/lib64/libfreetype.so.6 (0x00007fc9ce942000)
libfontconfig.so.1 => /usr/lib64/libfontconfig.so.1 (0x00007fc9ce70c000)
librt.so.1 => /lib64/librt.so.1 (0x00007fc9ce504000)
libexpat.so.1 => /lib64/libexpat.so.1 (0x00007fc9ce2db000)

Now that you have something wandering around in the cloud, what can you do with it? There are few example usages of PhantomJS which may inspire you. Personally what I’d like to appear someday are the screenshot service and the next-generation network monitoring service.

For the screenshot service, it’s necessary to combine PhantomJS with other web stack frameworks. Basically PhantomJS is just the back-end, its screen capture will be driven by another middleware. There are examples of such an implementation using Perl Dancer (Screenshot), Node.js (screenshot-app), Python/Flask (bookmarking service), and Play2 (screenshot-webservice). For a reference of a commercial screenshot service, take a look at URL2PNG which seems to capture the web page using the Linux version of Chromium 11 (that’s a release from a year ago). Using Chromium might give a better rendering fidelity although a headless optimized PhantomJS is guaranteed to be more resource/CPU friendly.

One underrated feature of PhantomJS is its ability to track network activity, i.e. every single network response and request along with the timing information. This is used in e.g. confess.js. An export to HAR format is almost trivial. Now imagine you build an advanced network traffic and monitoring service based on this feature. You can enrich the report with tons of useful (and useless) metrics and stats, everything from HTTP header analysis, detailed breakdown of assets size, complete network waterfall diagram, optimization opportunities, and many more. Maybe even the screen capture of the monitored site. If your client focuses on interactive web page or rich internet apps (RIA), you can even report the code coverage and full execution trace by leveraging my other project, Esprima.

Do I hear a startup?

  • Raynor08

    Awesome!!

  • Pingback: PhantomJS - headless WebKit environment - javascripted.me

  • Pingback: PhantomJS Nedir ? Cloud PhantomJS on Amazon EC2

  • David Tooley

    ./build.sh –jobs 1 is throwing the following error on Amazon Linux AMI release 2012.03:

    You don’t seem to have ‘make’ or ‘gmake’ in your PATH.Cannot proceed.

    ./preconfig.sh: line 110: make: command not found

    cp: cannot stat `src/3rdparty/webkit/Source/JavaScriptCore/release/*’: No such file or directory

    cp: cannot stat `src/3rdparty/webkit/Source/WebCore/release/*’: No such file or directory

    ./build.sh: line 48: src/qt/bin/qmake: No such file or directory

    ./build.sh: line 49: make: command not found

    Any advice?

    • http://twitter.com/ariyahidayat Ariya Hidayat

      Maybe you need to install make manually. It’s strange, gcc-c++ usually pulls make as its dependent.

      • David Tooley

        That was it. 

        sudo yum install make.

        Thanks.

        • http://twitter.com/ariyahidayat Ariya Hidayat

          Nice. I added an update to include this info.

  • bobbybrakes

    Awesome article, was just getting “sh: cannot execute binary file” on EC2 as I was trying to run the Mac OS X version. I built from source in about 90 minutes and am moving the bundle to another instance. Will report back with any issues!

  • Daniel Gleckler

    Hey there,

    Came across this post while trying to figure out a problem I’m having. Maybe you can help me.

    Is there any obvious reason that’ I’ve overlooked that a would be able to runn phantomjs just fine from the command line, but not from a php exec() call?

    • http://twitter.com/ariyahidayat Ariya Hidayat

      Probably means that PHP can’t find the executable.

  • Dan

    Hey Ariya,
    I’m doing screen captures with phantomjs via php exec. It works fine for the most part, but every now and then, it just slows down drastically on the same URL that worked fine before (60secs vs 2-3 secs). I have tried debug-mode but don’t see any errors or warnings. Any idea what might cause this?

    • Dan

      Just some additional info: This is on Amazon EC2 Linux.

      • http://twitter.com/ariyahidayat Ariya Hidayat

        Hard to tell. Use the mailing-list for further discussion.

  • Dan

    Getting build errors on Amazon Linux AMI release 2012.03: Any advice?
    make[1]: Leaving directory `/home/ec2-user/phantomjs/src’
    make[1]: Entering directory `/home/ec2-user/phantomjs/src’
    g++ -m64 -Wl,-O1 -Wl,-rpath,/home/ec2-user/phantomjs/src/qt/lib -o ../bin/phantomjs phantom.o callback.o webpage.o webserver.o main.o csconverter.o utils.o networkaccessmanager.o cookiejar.o filesystem.o system.o env.o terminal.o encoding.o config.o repl.o replcompletable.o gif_err.o gifalloc.o egif_lib.o gif_hash.o quantize.o gifwriter.o mongoose.o linenoise.o utf8.o qcommandline.o minidump_file_writer.o convert_UTF.o md5.o string_conversion.o crash_generation_client.o exception_handler.o log.o linux_dumper.o linux_ptrace_dumper.o minidump_writer.o file_id.o guid_creator.o memory_mapped_file.o safe_readlink.o moc_phantom.o moc_callback.o moc_webpage.o moc_webserver.o moc_networkaccessmanager.o moc_cookiejar.o moc_filesystem.o moc_system.o moc_env.o moc_config.o moc_repl.o moc_replcompletable.o moc_qcommandline.o qrc_phantomjs.o qrc_WebKit.o qrc_InspectorBackendStub.o -L/home/ec2-user/phantomjs/src/qt/lib -lQtWebKit -L/home/ec2-user/phantomjs/src/qt/lib -lQtGui -lfreetype -lfontconfig -lQtNetwork -lQtCore -lm -ldl -lrt -lpthread
    collect2: ld terminated with signal 9 [Killed]
    make[1]: *** [../bin/phantomjs] Error 1
    make[1]: Leaving directory `/home/ec2-user/phantomjs/src’
    make: *** [sub-src-phantomjs-pro-make_default-ordered] Error 2

    • http://ariya.ofilabs.com/ Ariya Hidayat

      Maybe out of memory or out of disk space.

  • Andy

    This is probably what has enabled sites like urlbox.io to spring up which offer screenshots as a service to agencies and enterprises. I tried to create my own phantomjs in the cloud but can’t get webfonts to render properly.

  • shlooklak

    phantom js is mighty cool! but – it’s a shame you didn’t mention that the build takes so long… and that you could use the existing binaries.. But other than that – great post!

    • http://ariya.ofilabs.com/ Ariya Hidayat

      The build time is mentioned, i.e. 28 hours. The recommended use of the premade binaries have been referred over and over again in many places, including in the main PhantomJS download page.