$ mojo get is X-ray for the web

2011-11-30 − 🏷 command line 🏷 css 🏷 dom 🏷 http 🏷 mojolicious 🏷 perl

Mojolicious isn't just useful for perl coders, it also includes a command line tool that can be quite handy for anybody who wants to get info from the web:

usage: /Users/marcus/perl5/perlbrew/perls/perl-5.14.2/bin/mojo get [OPTIONS] \
URL [SELECTOR] [COMMANDS]
  mojo get /
  mojo get mojolicio.us
  mojo get -v -r google.com
  mojo get --method POST --content 'content' mojolicio.us
  mojo get --header 'X-Bender: Bite my shiny metal ass!' mojolicio.us
  mojo get mojolicio.us 'head > title' text
  mojo get mojolicio.us .footer all
  mojo get mojolicio.us a attr href
  mojo get mojolicio.us '*' attr id
  mojo get mojolicio.us 'h1, h2, h3' 3 text

These options are available:
  --charset <charset>     Charset of HTML5/XML content, defaults to auto
                          detection or "UTF-8".
  --content <content>     Content to send with request.
  --header <name:value>   Additional HTTP header.
  --method <method>       HTTP method to use, defaults to "GET".
  --redirect              Follow up to 5 redirects.
  --verbose               Print verbose debug information to STDERR.

First, the name can be a bit awkward when you use it often. I tend to shorten it to 'mg':

    $ alias mg='mojo get'

mg is a command line utility similar to curl, but with some really neat tricks up it's sleeve.

As you can see from the examples above, mg allows you to use familiar CSS selectors to process the response body. This turns out to be very useful. For instance, to get the links to the frontpage apps on my site iusethis.com:

$ mg -r iusethis.com 'h2 a' attr href 
http://osx.iusethis.com/app/corebreach
http://osx.iusethis.com/app/terraray
http://osx.iusethis.com/app/preferencemanager
http://osx.iusethis.com/app/panoedit
http://osx.iusethis.com/app/findanyfile
http://osx.iusethis.com/app/webkit
http://osx.iusethis.com/app/nulanaslauncher
http://osx.iusethis.com/app/pagelayers
http://osx.iusethis.com/app/arrivalsampdepartures
http://osx.iusethis.com/app/iawriter

Check the top one for link tags:

$ mg -r http://osx.iusethis.com/app/corebreach link

<link href="http://osx.iusethis.com/appcast/corebreach" rel="alternate" title="Sparkle AppCast" type="application/rss+xml" />
<link href="http://osx.iusethis.com/comment/app.rss/corebreach" rel="alternate" title="Recent Comments" type="application/rss+xml" />
<link href="http://osx.iusethis.com/static/opensearch.xml" rel="search" title="iusethis" type="application/opensearchdescription+xml" />

Neat, an app-cast, let's take a look at the version history:

 $ mg http://osx.iusethis.com/appcast/corebreach title text
Appcast for CoreBreach
CoreBreach 1.1
CoreBreach 1.0.2
CoreBreach 1.0.1
CoreBreach 1.0

Mojo understands much more complex queries than these tho. Pretty much anything you can use in jQuery or CSS 3 works.
A good way to find these definitions is to open up your web page with the Chrome debugger:

Just right-click the element you want to know more about and choose 'Inspect Element' from the menu. This gives you a really simple view of the DOM, which shows you which element on the web page you are highlighting, as well as the parent nodes to your element for use in selectors.

As for the last argument, you can call anything that you can call on a Mojo DOM Node.
If you exclude it you just get the markup as you saw above.

Because the client uses the Mojo UserAgent, it already supports features like HTTPS and basic authentication credentials in the URL.
mojo get also supports features like doing posts and setting headers, as well as setting the method and the request body, but I think it's this DOM queries that puts it apart from the other HTTP clients like curl, wget or lwp-download. Here's a final trick:

Set MOJO_USERAGENT_DEBUG=1 in your environment to get full traces of your HTTP requests.

Like it? Installation is really simple and fast. Go get it!