Beyond Tag Clouds: TagArcs for WordPress Tag Visualization (part I)

Tag clouds are very use­ful to vi­sua­lize the most fre­quently used tags on a web­site, e.g. a blog. This is done by stee­ring at­ten­tion through em­pha­si­zed words whose font size, co­lor or po­si­tion stands out. But not­hing can be found out about tem­po­ral re­la­tion of a tag's posts. For me this be­came evi­dently on my own tag cloud (see left) which still ra­tes 'Lima' at the lea­ding po­si­tion whe­reas the re­la­ted ar­ti­cles are more then three years old.
More un­for­t­u­na­tely is the mis­sing re­la­tion to other tags. While one tag is re­ally high­ligh­ted the user can not fi­gure out anything about re­la­ted tags who may ap­pear con­cur­rently.
In or­der to push se­man­tic vi­sua­liza­tion I am go­ing to in­tro­duce Tag­Arcs as mea­ningful and eye catching way to out­line re­la­ti­onships bet­ween tags and posts.



On the x-axis you can see all posts. Tags its­elf can be seen as links bet­ween posts. The more tags two post have in com­mon the thi­cker the arc is shown. A Tag­Arc shows links bet­ween re­la­ted posts over time.

To see how looks and to get a fee­ling for Tag­Arcs I have been de­ve­lo­ping a tiny Word­Press plu­gin. The power­ful vi­sual pro­gramming li­brary of pro­to­vis has been very va­lu­able for my purposes.

De­ter­mine nodes and their lin­king from WP posts re­spec­tively tags.

$options = 'numberposts=500&order=DESC&orderby=date';
$ii  = 0;
$arr = array();

  $postslist = get_posts($options);
  global $wpdb;
  foreach ($postslist as $post){
     setup_postdata($post);
     $name = split(' ', $post->post_date);
     $nodes .= '{nodeName:"' . $name[0] . '", group:2},';
     $arr[$post->ID] = $ii;
     $ii++;
  }

  $postslist = get_posts($options);
  global $wpdb;
  foreach ($postslist as $post){
     setup_postdata($post);

     foreach (get_the_tags(''.$post->ID) as $tag){
     $t = get_posts('tag='.$tag->name);
        foreach ($t as $relpost){
           if ($relpost->ID != $post_id){
             $a = $arr[$post->ID] != null ? $arr[$post->ID] : 0;
             $b = $arr[$relpost->ID] != null ? $arr[$relpost->ID] : 0;
             if($a > 0 && $b > 0)
               $links .= '{source:' . $a .' ,target:' . $b . ',value:1},';
              }
         }
      }
}

Some more lines to feed Pro­to­viz:

var nodes = {nodes:[' . $nodes . ' ]};
  var links = { links:[ '. $links.' ]};

  var vis = new pv.Panel()
     .width(800)
     .height(400)
     .margin(10)
     .bottom(20);

  var layout = vis.add(pv.Layout.Arc)
     .nodes(nodes.nodes)
     .links(links.links);

 layout.link.add(pv.Line)
     .lineWidth(function(d) d.linkDegree*0.1);

  layout.node.add(pv.Dot)
    .size(function(d) d.linkDegree + 2)
     .fillStyle(pv.Colors.category20().by(function(d) d.group))
     .strokeStyle(function() this.fillStyle().brighter());

   //layout.label.add(pv.Label);

   vis.render();

All code snip­pets from above can be sti­cked to­ge­ther in a sin­gle Word­Press plu­gin which I am go­ing to re­lease as soon as I've fi­nis­hed the other vi­sua­liza­tion ideas for WordPress.


These are the Tag­Arcs of my blog. In com­pa­ri­son to arcs above it can be seen that there have been more se­pa­ra­ted post clus­ters du­ring the past years. Al­most in the cen­ter of the time line the posts were very clo­sely lin­ked with each other. Re­cent ac­tivity is stron­gly re­la­ted to a bunch of post I have been writ­ten two years ago.

Pos­si­ble im­pro­ve­ments of Tag­Arcs could be:
* dis­play ex­act tem­po­ral po­si­tion on x-axis
* use dif­fe­rent co­lors for post wi­t­hin the same ca­te­gory
* enable in­ter­ac­tion, e.g. hiding not selec­ted tag-relations or show meta data when ho­ve­r­ing a post dot or tag arc.
* fil­ter by ca­te­gory, com­mon tag count, time, ...
* com­pare two ca­te­go­ries by their in­ter­me­diate tags

Any other ideas?

Posted by nise | Filed in english, Tech | Comment now »Share this on del.icio.us Digg this! Share this on Facebook Share this on Technorati Tweet This!

Tags: , , ,

Leave a Comment