RDFa with schema.org codelab: Comic Books

By Dan Scott,

About this codelab

SchemaBibEx initially focused on getting the Periodical / PublicationVolume / PublicationIssue proposal accepted. Currently, comic books exist as a proposed specialization of Periodical.

This excercise builds on what you have already learned, and introduces the concept of roles for describing relationships that might not exist in a controlled vocabulary and inverse relationships using the @rel property.

Audience: Intermediate

Prerequisites: To complete this codelab, you should already be familiar with HTML, RDFa, and schema.org, as well as the Periodical and PublicationIssue types. A previous exercise offers a practical introduction to those concepts.

Identifying the ComicIssue

In this exercise, you (acting as a contributor to the Grand Comics Database collaborative cataloguing effort) will take a sample comic book issue page and identify some of the core information in it using the proposed ComicIssue type and its properties.

View the periodical issue page source HTML

Open step1/comic_book.html in a text editor. You should see something like the following HTML source for the web page:

<!DOCTYPE html>
<html>
<head>
  <title>GCD :: Issue :: Wolverine: Son of Canada #[nn]</title>
  <link rel="shortcut icon" href="http://files1.comics.org//img/favicon.ico">
  <link rel="stylesheet" href="http://files1.comics.org/CACHE/css/dab09e047c47.css" type="text/css" />
</head>

<body>
<div id="sizing_base" >
<h1 class="item_id">
<div class="left" id="series_and_issue">
  <span id="series_name">
    <a href="http://comics.org/series/10616/">Wolverine: Son of Canada</a>
  </span>
  
  <a href="http://comics.org/issue/119667/"><span class="issue_number"><span class="p">#</span>[nn]</span></a>
</div>
  
<div class="right">(April 2001)</div>
</h1>

<div class="item_id">
  <div class="left" class="item_data">
  <a href="http://comics.org/publisher/78/">Marvel</a>,
  2001 Series
  </div>
</div>
...

Declare the core type of the page as ComicIssue

Edit the <body> element to include your @vocab declaration for the schema.org vocabulary, add a @typeof value of ComicIssue, and set a @resource value. In this case, the Grand Comics Database uses one URI per issue, so you can set the @resource to that URI (http://comics.org/issue/119667/). Check the results with one or more of the structured data testing tools.

You should see that the page is now recognized as describing a ComicIssue entity.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <title>GCD :: Issue :: Wolverine: Son of Canada #[nn]</title>
  <link rel="shortcut icon" href="http://files1.comics.org//img/favicon.ico">
  <link rel="stylesheet" href="http://files1.comics.org/CACHE/css/dab09e047c47.css" type="text/css" />
</head>

<body vocab="http://schema.org/" typeof="ComicIssue" resource="http://comics.org/issue/119667/">
<div id="sizing_base" >
<h1 class="item_id">
<div class="left" id="series_and_issue">
  <span id="series_name">
    <a href="http://comics.org/series/10616/">Wolverine: Son of Canada</a>
  </span>
  
  <a href="http://comics.org/issue/119667/"><span class="issue_number"><span class="p">#</span>[nn]</span></a>
</div>
  
<div class="right">(April 2001)</div>
</h1>
...

Mark the basic ComicIssue properties

Declare the core datePublished, issueNumber, image, publisher, and description properties for this comic book.

Note: The issue number as desribed in the HTML is [nn]; as the range of issueNumber is either Integer or Text, you can reproduce that here. Alternately, you can opt to not include the property at all, as even though humans might be able to interpret that as meaning "no number", the machines will faithfully treat that as an issue number instead of the absence of one.

Check your markup
<!DOCTYPE html>
...
<body vocab="http://schema.org/" typeof="ComicIssue" resource="http://comics.org/issue/119667/">
<div id="sizing_base" >
<h1 class="item_id">
<div class="left" id="series_and_issue">
  <span id="series_name">
    <a href="http://comics.org/series/10616/">Wolverine: Son of Canada</a>
  </span>
  
  <a href="http://comics.org/issue/119667/"><span class="issue_number"><span
    class="p">#</span><span property="issueNumber">[nn]</span></span></a>
</div>
  
<div class="right"><time property="datePublished" content="2001-04">(April 2001)</time></div>
</h1>

<div class="item_id">
  <div class="left" class="item_data">
  <a href="http://comics.org/publisher/78/" property="publisher"
    typeof="Organization"><span property="name">Marvel</span></a>,
  2001 Series
  </div>
</div>
...
     <div class="issue_notes">
       <div class="issue_notes_border">
         <h3 class="notes_header"> Issue Notes </h3>
         <p property="description">
           Giveaway comic distributed by Doritos, limited to 65,000 copies,
           and available only in Canada; Individually numbered on cover.
       </div>
     </div>
...
    <a href="http://comics.org/issue/119667/cover/4"><span  resource="http://comics.org/issue/119667/"><img
        src="http://files1.comics.org//img/gcd/covers_by_id/118/w200/118909.jpg?8888768773072555560"
        alt="Cover Thumbnail for Wolverine: Son of Canada (Marvel, 2001 series) #[nn]"
        class="cover_img" property="image"/></span></a>
...

Relate the ComicIssue to its ComicSeries

Thus far we've focused on the ComicIssue as the primary entity on the page, but we know that a ComicIssue does not mean much if it does not belong to a ComicSeries (a specialization of Periodical)—even if that ComicSeries has only one issue. Declare the ComicSeries entity and use the isPartOf property to relate the ComicIssue to its ComicSeries.

Check your markup
<!DOCTYPE html>
...
<body vocab="http://schema.org/" typeof="ComicIssue" resource="http://comics.org/issue/119667/">
...
<div class="left" id="series_and_issue">
  <span id="series_name">
    <a href="http://comics.org/series/10616/" property="isPartOf"
      typeof="ComicSeries"><span property="name">Wolverine: Son of Canada</span>
    </a>
  </span>
  
  <a href="http://comics.org/issue/119667/"><span class="issue_number"><span
    class="p">#</span><span property="issueNumber">[nn]</span></span></a>
</div>
...

Checkpoint: Your original periodical issue page should now look like step1/check_a.html.

Identifying the contributors

Like many creative works, there are a number of contributors to this comic book. We will focus on marking up the five contributors in the story section of the page.

Declare the author and associated artist properties of the work

The closest schema.org property for script-writer is author. The SchemaBibEx extension to schema.org also defines properties for penciller, inker, colorist, or letterer. Go ahead and declare those properties; just use the text values for this exercise.

Check your markup
<!DOCTYPE html>
...

...
      <dl class="credits">
        <dt class="credit_tag"><span class="credit_label">Script:</span></dt>
          <dd class="credit_def"><span class="credit_value" property="author">Howard Mackie</span></dd>
        <dt class="credit_tag"><span class="credit_label">Pencils:</span></dt>
         <dd class="credit_def"><span class="credit_value" property="penciller">Ron Lim</span></dd>
        <dt class="credit_tag"><span class="credit_label">Inks:</span></dt>
         <dd class="credit_def"><span class="credit_value" property="inker">Walden Wong</span></dd>
        <dt class="credit_tag"><span class="credit_label">Colors:</span></dt>
         <dd class="credit_def"><span class="credit_value" property="colorist">Chris Sotomayor</span></dd>
        <dt class="credit_tag"><span class="credit_label">Letters:</span></dt>
         <dd class="credit_def"><span class="credit_value" property="letterer">Dave Sharpe</span></dd>
      </dl>
...

Use LoC relator codes to improve the precision of contributors

Although we have identified that certain people contributed to the comic book using the SchemaBibEx extension properties, it would be reassuring to be able to indicate a more stable identifier for each contribution should the proposal ultimately change.

Fortunately, in RDFa you can supply multiple values to a single @property attribute, including values from external vocabularies. And the Library of Congress relator terms offer http://id.loc.gov/vocabulary/relators/clr for "colorist". Add a space and that URI to the colorist value for Chris Sotomayor.

Check your markup
<!DOCTYPE html>
...
<body vocab="http://schema.org/" typeof="ComicIssue" resource="http://comics.org/issue/119667/">
...
  <dd class="credit_def"><span class="credit_value"
    property="colorist http://id.loc.gov/vocabulary/relators/clr">Chris Sotomayor</span></dd>
...

Use the @prefix attribute for shorter URIs

The @vocab attribute effectively says "In the absence of a prefix, prepend this URI to the property or type". RDFa allows you to use the @prefix attribute to declare a short prefix for a set of URIs outside of the default vocabulary that you might use repeatedly in your markup. In this case, we are only using one external URI, but it's good to practice.

  1. Declare the prefix for the Library of Congress relator terms on the <body> element by adding the following attribute: prefix="lcrel: http://id.loc.gov/vocabulary/relators/".
  2. Change the colorist URI from the full URI to a prefixed version of the URI, so that the final result is lcrel:clr.
Check your markup
<!DOCTYPE html>
...
<body vocab="http://schema.org/" prefix="lcrel: http://id.loc.gov/vocabulary/relators/"
  typeof="ComicIssue" resource="http://comics.org/issue/119667/">
...
  <dd class="credit_def"><span class="credit_value"
    property="colorist http://id.loc.gov/vocabulary/relators/clr">Chris Sotomayor</span></dd>
...

Checkpoint: Your original periodical issue page should now look like step1/check_b.html.

Use the Role type to define the other roles

schema.org recently added the Role type, initially to support the many different roles a person could play on a given sports team, but now more generally to support the many different roles people can play in any endeavour.

Checkpoint: Your periodical issue page should now look like step1/check_c.html.

About the author

Dan Scott is a systems librarian at Laurentian University.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.