RDFa with schema.org codelab: Periodicals

By Dan Scott,

About this codelab

As of early June 2014, schema.org includes types like Article, ScholarlyArticle, and NewsArticle but offers no way to link those articles to a given issue, volume, or even a general periodical.

The W3C Schema.org Bibliographic Extension community (SchemaBibEx) has proposed a set of new types, properties, and changes to the schema.org vocabulary to support the expression of periodicals such as newspapers, scholarly journals, and comics. This exercise introduces the proposed vocabulary so that you can express the relationship between articles, issues, and their overarching periodicals.

Note: As the Periodical extension is still a proposal in front of the W3 WebSchemas community, the links to the vocabulary documentation in this codelab currently point to a custom copy of the vocabulary at http://schemaorg-dbs.appspot.com.

Audience: Intermediate

Prerequisites: To complete this codelab, you should already be familiar with HTML, RDFa, and schema.org. A previous exercise offers a practical introduction to those concepts.

Identifying the Periodical entity (publishers)

In this exercise, you (acting as the publisher of a journal) will take a simple journal issue page and identify some of the core information in it using the Periodical type and its properties.

View the periodical issue page source HTML

Open step1/periodical_issue.html in a text editor. You should see something like the following HTML source for the web page:

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
  ...
</head>
<body>
...
    <div id="content" class="listpage">
      <h1 class="pagetitle">Issue 24, 2014-04-16</h1>

      <div class="article" id="post-9345">
        <h2 class="articletitle">
          <a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
          Diversity of Voices</a>
        </h2>

        <p class="author">Ron Peterson</p>

        <div class="abstract">
          <p>Making the Journal the best that it can be.</p>
        </div>
      </div>
    </div>
...

Note: In a pinch, you can use the browser development tools to view and edit the source of the web page (CTRL-Shift-i in Chrome or Firefox, in the Elements or Inspector tab respectively).

Declare the core type of the page as Periodical

Edit the <body> element to include your @vocab declaration for the schema.org vocabulary, add a @typeof value of Periodical, and set a @resource value. Check the results with one or more of the structured data testing tools.

You should see that the page is now recognized as describing a Periodical entity.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
  ...
</head>
<body vocab="http://schema.org/" typeof="Periodical" resource="#periodical">
...
    <div id="content" class="listpage">
      <h1 class="pagetitle">Issue 24, 2014-04-16</h1>

      <div class="article" id="post-9345">
        <h2 class="articletitle">
          <a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
          Diversity of Voices</a>
        </h2>

        <p class="author">Ron Peterson</p>

        <div class="abstract">
          <p>Making the Journal the best that it can be.</p>
        </div>
      </div>
    </div>
...

Mark the basic name, image, and description properties

As most schema.org types inherit from Thing, some of the core properties are name, image, and description. These properties are often used to generate rich snippets (short summaries displayed in search results).

Declare the appropriate name, image, and description properties for this library.

Note: the description of this periodical is listed on a separate page; even though the range of description is supposed to be Text, it's better to link to the separate page rather than leave the property unpopulated.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
</head>

<body vocab="http://schema.org/" typeof="Periodical" resource="#periodical">
  <div id="page">
    <div id="header">
      <div id="headerbackground">
        <h1>
          <meta property="name" content="The Code4Lib Journal">
          <a href="http://journal.code4lib.org/"><span resource="#periodical"><img src="../resources/logo.png"
            alt="The Code4Lib Journal" property="image"></span></a>
        </h1>

        <h2 id="issn">ISSN 1940-5758</h2>
      </div>
    </div>

...
      <div id="about">
        <h2>About</h2>

        <ul>
          <li class="page_item page-item-5">
            <a href="http://journal.code4lib.org/mission" property="description">Mission</a>
          </li>
          <li class="page_item page-item-6">
            <a href="http://journal.code4lib.org/editorial-committee">Editorial Committee</a>
          </li>
          <li class="page_item page-item-8">
            <a href="http://journal.code4lib.org/process-and-structure">Process and Structure</a>
          </li>
          <li><a href="http://code4lib.org/">Code4Lib</a></li>
        </ul>
      </div>
...

Declare the issn

Periodicals are often identified by an ISSN, and the Periodical type offers a issn property for that purpose. Go ahead an mark up the ISSN accordingly.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
</head>

<body vocab="http://schema.org/" typeof="Periodical" resource="#periodical">
  <div id="page">
    <div id="header">
      <div id="headerbackground">
        <h1>
          <meta property="name" content="The Code4Lib Journal">
          <a href="http://journal.code4lib.org/"><span resource="#periodical"><img src="../resources/logo.png"
            alt="The Code4Lib Journal" property="image"></span></a>
        </h1>

        <h2 id="issn">ISSN <span property="issn">1940-5758</span></h2>
      </div>
    </div>

...
      <div id="about">
        <h2>About</h2>

        <ul>
          <li class="page_item page-item-5">
            <a href="http://journal.code4lib.org/mission" property="description">Mission</a>
          </li>
          <li class="page_item page-item-6">
            <a href="http://journal.code4lib.org/editorial-committee">Editorial Committee</a>
          </li>
          <li class="page_item page-item-8">
            <a href="http://journal.code4lib.org/process-and-structure">Process and Structure</a>
          </li>
          <li><a href="http://code4lib.org/">Code4Lib</a></li>
        </ul>
      </div>
...

Highlight the publisher

The publisher property is available to any CreativeWork, but certainly seems appropriate for periodicals. In this case, the publisher is the Code4Lib organization; use the existing link for that purpose.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
</head>

<body vocab="http://schema.org/" typeof="Periodical" resource="#periodical">
  <div id="page">
    <div id="header">
      <div id="headerbackground">
        <h1>
          <meta property="name" content="The Code4Lib Journal">
          <a href="http://journal.code4lib.org/"><span resource="#periodical"><img src="../resources/logo.png"
            alt="The Code4Lib Journal" property="image"></span></a>
        </h1>

        <h2 id="issn">ISSN <span property="issn">1940-5758</span></h2>
      </div>
    </div>

...
      <div id="about">
        <h2>About</h2>

        <ul>
          <li class="page_item page-item-5">
            <a href="http://journal.code4lib.org/mission" property="description">Mission</a>
          </li>
          <li class="page_item page-item-6">
            <a href="http://journal.code4lib.org/editorial-committee">Editorial Committee</a>
          </li>
          <li class="page_item page-item-8">
            <a href="http://journal.code4lib.org/process-and-structure">Process and Structure</a>
          </li>
          <li><a href="http://code4lib.org/" property="publisher">Code4Lib</a></li>
        </ul>
      </div>
...

Checkpoint: Your original periodical issue page should now look like step1/check_a.html.

Identifying the periodical issue

While some periodicals are divided into volumes and issues, others (like this example) are simply a continuous sequence of issues. In this exercise, you use the PublicationIssue type to identify some of the important attributes of a periodical issue.

Declare the PublicationIssue entity

To be able to mark up the properties for the publication issue, you first need to declare the PublicationIssue property. At this point, let's revisit the earlier decision to mark the body of the page with the Periodical type, as the primary entity of this page is really the PublicationIssue.

  1. Change the body element so that it declares a type of PublicationIssue. You should also change the value of the associated @resource attribute as the old value of #periodical would otherwise be misleading.
  2. The image and name still belong to the Periodical type, which you can now declare directly on the link to the journal web site. The href attribute on a <a> element acts just like declaring a @resource attribute. Remove the <span> element that includes the @resource attribute and let the image and name properties apply to the Periodical type.
  3. The issn property needs to apply to the Periodical type as well. Fortunately, you can declare a @resource attribute with the value http://journal.code4lib.org on the surrounding <h2> element to make the property apply to the right type.
  4. Similarly, the description also belongs to the Periodical type. Repeat the process of declaring a @resource attribute with the value http://journal.code4lib.org on a surrounding element.
Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
  <link href="../resources/style.css" rel="stylesheet">
</head>

<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
  <div id="page">
    <div id="header">
      <div id="headerbackground">
        <h1>
          <a href="http://journal.code4lib.org/" typeof="Periodical"><img
            src="../resources/logo.png" alt="The Code4Lib Journal"
            property="image"><meta property="name" content="The Code4Lib Journal"></a>
        </h1>

        <h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
      </div>
    </div>
...
    <div id="about">
      <h2>About</h2>

      <ul>
        <li class="page_item page-item-5" resource="http://journal.code4lib.org/">
          <a href="http://journal.code4lib.org/mission" property="description">Mission</a>
        </li>
...

Identify the issue number and publication date

Two of the core attributes of an issue of a periodical are the date on which it was published and the identifying number. For the former, schema.org offers the datePublished property, and the issueNumber property for the latter.

Enhance the issue page by marking the datePublished and issueNumber properties.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
  <link href="../resources/style.css" rel="stylesheet">
</head>

<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
  <div id="page">
    <div id="header">
      <div id="headerbackground">
        <h1>
          <a href="http://journal.code4lib.org/" typeof="Periodical"><img
            src="../resources/logo.png" alt="The Code4Lib Journal"
            property="image"><meta property="name" content="The Code4Lib Journal"></a>
        </h1>

        <h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
      </div>
    </div>
...
    <div id="content" class="listpage">
      <h1 class="pagetitle">Issue <span property="issueNumber">24</span>, <span property="datePublished">2014-04-16</span></h1>

      <div class="article" id="post-9345">
        <h2 class="articletitle">
          <a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
          Diversity of Voices</a>
        </h2>
...

Link the PublicationIssue to the Periodical

Now that you have identified the key properties of both the issue and the periodical, you should make an explicit connection between the two entities so that it is clear that the PublicationIssue is a part of the Periodical. You can use either the isPartOf or hasPart property to make that connection, depending on what is more convenient for your page.

Note: These properties are useful in general for declaring relationships between composite CreativeWork entities, such as the individual volumes in a book trilogy, or poems in a poetry collection.

In this case, use the isPartOf property to declare that the PublicationIssue is part of the Periodical entity.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
  <link href="../resources/style.css" rel="stylesheet">
</head>

<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
  <div id="page">
    <div id="header">
      <div id="headerbackground">
        <h1>
          <a href="http://journal.code4lib.org/" property="isPartOf" typeof="Periodical"><img
            src="../resources/logo.png" alt="The Code4Lib Journal"
            property="image"><meta property="name" content="The Code4Lib Journal"></a>
        </h1>

        <h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
      </div>
    </div>
...

Checkpoint: Your periodical page should now look like now look like step1/check_b.html.

Lessons learned

In this exercise, you learned:

Identifying articles in a periodical issue

The most interesting part of any given issue of a periodical for readers and researchers is not so much the issue number or the date of publication; it is instead the actual articles contained in that issue. So far, however, you have not yet identified any of the articles in this issue.

What you have is a table of contents for the issue, with high-level metadata for the articles, and not the content of the articles itself. Therefore, you should mark up the table of contents with ScholarlyArticle entities (it is an edited journal).

Identify the core CreativeWork properties of each article

Checking the properties for ScholarlyArticle shows that the familiar core properties inherited from CreativeWork and Thing are of primary interest: name, author, and description.

Mark up the articles in the table of contents as ScholarlyArticle entities with those core properties.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
  <link href="../resources/style.css" rel="stylesheet">
</head>

<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
  <div id="page">
    <div id="header">
      <div id="headerbackground">
        <h1>
          <a href="http://journal.code4lib.org/" property="isPartOf" typeof="Periodical"><img
            src="../resources/logo.png" alt="The Code4Lib Journal"
            property="image"><meta property="name" content="The Code4Lib Journal"></a>
        </h1>

        <h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
      </div>
    </div>
...
    <div id="content" class="listpage">
      <h1 class="pagetitle">Issue <span property="issueNumber">24</span>, <span property="datePublished">2014-04-16</span></h1>

      <div class="article" id="post-9345" typeof="ScholarlyArticle" typeof="ScholarlyArticle">
        <h2 class="articletitle">
          <a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
          Diversity of Voices</a>
        </h2>

        <p class="author"><span property="author">Ron Peterson</span></p>

        <div class="abstract" property="description">
          <p>Making the Journal the best that it can be.</p>
        </div>
      </div>

      <div class="article" id="post-9519" typeof="ScholarlyArticle" typeof="ScholarlyArticle">
        <h2 class="articletitle">
          <a href=
          "http://journal.code4lib.org/articles/9519">EgoSystem: Where are our Alumni?</a>
        </h2>

        <p class="author"><span property="author">James Powell</span>, <span
          property="author">Harihar Shankar</span>, <span property="author">Marko Rodriguez</span>,
          <span property="author">Herbert Van de Sompel</span></p>

        <div class="abstract" property="description">
          <p>Comprehensive social search on the Internet remains an unsolved problem.
          Social networking sites tend to be isolated from each other, and the
          information they contain is often not fully searchable outside the confines of
...
        </div>
      </div>
...

Use absolute URIs to identify each article

As your table of contents offers metadata about each article but does not provide the actual content of the article, use the appropriate absolute URIs as the identifiers for each ScholarlyArticle entity.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
  <link href="../resources/style.css" rel="stylesheet">
</head>

<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
  <div id="page">
    <div id="header">
      <div id="headerbackground">
        <h1>
          <a href="http://journal.code4lib.org/" property="isPartOf" typeof="Periodical"><img
            src="../resources/logo.png" alt="The Code4Lib Journal"
            property="image"><meta property="name" content="The Code4Lib Journal"></a>
        </h1>

        <h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
      </div>
    </div>
...
    <div id="content" class="listpage">
      <h1 class="pagetitle">Issue <span property="issueNumber">24</span>, <span property="datePublished">2014-04-16</span></h1>

      <div class="article" id="post-9345" typeof="ScholarlyArticle"
        typeof="ScholarlyArticle" resource="http://journal.code4lib.org/articles/9345">
        <h2 class="articletitle">
          <a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
          Diversity of Voices</a>
        </h2>

        <p class="author"><span property="author">Ron Peterson</span></p>

        <div class="abstract" property="description">
          <p>Making the Journal the best that it can be.</p>
        </div>
      </div>

      <div class="article" id="post-9519" typeof="ScholarlyArticle"
        resource="http://journal.code4lib.org/articles/9519">
        <h2 class="articletitle">
          <a href=
          "http://journal.code4lib.org/articles/9519">EgoSystem: Where are our Alumni?</a>
        </h2>

        <p class="author"><span property="author">James Powell</span>, <span
          property="author">Harihar Shankar</span>, <span property="author">Marko Rodriguez</span>,
          <span property="author">Herbert Van de Sompel</span></p>

        <div class="abstract" property="description">
          <p>Comprehensive social search on the Internet remains an unsolved problem.
          Social networking sites tend to be isolated from each other, and the
          information they contain is often not fully searchable outside the confines of
...
        </div>
      </div>
...

Link the periodical issue to the articles it contains

Now that the articles have been individually identified, you can link each article to the containing periodical issue using either the hasPart or isPartOf property. On this page, the least disruptive approach to the existing markup is to use the hasPart property.

Check your markup
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>The Code4Lib Journal – Issue 24</title>
  <link href="../resources/style.css" rel="stylesheet">
</head>

<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
  <div id="page">
    <div id="header">
      <div id="headerbackground">
        <h1>
          <a href="http://journal.code4lib.org/" property="isPartOf" typeof="Periodical"><img
            src="../resources/logo.png" alt="The Code4Lib Journal"
            property="image"><meta property="name" content="The Code4Lib Journal"></a>
        </h1>

        <h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
      </div>
    </div>
...
    <div id="content" class="listpage">
      <h1 class="pagetitle">Issue <span property="issueNumber">24</span>, <span property="datePublished">2014-04-16</span></h1>

      <div class="article" id="post-9345" property="hasPart" typeof="ScholarlyArticle"
        typeof="ScholarlyArticle" resource="http://journal.code4lib.org/articles/9345">
        <h2 class="articletitle">
          <a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
          Diversity of Voices</a>
        </h2>

        <p class="author"><span property="author">Ron Peterson</span></p>

        <div class="abstract" property="description">
          <p>Making the Journal the best that it can be.</p>
        </div>
      </div>

      <div class="article" id="post-9519" property="hasPart" typeof="ScholarlyArticle"
        resource="http://journal.code4lib.org/articles/9519">
        <h2 class="articletitle">
          <a href=
          "http://journal.code4lib.org/articles/9519">EgoSystem: Where are our Alumni?</a>
        </h2>

        <p class="author"><span property="author">James Powell</span>, <span
          property="author">Harihar Shankar</span>, <span property="author">Marko Rodriguez</span>,
          <span property="author">Herbert Van de Sompel</span></p>

        <div class="abstract" property="description">
          <p>Comprehensive social search on the Internet remains an unsolved problem.
          Social networking sites tend to be isolated from each other, and the
          information they contain is often not fully searchable outside the confines of
...
        </div>
      </div>
...

Checkpoint: Your periodical issue page should now look like step1/check_c.html.

About the author

Dan Scott is a systems librarian at Laurentian University.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.