As of early June 2014, schema.org includes types like Article
, ScholarlyArticle
,
and NewsArticle
but offers no way to link those articles to a given issue, volume, or even
a general periodical.
The W3C Schema.org Bibliographic Extension community (SchemaBibEx) has proposed a set of new types, properties, and changes to the schema.org vocabulary to support the expression of periodicals such as newspapers, scholarly journals, and comics. This exercise introduces the proposed vocabulary so that you can express the relationship between articles, issues, and their overarching periodicals.
Note: As the Periodical
extension is still
a proposal in front of the W3 WebSchemas community, the links to the
vocabulary documentation in this codelab currently point to a custom
copy of the vocabulary at http://schemaorg-dbs.appspot.com.
Audience: Intermediate
Prerequisites: To complete this codelab, you should already be familiar with HTML, RDFa, and schema.org. A previous exercise offers a practical introduction to those concepts.
Periodical
entity (publishers)
In this exercise, you (acting as the publisher of a journal) will take a
simple journal issue page and identify some of the core information in it
using the Periodical
type and its properties.
Open step1/periodical_issue.html
in a text editor. You should see something like the following HTML source for
the web page:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
...
</head>
<body>
...
<div id="content" class="listpage">
<h1 class="pagetitle">Issue 24, 2014-04-16</h1>
<div class="article" id="post-9345">
<h2 class="articletitle">
<a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
Diversity of Voices</a>
</h2>
<p class="author">Ron Peterson</p>
<div class="abstract">
<p>Making the Journal the best that it can be.</p>
</div>
</div>
</div>
...
Note: In a pinch, you can use the browser development tools to
view and edit the source of the web page (CTRL-Shift-i
in
Chrome or Firefox, in the Elements or Inspector tab
respectively).
Periodical
Edit the <body>
element to include your @vocab
declaration for the schema.org vocabulary, add a @typeof
value of
Periodical
, and set a
@resource
value. Check the results with one or more of the
structured data testing tools.
You should see that the page is now recognized as describing a
Periodical
entity.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
...
</head>
<body vocab="http://schema.org/" typeof="Periodical" resource="#periodical">
...
<div id="content" class="listpage">
<h1 class="pagetitle">Issue 24, 2014-04-16</h1>
<div class="article" id="post-9345">
<h2 class="articletitle">
<a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
Diversity of Voices</a>
</h2>
<p class="author">Ron Peterson</p>
<div class="abstract">
<p>Making the Journal the best that it can be.</p>
</div>
</div>
</div>
...
name
, image
, and description
properties
As most schema.org types inherit from Thing
, some of the core
properties are name
,
image
, and description
.
These properties are often used to generate rich snippets (short
summaries displayed in search results).
Declare the appropriate name
, image
, and
description
properties for this library.
Note: the description of this periodical is listed on a separate
page; even though the range of description
is supposed to be
Text
, it's better to link to the separate page rather than leave
the property unpopulated.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
</head>
<body vocab="http://schema.org/" typeof="Periodical" resource="#periodical">
<div id="page">
<div id="header">
<div id="headerbackground">
<h1>
<meta property="name" content="The Code4Lib Journal">
<a href="http://journal.code4lib.org/"><span resource="#periodical"><img src="../resources/logo.png"
alt="The Code4Lib Journal" property="image"></span></a>
</h1>
<h2 id="issn">ISSN 1940-5758</h2>
</div>
</div>
...
<div id="about">
<h2>About</h2>
<ul>
<li class="page_item page-item-5">
<a href="http://journal.code4lib.org/mission" property="description">Mission</a>
</li>
<li class="page_item page-item-6">
<a href="http://journal.code4lib.org/editorial-committee">Editorial Committee</a>
</li>
<li class="page_item page-item-8">
<a href="http://journal.code4lib.org/process-and-structure">Process and Structure</a>
</li>
<li><a href="http://code4lib.org/">Code4Lib</a></li>
</ul>
</div>
...
issn
Periodicals are often identified by an ISSN, and the Periodical
type offers a issn
property for that purpose. Go ahead an mark up the ISSN accordingly.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
</head>
<body vocab="http://schema.org/" typeof="Periodical" resource="#periodical">
<div id="page">
<div id="header">
<div id="headerbackground">
<h1>
<meta property="name" content="The Code4Lib Journal">
<a href="http://journal.code4lib.org/"><span resource="#periodical"><img src="../resources/logo.png"
alt="The Code4Lib Journal" property="image"></span></a>
</h1>
<h2 id="issn">ISSN <span property="issn">1940-5758</span></h2>
</div>
</div>
...
<div id="about">
<h2>About</h2>
<ul>
<li class="page_item page-item-5">
<a href="http://journal.code4lib.org/mission" property="description">Mission</a>
</li>
<li class="page_item page-item-6">
<a href="http://journal.code4lib.org/editorial-committee">Editorial Committee</a>
</li>
<li class="page_item page-item-8">
<a href="http://journal.code4lib.org/process-and-structure">Process and Structure</a>
</li>
<li><a href="http://code4lib.org/">Code4Lib</a></li>
</ul>
</div>
...
The publisher
property is available to any CreativeWork
, but certainly seems
appropriate for periodicals. In this case, the publisher is the Code4Lib
organization; use the existing link for that purpose.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
</head>
<body vocab="http://schema.org/" typeof="Periodical" resource="#periodical">
<div id="page">
<div id="header">
<div id="headerbackground">
<h1>
<meta property="name" content="The Code4Lib Journal">
<a href="http://journal.code4lib.org/"><span resource="#periodical"><img src="../resources/logo.png"
alt="The Code4Lib Journal" property="image"></span></a>
</h1>
<h2 id="issn">ISSN <span property="issn">1940-5758</span></h2>
</div>
</div>
...
<div id="about">
<h2>About</h2>
<ul>
<li class="page_item page-item-5">
<a href="http://journal.code4lib.org/mission" property="description">Mission</a>
</li>
<li class="page_item page-item-6">
<a href="http://journal.code4lib.org/editorial-committee">Editorial Committee</a>
</li>
<li class="page_item page-item-8">
<a href="http://journal.code4lib.org/process-and-structure">Process and Structure</a>
</li>
<li><a href="http://code4lib.org/" property="publisher">Code4Lib</a></li>
</ul>
</div>
...
Checkpoint: Your original periodical issue page should now look like step1/check_a.html.
While some periodicals are divided into volumes and issues, others
(like this example) are simply a continuous sequence of issues. In
this exercise, you use the PublicationIssue
type to identify some of the important attributes of a periodical issue.
PublicationIssue
entity
To be able to mark up the properties for the publication issue, you first
need to declare the PublicationIssue
property. At this point,
let's revisit the earlier decision to mark the body of the
page with the Periodical
type, as the primary entity of this
page is really the PublicationIssue
.
body
element so that it declares a type of
PublicationIssue
. You should also change the value of the
associated @resource
attribute as the old value of
#periodical
would otherwise be misleading.
image
and name
still belong to the
Periodical
type, which you can now declare directly on the
link to the journal web site. The href
attribute on a
<a>
element acts just like declaring a
@resource
attribute. Remove the <span>
element that includes the @resource
attribute and
let the image
and name
properties apply to
the Periodical
type.
issn
property needs to apply to the Periodical
type as well. Fortunately, you can declare a @resource
attribute with the value http://journal.code4lib.org
on the
surrounding <h2>
element to make the property apply to
the right type.
description
also belongs to the
Periodical
type. Repeat the process of declaring a
@resource
attribute with the value
http://journal.code4lib.org
on a surrounding element.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
<link href="../resources/style.css" rel="stylesheet">
</head>
<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
<div id="page">
<div id="header">
<div id="headerbackground">
<h1>
<a href="http://journal.code4lib.org/" typeof="Periodical"><img
src="../resources/logo.png" alt="The Code4Lib Journal"
property="image"><meta property="name" content="The Code4Lib Journal"></a>
</h1>
<h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
</div>
</div>
...
<div id="about">
<h2>About</h2>
<ul>
<li class="page_item page-item-5" resource="http://journal.code4lib.org/">
<a href="http://journal.code4lib.org/mission" property="description">Mission</a>
</li>
...
Two of the core attributes of an issue of a periodical are the date on
which it was published and the identifying number. For the former, schema.org
offers the datePublished
property, and the issueNumber
property for the latter.
Enhance the issue page by marking the datePublished
and
issueNumber
properties.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
<link href="../resources/style.css" rel="stylesheet">
</head>
<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
<div id="page">
<div id="header">
<div id="headerbackground">
<h1>
<a href="http://journal.code4lib.org/" typeof="Periodical"><img
src="../resources/logo.png" alt="The Code4Lib Journal"
property="image"><meta property="name" content="The Code4Lib Journal"></a>
</h1>
<h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
</div>
</div>
...
<div id="content" class="listpage">
<h1 class="pagetitle">Issue <span property="issueNumber">24</span>, <span property="datePublished">2014-04-16</span></h1>
<div class="article" id="post-9345">
<h2 class="articletitle">
<a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
Diversity of Voices</a>
</h2>
...
PublicationIssue
to the Periodical
Now that you have identified the key properties of both the issue and the
periodical, you should make an explicit connection between the two entities
so that it is clear that the PublicationIssue
is a part of the
Periodical
. You can use either the isPartOf
or
hasPart
property to make that connection, depending on what is more convenient for
your page.
Note: These properties are useful in general for declaring
relationships between composite CreativeWork
entities, such as
the individual volumes in a book trilogy, or poems in a poetry collection.
In this case, use the isPartOf
property to declare that the
PublicationIssue
is part of the Periodical
entity.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
<link href="../resources/style.css" rel="stylesheet">
</head>
<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
<div id="page">
<div id="header">
<div id="headerbackground">
<h1>
<a href="http://journal.code4lib.org/" property="isPartOf" typeof="Periodical"><img
src="../resources/logo.png" alt="The Code4Lib Journal"
property="image"><meta property="name" content="The Code4Lib Journal"></a>
</h1>
<h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
</div>
</div>
...
Checkpoint: Your periodical page should now look like now look like step1/check_b.html.
In this exercise, you learned:
@href
and @resource
attributes
to declare absolute URIs for identifiers instead of simply using relative
fragment (#foobar
) identifiers.
CreativeWork
is part of, or
contains, another CreativeWork
.
The most interesting part of any given issue of a periodical for readers and researchers is not so much the issue number or the date of publication; it is instead the actual articles contained in that issue. So far, however, you have not yet identified any of the articles in this issue.
What you have is a table of contents for the issue, with high-level metadata
for the articles, and not the content of the articles itself. Therefore, you
should mark up the table of contents with ScholarlyArticle
entities (it is an edited journal).
CreativeWork
properties of each article
Checking the properties for ScholarlyArticle
shows that the familiar core properties inherited from CreativeWork
and Thing
are of primary interest: name
,
author
, and description
.
Mark up the articles in the table of contents as ScholarlyArticle
entities with those core properties.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
<link href="../resources/style.css" rel="stylesheet">
</head>
<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
<div id="page">
<div id="header">
<div id="headerbackground">
<h1>
<a href="http://journal.code4lib.org/" property="isPartOf" typeof="Periodical"><img
src="../resources/logo.png" alt="The Code4Lib Journal"
property="image"><meta property="name" content="The Code4Lib Journal"></a>
</h1>
<h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
</div>
</div>
...
<div id="content" class="listpage">
<h1 class="pagetitle">Issue <span property="issueNumber">24</span>, <span property="datePublished">2014-04-16</span></h1>
<div class="article" id="post-9345" typeof="ScholarlyArticle" typeof="ScholarlyArticle">
<h2 class="articletitle">
<a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
Diversity of Voices</a>
</h2>
<p class="author"><span property="author">Ron Peterson</span></p>
<div class="abstract" property="description">
<p>Making the Journal the best that it can be.</p>
</div>
</div>
<div class="article" id="post-9519" typeof="ScholarlyArticle" typeof="ScholarlyArticle">
<h2 class="articletitle">
<a href=
"http://journal.code4lib.org/articles/9519">EgoSystem: Where are our Alumni?</a>
</h2>
<p class="author"><span property="author">James Powell</span>, <span
property="author">Harihar Shankar</span>, <span property="author">Marko Rodriguez</span>,
<span property="author">Herbert Van de Sompel</span></p>
<div class="abstract" property="description">
<p>Comprehensive social search on the Internet remains an unsolved problem.
Social networking sites tend to be isolated from each other, and the
information they contain is often not fully searchable outside the confines of
...
</div>
</div>
...
As your table of contents offers metadata about each article but does not
provide the actual content of the article, use the appropriate absolute
URIs as the identifiers for each ScholarlyArticle
entity.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
<link href="../resources/style.css" rel="stylesheet">
</head>
<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
<div id="page">
<div id="header">
<div id="headerbackground">
<h1>
<a href="http://journal.code4lib.org/" property="isPartOf" typeof="Periodical"><img
src="../resources/logo.png" alt="The Code4Lib Journal"
property="image"><meta property="name" content="The Code4Lib Journal"></a>
</h1>
<h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
</div>
</div>
...
<div id="content" class="listpage">
<h1 class="pagetitle">Issue <span property="issueNumber">24</span>, <span property="datePublished">2014-04-16</span></h1>
<div class="article" id="post-9345" typeof="ScholarlyArticle"
typeof="ScholarlyArticle" resource="http://journal.code4lib.org/articles/9345">
<h2 class="articletitle">
<a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
Diversity of Voices</a>
</h2>
<p class="author"><span property="author">Ron Peterson</span></p>
<div class="abstract" property="description">
<p>Making the Journal the best that it can be.</p>
</div>
</div>
<div class="article" id="post-9519" typeof="ScholarlyArticle"
resource="http://journal.code4lib.org/articles/9519">
<h2 class="articletitle">
<a href=
"http://journal.code4lib.org/articles/9519">EgoSystem: Where are our Alumni?</a>
</h2>
<p class="author"><span property="author">James Powell</span>, <span
property="author">Harihar Shankar</span>, <span property="author">Marko Rodriguez</span>,
<span property="author">Herbert Van de Sompel</span></p>
<div class="abstract" property="description">
<p>Comprehensive social search on the Internet remains an unsolved problem.
Social networking sites tend to be isolated from each other, and the
information they contain is often not fully searchable outside the confines of
...
</div>
</div>
...
Now that the articles have been individually identified, you can link each
article to the containing periodical issue using either the
hasPart
or isPartOf
property. On this page,
the least disruptive approach to the existing markup is to use the
hasPart
property.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>The Code4Lib Journal – Issue 24</title>
<link href="../resources/style.css" rel="stylesheet">
</head>
<body vocab="http://schema.org/" typeof="PublicationIssue" resource="#issue">
<div id="page">
<div id="header">
<div id="headerbackground">
<h1>
<a href="http://journal.code4lib.org/" property="isPartOf" typeof="Periodical"><img
src="../resources/logo.png" alt="The Code4Lib Journal"
property="image"><meta property="name" content="The Code4Lib Journal"></a>
</h1>
<h2 id="issn" resource="http://journal.code4lib.org/">ISSN <span property="issn">1940-5758</span></h2>
</div>
</div>
...
<div id="content" class="listpage">
<h1 class="pagetitle">Issue <span property="issueNumber">24</span>, <span property="datePublished">2014-04-16</span></h1>
<div class="article" id="post-9345" property="hasPart" typeof="ScholarlyArticle"
typeof="ScholarlyArticle" resource="http://journal.code4lib.org/articles/9345">
<h2 class="articletitle">
<a href="http://journal.code4lib.org/articles/9345">Editorial Introduction: Seeking a
Diversity of Voices</a>
</h2>
<p class="author"><span property="author">Ron Peterson</span></p>
<div class="abstract" property="description">
<p>Making the Journal the best that it can be.</p>
</div>
</div>
<div class="article" id="post-9519" property="hasPart" typeof="ScholarlyArticle"
resource="http://journal.code4lib.org/articles/9519">
<h2 class="articletitle">
<a href=
"http://journal.code4lib.org/articles/9519">EgoSystem: Where are our Alumni?</a>
</h2>
<p class="author"><span property="author">James Powell</span>, <span
property="author">Harihar Shankar</span>, <span property="author">Marko Rodriguez</span>,
<span property="author">Herbert Van de Sompel</span></p>
<div class="abstract" property="description">
<p>Comprehensive social search on the Internet remains an unsolved problem.
Social networking sites tend to be isolated from each other, and the
information they contain is often not fully searchable outside the confines of
...
</div>
</div>
...
Checkpoint: Your periodical issue page should now look like step1/check_c.html.
Dan Scott is a systems librarian at Laurentian University.
This work
is licensed under a Creative
Commons Attribution-ShareAlike 4.0 International License.