These are the materials for a workshop on getting started with Wikibase, offered at Semantic Web in Libraries 2018 in Bonn, Germany on November 26, 2018.

CC-BY-SA logo This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


Originally developed for the Wikidata project, Wikibase "is a collection of applications and libraries for creating, managing and sharing structured data." (http://wikiba.se/) It offers a multilingual platform for linked open data, including a human-friendly editing interface, a SPARQL endpoint, and programmatic means of loading and accessing data, making it a potential match for libraries that like Wikidata’s platform but want to maintain a local store of linked open data.

In this workshop, we discuss how a local Wikibase instance can support library needs, and work through exercises that include:

  • Setting up a local Wikibase instance

  • Adding users

  • Creating custom classes and properties

  • Adding and editing entries

  • Loading data in bulk

  • Querying the data

  • Integrating data with external applications.

What is Wikibase?

Wikibase is developed and supported by Wikimedia Deutschland. It is a structured data repository created through an extension of the MediaWiki software. Wikidata, an implementation of Wikibase, was developed as a method for providing better control over the multilingual Wikipedia environment through providing links between “like” Wikipedia entries (think lots of “same as”) For example, through Wikidata all the pages on Celine Dion in different language Wikipedia are linked.

It was also developed as a way to support additional functionality on Wikipedia. Statements created on Wikidata could be managed in one place and used in multiple Wikipedia pages. Examples include adding links to external data repositories (more “same as” statements, but instead to external sources such as national libraries) as well as information such as birth dates or geographic coordinates that could be used in info boxes.

Wikibase includes functionality for creating and managing a knowledge base, including user-defined properties. It has a comprehensive JavaScript-based user interface for easy access and updating your data. It has an open data model and is multilingual, making it very accommodating to a wide variety of use cases. You can export to a variety of formats including JSON, RDF/XML, N3, and Turtle. It also includes mechanisms for querying and viewing data with SPARQL. Version history is tracked. One of the biggest features of Wikibase is an excellent human-friendly interface.

It’s also important to recognize that Wikibase is a community. As with Wikipedia, and Wikidata, Wikibase is also part of a community. However Wikibase is in the early stages of community development. We, as potential community members, can plan an active role in shaping what it can be. We want to focus today’s discussion in part of documenting and discussing Wikibase as a community and how we might define community structures, needs, and culture. At the end of the workshop we will loop back to a discussion of community.

Setting up a local Wikibase instance

Assumes a Docker install, rather than a Vagrant or bare metal install.

Docker prerequisites

Laptop prerequisites

Ensure virtualization support and memory protection is enabled in your BIOS/UEFI before Docker CE will run.

If you do not have virtualization support on your computer, Docker CE will not run. In that case, you can use a cloud-hosted virtual machine (OpenStack, Google Cloud Compute VM, Amazon AWS… for example Quickstart for Google Cloud Compute VM on Linux)

Linux

The user account that you use to run the Docker commands either needs to be a member of the "docker" group (but be warned, that effectively means the user has superuser privileges), or you will need to prefix every Docker command with "sudo".

  • Install and start Docker using your Linux flavour’s commands

  • Allocate 4GB of RAM:

    sudo sysctl -w vm.max_map_count=262144

This setting will revert to your system default when you reboot. To make it persistent across reboots, something like the following command should work on most Linux flavours:

sudo sh -c 'echo vm.max_map_count=262144 > /etc/sysctl.d/10-vm_max_map_count.conf'

Windows 10 Pro/Enterprise Edition

Note: Docker CE does not work on Windows 10 Home edition.

Mac OS 10.11.3 (Yosemite) or above

Installing Wikibase (all operating systems)

Once you have Docker CE set up and configured for your operating system, perform the following steps:

  1. Download https://raw.githubusercontent.com/wmde/wikibase-docker/master/docker-compose.yml to the location of your choice.

  2. Start a command line/terminal/Powershell and navigate to the same location as the docker-compose.yml file you just downloaded.

  3. Issue the docker-compose command that downloads prebuilt Wikibase Docker images:

    docker-compose -f docker-compose.yml pull
  4. Issue the docker-compose command that initializes and starts the Wikibase Docker images:

    docker-compose -f docker-compose.yml up

You should now be able to browse to http://localhost:8181/ and see the generic Wikibase landing page. Congratulations, you have arrived!

What’s going on under the hood?

The Docker compose configuration actually creates and controls 8 separate containers, which combined provide Wikibase, the Wikibase Query Service, and the Wikibase Quickstatements service:

  • wikibase/wikibase - provides MediaWiki with extensions, including the Wikibase Repo and Wikibase Client extensions

  • mariadb or mysqldb - provides the relational database that MediaWiki/Wikibase rely on

  • elasticsearch - provides full-text search support

  • wdqs - provides the Wikibase Query Service (Blazegraph instance)

  • wdqs (updater) - retrieves updated statements from the relational database and loads them in the Wikibase Query Service

  • wdqs (frontend) - provides the custom user interface for the Wikibase Query Service

  • wdqs (proxy) - provides read-only access to the Wikibase Query Service, and enforces maximum query length times

  • quickstatements - provides the Wikibase QuickStatements service

Stop the Wikibase services

To stop all of the Wikibase services, navigate to the same location as the docker-compose.yml file and issue the following command:

docker-compose -f docker-compose.yml stop

To stop a single Wikibase service defined in the docker-compose.yml file, navigate to the same location as the docker-compose.yml file and issue the following command:

docker-compose -f docker-compose.yml stop <service-name>

Start the Wikibase services

To start all of the Wikibase services, navigate to the same location as the docker-compose.yml file and issue the following command:

docker-compose -f docker-compose.yml start

To start a single Wikibase service defined in the docker-compose.yml file, navigate to the same location as the docker-compose.yml file and issue the following command:

docker-compose -f docker-compose.yml start <service-name>

Wikibase 1.31 images

The current Wikibase Docker images package 1.30, but there are some features in 1.31 such as linkages to external vocabularies that you might want. While it is experimental, you can build your own 1.31 images.

  1. Retrieve a copy of the Wikibase 1.31 working branch. Either:

  2. From that location, build the Wikibase 1.31 image, then issue the docker-compose build command to build the complete set of images:

    docker build -t wikibase/wikibase:1.31 wikibase/1.31/base/
    docker-compose -f docker-compose-build.yml build

    If all goes well, all of the images will build successfully.

  3. Start the images, specifying the build file:

    docker-compose -f docker-compose-build.yml up

You’re now up and running with Wikibase 1.31!

Removing all data and starting fresh

If the data in your query service doesn’t seem to be updating, it may be the case that you have an old volume which is preventing the query service updater from running. Your logs might show the error message:

java.lang.IllegalStateException: RDF store reports the last update time is
before the minimum safe poll time.  You will have to reload from scratch or you
might have missing data.

To remove all of the data from your Docker images and start from scratch, run the following command (adding the -f flag to point at the docker-compose.yml or docker-compose-build.yml file that you initially used to create the volumes):

docker-compose -f <compose-file> down --volumes

Adding users

The default Administrator account

When you first launch the Docker containers, a default Administrator account is created. You can find the user name and password for the account in the docker-compose.yml or docker-compose-build.yml file you used to start the instance. By default, those values are:

  • MW_ADMIN_NAME=admin

  • MW_ADMIN_PASS=adminpass

To change these values, edit the docker-compose configuration file before the containers are created with the first docker-compose up command. If you have already created the containers and volumes, see Removing all data and starting fresh.

Creating user accounts

However, while using an Administrator account is good for getting started, it is not a good security practice to always use an account with elevated privileges. Eventually you will also want to involve other users. Therefore, you should create a regular user for yourself. The simplest way to create an account is to click the Create Account link that appears in the top right corner when you are not logged in.

You can also create accounts for other users. Click Special Pages → Create account (under the "Log in / Create account" heading).

Differentiate between privilege levels: admin, non-admin, bot

Disabling account creation and anonymous edits

You might want to expose your Wikibase content to the rest of the world so they can read and query the Wikibase contents, while at the same time restricting the ability to edit your Wikibase to users with accounts. To do this, you need to edit the LocalSettings.php file that is generated in the /var/www/html directory of the Wikibase container.

The name of your running wikibase container will generally follow the format: <prefix>_<service>_<suffix>, where:

  • <prefix> matches the name of the directory in which the docker-compose.yml file lives;

  • <service> matches the name of the particular service defined in the the docker-compose.yml file;

  • <suffix> matches the integer of the instance, supporting multiple instances of the same container, possibly combined with a hash representing the version of the container

The following examples use wikibase-docker_wikibase_1; substitute the name of your wikibase container.

  1. Run the following command to find the name of your wikibase container:

    docker ps
  2. Copy the /var/www/html/LocalSettings.php file from inside the wikibase container to a location on your host computer. This example copies the file to /tmp/ but you should adjust according to your operating system:

    docker cp wikibase-docker_wikibase_1:/var/www/html/LocalSettings.php /tmp/.
  3. On your host computer, open the file in the text editor of your choice (viM, Notepad\\, Sublime, Atom, whatever) and add the following lines to the bottom of the file:

    $wgGroupPermissions['*']['edit'] = false;
    $wgGroupPermissions['*']['createaccount'] = false;

    Note: the file might have been created on your host computer with permissions that prevent your regular user from changing it.

    For example, on Linux, if you are using sudo to run Docker commands, the file will be owned by the root user. You can run sudo chmod a+w /tmp/LocalSettings.php to grant write privileges, or change the ownership of the file.

  4. Copy the LocalSettings.php file from the location on your host computer back into the wikibase container at /var/www/html/LocalSettings.php. This example copies the file from /tmp/ but you should adjust according to your operating system:

    docker cp /tmp/LocalSettings.php wikibase-docker_wikibase_1:/var/www/html/.

Some changes are cached by the Wikibase HTTP server, so when you reload a Wikibase page, it might show the "Create account" and "Edit" links until those pages are purged from the cache. The links will not function, however.

The easiest way to purge the cache is to restart the wikibase service with the following command:

docker-compose -f docker-compose.yml restart wikibase

Customizing Wikibase

Creating classes and properties

Let’s add some sample data to ensure the basics are working.

  1. To create your first item in Wikibase, click Special Pages (on the left-hand menu) → Create a new item (under the "Wikibase" heading). You can add a label, a description, and some aliases for the item. This is not very exciting; you need some properties to truly describe the item!

  2. To create a property in Wikibase, click Special Pages (on the left-hand menu) → Create a new property (under the "Wikibase" heading). Now you can add a label, description, some aliases for the property, and choose a data type.

  3. Search for the item that you just created, retrieve the item, and click Edit. You should now be able to add a value for the property that you just created.

Considerations

  • Consider your use case(s) and what data model is required

  • Consider mapping to existing vocabularies

Docker can use "volumes" to make files in your containers persistent, accessible from multiple containers, and (on Linux) directly available from your host system. If you search through the docker-compose.yml or docker-compose-build.yml file you used to start the instance, you will find declarations like the following:

services:
  wikibase:
    image: wikibase/wikibase:1.31-bundle
volumes:
  - mediawiki-images-data:/var/www/html/images

This tells us that the wikibase service has defined a volume where images will be stored and accessible from multiple containers..

  1. Get a banner image. The following example uses the SWIB 10 year anniversary logo, scaled down to an appropriate size, with the name jubi-logo.jpg; but you can use any image you like. Just adjust the name in the following instructions accordingly.

    Note: on Windows, the curl command will likely not work; you may instead need to use your browser to download the logo.

    curl -O http://swib.org/swib18/images/jubi-banner.jpg
  2. (Optional): The MediaWiki manual says the logo size should be 135x135 pixels. The SWIB jubilee banner is too large, so if you have ImageMagick installed, you can scale it down to create an appropriately sized logo:

    convert -scale 135x135 jubi-banner.jpg jubi-logo.jpg
  3. Use the "docker cp" commands demonstrated in Disabling account creation and anonymous edits to copy the logo image file into the /var/www/html/images/ directory:

    docker cp jubi-logo.jpg wikibase-docker_wikibase_1:/var/www/html/images/.

    The banner should be visible at http://localhost:8181/images/jubi-logo.jpg.

  4. Now you can edit the LocalSettings.php file to point to the new logo, following the process outlined in Disabling account creation and anonymous edits. This time you want to add the following line to the end of the file:

    $wgLogo = "$wgResourceBasePath/images/jubi-logo.jpg";

    Remember to copy the modified LocalSettings.php file from your host computer back into the container!

  5. Restart the wikibase container to flush the cache. The next page you load should display the logo.

This is a total hack, but if you dislike the empty "wikibase-sitelinks-wikinews" and assorted other link boxes that don’t really make sense outside of Wikidata, you can:

  1. Edit /var/www/html/extensions/Wikibase/view/src/ItemView.php to make the getSideHTML() function return an empty string. Something like:

    protected function getSideHtml( EntityDocument $entity ) {
      if ( !( $entity instanceof Item ) ) {
        throw new InvalidArgumentException( '$item must be an Item' );
      }
      return '';
      //return $this->getHtmlForPageImage()
      //     . $this->getHtmlForSiteLinks( $entity );
    }

Customizing the menu

It can be painful to go through the Special Pages menu every time you want to add a new item or property. Let’s customize the menu so those entries are readily available:

  1. Open http://localhost:8181/wiki/MediaWiki:Sidebar. You will see that this special page just contains wiki markup (oh right, this is running on top of a wiki!) like:

    * navigation
    ** mainpage|mainpage-description
    ** recentchanges-url|recentchanges
    ** randompage-url|randompage
    ** helppage|help
    * SEARCH
    * TOOLBOX
    * LANGUAGES
  2. Click Edit to edit the page, and change the wiki markup to add new links to the "New item" and "New Property" pages. The markup for these links looks like:

    ** Special:NewItem|New item
    ** Special:NewProperty|New property

    The final version of your menu wiki markup text should look something like:

    * navigation
    ** mainpage|mainpage-description
    ** recentchanges-url|recentchanges
    ** Special:NewItem|New item
    ** Special:NewProperty|New property
    ** randompage-url|randompage
    ** helppage|help
    * SEARCH
    * TOOLBOX
    * LANGUAGES
  3. Click Save Changes. The page refreshes and the new links appear on your menu.

Adding media properties (sound files, images, video)

Problem: there is a data type for Commons Media, and this is hardcoded to search Wikimedia Commons and present a typeahead selector box.

  • On Wikibase 1.30, it seems to work, but links out to WikiCommons (which may not be appropriate for many use cases).

  • On Wikibase 1.31, it fails the first time, and then the second time it will create a link to a local file that does not exist. So for now, linking to external sources is the best option.

Language support extension

Currently ships without the UniversalLanguageSelector extension, which means that when you enter text and are prompted for a language, you don’t get any typeahead support, and it may fail to save. That’s annoying, so let’s fix it.

The name of your running wikibase container will generally follow the format: <prefix>_<service>_<suffix>, where:

  • <prefix> matches the name of the directory in which the docker-compose.yml file lives;

  • <service> matches the name of the particular service defined in the the docker-compose.yml file;

  • <suffix> matches the integer of the instance, supporting multiple instances of the same container, possibly combined with a hash representing the version of the container

The following examples use wikibase-docker_wikibase_1; substitute the name of your wikibase container.

The steps to add the UniversalLanguageSelector extension are as follows:

  1. Run the following command to find the name of the your wikibase container:

    docker ps
  2. Attach to the wikibase container by running a bash shell in it:

    docker exec -it wikibase-docker_wikibase_1 /bin/bash
  3. Change to the /var/www/html/extensions/ directory:

    cd /var/www/html/extensions/
  4. Create the extension:

    git clone https://gerrit.wikimedia.org/r/p/mediawiki/extensions/UniversalLanguageSelector.git
  5. Load the extension:

    echo "wfLoadExtension( 'UniversalLanguageSelector' );" >> /var/www/html/LocalSettings.php
  6. Run the update maintenance script to ensure any required tables are created, etc. You should run this every time you add another extension, or update to a new version of MediaWiki:

    php /var/www/html/maintenance/update.php

To check if it worked, enter text for a string property and you should now have typeahead support for en, fr, etc.

Adding quality constraints for property values

Wikidata includes a mechanism for defining constraints on property values (take a look at the property constraints defined on ISSN (P236) for example). Like the UniversalLanguageSelector extension, this support is also defined through a set of extensions: the Wikibase Quality Extensions:

Wikidata supports nicely formatted URLs for properties that resolve to external identifiers in the HTML, such as for the VIAF property.

In the Wikidata page for Melissa McClelland, if you hover over the VIAF property value "106549076", it is a link that leads to https://viaf.org/viaf/106549076/ - this magic happens thanks to the P1630 property, which enables you to define how the identifier should be formatted.

To support this in Wikibase, perform the following steps:

  1. Create a new property using the "String" data type and suggested values.

    • Label: formatter URL

    • Description: web page URL; URI template from which "$1" can be automatically replaced with the effective property value on items

    • Data type: String

      A new property is created. Substitute that property ID for P1630 in the following steps!

  2. Add the following setting to LocalSettings.php and deploy it to your wikibase container (see Disabling account creation and anonymous edits if you need a refresher!):

    $wgWBRepoSettings['formatterUrlProperty'] = 'P1630';
  3. Create a new property using the "External identifier" data type that mirrors Wikidata’s VIAF identifier:

    • Label: VIAF identifier

    • Description: Virtual International Authority FIle identifier

    • Data type: External identifier

      A new property is created. Substitute that property ID for P214 in the following steps.

  4. Edit the new property (P214) to add a statement:

  5. Edit one of your items to add a VIAF identifier statement to it. For example, edit the item "Ella Fitzgerald" to add the statement:

    • Property: VIAF identifier

    • Value: 6148211

When you reload the item for "Ella Fitzgerald", the VIAF identifier should now display 6148211 but be linked to https://viaf.org/viaf/6148211.

However, if you check the RDF generated for "Ella Fitzgerald", you will not find a meaningful link from your item to VIAF. In Wikidata, these links are generated through property P1921 ('formatter URI for RDF resource').

  1. To add links to your Wikibase item RDF, you need to create a new property using the "String" data type that mirrors Wikidata’s VIAF identifier:

    • Label: canonical URI

    • Description: generates RDF links to external identifiers

    • Data type: String

      A new property is created. Substitute that property ID for P1921 in the following steps.

  2. Add the following setting to LocalSettings.php and deploy it to your wikibase container (see Disabling account creation and anonymous edits if you need a refresher!):

    $wgWBRepoSettings['canonicalUriProperty'] = 'P1630';
  3. Edit your "VIAF identifier" property to add a statement:

Problem: Unfortunately canonicalUriProperty does not seem to be working, just as it was not working in July.

Wikibase data model

This is a brief outline of the basic structure of the Wikibase data model.

Wikibase is not tied to one specific ontology or data model and you are therefore free to “model the world” as suits your needs. However it does have a base data structure.

The Basic outline primer states that: “A Wikibase knowledge base is a collection of Entities.” These Entities come in two “kinds”: items (with the of prefix Q) and properties (with a prefix of P). Note: Properties can have subclasses.

Each page in Wikibase is an item. (the nice thing about using MediaWiki is that each page/item can be edited.) Every item has a label and a description. Labels and descriptions can be entered in multiple languages. Labels and descriptions work to document the meaning and help users understand the use of each item. Items can also include aliases. This is useful for cases of alternate spellings or alternate names. For example, Label: Ludwig van Beethoven with the aliases: Beethoven, Louis van Beethoven, L. van Beethoven.

Properties can have Datatypes. These are available as a selection in the Wikibase interface and the list can be extended by developers, but not by users (you will want to give your users different levels of access.)

Qualifiers need to be used in conjunction with particular kinds of properties. For example, time-based or properties requiring duration. Properties such as “Employed at” or “president” need to be qualified with dates in order to make sense. Another example is providing information on the kind or method of data. The example from the “Primer” is geographic coordinates or the way data is collected. The qualifier is integral to the use and interpretation of the property. However, as stated above, Wikibase is an open data model and the development and definition of qualifiers is determined by the use case.

It includes a mechanism for documenting statement level provenance or what are termed “references” on Wikibase and Wikidata. References should also be items in Wikibase.

Statements = Claim (Property, Value, Qualifier) + Reference + Rank

What are the positives and drawbacks of this set-up? Where might there be points of tension in relation to libraries?

Loading data in bulk

A note on disk space: any method of loading data in bulk can use up a lot of disk space. If your disk is running out of space, you can hit CTRL-C to stop the data load after a few items.

QuickStatements

Problem: QuickStatements is not currently working in the Docker images. There is a Phabricator ticket for this. Also Jakob Voß and Joachim Neubert are running a whole workshop on Adding your own stuff to Wikidata that includes QuickStatements.

Via a WikidataIntegrator script

This exercise is based on a set of scripts, data, and instructions created by Matt Miller of the Linked Jazz project (blog post). It demonstrates how you can repurpose the WikidataIntegrator Python module, built to bulk load genetic and protein information into Wikidata, to instead add custom properties and items to a Wikibase instance.

The scripts use an older version of the module to allow the properties to be overloaded; the current version appears to pull all property information from Wikidata directly.

Prerequisites

  • Python 3 installed and available from the command line

    • Windows users: when you install Python, we recommend that you check the "Add Python 3.7 to your PATH" box; this will make it easier to run Python commands the next time you log into your Windows account.

  • pipenv installed and available from the command line

    • Windows users: when you install pipenv, it will tell you that you need to set the path for Python scripts. You can set the path temporarily each time you open a command prompt using a command like the following (modifying the part before ";%PATH" to match what pipenv told you):

      set PATH="C:\Users\denia\AppData\Roaming\Python\Python37\Scripts";%PATH%

Steps

The scripts assume that you have an empty Wikibase instance. See Removing all data and starting fresh for a refresher on removing all of the data from your Wikibase instance.

  1. Clone or download and extract the branch from https://github.com/dbs/data-2-wikibase/tree/wikidataintegrator_version

    If you used git, you need to check out the correct branch:

    git checkout origin/wikidataintegrator_version
  2. Inside the directory that you have just created, run the following commands:

    pipenv install

    If this fails, then run "pipenv run pip install pip==18.0" and try again

  3. Start a pipenv shell so that the modules you have installed are available in your environment:

    pipenv shell
  4. From a web browser, log into your Wikibase instance at http://localhost:8181/ as the administrator (user name: admin password: adminpass)

  5. From a web browser, navigate to the Special pages → Bot passwords page (http://localhost:8181/wiki/Special:BotPasswords) and create a new bot:

  6. Give the bot a name (we will use "otto") and click Create. A list of possible privileges to grant to a bot is displayed.

  7. Check the boxes for "Basic rights", "High volume editing", and "Edit existing pages" and click Create to grant basic privileges to the bot. The bot is created and the password for the bot is displayed.

  8. Copy the password into the password file in your directory.

  9. Update line 30 of the add_items.py file with the otto@password bot name and password that you just created.

  10. Now add the Linked Jazz properties to your Wikibase instance with the following command:

    python add_properties.py add_properties.csv

    The script tells you it has logged into the Wikibase instance and lists the properties as it creates them.

  11. Now add the Linked Jazz core items to your Wikibase instance with the following command:

    python add_items.py add_core_items.csv
  12. Now add the Linked Jazz people to your Wikibase instance with the following command:

    python add_items.py add_jazz_people.csv

The script lists the people as they are added. There are 2009 of them and it will take a long time to complete. If you get tired of waiting you can hit CTRL-C to stop the script at any point.

Now you can check your Wikibase instance to see if Billie Holliday or Oscar Peterson show up in your data set. (They should!)

Exposition

Walk through the code around line 54 of add_properties.py

Loading data from Wikidata

Via WikibaseImport (importEntities.php)

This method imports the target item, as well as any of the properties and items needed to describe the entity. It assigns new P and Q values for all of the imported properties and items, and does not add a statement that links back to the original Wikidata item. The code comes from https://github.com/filbertkm/WikibaseImport and offers many options for bulk importing Wikidata items!

The name of your running wikibase container will generally follow the format: <prefix>_<service>_<suffix>, where:

  • <prefix> matches the name of the directory in which the docker-compose.yml file lives;

  • <service> matches the name of the particular service defined in the the docker-compose.yml file;

  • <suffix> matches the integer of the instance, supporting multiple instances of the same container, possibly combined with a hash representing the version of the container

The following examples use wikibase-docker_wikibase_1; substitute the name of your wikibase container.

  1. To find the name of the your wikibase container, run the following command:

    docker ps
  2. Import the target item Q2882604 (representing the musical group Whitehorse) by invoking the importEntities.php script from the wikibase container, using the --all-properties flag to also import any of the properties and items needed to describe the item:

    docker exec -it wikibase-docker_wikibase_1 php /var/www/html/extensions/WikibaseImport/maintenance/importEntities.php --all-properties --entity Q2882604

    It will take a minute or two to import all of the properties and items—​there are a lot of them!

  3. Search for "Whitehorse" in your Wikibase instance; if everything went well, you should find a well-described item!

    If it failed near the end with the error "DB connection was already closed or the connection dropped," try again. Note

    Theoretically, to import all of the items which have received the Juno award for alternative album of the year, substitute --query P166:Q6314039 for --all-properties --entity Q2882604 in the preceding command.

If you leave in --all-properties, this seems to import every property (over 6,000 currently) from Wikidata!

Note that this is not working 100% for every targeted entity; see Phabricator ticket T209803

Querying the data

Now that we have a Wikibase instance running with some data, we can explore ways to query that data. In this next section we are going to walk through a few queries using the query service on Wikidata. You are welcome to try this out against your own Wikibase instance or you can shift over to the query service on Wikidata: https://query.wikidata.org/

About the Wikidata Query Service (WDQS)

The docker image of Wikibase comes with the same SPARQL query service available in Wikidata. On Wikibase it is available by default at http://localhost:8282/. The Wikidata Query Service (WDQS) provides a human readable interface to the wikibase SPARQL endpoint and allows users to query data. It is separate from the MediaWiki/Wikibase platform. It includes an RDF Triplestore as well as a SPARQL Query API.

The query service has two different methods for creating the query. You can either use the query helper or the SPARQL interface.

There are numerous existing presentations and tutorials on the query service.

Prerequisites

Make sure you have enough RAM available otherwise the query service will not run. See the instructions above for ensuring Docker has at least 4 GB of RAM available.

Using the Query Service

We’ll work through some of the tutorial to get a feel for the WDQS and to discuss any issues.

It is best to have some idea of what data is available and how that data is modelled before starting work on querying the data. Today we all have some sense of the data we are working with, but when working with other users, or when querying Wikidata it can be helpful to search for a particular item to get a sense of what data is available and the way the data is modelled. The “success” of the query is dependent on the data available.

Queries are constructed like sentences. This text can be copied and pasted into the query box of the query service. You can modify this query to search for data you know is in your wikibase database, or you can hop over to Wikidata to try it out.

SELECT ?child
WHERE
{
# ?child  father  Schumann
  ?child wdt:P22 wd:Q7351.
}
If we want to add labels we add: ?childLabel  and append: SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
SELECT ?child ?childLabel
WHERE
{
# ?child  father   Bach
  ?child wdt:P22 wd:Q7351.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}

Autocompletion

The WDQS includes a helpful feature that allows for autocompletion of text. How it works:

  • At any point when entering text as either wdt or wd hold down the control and the spacebar and the code (plus descriptive text) will appear. Click on the text that matches to fill in the variable.

  • It also works for the SERVICE text. Hold down the control key and the space bar to see suggested text and select the appropriate option.

More complex queries

We can spend some time working on building more complex queries. For the purposes of this example we’ll use Wikidata.

You can add multiple triple statements.

You can add predicate pairs through the use of ,

See:

Take some time to try it out

Integrating data with external applications

SPARQL - hey, we just saw that!

Content negotiation

You can request Turtle, N-triples, RDF/XML, and JSON representations of any single entity in Wikibase using the /entity/<id>. For example, to request a Turtle representation of entity Q10 in your Wikibase installation:

curl -LH 'Accept: text/turtle' http://localhost:8181/entity/Q10

And here’s the same entity, but in N-Triples:

curl -LH 'Accept: application/n-triples' http://localhost:8181/entity/Q10

Digging into MariaDB

You can connect to the MariaDB or MySQL database in which Wikibase stores its data through the corresponding Docker container; for example:

  1. Connect to the database inside the MariaDB container (for Wikibase 1.31) or MySQL container (for Wikibase 1.30). You can find the user name, password, and database name defined in the LocalSettings.php file. The defaults are "wikiuser", "sqlpass", and "my_wiki" respectively:

    docker exec -it wikibase-docker_mariadb_1 mysql -u wikiuser -psqlpass -D my_wiki

    You should see a database prompt similar to the following:

    MariaDB [my_wiki]>

    You can see some of the tables that are unique to Wikibase using the following statement:

    SHOW TABLES WHERE tables_in_my_wiki LIKE 'wb_%';
    
    +-------------------------+
    | Tables_in_my_wiki       |
    +-------------------------+
    | wb_changes              |
    | wb_changes_dispatch     |
    | wb_changes_subscription |
    | wb_id_counters          |
    | wb_items_per_site       |
    | wb_property_info        |
    | wb_terms                |
    | wbc_entity_usage        |
    | wbs_entity_mapping      |
    +-------------------------+
  2. Show all of the information about the properties you have defined so far:

    SELECT * FROM wb_property_info;
    
    +----------------+---------------+--------------------------
    | pi_property_id | pi_type       | pi_info
    +----------------+---------------+--------------------------
    |              2 | wikibase-item | {"type":"wikibase-item"}
    |              3 | wikibase-item | {"type":"wikibase-item"}
    |              4 | string        | {"type":"string"}
    |              5 | string        | {"type":"string"}
    |              6 | string        | {"type":"string"}
    |              7 | url           | {"type":"url"}
    |             10 | string        | {"type":"string"}
    |             11 | string        | {"type":"string"}
    |             12 | string        | {"type":"string"}
    |             13 | external-id   | {"type":"external-id","fo
    +----------------+---------------+--------------------------

wikidata-taxonomy

The wikidata-taxonomy tool enables you to easily explore the ontology in Wikidata or in your own Wikibase.

Prerequisites

  • NodeJS LTS ("Long Term Stability") installed and available from the command line

  • Windows users: You can accept the default install options.

Install the wikidata-taxonomy package. To install and run it inside a directory, run the following commands:

npm install wikidata-taxonomy
cd node_modules/wikidata-taxonomy

Alternately, you have the option of installing it globally, so that you can invoke it from the command line. This could complicate your NodeJS/npm environment if you work with other NodeJS/npm packages.:

npm install -g wikidata-taxonomy

Now you can explore the ontology of Wikidata. For example, you can explore the subclasses of the "library" item.

  • If you installed wikidata-taxonomy inside a directory, issue the following command:

    node wdtaxonomy.js Q7075
  • If you installed it globally, then you don’t need to call "node" or add the ".js" suffix; issue the following command:

    wdtaxonomy Q7075

In either case, you should see output like the following:

library (Q7075) •150 ×9299 ↑↑↑↑↑
    ├──public library (Q28564) •30 ×7142 ↑
        │  ├──national library (Q22806) •44 ×247
        │  │  └──State public library (Q11834910) •2 ×28

To explore your Wikibase taxonomy instead of the Wikidata taxonomy, you need to:

  • specify the SPARQL endpoint of your Wikibase SPARQL proxy server using the --sparql-endpoint command line option

  • specify the property that corresponds to P1709 ("equivalent class") using the -m/--mappings command line option

  • specify the properties that correspond to P279 ("subclass of") and P31 ("instance of") using the -P/--property command line option

So your final command might look like:

node wdtaxonomy.js Q4 -e http://localhost:8989/bigdata/sparql -P P297,P28 -m P251

Except unfortunately all the results seem to be coming from Wikidata instead. See bug #45

Community development and support

The Wikibase development community

  • Phabricator

  • mailing list

  • Discourse

  • IRC channel

  • Technical Advice IRC meetings

  • Slack

  • Telegram

  • OMGWTFBBQ!

Wrap-up discussion

  • Going forward what do we want for this community if Wikibase can become a piece of library infrastructure?

  • What are our expectations?

  • How will we enable this? Let’s take some notes that we can share with the Wikibase team and the Wikimedia Foundation.


CC-BY-SA logo This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.