These are the materials for a workshop on getting started with Wikibase, offered at Semantic Web in Libraries 2018 in Bonn, Germany on November 26, 2018.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Originally developed for the Wikidata project, Wikibase "is a collection of applications and libraries for creating, managing and sharing structured data." (http://wikiba.se/) It offers a multilingual platform for linked open data, including a human-friendly editing interface, a SPARQL endpoint, and programmatic means of loading and accessing data, making it a potential match for libraries that like Wikidata’s platform but want to maintain a local store of linked open data.
In this workshop, we discuss how a local Wikibase instance can support library needs, and work through exercises that include:
-
Setting up a local Wikibase instance
-
Adding users
-
Creating custom classes and properties
-
Adding and editing entries
-
Loading data in bulk
-
Querying the data
-
Integrating data with external applications.
What is Wikibase?
Wikibase is developed and supported by Wikimedia Deutschland. It is a structured data repository created through an extension of the MediaWiki software. Wikidata, an implementation of Wikibase, was developed as a method for providing better control over the multilingual Wikipedia environment through providing links between “like” Wikipedia entries (think lots of “same as”) For example, through Wikidata all the pages on Celine Dion in different language Wikipedia are linked.
It was also developed as a way to support additional functionality on Wikipedia. Statements created on Wikidata could be managed in one place and used in multiple Wikipedia pages. Examples include adding links to external data repositories (more “same as” statements, but instead to external sources such as national libraries) as well as information such as birth dates or geographic coordinates that could be used in info boxes.
Wikibase includes functionality for creating and managing a knowledge base, including user-defined properties. It has a comprehensive JavaScript-based user interface for easy access and updating your data. It has an open data model and is multilingual, making it very accommodating to a wide variety of use cases. You can export to a variety of formats including JSON, RDF/XML, N3, and Turtle. It also includes mechanisms for querying and viewing data with SPARQL. Version history is tracked. One of the biggest features of Wikibase is an excellent human-friendly interface.
It’s also important to recognize that Wikibase is a community. As with Wikipedia, and Wikidata, Wikibase is also part of a community. However Wikibase is in the early stages of community development. We, as potential community members, can plan an active role in shaping what it can be. We want to focus today’s discussion in part of documenting and discussing Wikibase as a community and how we might define community structures, needs, and culture. At the end of the workshop we will loop back to a discussion of community.
Setting up a local Wikibase instance
Assumes a Docker install, rather than a Vagrant or bare metal install.
Docker prerequisites
Laptop prerequisites
Ensure virtualization support and memory protection is enabled in your BIOS/UEFI before Docker CE will run.
If you do not have virtualization support on your computer, Docker CE will not run. In that case, you can use a cloud-hosted virtual machine (OpenStack, Google Cloud Compute VM, Amazon AWS… for example Quickstart for Google Cloud Compute VM on Linux)
Linux
The user account that you use to run the Docker commands either needs to be a member of the "docker" group (but be warned, that effectively means the user has superuser privileges), or you will need to prefix every Docker command with "sudo".
-
Install and start Docker using your Linux flavour’s commands
-
Allocate 4GB of RAM:
sudo sysctl -w vm.max_map_count=262144
This setting will revert to your system default when you reboot. To make it persistent across reboots, something like the following command should work on most Linux flavours:
sudo sh -c 'echo vm.max_map_count=262144 > /etc/sysctl.d/10-vm_max_map_count.conf'
Windows 10 Pro/Enterprise Edition
Note: Docker CE does not work on Windows 10 Home edition.
-
Install Docker Community Edition
-
Allocate 4GB of RAM to Docker (Settings → Advanced → Memory)
Mac OS 10.11.3 (Yosemite) or above
-
Install Docker Community Edition
-
Allocate at least 4 GB of RAM to Docker (Preferences → Advanced → Memory)
Installing Wikibase (all operating systems)
Once you have Docker CE set up and configured for your operating system, perform the following steps:
-
Download https://raw.githubusercontent.com/wmde/wikibase-docker/master/docker-compose.yml to the location of your choice.
-
Start a command line/terminal/Powershell and navigate to the same location as the
docker-compose.yml
file you just downloaded. -
Issue the docker-compose command that downloads prebuilt Wikibase Docker images:
docker-compose -f docker-compose.yml pull
-
Issue the docker-compose command that initializes and starts the Wikibase Docker images:
docker-compose -f docker-compose.yml up
You should now be able to browse to http://localhost:8181/ and see the generic Wikibase landing page. Congratulations, you have arrived!
What’s going on under the hood?
The Docker compose configuration actually creates and controls 8 separate containers, which combined provide Wikibase, the Wikibase Query Service, and the Wikibase Quickstatements service:
-
wikibase/wikibase - provides MediaWiki with extensions, including the Wikibase Repo and Wikibase Client extensions
-
mariadb or mysqldb - provides the relational database that MediaWiki/Wikibase rely on
-
elasticsearch - provides full-text search support
-
wdqs - provides the Wikibase Query Service (Blazegraph instance)
-
wdqs (updater) - retrieves updated statements from the relational database and loads them in the Wikibase Query Service
-
wdqs (frontend) - provides the custom user interface for the Wikibase Query Service
-
wdqs (proxy) - provides read-only access to the Wikibase Query Service, and enforces maximum query length times
-
quickstatements - provides the Wikibase QuickStatements service
Stop the Wikibase services
To stop all of the Wikibase services, navigate to the same location as the docker-compose.yml
file and issue the following command:
docker-compose -f docker-compose.yml stop
To stop a single Wikibase service defined in the docker-compose.yml
file, navigate to the same location as the docker-compose.yml
file and issue the following command:
docker-compose -f docker-compose.yml stop <service-name>
Start the Wikibase services
To start all of the Wikibase services, navigate to the same location as the docker-compose.yml
file and issue the following command:
docker-compose -f docker-compose.yml start
To start a single Wikibase service defined in the docker-compose.yml
file, navigate to the same location as the docker-compose.yml
file and issue the following command:
docker-compose -f docker-compose.yml start <service-name>
Wikibase 1.31 images
The current Wikibase Docker images package 1.30, but there are some features in 1.31 such as linkages to external vocabularies that you might want. While it is experimental, you can build your own 1.31 images.
-
Retrieve a copy of the Wikibase 1.31 working branch. Either:
-
Download and extract https://github.com/dbs/wikibase-docker/archive/wikibase-1.31.zip to a known location.
-
Or clone the branch using git:
git clone --single-branch -b wikibase-1.31 https://github.com/dbs/wikibase-docker.git
-
-
From that location, build the Wikibase 1.31 image, then issue the
docker-compose build
command to build the complete set of images:docker build -t wikibase/wikibase:1.31 wikibase/1.31/base/ docker-compose -f docker-compose-build.yml build
If all goes well, all of the images will build successfully.
-
Start the images, specifying the build file:
docker-compose -f docker-compose-build.yml up
You’re now up and running with Wikibase 1.31!
Removing all data and starting fresh
If the data in your query service doesn’t seem to be updating, it may be the case that you have an old volume which is preventing the query service updater from running. Your logs might show the error message:
java.lang.IllegalStateException: RDF store reports the last update time is before the minimum safe poll time. You will have to reload from scratch or you might have missing data.
To remove all of the data from your Docker images and start from scratch, run the following command (adding the -f
flag to point at the docker-compose.yml
or docker-compose-build.yml
file that you initially used to create the volumes):
docker-compose -f <compose-file> down --volumes
Adding users
The default Administrator account
When you first launch the Docker containers, a default Administrator account is created. You can find the user name and password for the account in the docker-compose.yml
or docker-compose-build.yml
file you used to start the instance. By default, those values are:
-
MW_ADMIN_NAME=admin
-
MW_ADMIN_PASS=adminpass
To change these values, edit the docker-compose configuration file before the containers are created with the first docker-compose
up command. If you have already created the containers and volumes, see Removing all data and starting fresh.
Creating user accounts
However, while using an Administrator account is good for getting started, it is not a good security practice to always use an account with elevated privileges. Eventually you will also want to involve other users. Therefore, you should create a regular user for yourself. The simplest way to create an account is to click the Create Account link that appears in the top right corner when you are not logged in.
You can also create accounts for other users. Click Special Pages → Create account (under the "Log in / Create account" heading).
Differentiate between privilege levels: admin, non-admin, bot
Disabling account creation and anonymous edits
You might want to expose your Wikibase content to the rest of the world so they can read and query the Wikibase contents, while at the same time restricting the ability to edit your Wikibase to users with accounts. To do this, you need to edit the LocalSettings.php
file that is generated in the /var/www/html
directory of the Wikibase container.
The name of your running wikibase container will generally follow the format: <prefix>_<service>_<suffix>, where:
-
<prefix> matches the name of the directory in which the
docker-compose.yml
file lives; -
<service> matches the name of the particular service defined in the the
docker-compose.yml
file; -
<suffix> matches the integer of the instance, supporting multiple instances of the same container, possibly combined with a hash representing the version of the container
The following examples use wikibase-docker_wikibase_1
; substitute the name of your wikibase container.
-
Run the following command to find the name of your wikibase container:
docker ps
-
Copy the
/var/www/html/LocalSettings.php
file from inside the wikibase container to a location on your host computer. This example copies the file to/tmp/
but you should adjust according to your operating system:docker cp wikibase-docker_wikibase_1:/var/www/html/LocalSettings.php /tmp/.
-
On your host computer, open the file in the text editor of your choice (viM, Notepad\\, Sublime, Atom, whatever) and add the following lines to the bottom of the file:
$wgGroupPermissions['*']['edit'] = false; $wgGroupPermissions['*']['createaccount'] = false;
Note: the file might have been created on your host computer with permissions that prevent your regular user from changing it.
For example, on Linux, if you are using sudo to run Docker commands, the file will be owned by the root user. You can run
sudo chmod a+w /tmp/LocalSettings.php
to grant write privileges, or change the ownership of the file. -
Copy the
LocalSettings.php
file from the location on your host computer back into the wikibase container at/var/www/html/LocalSettings.php
. This example copies the file from/tmp/
but you should adjust according to your operating system:docker cp /tmp/LocalSettings.php wikibase-docker_wikibase_1:/var/www/html/.
Some changes are cached by the Wikibase HTTP server, so when you reload a Wikibase page, it might show the "Create account" and "Edit" links until those pages are purged from the cache. The links will not function, however.
The easiest way to purge the cache is to restart the wikibase service with the following command:
docker-compose -f docker-compose.yml restart wikibase
Customizing Wikibase
Creating classes and properties
Let’s add some sample data to ensure the basics are working.
-
To create your first item in Wikibase, click Special Pages (on the left-hand menu) → Create a new item (under the "Wikibase" heading). You can add a label, a description, and some aliases for the item. This is not very exciting; you need some properties to truly describe the item!
-
To create a property in Wikibase, click Special Pages (on the left-hand menu) → Create a new property (under the "Wikibase" heading). Now you can add a label, description, some aliases for the property, and choose a data type.
-
Search for the item that you just created, retrieve the item, and click Edit. You should now be able to add a value for the property that you just created.
Considerations
-
Consider your use case(s) and what data model is required
-
Consider mapping to existing vocabularies
Adding a custom logo
Docker can use "volumes" to make files in your containers persistent, accessible from multiple containers, and (on Linux) directly available from your host system. If you search through the docker-compose.yml
or docker-compose-build.yml
file you used to start the instance, you will find declarations like the following:
services: wikibase: image: wikibase/wikibase:1.31-bundle
volumes: - mediawiki-images-data:/var/www/html/images
This tells us that the wikibase service has defined a volume where images will be stored and accessible from multiple containers..
-
Get a banner image. The following example uses the SWIB 10 year anniversary logo, scaled down to an appropriate size, with the name jubi-logo.jpg; but you can use any image you like. Just adjust the name in the following instructions accordingly.
Note: on Windows, the curl command will likely not work; you may instead need to use your browser to download the logo.
curl -O http://swib.org/swib18/images/jubi-banner.jpg
-
(Optional): The MediaWiki manual says the logo size should be 135x135 pixels. The SWIB jubilee banner is too large, so if you have ImageMagick installed, you can scale it down to create an appropriately sized logo:
convert -scale 135x135 jubi-banner.jpg jubi-logo.jpg
-
Use the "docker cp" commands demonstrated in Disabling account creation and anonymous edits to copy the logo image file into the
/var/www/html/images/
directory:docker cp jubi-logo.jpg wikibase-docker_wikibase_1:/var/www/html/images/.
The banner should be visible at http://localhost:8181/images/jubi-logo.jpg.
-
Now you can edit the
LocalSettings.php
file to point to the new logo, following the process outlined in Disabling account creation and anonymous edits. This time you want to add the following line to the end of the file:$wgLogo = "$wgResourceBasePath/images/jubi-logo.jpg";
Remember to copy the modified
LocalSettings.php
file from your host computer back into the container! -
Restart the wikibase container to flush the cache. The next page you load should display the logo.
Hide link boxes on the right hand side of item pages
This is a total hack, but if you dislike the empty "wikibase-sitelinks-wikinews" and assorted other link boxes that don’t really make sense outside of Wikidata, you can:
-
Edit
/var/www/html/extensions/Wikibase/view/src/ItemView.php
to make thegetSideHTML()
function return an empty string. Something like:protected function getSideHtml( EntityDocument $entity ) { if ( !( $entity instanceof Item ) ) { throw new InvalidArgumentException( '$item must be an Item' ); }
return ''; //return $this->getHtmlForPageImage() // . $this->getHtmlForSiteLinks( $entity ); }
Customizing the menu
It can be painful to go through the Special Pages menu every time you want to add a new item or property. Let’s customize the menu so those entries are readily available:
-
Open http://localhost:8181/wiki/MediaWiki:Sidebar. You will see that this special page just contains wiki markup (oh right, this is running on top of a wiki!) like:
* navigation ** mainpage|mainpage-description ** recentchanges-url|recentchanges ** randompage-url|randompage ** helppage|help * SEARCH * TOOLBOX * LANGUAGES
-
Click Edit to edit the page, and change the wiki markup to add new links to the "New item" and "New Property" pages. The markup for these links looks like:
** Special:NewItem|New item ** Special:NewProperty|New property
The final version of your menu wiki markup text should look something like:
* navigation ** mainpage|mainpage-description ** recentchanges-url|recentchanges ** Special:NewItem|New item ** Special:NewProperty|New property ** randompage-url|randompage ** helppage|help * SEARCH * TOOLBOX * LANGUAGES
-
Click Save Changes. The page refreshes and the new links appear on your menu.
Adding media properties (sound files, images, video)
Problem: there is a data type for Commons Media, and this is hardcoded to search Wikimedia Commons and present a typeahead selector box.
-
On Wikibase 1.30, it seems to work, but links out to WikiCommons (which may not be appropriate for many use cases).
-
On Wikibase 1.31, it fails the first time, and then the second time it will create a link to a local file that does not exist. So for now, linking to external sources is the best option.
Language support extension
Currently ships without the UniversalLanguageSelector extension, which means that when you enter text and are prompted for a language, you don’t get any typeahead support, and it may fail to save. That’s annoying, so let’s fix it.
The name of your running wikibase container will generally follow the format: <prefix>_<service>_<suffix>, where:
-
<prefix> matches the name of the directory in which the
docker-compose.yml
file lives; -
<service> matches the name of the particular service defined in the the
docker-compose.yml
file; -
<suffix> matches the integer of the instance, supporting multiple instances of the same container, possibly combined with a hash representing the version of the container
The following examples use wikibase-docker_wikibase_1
; substitute the name of your wikibase container.
The steps to add the UniversalLanguageSelector extension are as follows:
-
Run the following command to find the name of the your wikibase container:
docker ps
-
Attach to the wikibase container by running a bash shell in it:
docker exec -it wikibase-docker_wikibase_1 /bin/bash
-
Change to the /var/www/html/extensions/ directory:
cd /var/www/html/extensions/
-
Create the extension:
git clone https://gerrit.wikimedia.org/r/p/mediawiki/extensions/UniversalLanguageSelector.git
-
Load the extension:
echo "wfLoadExtension( 'UniversalLanguageSelector' );" >> /var/www/html/LocalSettings.php
-
Run the update maintenance script to ensure any required tables are created, etc. You should run this every time you add another extension, or update to a new version of MediaWiki:
php /var/www/html/maintenance/update.php
To check if it worked, enter text for a string property and you should now have typeahead support for en, fr, etc.
Adding quality constraints for property values
Wikidata includes a mechanism for defining constraints on property values (take a look at the property constraints defined on ISSN (P236) for example). Like the UniversalLanguageSelector extension, this support is also defined through a set of extensions: the Wikibase Quality Extensions:
Supporting links to external vocabularies
Wikidata supports nicely formatted URLs for properties that resolve to external identifiers in the HTML, such as for the VIAF property.
In the Wikidata page for Melissa McClelland, if you hover over the VIAF property value "106549076", it is a link that leads to https://viaf.org/viaf/106549076/ - this magic happens thanks to the P1630 property, which enables you to define how the identifier should be formatted.
To support this in Wikibase, perform the following steps:
-
Create a new property using the "String" data type and suggested values.
-
Label: formatter URL
-
Description: web page URL; URI template from which "$1" can be automatically replaced with the effective property value on items
-
Data type: String
A new property is created. Substitute that property ID for P1630 in the following steps!
-
-
Add the following setting to
LocalSettings.php
and deploy it to your wikibase container (see Disabling account creation and anonymous edits if you need a refresher!):$wgWBRepoSettings['formatterUrlProperty'] = 'P1630';
-
Create a new property using the "External identifier" data type that mirrors Wikidata’s VIAF identifier:
-
Label: VIAF identifier
-
Description: Virtual International Authority FIle identifier
-
Data type: External identifier
A new property is created. Substitute that property ID for P214 in the following steps.
-
-
Edit the new property (P214) to add a statement:
-
Property: formatter URL
-
Value: https://viaf.org/viaf/$1
-
-
Edit one of your items to add a VIAF identifier statement to it. For example, edit the item "Ella Fitzgerald" to add the statement:
-
Property: VIAF identifier
-
Value: 6148211
-
When you reload the item for "Ella Fitzgerald", the VIAF identifier should now display 6148211 but be linked to https://viaf.org/viaf/6148211.
However, if you check the RDF generated for "Ella Fitzgerald", you will not find a meaningful link from your item to VIAF. In Wikidata, these links are generated through property P1921 ('formatter URI for RDF resource').
-
To add links to your Wikibase item RDF, you need to create a new property using the "String" data type that mirrors Wikidata’s VIAF identifier:
-
Label: canonical URI
-
Description: generates RDF links to external identifiers
-
Data type: String
A new property is created. Substitute that property ID for P1921 in the following steps.
-
-
Add the following setting to
LocalSettings.php
and deploy it to your wikibase container (see Disabling account creation and anonymous edits if you need a refresher!):$wgWBRepoSettings['canonicalUriProperty'] = 'P1630';
-
Edit your "VIAF identifier" property to add a statement:
-
Property: canonical URI
-
Value: https://viaf.org/viaf/$1
-
Problem: Unfortunately canonicalUriProperty
does not seem to be working, just as it was not working in July.
Wikibase data model
This is a brief outline of the basic structure of the Wikibase data model.
Wikibase is not tied to one specific ontology or data model and you are therefore free to “model the world” as suits your needs. However it does have a base data structure.
The Basic outline primer states that: “A Wikibase knowledge base is a collection of Entities.” These Entities come in two “kinds”: items (with the of prefix Q) and properties (with a prefix of P). Note: Properties can have subclasses.
Each page in Wikibase is an item. (the nice thing about using MediaWiki is that each page/item can be edited.) Every item has a label and a description. Labels and descriptions can be entered in multiple languages. Labels and descriptions work to document the meaning and help users understand the use of each item. Items can also include aliases. This is useful for cases of alternate spellings or alternate names. For example, Label: Ludwig van Beethoven with the aliases: Beethoven, Louis van Beethoven, L. van Beethoven.
Properties can have Datatypes. These are available as a selection in the Wikibase interface and the list can be extended by developers, but not by users (you will want to give your users different levels of access.)
Qualifiers need to be used in conjunction with particular kinds of properties. For example, time-based or properties requiring duration. Properties such as “Employed at” or “president” need to be qualified with dates in order to make sense. Another example is providing information on the kind or method of data. The example from the “Primer” is geographic coordinates or the way data is collected. The qualifier is integral to the use and interpretation of the property. However, as stated above, Wikibase is an open data model and the development and definition of qualifiers is determined by the use case.
It includes a mechanism for documenting statement level provenance or what are termed “references” on Wikibase and Wikidata. References should also be items in Wikibase.
Statements = Claim (Property, Value, Qualifier) + Reference + Rank
What are the positives and drawbacks of this set-up? Where might there be points of tension in relation to libraries?
Loading data in bulk
A note on disk space: any method of loading data in bulk can use up a lot of disk space. If your disk is running out of space, you can hit CTRL-C to stop the data load after a few items.
QuickStatements
Problem: QuickStatements is not currently working in the Docker images. There is a Phabricator ticket for this. Also Jakob Voß and Joachim Neubert are running a whole workshop on Adding your own stuff to Wikidata that includes QuickStatements.
Via a WikidataIntegrator script
This exercise is based on a set of scripts, data, and instructions created by Matt Miller of the Linked Jazz project (blog post). It demonstrates how you can repurpose the WikidataIntegrator Python module, built to bulk load genetic and protein information into Wikidata, to instead add custom properties and items to a Wikibase instance.
The scripts use an older version of the module to allow the properties to be overloaded; the current version appears to pull all property information from Wikidata directly.
Prerequisites
-
Python 3 installed and available from the command line
-
Windows users: when you install Python, we recommend that you check the "Add Python 3.7 to your PATH" box; this will make it easier to run Python commands the next time you log into your Windows account.
-
-
pipenv installed and available from the command line
-
Windows users: when you install pipenv, it will tell you that you need to set the path for Python scripts. You can set the path temporarily each time you open a command prompt using a command like the following (modifying the part before ";%PATH" to match what pipenv told you):
set PATH="C:\Users\denia\AppData\Roaming\Python\Python37\Scripts";%PATH%
-
Steps
The scripts assume that you have an empty Wikibase instance. See Removing all data and starting fresh for a refresher on removing all of the data from your Wikibase instance.
-
Clone or download and extract the branch from https://github.com/dbs/data-2-wikibase/tree/wikidataintegrator_version
If you used git, you need to check out the correct branch:
git checkout origin/wikidataintegrator_version
-
Inside the directory that you have just created, run the following commands:
pipenv install
If this fails, then run "pipenv run pip install pip==18.0" and try again
-
Start a pipenv shell so that the modules you have installed are available in your environment:
pipenv shell
-
From a web browser, log into your Wikibase instance at http://localhost:8181/ as the administrator (user name: admin password: adminpass)
-
From a web browser, navigate to the Special pages → Bot passwords page (http://localhost:8181/wiki/Special:BotPasswords) and create a new bot:
-
Give the bot a name (we will use "otto") and click Create. A list of possible privileges to grant to a bot is displayed.
-
Check the boxes for "Basic rights", "High volume editing", and "Edit existing pages" and click Create to grant basic privileges to the bot. The bot is created and the password for the bot is displayed.
-
Copy the password into the password file in your directory.
-
Update line 30 of the
add_items.py
file with the otto@password bot name and password that you just created. -
Now add the Linked Jazz properties to your Wikibase instance with the following command:
python add_properties.py add_properties.csv
The script tells you it has logged into the Wikibase instance and lists the properties as it creates them.
-
Now add the Linked Jazz core items to your Wikibase instance with the following command:
python add_items.py add_core_items.csv
-
Now add the Linked Jazz people to your Wikibase instance with the following command:
python add_items.py add_jazz_people.csv
The script lists the people as they are added. There are 2009 of them and it will take a long time to complete. If you get tired of waiting you can hit CTRL-C to stop the script at any point.
Now you can check your Wikibase instance to see if Billie Holliday or Oscar Peterson show up in your data set. (They should!)
Exposition
Walk through the code around line 54 of add_properties.py
Loading data from Wikidata
Via WikibaseImport (importEntities.php
)
This method imports the target item, as well as any of the properties and items needed to describe the entity. It assigns new P and Q values for all of the imported properties and items, and does not add a statement that links back to the original Wikidata item. The code comes from https://github.com/filbertkm/WikibaseImport and offers many options for bulk importing Wikidata items!
The name of your running wikibase container will generally follow the format: <prefix>_<service>_<suffix>, where:
-
<prefix> matches the name of the directory in which the
docker-compose.yml
file lives; -
<service> matches the name of the particular service defined in the the
docker-compose.yml
file; -
<suffix> matches the integer of the instance, supporting multiple instances of the same container, possibly combined with a hash representing the version of the container
The following examples use wikibase-docker_wikibase_1
; substitute the name of your wikibase container.
-
To find the name of the your wikibase container, run the following command:
docker ps
-
Import the target item Q2882604 (representing the musical group Whitehorse) by invoking the importEntities.php script from the wikibase container, using the
--all-properties
flag to also import any of the properties and items needed to describe the item:docker exec -it wikibase-docker_wikibase_1 php /var/www/html/extensions/WikibaseImport/maintenance/importEntities.php --all-properties --entity Q2882604
It will take a minute or two to import all of the properties and items—there are a lot of them!
-
Search for "Whitehorse" in your Wikibase instance; if everything went well, you should find a well-described item!
If it failed near the end with the error "DB connection was already closed or the connection dropped," try again. Note
Theoretically, to import all of the items which have received the Juno award for alternative album of the year, substitute
--query P166:Q6314039
for--all-properties --entity Q2882604
in the preceding command.
If you leave in --all-properties
, this seems to import every property (over 6,000 currently) from Wikidata!
Note that this is not working 100% for every targeted entity; see Phabricator ticket T209803
Querying the data
Now that we have a Wikibase instance running with some data, we can explore ways to query that data. In this next section we are going to walk through a few queries using the query service on Wikidata. You are welcome to try this out against your own Wikibase instance or you can shift over to the query service on Wikidata: https://query.wikidata.org/
About the Wikidata Query Service (WDQS)
The docker image of Wikibase comes with the same SPARQL query service available in Wikidata. On Wikibase it is available by default at http://localhost:8282/. The Wikidata Query Service (WDQS) provides a human readable interface to the wikibase SPARQL endpoint and allows users to query data. It is separate from the MediaWiki/Wikibase platform. It includes an RDF Triplestore as well as a SPARQL Query API.
The query service has two different methods for creating the query. You can either use the query helper or the SPARQL interface.
There are numerous existing presentations and tutorials on the query service.
-
This tutorial provides a clear and detailed set of instructions on using the service: https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial (as a former technical writer it rocks Dan’s world)
-
There is also a “Wikidata in One Page” and the “Wikidata Query Service in Brief” which is useful as a quick reference for query syntax: https://www.wikidata.org/wiki/Wikidata:In_one_page
Prerequisites
Make sure you have enough RAM available otherwise the query service will not run. See the instructions above for ensuring Docker has at least 4 GB of RAM available.
Using the Query Service
We’ll work through some of the tutorial to get a feel for the WDQS and to discuss any issues.
It is best to have some idea of what data is available and how that data is modelled before starting work on querying the data. Today we all have some sense of the data we are working with, but when working with other users, or when querying Wikidata it can be helpful to search for a particular item to get a sense of what data is available and the way the data is modelled. The “success” of the query is dependent on the data available.
Queries are constructed like sentences. This text can be copied and pasted into the query box of the query service. You can modify this query to search for data you know is in your wikibase database, or you can hop over to Wikidata to try it out.
SELECT ?child WHERE { # ?child father Schumann ?child wdt:P22 wd:Q7351. } If we want to add labels we add: ?childLabel and append: SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". } SELECT ?child ?childLabel WHERE { # ?child father Bach ?child wdt:P22 wd:Q7351. SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". } }
Autocompletion
The WDQS includes a helpful feature that allows for autocompletion of text. How it works:
-
At any point when entering text as either wdt or wd hold down the control and the spacebar and the code (plus descriptive text) will appear. Click on the text that matches to fill in the variable.
-
It also works for the SERVICE text. Hold down the control key and the space bar to see suggested text and select the appropriate option.
More complex queries
We can spend some time working on building more complex queries. For the purposes of this example we’ll use Wikidata.
You can add multiple triple statements.
You can add predicate pairs through the use of ,
See:
Take some time to try it out
Integrating data with external applications
SPARQL - hey, we just saw that!
Content negotiation
You can request Turtle, N-triples, RDF/XML, and JSON representations of any single entity in Wikibase using the /entity/<id>. For example, to request a Turtle representation of entity Q10 in your Wikibase installation:
curl -LH 'Accept: text/turtle' http://localhost:8181/entity/Q10
And here’s the same entity, but in N-Triples:
curl -LH 'Accept: application/n-triples' http://localhost:8181/entity/Q10
Digging into MariaDB
You can connect to the MariaDB or MySQL database in which Wikibase stores its data through the corresponding Docker container; for example:
-
Connect to the database inside the MariaDB container (for Wikibase 1.31) or MySQL container (for Wikibase 1.30). You can find the user name, password, and database name defined in the
LocalSettings.php
file. The defaults are "wikiuser", "sqlpass", and "my_wiki" respectively:docker exec -it wikibase-docker_mariadb_1 mysql -u wikiuser -psqlpass -D my_wiki
You should see a database prompt similar to the following:
MariaDB [my_wiki]>
You can see some of the tables that are unique to Wikibase using the following statement:
SHOW TABLES WHERE tables_in_my_wiki LIKE 'wb_%'; +-------------------------+ | Tables_in_my_wiki | +-------------------------+ | wb_changes | | wb_changes_dispatch | | wb_changes_subscription | | wb_id_counters | | wb_items_per_site | | wb_property_info | | wb_terms | | wbc_entity_usage | | wbs_entity_mapping | +-------------------------+
-
Show all of the information about the properties you have defined so far:
SELECT * FROM wb_property_info; +----------------+---------------+-------------------------- | pi_property_id | pi_type | pi_info +----------------+---------------+-------------------------- | 2 | wikibase-item | {"type":"wikibase-item"} | 3 | wikibase-item | {"type":"wikibase-item"} | 4 | string | {"type":"string"} | 5 | string | {"type":"string"} | 6 | string | {"type":"string"} | 7 | url | {"type":"url"} | 10 | string | {"type":"string"} | 11 | string | {"type":"string"} | 12 | string | {"type":"string"} | 13 | external-id | {"type":"external-id","fo +----------------+---------------+--------------------------
wikidata-taxonomy
The wikidata-taxonomy tool enables you to easily explore the ontology in Wikidata or in your own Wikibase.
Prerequisites
-
NodeJS LTS ("Long Term Stability") installed and available from the command line
-
Windows users: You can accept the default install options.
Install the wikidata-taxonomy package. To install and run it inside a directory, run the following commands:
npm install wikidata-taxonomy cd node_modules/wikidata-taxonomy
Alternately, you have the option of installing it globally, so that you can invoke it from the command line. This could complicate your NodeJS/npm environment if you work with other NodeJS/npm packages.:
npm install -g wikidata-taxonomy
Now you can explore the ontology of Wikidata. For example, you can explore the subclasses of the "library" item.
-
If you installed
wikidata-taxonomy
inside a directory, issue the following command:node wdtaxonomy.js Q7075
-
If you installed it globally, then you don’t need to call "node" or add the ".js" suffix; issue the following command:
wdtaxonomy Q7075
In either case, you should see output like the following:
library (Q7075) •150 ×9299 ↑↑↑↑↑ ├──public library (Q28564) •30 ×7142 ↑ │ ├──national library (Q22806) •44 ×247 │ │ └──State public library (Q11834910) •2 ×28
To explore your Wikibase taxonomy instead of the Wikidata taxonomy, you need to:
-
specify the SPARQL endpoint of your Wikibase SPARQL proxy server using the --sparql-endpoint command line option
-
specify the property that corresponds to P1709 ("equivalent class") using the -m/--mappings command line option
-
specify the properties that correspond to P279 ("subclass of") and P31 ("instance of") using the -P/--property command line option
So your final command might look like:
node wdtaxonomy.js Q4 -e http://localhost:8989/bigdata/sparql -P P297,P28 -m P251
Except unfortunately all the results seem to be coming from Wikidata instead. See bug #45
Community development and support
The Wikibase development community
-
Phabricator
-
mailing list
-
Discourse
-
IRC channel
-
Technical Advice IRC meetings
-
Slack
-
Telegram
-
OMGWTFBBQ!
Wrap-up discussion
-
Going forward what do we want for this community if Wikibase can become a piece of library infrastructure?
-
What are our expectations?
-
How will we enable this? Let’s take some notes that we can share with the Wikibase team and the Wikimedia Foundation.
ARL Whitepaper: https://www.arl.org/news/arl-news/4682-arl-wikimedia-and-linked-open-data-draft-white-paper-open-for-comments-through-november-30#.W_wEiJNKhKk
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.